Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Chukwa_Processes_and_Data_Flow" page has been changed by BillGraham. http://wiki.apache.org/hadoop/Chukwa_Processes_and_Data_Flow?action=diff&rev1=2&rev2=3 -------------------------------------------------- 1. Collectors close chunks and rename them to {{{*.done}}} * from: {{{logs/*.chukwa}}} * to: {{{logs/*.done}}} - 1. DemuxManager wakes up every 20 seconds, runs M/R to merges {{{*.done}}} files and moves them. + 1. DemuxManager checks for {{{*.done}}} files every 20 seconds. + 1. If {{{*.done}}} files exist, moves files in place for demux processing: - * from: {{{logs/*.done}}} + * from: {{{logs/*.done}}} - * to: {{{demuxProcessing/mrInput}}} + * to: {{{demuxProcessing/mrInput}}} + 1. If demux is successful within 3 attempts, archives the completed files: - * to: {{{demuxProcessing/mrOutput}}} + * from: {{{demuxProcessing/mrOutput}}} - * to: {{{{{{dataSinkArchives/[yyyyMMdd]/*/*.done}}} + * to: {{{dataSinkArchives/[yyyyMMdd]/*/*.done}}} + 1. Otherwise moves the completed files to an error folder: + * from: {{{demuxProcessing/mrOutput}}} + * to: {{{dataSinkArchives/InError/[yyyyMMdd]/*/*.done}}} 1. PostProcessManager wakes up every few minutes and aggregates, orders and de-dups record files. * from: postProcess/demuxOutputDir_*/[clusterName]/[dataType]/[dataType]_[yyyyMMdd]_[HH].R.evt}}} * to: {{{repos/[clusterName]/[dataType]/[yyyyMMdd]/[HH]/[mm]/[dataType]_[yyyyMMdd]_[HH]_[N].[N].evt}}}
