CharFileTailingAdaptorUTF should handle log rotation gracefully. Is the log rotating rapidly?
Run those command on chukwa agent: telnet localhost 9093 list This should show a list of tailing files, and check the offset number of the tailing log file. The most right number should be smaller than the size of your log file. If it is bigger and not changing, it is most likely there is a bug that we haven't seen before. It might be useful to turn on debug on chukwa agent and see if this can be reproduced to nail down the root cause. Thanks regards, Eric On Jul 26, 2011, at 6:13 AM, Ying Tang wrote: > Is there the possibility that > when the log file reaches the log4g config file size ,the log4j will rename > this log file and create a new file with this name as the log file . At the > time ,the chukwa adaptor doesn't tail the log properly , and this cause the > chuwa agent can't collector the log any more. > > On Tue, Jul 26, 2011 at 2:07 PM, Ying Tang <[email protected]> wrote: > The log file is log4j log file ,and the size is 10M ,the maxbackupindex is 1. > > > > On Tue, Jul 26, 2011 at 1:42 PM, Eric Yang <[email protected]> wrote: > Can you run "ls -l" to show the size and dateof the log files that you > are streaming? > > regards, > Eric > > On Mon, Jul 25, 2011 at 7:36 PM, Ying Tang <[email protected]> wrote: > > The chukwa version is 0.4.0 and the adaptor is > > org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8 > > > > On Mon, Jul 25, 2011 at 11:50 PM, Eric Yang <[email protected]> wrote: > >> > >> Hi Ivy, > >> > >> When data is send from agent to collector, collector send acknowledgment > >> of receiving of the chunks. At 00:03:28, there are 5 chunks acknowledged. > >> This means communication between collector and agent are working at that > >> point in time. However, there is no activity after 00:04:28. This looks > >> like adaptor did not handle the log rotation properly at close to midnight. > >> Which version of Chukwa are you using and which adaptor are you using? > >> > >> regards, > >> Eric > >> > >> On Jul 25, 2011, at 12:40 AM, Ying Tang wrote: > >> > >> > Hi all, > >> > > >> > In my cluster , i have two chukwa agent and one collector . > >> > At a time , both chukwa agents's log : > >> > 2011-07-18 00:03:28,688 INFO Timer-1 HttpConnector - # http chunks > >> > ACK'ed since last report: 5 > >> > 2011-07-18 00:04:28,697 INFO Timer-1 HttpConnector - # http chunks > >> > ACK'ed since last report: 0 > >> > 2011-07-18 00:05:28,706 INFO Timer-1 HttpConnector - # http chunks > >> > ACK'ed since last report: 0 > >> > 2011-07-18 00:06:28,714 INFO Timer-1 HttpConnector - # http chunks > >> > ACK'ed since last report: 0 > >> > 2011-07-18 00:07:29,340 INFO Timer-1 HttpConnector - # http chunks > >> > ACK'ed since last report: 0 > >> > > >> > And the collector > >> > 2011-07-17 11:02:32,155 INFO Timer-3 SeqFileWriter - > >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 > >> > 2011-07-17 11:02:43,074 INFO Timer-1 root - > >> > stats:ServletCollector,numberHTTPConnection:0,numberchunks:0 > >> > 2011-07-17 11:03:02,162 INFO Timer-3 SeqFileWriter - > >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 > >> > 2011-07-17 11:03:32,168 INFO Timer-3 SeqFileWriter - > >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 > >> > 2011-07-17 11:03:43,085 INFO Timer-1 root - > >> > stats:ServletCollector,numberHTTPConnection:0,numberchunks:0 > >> > 2011-07-17 11:04:02,174 INFO Timer-3 SeqFileWriter - > >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 > >> > 2011-07-17 11:04:32,180 INFO Timer-3 SeqFileWriter - > >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 > >> > 2011-07-17 11:04:43,096 INFO Timer-1 root - > >> > stats:ServletCollector,numberHTTPConnection:0,numberchunks:0 > >> > 2011-07-17 11:05:02,185 INFO Timer-3 SeqFileWriter - > >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 > >> > > >> > (the collector and agent has different timezone) > >> > And the collector didn't collect any log. > >> > > >> > > >> > What dons the "http chunks ACK'ed since last report: 0" means? > >> > And from this log "http chunks ACK'ed since last report: 0" appears to > >> > agent crash, the chukwa port still on , but after several days, both > >> > agents > >> > crashed without exceptions. > >> > > >> > > >> > -- > >> > Best regards, > >> > > >> > Ivy Tang > >> > > >> > > >> > > >> > > > > > > > > -- > > Best regards, > > Ivy Tang > > > > > > > > > > -- > Best regards, > > Ivy Tang > > > > > > > -- > Best regards, > > Ivy Tang > > >
