Does this mean the adaptor still transfer the previous log ? The current log is missing?
On Wed, Jul 27, 2011 at 1:08 PM, Eric Yang <[email protected]> wrote: > This looks like a bug, the last number should be in sync with the > current file's size, but the UTF adaptor is still tailing the previous > file (which rotated at 10487067) > It means there is a bug in handling the file rotation, but the adaptor > did not pick up the change. > > Please open a jira. Thanks > > regards, > Eric > > On Tue, Jul 26, 2011 at 8:05 PM, Ying Tang <[email protected]> wrote: > > The log didn't rotate very rapidly. > > > > Now i can't rebuild the scenario . But when the chukwa agent log looks > ok, > > > > 2011-07-27 10:57:38,967 INFO Timer-0 ChukwaAgent - writing checkpoint > > 1307083 > > 2011-07-27 10:57:42,571 INFO HTTP post thread ChukwaHttpSender - > collected 1 > > chunks for post_745 > > 2011-07-27 10:57:42,571 INFO HTTP post thread ChukwaHttpSender - >>>>>> > HTTP > > post_745 to http://chukwacollector1.xingcloud.com:9095/ length = 1837 > > 2011-07-27 10:57:42,574 INFO HTTP post thread ChukwaHttpSender - >>>>>> > HTTP > > Got success back from http://chukwacollector1.xingcloud.com:9095/chukwa; > > response length 43 > > 2011-07-27 10:57:42,574 INFO HTTP post thread ChukwaHttpSender - post_745 > > sent 0 chunks, got back 1 acks > > > > The list in telnet agent 9093 is: > > adaptor_2963225a90653a309cf779d4a1d815a3) > > > org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8 > > Gamelog 0 /var/log/gamelog 10487067 > > After several minites , the list is still > > adaptor_2963225a90653a309cf779d4a1d815a3) > > > org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8 > > Gamelog 0 /var/log/gamelog 10487067 > > > > Is the 10487067 the offset number ?The number didn't changed , and the > log > > file's size is from 0 to 10M .And now the log file's size is 1150872. > > > > On Wed, Jul 27, 2011 at 12:26 AM, Eric Yang <[email protected]> wrote: > >> > >> CharFileTailingAdaptorUTF should handle log rotation gracefully. Is the > >> log rotating rapidly? > >> Run those command on chukwa agent: > >> telnet localhost 9093 > >> list > >> This should show a list of tailing files, and check the offset number of > >> the tailing log file. The most right number should be smaller than the > size > >> of your log file. If it is bigger and not changing, it is most likely > there > >> is a bug that we haven't seen before. It might be useful to turn on > debug > >> on chukwa agent and see if this can be reproduced to nail down the root > >> cause. Thanks > >> regards, > >> Eric > >> On Jul 26, 2011, at 6:13 AM, Ying Tang wrote: > >> > >> Is there the possibility that > >> when the log file reaches the log4g config file size ,the log4j will > >> rename this log file and create a new file with this name as the log > file . > >> At the time ,the chukwa adaptor doesn't tail the log properly , and this > >> cause the chuwa agent can't collector the log any more. > >> > >> On Tue, Jul 26, 2011 at 2:07 PM, Ying Tang <[email protected]> > wrote: > >>> > >>> The log file is log4j log file ,and the size is 10M ,the maxbackupindex > >>> is 1. > >>> > >>> > >>> On Tue, Jul 26, 2011 at 1:42 PM, Eric Yang <[email protected]> wrote: > >>>> > >>>> Can you run "ls -l" to show the size and dateof the log files that you > >>>> are streaming? > >>>> > >>>> regards, > >>>> Eric > >>>> > >>>> On Mon, Jul 25, 2011 at 7:36 PM, Ying Tang <[email protected]> > >>>> wrote: > >>>> > The chukwa version is 0.4.0 and the adaptor is > >>>> > > >>>> > > org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8 > >>>> > > >>>> > On Mon, Jul 25, 2011 at 11:50 PM, Eric Yang <[email protected]> > wrote: > >>>> >> > >>>> >> Hi Ivy, > >>>> >> > >>>> >> When data is send from agent to collector, collector send > >>>> >> acknowledgment > >>>> >> of receiving of the chunks. At 00:03:28, there are 5 chunks > >>>> >> acknowledged. > >>>> >> This means communication between collector and agent are working > at > >>>> >> that > >>>> >> point in time. However, there is no activity after 00:04:28. This > >>>> >> looks > >>>> >> like adaptor did not handle the log rotation properly at close to > >>>> >> midnight. > >>>> >> Which version of Chukwa are you using and which adaptor are you > >>>> >> using? > >>>> >> > >>>> >> regards, > >>>> >> Eric > >>>> >> > >>>> >> On Jul 25, 2011, at 12:40 AM, Ying Tang wrote: > >>>> >> > >>>> >> > Hi all, > >>>> >> > > >>>> >> > In my cluster , i have two chukwa agent and one collector . > >>>> >> > At a time , both chukwa agents's log : > >>>> >> > 2011-07-18 00:03:28,688 INFO Timer-1 HttpConnector - # http > chunks > >>>> >> > ACK'ed since last report: 5 > >>>> >> > 2011-07-18 00:04:28,697 INFO Timer-1 HttpConnector - # http > chunks > >>>> >> > ACK'ed since last report: 0 > >>>> >> > 2011-07-18 00:05:28,706 INFO Timer-1 HttpConnector - # http > chunks > >>>> >> > ACK'ed since last report: 0 > >>>> >> > 2011-07-18 00:06:28,714 INFO Timer-1 HttpConnector - # http > chunks > >>>> >> > ACK'ed since last report: 0 > >>>> >> > 2011-07-18 00:07:29,340 INFO Timer-1 HttpConnector - # http > chunks > >>>> >> > ACK'ed since last report: 0 > >>>> >> > > >>>> >> > And the collector > >>>> >> > 2011-07-17 11:02:32,155 INFO Timer-3 SeqFileWriter - > >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 > >>>> >> > 2011-07-17 11:02:43,074 INFO Timer-1 root - > >>>> >> > stats:ServletCollector,numberHTTPConnection:0,numberchunks:0 > >>>> >> > 2011-07-17 11:03:02,162 INFO Timer-3 SeqFileWriter - > >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 > >>>> >> > 2011-07-17 11:03:32,168 INFO Timer-3 SeqFileWriter - > >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 > >>>> >> > 2011-07-17 11:03:43,085 INFO Timer-1 root - > >>>> >> > stats:ServletCollector,numberHTTPConnection:0,numberchunks:0 > >>>> >> > 2011-07-17 11:04:02,174 INFO Timer-3 SeqFileWriter - > >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 > >>>> >> > 2011-07-17 11:04:32,180 INFO Timer-3 SeqFileWriter - > >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 > >>>> >> > 2011-07-17 11:04:43,096 INFO Timer-1 root - > >>>> >> > stats:ServletCollector,numberHTTPConnection:0,numberchunks:0 > >>>> >> > 2011-07-17 11:05:02,185 INFO Timer-3 SeqFileWriter - > >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 > >>>> >> > > >>>> >> > (the collector and agent has different timezone) > >>>> >> > And the collector didn't collect any log. > >>>> >> > > >>>> >> > > >>>> >> > What dons the "http chunks ACK'ed since last report: 0" means? > >>>> >> > And from this log "http chunks ACK'ed since last report: 0" > appears > >>>> >> > to > >>>> >> > agent crash, the chukwa port still on , but after several days, > >>>> >> > both agents > >>>> >> > crashed without exceptions. > >>>> >> > > >>>> >> > > >>>> >> > -- > >>>> >> > Best regards, > >>>> >> > > >>>> >> > Ivy Tang > >>>> >> > > >>>> >> > > >>>> >> > > >>>> >> > >>>> > > >>>> > > >>>> > > >>>> > -- > >>>> > Best regards, > >>>> > Ivy Tang > >>>> > > >>>> > > >>>> > > >>> > >>> > >>> > >>> -- > >>> Best regards, > >>> Ivy Tang > >>> > >>> > >> > >> > >> > >> -- > >> Best regards, > >> Ivy Tang > >> > >> > >> > > > > > > > > -- > > Best regards, > > Ivy Tang > > > > > > > -- Best regards, Ivy Tang
