And after i restart the chukwa agent , the telnet adaptor last number is still 10487067.
On Wed, Jul 27, 2011 at 1:52 PM, Ying Tang <[email protected]> wrote: > Does this mean the adaptor still transfer the previous log ? The current > log is missing? > > > On Wed, Jul 27, 2011 at 1:08 PM, Eric Yang <[email protected]> wrote: > >> This looks like a bug, the last number should be in sync with the >> current file's size, but the UTF adaptor is still tailing the previous >> file (which rotated at 10487067) >> It means there is a bug in handling the file rotation, but the adaptor >> did not pick up the change. >> >> Please open a jira. Thanks >> >> regards, >> Eric >> >> On Tue, Jul 26, 2011 at 8:05 PM, Ying Tang <[email protected]> wrote: >> > The log didn't rotate very rapidly. >> > >> > Now i can't rebuild the scenario . But when the chukwa agent log looks >> ok, >> > >> > 2011-07-27 10:57:38,967 INFO Timer-0 ChukwaAgent - writing checkpoint >> > 1307083 >> > 2011-07-27 10:57:42,571 INFO HTTP post thread ChukwaHttpSender - >> collected 1 >> > chunks for post_745 >> > 2011-07-27 10:57:42,571 INFO HTTP post thread ChukwaHttpSender - >>>>>> >> HTTP >> > post_745 to http://chukwacollector1.xingcloud.com:9095/ length = 1837 >> > 2011-07-27 10:57:42,574 INFO HTTP post thread ChukwaHttpSender - >>>>>> >> HTTP >> > Got success back from http://chukwacollector1.xingcloud.com:9095/chukwa >> ; >> > response length 43 >> > 2011-07-27 10:57:42,574 INFO HTTP post thread ChukwaHttpSender - >> post_745 >> > sent 0 chunks, got back 1 acks >> > >> > The list in telnet agent 9093 is: >> > adaptor_2963225a90653a309cf779d4a1d815a3) >> > >> org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8 >> > Gamelog 0 /var/log/gamelog 10487067 >> > After several minites , the list is still >> > adaptor_2963225a90653a309cf779d4a1d815a3) >> > >> org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8 >> > Gamelog 0 /var/log/gamelog 10487067 >> > >> > Is the 10487067 the offset number ?The number didn't changed , and the >> log >> > file's size is from 0 to 10M .And now the log file's size is 1150872. >> > >> > On Wed, Jul 27, 2011 at 12:26 AM, Eric Yang <[email protected]> wrote: >> >> >> >> CharFileTailingAdaptorUTF should handle log rotation gracefully. Is >> the >> >> log rotating rapidly? >> >> Run those command on chukwa agent: >> >> telnet localhost 9093 >> >> list >> >> This should show a list of tailing files, and check the offset number >> of >> >> the tailing log file. The most right number should be smaller than the >> size >> >> of your log file. If it is bigger and not changing, it is most likely >> there >> >> is a bug that we haven't seen before. It might be useful to turn on >> debug >> >> on chukwa agent and see if this can be reproduced to nail down the root >> >> cause. Thanks >> >> regards, >> >> Eric >> >> On Jul 26, 2011, at 6:13 AM, Ying Tang wrote: >> >> >> >> Is there the possibility that >> >> when the log file reaches the log4g config file size ,the log4j will >> >> rename this log file and create a new file with this name as the log >> file . >> >> At the time ,the chukwa adaptor doesn't tail the log properly , and >> this >> >> cause the chuwa agent can't collector the log any more. >> >> >> >> On Tue, Jul 26, 2011 at 2:07 PM, Ying Tang <[email protected]> >> wrote: >> >>> >> >>> The log file is log4j log file ,and the size is 10M ,the >> maxbackupindex >> >>> is 1. >> >>> >> >>> >> >>> On Tue, Jul 26, 2011 at 1:42 PM, Eric Yang <[email protected]> wrote: >> >>>> >> >>>> Can you run "ls -l" to show the size and dateof the log files that >> you >> >>>> are streaming? >> >>>> >> >>>> regards, >> >>>> Eric >> >>>> >> >>>> On Mon, Jul 25, 2011 at 7:36 PM, Ying Tang <[email protected]> >> >>>> wrote: >> >>>> > The chukwa version is 0.4.0 and the adaptor is >> >>>> > >> >>>> > >> org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8 >> >>>> > >> >>>> > On Mon, Jul 25, 2011 at 11:50 PM, Eric Yang <[email protected]> >> wrote: >> >>>> >> >> >>>> >> Hi Ivy, >> >>>> >> >> >>>> >> When data is send from agent to collector, collector send >> >>>> >> acknowledgment >> >>>> >> of receiving of the chunks. At 00:03:28, there are 5 chunks >> >>>> >> acknowledged. >> >>>> >> This means communication between collector and agent are working >> at >> >>>> >> that >> >>>> >> point in time. However, there is no activity after 00:04:28. >> This >> >>>> >> looks >> >>>> >> like adaptor did not handle the log rotation properly at close to >> >>>> >> midnight. >> >>>> >> Which version of Chukwa are you using and which adaptor are you >> >>>> >> using? >> >>>> >> >> >>>> >> regards, >> >>>> >> Eric >> >>>> >> >> >>>> >> On Jul 25, 2011, at 12:40 AM, Ying Tang wrote: >> >>>> >> >> >>>> >> > Hi all, >> >>>> >> > >> >>>> >> > In my cluster , i have two chukwa agent and one collector . >> >>>> >> > At a time , both chukwa agents's log : >> >>>> >> > 2011-07-18 00:03:28,688 INFO Timer-1 HttpConnector - # http >> chunks >> >>>> >> > ACK'ed since last report: 5 >> >>>> >> > 2011-07-18 00:04:28,697 INFO Timer-1 HttpConnector - # http >> chunks >> >>>> >> > ACK'ed since last report: 0 >> >>>> >> > 2011-07-18 00:05:28,706 INFO Timer-1 HttpConnector - # http >> chunks >> >>>> >> > ACK'ed since last report: 0 >> >>>> >> > 2011-07-18 00:06:28,714 INFO Timer-1 HttpConnector - # http >> chunks >> >>>> >> > ACK'ed since last report: 0 >> >>>> >> > 2011-07-18 00:07:29,340 INFO Timer-1 HttpConnector - # http >> chunks >> >>>> >> > ACK'ed since last report: 0 >> >>>> >> > >> >>>> >> > And the collector >> >>>> >> > 2011-07-17 11:02:32,155 INFO Timer-3 SeqFileWriter - >> >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 >> >>>> >> > 2011-07-17 11:02:43,074 INFO Timer-1 root - >> >>>> >> > stats:ServletCollector,numberHTTPConnection:0,numberchunks:0 >> >>>> >> > 2011-07-17 11:03:02,162 INFO Timer-3 SeqFileWriter - >> >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 >> >>>> >> > 2011-07-17 11:03:32,168 INFO Timer-3 SeqFileWriter - >> >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 >> >>>> >> > 2011-07-17 11:03:43,085 INFO Timer-1 root - >> >>>> >> > stats:ServletCollector,numberHTTPConnection:0,numberchunks:0 >> >>>> >> > 2011-07-17 11:04:02,174 INFO Timer-3 SeqFileWriter - >> >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 >> >>>> >> > 2011-07-17 11:04:32,180 INFO Timer-3 SeqFileWriter - >> >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 >> >>>> >> > 2011-07-17 11:04:43,096 INFO Timer-1 root - >> >>>> >> > stats:ServletCollector,numberHTTPConnection:0,numberchunks:0 >> >>>> >> > 2011-07-17 11:05:02,185 INFO Timer-3 SeqFileWriter - >> >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0 >> >>>> >> > >> >>>> >> > (the collector and agent has different timezone) >> >>>> >> > And the collector didn't collect any log. >> >>>> >> > >> >>>> >> > >> >>>> >> > What dons the "http chunks ACK'ed since last report: 0" means? >> >>>> >> > And from this log "http chunks ACK'ed since last report: 0" >> appears >> >>>> >> > to >> >>>> >> > agent crash, the chukwa port still on , but after several days, >> >>>> >> > both agents >> >>>> >> > crashed without exceptions. >> >>>> >> > >> >>>> >> > >> >>>> >> > -- >> >>>> >> > Best regards, >> >>>> >> > >> >>>> >> > Ivy Tang >> >>>> >> > >> >>>> >> > >> >>>> >> > >> >>>> >> >> >>>> > >> >>>> > >> >>>> > >> >>>> > -- >> >>>> > Best regards, >> >>>> > Ivy Tang >> >>>> > >> >>>> > >> >>>> > >> >>> >> >>> >> >>> >> >>> -- >> >>> Best regards, >> >>> Ivy Tang >> >>> >> >>> >> >> >> >> >> >> >> >> -- >> >> Best regards, >> >> Ivy Tang >> >> >> >> >> >> >> > >> > >> > >> > -- >> > Best regards, >> > Ivy Tang >> > >> > >> > >> > > > > -- > Best regards, > > Ivy Tang > > > > -- Best regards, Ivy Tang
