Hi Ivy,

In Http Sender, there is a throttle to sleep 5 seconds, if the data sending 
frequency is high.  If the lastPosted time is more recent than the current 
time, then the value becomes negative.  This could happen when ntp makes time 
correction.  When sleeping on negative millisecond, then it throws exception, 
and prevents agent to send data to collector.  I filed a JIRA (CHUKWA-595) for 
this, thanks for catching this.  :)

regards,
Eric

On Jul 28, 2011, at 11:59 PM, Ying Tang wrote:

> Hi Eric,
>     I meet with an exception.
>  
>   Exception in thread "HTTP post thread" java.lang.IllegalArgumentException: 
> timeout value is negative
>         at java.lang.Thread.sleep(Native Method)
>         at 
> org.apache.hadoop.chukwa.datacollection.connector.http.HttpConnector.run(HttpConnector.java:187)
>         at java.lang.Thread.run(Thread.java:636)
>  
> And the log " INFO Timer-1 HttpConnector - # http chunks ACK'ed since last 
> report: 0 " appears again.
> On Thu, Jul 28, 2011 at 3:23 AM, Eric Yang <[email protected]> wrote:
> We need two numbers to track rotated logs properly without sending 
> duplicates.  A number is the current file offset, and another number is the 
> incremental sequence number.  The checkpoint file only preserves the 
> incremental sequence number.  The current file offset is not tracked.  If 
> agent crashes, agent will stream from offset 0 of the current file.  Hence, 
> there maybe duplicate entries, when system recovers from agent crash.  I 
> tested this, after file rotated, it is sending the new data from the current 
> log file, but the file current offset is tracked in memory only.  It is not 
> visible from port 9093.
> 
> This issue is not directly related to the issue that you experienced which 
> the file tailing adaptor.  From the code, it is still not clear why agent 
> crashed.  However, I would recommend to use Chukwa trunk with SocketAdaptor 
> with Log4JSocketAppender.  It is more reliable than UTF8 adaptor.
> 
> See: http://wiki.apache.org/hadoop/Chukwa_Quick_Start
> 
> for installation instructions.
> 
> regards,
> Eric
> 
> On Jul 26, 2011, at 10:54 PM, Ying Tang wrote:
> 
> > And after i restart the chukwa agent , the telnet adaptor last number is 
> > still 10487067.
> >
> > On Wed, Jul 27, 2011 at 1:52 PM, Ying Tang <[email protected]> wrote:
> > Does this mean the adaptor still transfer the previous log ? The current 
> > log is missing?
> >
> >
> > On Wed, Jul 27, 2011 at 1:08 PM, Eric Yang <[email protected]> wrote:
> > This looks like a bug, the last number should be in sync with the
> > current file's size, but the UTF adaptor is still tailing the previous
> > file (which rotated at 10487067)
> > It means there is a bug in handling the file rotation, but the adaptor
> > did not pick up the change.
> >
> > Please open a jira.  Thanks
> >
> > regards,
> > Eric
> >
> > On Tue, Jul 26, 2011 at 8:05 PM, Ying Tang <[email protected]> wrote:
> > > The log didn't rotate very  rapidly.
> > >
> > > Now i can't rebuild the scenario . But when the chukwa agent log looks ok,
> > >
> > >  2011-07-27 10:57:38,967 INFO Timer-0 ChukwaAgent - writing checkpoint
> > > 1307083
> > > 2011-07-27 10:57:42,571 INFO HTTP post thread ChukwaHttpSender - 
> > > collected 1
> > > chunks for post_745
> > > 2011-07-27 10:57:42,571 INFO HTTP post thread ChukwaHttpSender - >>>>>> 
> > > HTTP
> > > post_745 to http://chukwacollector1.xingcloud.com:9095/ length = 1837
> > > 2011-07-27 10:57:42,574 INFO HTTP post thread ChukwaHttpSender - >>>>>> 
> > > HTTP
> > > Got success back from http://chukwacollector1.xingcloud.com:9095/chukwa;
> > > response length 43
> > > 2011-07-27 10:57:42,574 INFO HTTP post thread ChukwaHttpSender - post_745
> > > sent 0 chunks, got back 1 acks
> > >
> > > The list in telnet agent 9093 is:
> > > adaptor_2963225a90653a309cf779d4a1d815a3)
> > > org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8
> > > Gamelog 0 /var/log/gamelog 10487067
> > > After several minites ,  the list is still
> > > adaptor_2963225a90653a309cf779d4a1d815a3)
> > > org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8
> > > Gamelog 0 /var/log/gamelog 10487067
> > >
> > > Is the 10487067 the offset number ?The number didn't changed , and the log
> > > file's size is from 0 to 10M .And now the log file's size is 1150872.
> > >
> > > On Wed, Jul 27, 2011 at 12:26 AM, Eric Yang <[email protected]> wrote:
> > >>
> > >> CharFileTailingAdaptorUTF should handle log rotation gracefully.  Is the
> > >> log rotating rapidly?
> > >> Run those command on chukwa agent:
> > >> telnet localhost 9093
> > >> list
> > >> This should show a list of tailing files, and check the offset number of
> > >> the tailing log file.  The most right number should be smaller than the 
> > >> size
> > >> of your log file.  If it is bigger and not changing, it is most likely 
> > >> there
> > >> is a bug that we haven't seen before.  It might be useful to turn on 
> > >> debug
> > >> on chukwa agent and see if this can be reproduced to nail down the root
> > >> cause.  Thanks
> > >> regards,
> > >> Eric
> > >> On Jul 26, 2011, at 6:13 AM, Ying Tang wrote:
> > >>
> > >> Is there the possibility that
> > >> when the log file reaches the log4g config file size ,the log4j will
> > >> rename this log file and create a new file with this name as the log 
> > >> file .
> > >> At the time ,the chukwa adaptor doesn't tail the log properly , and this
> > >> cause the chuwa agent can't collector the log any more.
> > >>
> > >> On Tue, Jul 26, 2011 at 2:07 PM, Ying Tang <[email protected]> wrote:
> > >>>
> > >>> The log file is log4j log file ,and the size is 10M ,the maxbackupindex
> > >>> is 1.
> > >>>
> > >>>
> > >>> On Tue, Jul 26, 2011 at 1:42 PM, Eric Yang <[email protected]> wrote:
> > >>>>
> > >>>> Can you run "ls -l" to show the size and dateof the log files that you
> > >>>> are streaming?
> > >>>>
> > >>>> regards,
> > >>>> Eric
> > >>>>
> > >>>> On Mon, Jul 25, 2011 at 7:36 PM, Ying Tang <[email protected]>
> > >>>> wrote:
> > >>>> > The chukwa version is 0.4.0 and the adaptor is
> > >>>> >
> > >>>> > org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8
> > >>>> >
> > >>>> > On Mon, Jul 25, 2011 at 11:50 PM, Eric Yang <[email protected]> 
> > >>>> > wrote:
> > >>>> >>
> > >>>> >> Hi Ivy,
> > >>>> >>
> > >>>> >> When data is send from agent to collector, collector send
> > >>>> >> acknowledgment
> > >>>> >> of receiving of the chunks.  At 00:03:28, there are 5 chunks
> > >>>> >> acknowledged.
> > >>>> >>  This means communication between collector and agent are working at
> > >>>> >> that
> > >>>> >> point in time.  However, there is no activity after 00:04:28.  This
> > >>>> >> looks
> > >>>> >> like adaptor did not handle the log rotation properly at close to
> > >>>> >> midnight.
> > >>>> >>  Which version of Chukwa are you using and which adaptor are you
> > >>>> >> using?
> > >>>> >>
> > >>>> >> regards,
> > >>>> >> Eric
> > >>>> >>
> > >>>> >> On Jul 25, 2011, at 12:40 AM, Ying Tang wrote:
> > >>>> >>
> > >>>> >> > Hi all,
> > >>>> >> >
> > >>>> >> > In my cluster , i have two chukwa agent and one collector .
> > >>>> >> > At a time ,  both chukwa agents's log :
> > >>>> >> > 2011-07-18 00:03:28,688 INFO Timer-1 HttpConnector - # http chunks
> > >>>> >> > ACK'ed since last report: 5
> > >>>> >> > 2011-07-18 00:04:28,697 INFO Timer-1 HttpConnector - # http chunks
> > >>>> >> > ACK'ed since last report: 0
> > >>>> >> > 2011-07-18 00:05:28,706 INFO Timer-1 HttpConnector - # http chunks
> > >>>> >> > ACK'ed since last report: 0
> > >>>> >> > 2011-07-18 00:06:28,714 INFO Timer-1 HttpConnector - # http chunks
> > >>>> >> > ACK'ed since last report: 0
> > >>>> >> > 2011-07-18 00:07:29,340 INFO Timer-1 HttpConnector - # http chunks
> > >>>> >> > ACK'ed since last report: 0
> > >>>> >> >
> > >>>> >> > And the collector
> > >>>> >> > 2011-07-17 11:02:32,155 INFO Timer-3 SeqFileWriter -
> > >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0
> > >>>> >> > 2011-07-17 11:02:43,074 INFO Timer-1 root -
> > >>>> >> > stats:ServletCollector,numberHTTPConnection:0,numberchunks:0
> > >>>> >> > 2011-07-17 11:03:02,162 INFO Timer-3 SeqFileWriter -
> > >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0
> > >>>> >> > 2011-07-17 11:03:32,168 INFO Timer-3 SeqFileWriter -
> > >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0
> > >>>> >> > 2011-07-17 11:03:43,085 INFO Timer-1 root -
> > >>>> >> > stats:ServletCollector,numberHTTPConnection:0,numberchunks:0
> > >>>> >> > 2011-07-17 11:04:02,174 INFO Timer-3 SeqFileWriter -
> > >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0
> > >>>> >> > 2011-07-17 11:04:32,180 INFO Timer-3 SeqFileWriter -
> > >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0
> > >>>> >> > 2011-07-17 11:04:43,096 INFO Timer-1 root -
> > >>>> >> > stats:ServletCollector,numberHTTPConnection:0,numberchunks:0
> > >>>> >> > 2011-07-17 11:05:02,185 INFO Timer-3 SeqFileWriter -
> > >>>> >> > stat:datacollection.writer.hdfs dataSize=0 dataRate=0
> > >>>> >> >
> > >>>> >> > (the collector and agent has  different  timezone)
> > >>>> >> > And the collector didn't collect any log.
> > >>>> >> >
> > >>>> >> >
> > >>>> >> > What dons the "http chunks ACK'ed since last report: 0" means?
> > >>>> >> > And from this log "http chunks ACK'ed since last report: 0" 
> > >>>> >> > appears
> > >>>> >> > to
> > >>>> >> >  agent crash, the chukwa port still on , but after several days,
> > >>>> >> > both agents
> > >>>> >> > crashed without exceptions.
> > >>>> >> >
> > >>>> >> >
> > >>>> >> > --
> > >>>> >> > Best regards,
> > >>>> >> >
> > >>>> >> > Ivy Tang
> > >>>> >> >
> > >>>> >> >
> > >>>> >> >
> > >>>> >>
> > >>>> >
> > >>>> >
> > >>>> >
> > >>>> > --
> > >>>> > Best regards,
> > >>>> > Ivy Tang
> > >>>> >
> > >>>> >
> > >>>> >
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Best regards,
> > >>> Ivy Tang
> > >>>
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> Best regards,
> > >> Ivy Tang
> > >>
> > >>
> > >>
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Ivy Tang
> > >
> > >
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> > Ivy Tang
> >
> >
> >
> >
> >
> >
> > --
> > Best regards,
> >
> > Ivy Tang
> >
> >
> >
> 
> 
> 
> 
> -- 
> Best regards,
> 
> Ivy Tang
> 
> 
> 

Reply via email to