Thanks for you reply! I will do this in JIRA issue.
On Tue, Jul 10, 2012 at 4:36 PM, Ahmed Fathalla <afatha...@gmail.com> wrote: > Ivy, > > Thanks alot for your contributions! > > However, a better way to submit patches to Chukwa is: > > 1-Open a JIRA issue on > > https://issues.apache.org/jira/browse/CHUKWA > > I think you will need to register on JIRA first in order to create issues > > 2- Generate a .patch file using eclipse or command line tools, the .patch > file will be a single file with all the changes you have done. > > > 3- Using the "Submit Patch" option on the JIRA issue upload the file. Your > changes will be reviewed by a committer and committed if everything i okay. > > Thank you again for your interest in Chukwa and we really appreciate your > proactive approach in contributing back to the project! > > On Tue, Jul 10, 2012 at 10:04 AM, IvyTang <ivytang0...@gmail.com> wrote: > >> Our team has used chukwa *CharFileTailingAdaptorUTF8* to collect the >> log4j rotated log files for several months.It does help us to collect the >> logs from everywhere to our hadoop center. >> During the work , we met several problems . And i have raised them in >> this mail list , but i still haven't got a good solution. >> So we read the source code , and did some changes >> >> Our log files are generated by the log4j ,and the log4j appender is >> org.apache.log4j.DailyRollingFileAppender. >> If you use log4j to generate the rotated log ,may this mail will help you. >> >> These two problems are the causes why we have to modify the source code. >> >> 1. The mismatching checkpoint size and file size. >> >> I raised this problem in May 14 ,"the check point offset is bigger >> than the log file size". And Ariel Rabkin and Eric have answered my >> question , thanks for your replies. >> >> When chukwa starts, it will read the the check point file , let the >> size be the filereadoffset. The size in the checkpoint indicates how many >> bytes the adaptor has send . >> >> If the log source is stream or a file won't rotate , this size is >> right ,it indeed is the filereadoffset.But the file is rorated , the >> checkpoint size is often bigger than the file size ,and this will cause >> chukwa resend all the log file. >> >> So we add a "log.info("chunk seqID:"+c.getSeqID());" in >> ChukwaHttpSender:send. >> >> *for (Chunk c : toSend) { >> DataOutputBuffer b = new >> DataOutputBuffer(c.getSerializedSizeEstimate()); >> try { >> c.write(b); >> } catch (IOException err) { >> log.error("serialization threw IOException", err); >> } >> serializedEvents.add(b); >> // store a CLE for this chunk which we will use to ack this chunk >> to the >> // caller of send() >> // (e.g. the agent will use the list of CLE's for checkpointing) >> log.info("chunk seqID:"+c.getSeqID()); >> commitResults.add(new CommitListEntry(c.getInitiator(), >> c.getSeqID(), >> c.getSeqID() - c.getData().length)); >> }* >> * >> **The seqid is the offset of the send chunks in this log file.** >> * So when we need to restart the chukwa, we just need to stop the >> chukwa , change the size in checkpoint to the last chunk seqid in log and >> start chukwa. >> We also can directly apply the seqID to checkpoint size ,but we >> don't know if this will cause other problems. >> * >> >> *2.* *The method tailFile in FileTailingAdaptor is the core code of >> collecting the log. The code use the fileReadOffset , file length to detect >> the rotated file. >> *RandomAccessFile newReader = new RandomAccessFile(toWatch, "r"); >> * >> * len = reader.length();* >> * long newLength = newReader.length();* >> * if (newLength < len && fileReadOffset >= len) {* >> * if (reader != null) {* >> * reader.close();* >> * }* >> * * >> * reader = newReader;* >> * fileReadOffset = 0L;* >> * log.debug("Adaptor|"+ adaptorID + "| File size mismatched, >> rotating: "* >> * + toWatch.getAbsolutePath());* >> * } else {* >> * try {* >> * if (newReader != null) {* >> * newReader.close();* >> * }* >> * newReader =null;* >> * } catch (Throwable e) {* >> * // do nothing.* >> * }* >> * }* >> * >> * >> * *This arithmetic does work in most cases. But there is a case >> ,that when chukwa starts , the log file is 0 and it will be 0 untill it has >> been rotated. After it has been rotated ,becase its size is 0 ,this log >> will be removed. A new file has generated , and its size isn't 0. >> But the len is still 0 ,newLength is > 0.So this contition if >> (newLength < len && fileReadOffset >= len) will never be archived. The new >> log file will never be detected. >> >> So we changed the implemention of this method, we use timestamp >> to detect the new log file.The lastSlurpTime is the timestamp of the last >> slurp ,it is been declared and assigned in LWFTAdaptor . >> try { >> len = reader.length(); >> if(lastSlurpTime == 0){ >> lastSlurpTime = System.currentTimeMillis(); >> } >> if (offsetOfFirstByte > fileReadOffset) { >> // If the file rotated, the recorded >> offsetOfFirstByte is greater than >> // file size,reset the first byte position to >> beginning of the file. >> fileReadOffset = 0; >> offsetOfFirstByte = 0L; >> log.warn("offsetOfFirstByte>fileReadOffset, resetting >> offset to 0"); >> } >> if (len == fileReadOffset) { >> File fixedNameFile = new >> File(toWatch.getAbsolutePath()); >> long fixedNameLastModified = >> fixedNameFile.lastModified(); >> if (fixedNameLastModified > lastSlurpTime) { >> // If len == fileReadOffset,the file stops >> rolling log or the file has rotated. >> // But fixedNameLastModified > lastSlurpTime , >> this means after the last slurping,the file has been written . >> // so the file has been rotated. >> boolean hasLeftData = true; >> while(hasLeftData){// read the possiblly >> generated log >> hasLeftData = slurp(len, reader); >> } >> RandomAccessFile newReader = new >> RandomAccessFile(toWatch, "r"); >> if (reader != null) { >> reader.close(); >> } >> reader = newReader; >> fileReadOffset = 0L; >> len = reader.length(); >> log.debug("Adaptor|" + adaptorID + "| File size >> mismatched, rotating: " + >> toWatch.getAbsolutePath()); >> } >> hasMoreData = slurp(len, reader); >> } else if (len < fileReadOffset) { >> // file has rotated and no detection >> if (reader != null) { >> reader.close(); >> } >> reader = null; >> fileReadOffset = 0L; >> offsetOfFirstByte = 0L; >> hasMoreData = true; >> log.warn("Adaptor|" + adaptorID + "| file: " + >> toWatch.getPath() >> + ", has rotated and no detection - reset >> counters to 0L"); >> } else { >> hasMoreData = slurp(len, reader); >> } >> >> >> We hope these two changes will help the adaptor collect the rotated file >> more well. >> >> If these is anything wrong ,please let me know, >> >> Thanks! >> >> >> >> -- >> Best regards, >> >> Ivy Tang >> >> >> >> > > > -- > Ahmed Fathalla > -- Best regards, Ivy Tang