Thanks, so is this a bug? My issue is that i am storing the number of "bytes" served from my apache log, and when its 0, i will end up storing 48 and skewing the reports.
Any thoughts? Thanks for the find. _AD On Tue, Oct 4, 2011 at 2:56 PM, Mingjie Lai <[email protected]> wrote: > AD. > > I noticed the issue before. It's actually not a regex problem, but the way > flume printing byte array as string at collector side. > > You can also reproduce it by: > # bin/flume node_nowatch -1 -s -n dump -c 'dump: tail("/tmp/integer") | { > value("bb", "b") => console}; > > Below is the piece of code (Attributes.java). It takes a bytes array whose > length is 1, 4, or 8 and print them as int or long. In case of length 1, it > only prints the byte value. > > --------------- > // this is a hack that prints in int, string and double format when > there > // are 8 bytes. > // TODO (jon) this gets grosser and grosser. make a final decision on > how > // these attributes are going to be > if (bytes.length == 8) { > > return "(long)" + readLong(e, attr).toString() + " (string) '" > + readString(e, attr) + "'" + " (double)" > + readDouble(e, attr).toString(); > } > > // this is a similar hack that prints in int and string format when > there > // are 4 bytes. > if (bytes.length == 4) { > return readInt(e, attr).toString() + " '" + readString(e, attr) + > "'"; > } > > if (bytes.length == 1) { > return "" + (((int) bytes[0]) & 0xff); > } > > --------------- > > -mingjie > > > On 10/03/2011 07:40 PM, AD wrote: > >> Hello, >> >> I noticed when trying to use regex to parse an integer from a file, a >> number of 0 was populating the number 48 into the output on the flume >> command line instead. has anyone come across this before? Example below: >> >> bash-3.2# cat /tmp/integer >> 0 >> >> bash-3.2# cat parse.int <http://parse.int> >> >> ./flume node_nowatch -1 -s -n dump -c 'dump: tail("/tmp/integer") | { >> regexAll("^(\\d+)","mynum") => console }; ' >> >> bash-3.2# ./parse.int <http://parse.int> 2>&1 | grep mynum >> >> >> 2011-10-03 22:37:49,526 [main] INFO agent.FlumeNode: System property >> sun.java.command=com.cloudera.**flume.agent.FlumeNode -1 -s -n dump -c >> dump: tail("/tmp/integer") | { regexAll("^(\\d+)","mynum") => console }; >> 2011-10-03 22:37:49,966 [main] INFO agent.FlumeNode: Loading spec from >> command line: 'dump: tail("/tmp/integer") | { >> regexAll("^(\\d+)","mynum") => console }; ' >> lilmac.home [INFO Mon Oct 03 22:37:50 EDT 2011] { *mynum : 48* } { >> >> tailSrcFile : integer } 0 >> >> Cheers, >> AD >> >
