Hi Israel, I copied out the portions of my config that pertain to the server that I'm seeing this bad behavior from (and sanitized it a little) Otherwise my config is like 1400 lines now, trying to stay with a single config and dist it out to each server for consistency to save headaches.
Dave # define sources, channels, and sinks for each log file node_usvsm01.sources = source21061 source21062 source21063 node_usvsm01.channels = channel21061 channel21062 channel21063 node_usvsm01.sinks = sink21061 sink21062 sink21063 # source file usvsm01 - smaccess.log node_usvsm01.sources.source21061.type = exec node_usvsm01.sources.source21061.command = tail -F /opt/siteminder/CA/log/smaccess.log node_usvsm01.sources.source21061.channels = channel21061 # source file imsnolusvsm01 - smps.log node_usvsm01.sources.source21062.type = exec node_usvsm01.sources.source21062.command = tail -F /opt/siteminder/CA/log/smps.log node_usvsm01.sources.source21062.channels = channel21062 # source file imsnolusvsm01 - smtracedefault.log node_usvsm01.sources.source21063.type = exec node_usvsm01.sources.source21063.command = tail -F /opt/siteminder/CA/log/smtracedefault.log node_usvsm01.sources.source21063.channels = channel21063 node_usvsm01.channels.channel21061.type = memory node_usvsm01.channels.channel21061.capacity = 100000 node_usvsm01.channels.channel21061.transactionCapactiy = 1000 node_usvsm01.channels.channel21062.type = memory node_usvsm01.channels.channel21062.capacity = 100000 node_usvsm01.channels.channel21062.transactionCapactiy = 1000 node_usvsm01.channels.channel21063.type = memory node_usvsm01.channels.channel21063.capacity = 100000 node_usvsm01.channels.channel21063.transactionCapactiy = 1000 # send channels --> flume @ usvinf01 node_usvsm01.sinks.sink21061.type = avro node_usvsm01.sinks.sink21061.channel = channel21061 node_usvsm01.sinks.sink21061.hostname = usinf01 node_usvsm01.sinks.sink21061.port = 21061 node_usvsm01.sinks.sink21062.type = avro node_usvsm01.sinks.sink21062.channel = channel21062 node_usvsm01.sinks.sink21062.hostname = usinf01 node_usvsm01.sinks.sink21062.port = 21062 node_usvsm01.sinks.sink21063.type = avro node_usvsm01.sinks.sink21063.channel = channel21063 node_usvsm01.sinks.sink21063.hostname = usinf01 node_usvsm01.sinks.sink21063.port = 21063 node102.sources = source21061 source21062 source21063 node102.channels = channel21061 channel21062 channel21063 node102.sinks = sink21061 sink21062 sink21063 # - usvsm01 - # source file usvsm01 - smaccess.log node102.sources.source21061.type = avro node102.sources.source21061.bind = 0.0.0.0 node102.sources.source21061.port = 21061 node102.sources.source21061.channels = channel21061 # source file usvsm01 - smps.log node102.sources.source21062.type = avro node102.sources.source21062.bind = 0.0.0.0 node102.sources.source21062.port = 21062 node102.sources.source21062.channels = channel21062 # source file usvsm01 - smtracedefault.log node102.sources.source21063.type = avro node102.sources.source21063.bind = 0.0.0.0 node102.sources.source21063.port = 21063 node102.sources.source21063.channels = channel21063 # - usvsm01 - node102.channels.channel21061.type = memory node102.channels.channel21061.capacity = 100000 node102.channels.channel21061.transactionCapactiy = 1000 node102.channels.channel21062.type = memory node102.channels.channel21062.capacity = 100000 node102.channels.channel21062.transactionCapactiy = 1000 node102.channels.channel21063.type = memory node102.channels.channel21063.capacity = 100000 node102.channels.channel21063.transactionCapactiy = 1000 # usvsm01 - # source file usvsm01 - smaccess.log node102.sinks.sink21061.type = FILE_ROLL node102.sinks.sink21061.channel = channel21061 node102.sinks.sink21061.sink.directory = /flume_logs/usvsm01/siteminder/smaccess_log node102.sinks.sink21061.sink.rollInterval = 86400 node102.sinks.sink21061.sink.serializer = TEXT # source file usvsm01 - smps.log node102.sinks.sink21062.type = FILE_ROLL node102.sinks.sink21062.channel = channel21062 node102.sinks.sink21062.sink.directory = /flume_logs/usvsm01/siteminder/smps_log node102.sinks.sink21062.sink.rollInterval = 86400 node102.sinks.sink21062.sink.serializer = TEXT # source file usvsm01 - smtracedefault.log node102.sinks.sink21063.type = FILE_ROLL node102.sinks.sink21063.channel = channel21063 node102.sinks.sink21063.sink.directory = /flume_logs/usvsm01/siteminder/smtracedefault_log node102.sinks.sink21063.sink.rollInterval = 86400 node102.sinks.sink21063.sink.serializer = TEXT On Fri, Apr 5, 2013 at 9:00 AM, Israel Ekpo <[email protected]> wrote: > Hi Dave, > > Could you post your agents configuration file? > > Sometimes, little mis-configurations can result in un-intended or > undefined behaviors. > > > > On Fri, Apr 5, 2013 at 9:52 AM, Cochran, David <[email protected]>wrote: > >> I'm seeing a LOT of random dupes in some of my log files.... >> >> This is pretty consistent in one in particular that's being tail'ed >> averages ~20M per day, everyday. On the only sink (FILE_ROLL) the >> resulting 24hour log is 55M. Just some quick counts grep'ing a random time >> (ie 07:23) shows the sink log with a dozen or so more lines with the same >> timestamp than the source has every minute. >> >> But this is happening like clockwork everyday for the last couple months >> when I started using Flume on this box. >> >> I did check that there wasn't another source from this or another server >> sending to the same port...and the entries of the log file look proper for >> that app. >> >> The logs are not rolling at the same time on the source/sink and I've not >> yet taken the time to set up copies of each begining and ending at the same >> times and run a diff against them, but a preliminary 'eyeball diff' just >> shows dupes. I will note on the source a line with the exact same text may >> appear more than once as the logging mechanism does not log more precise >> then hour/minute. >> >> All in all, dupes are better than drops, but is there anything in >> particular I should look for to try to find the cause of and eliminate this? >> >> >> Thanks in advance for any thoughts, >> Dave >> >> >> >>
