We recently modified the RollSink to hide our problem by giving it a few seconds to finish writing before rolling. We are going to test it out and if it fixes our issue we will provide a patch later today. On Oct 19, 2011 1:27 PM, "AD" <straightfl...@gmail.com> wrote:
> Yea, i am using Hbase sink, so i guess its possible something is getting > hung up there and causing the collector to die. The number of file > descriptors seems more than safe under the limit. > > On Wed, Oct 19, 2011 at 3:16 PM, Cameron Gandevia <cgande...@gmail.com>wrote: > >> We were seeing the same issue when our HDFS instance was overloaded and >> taking over a second to respond. I assume if whatever backend is down the >> collector will die and need to be restarted when it becomes available again? >> Doesn't seem very reliable >> >> >> On Wed, Oct 19, 2011 at 8:13 AM, Ralph Goers >> <ralph.go...@dslextreme.com>wrote: >> >>> We saw this problem when it was taking more than 1 second for a response >>> from writing to Cassandra (our back end). A single long response will kill >>> the collector. We had to revert back to the version of Flume that uses >>> syncrhonization instead of read/write locking to get around this. >>> >>> Ralph >>> >>> On Oct 18, 2011, at 1:55 PM, AD wrote: >>> >>> > Hello, >>> > >>> > My collector keeps dying with the following error, is this a known >>> issue? Any idea how to prevent or find out what is causing it ? is >>> format("%{nanos}" an issue ? >>> > >>> > 2011-10-17 23:16:33,957 INFO >>> com.cloudera.flume.core.connector.DirectDriver: Connector logicalNode >>> flume1-18 exited with error: null >>> > java.lang.InterruptedException >>> > at >>> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1246) >>> > at >>> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:1009) >>> > at >>> com.cloudera.flume.handlers.rolling.RollSink.close(RollSink.java:296) >>> > at >>> com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67) >>> > at >>> com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67) >>> > >>> > >>> > source: collectorSource("35853") >>> > sink: regexAll("^([0-9.]+)\\s\\[([0-9a-zA-z\\/: >>> -]+)\\]\\s([A-Z]+)\\s([a-zA-Z0-9.:]+)\\s\"([^\\s]+)\"\\s([0-9]+)\\s([0-9]+)\\s\"([^\\s]+)\"\\s\"([a-zA-Z0-9\\/()_ >>> -;]+)\"\\s(hit|miss)\\s([0-9.]+)","hbase_remote_host","hbase_request_date","hbase_request_method","hbase_request_host","hbase_request_url","hbase_response_status","hbase_response_bytes","hbase_referrer","hbase_user_agent","hbase_cache_hitmiss","hbase_origin_firstbyte") >>> format("%{nanos}:") split(":", 0, "hbase_") format("%{node}:") >>> split(":",0,"hbase_node") digest("MD5","hbase_md5") collector(10000) { >>> attr2hbase("apache_logs","f1","","hbase_") } >>> >>> >> >> >> -- >> Thanks >> >> Cameron Gandevia >> > >