[ 
https://issues.apache.org/jira/browse/FLUME-659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Paliwal resolved FLUME-659.
----------------------------------
    Resolution: Won't Fix

Won't fix. 0.X branch not maintained anymore

> Agent with Thrift rpcSource closes source after receiving new config from 
> master
> --------------------------------------------------------------------------------
>
>                 Key: FLUME-659
>                 URL: https://issues.apache.org/jira/browse/FLUME-659
>             Project: Flume
>          Issue Type: Bug
>          Components: Master, Node, Sinks+Sources
>    Affects Versions: v0.9.3
>         Environment: Ubuntu 10.10 Maverick Meerkat
>            Reporter: Disabled imported user
>              Labels: rpc, thrift
>
> You can reproduce this problem by following these steps:
> Set up:
> * Master
> * Agent: rpcSource(35092) | agent*(...) # agent*Sink and agent*Chain all have 
> this problem
> * Collector: collectorSource(...) | collectorSink(...)
> Start sending events to the agent using Thrift.  Then use the flume shell on 
> master to configure the agent -- you can even use the exact same config as 
> the agent had in the first place.  Make sure the agent receives this 
> configuration while still being sent events.  After the agent receives its 
> configuration, it will close its source server for some reason and thereafter 
> become unresponsive to new configurations.  This is the sample output from 
> the agent logs:
> 2011-06-15 07:29:04,086 INFO 
> com.cloudera.flume.handlers.thrift.ThriftEventSink: ThriftEventSink on port 
> 35853 closed
> 2011-06-15 07:29:05,088 INFO 
> com.cloudera.flume.handlers.thrift.ThriftEventSource: Closed server on port 
> 35092...
> 2011-06-15 07:29:05,088 INFO 
> com.cloudera.flume.handlers.thrift.ThriftEventSource: Queue still has 4 
> elements ...
> And of course, the fact that the server is closed results in lots of the 
> following types of errors in the application that's sending events:
> Thrift::TransportException: Broken pipe
> Thrift::TransportException: Could not connect to localhost:35092: Connection 
> refused - connect(2)
> Another variation to reproduce this type of error is to bring the master 
> down, then bring it back up, at which point it will send its configuration to 
> the agent node.  Upon receiving the new configuration, the agent closes its 
> source server and becomes unresponsive to new configurations.  The following 
> is output from an agent that was configured with two logical nodes, one that 
> was rpcSource(35090) | agentE2EChain(...) and one that was rpcSource(35092) | 
> agentBEChain(...)
> 2011-06-15 05:37:46,731 INFO com.cloudera.flume.agent.ThriftMasterRPC: 
> Connected to master at flume-master:35872
> 2011-06-15 05:37:51,770 INFO 
> com.cloudera.flume.handlers.thrift.ThriftEventSource: Closed server on port 
> 35090...
> 2011-06-15 05:37:51,771 INFO 
> com.cloudera.flume.handlers.thrift.ThriftEventSource: Queue still has 0 
> elements ...
> 2011-06-15 05:37:51,787 INFO 
> com.cloudera.flume.handlers.thrift.ThriftEventSink: ThriftEventSink on port 
> 35853 closed
> 2011-06-15 05:37:51,868 INFO 
> com.cloudera.flume.handlers.thrift.ThriftEventSource: Closed server on port 
> 35090...
> 2011-06-15 05:37:51,868 INFO 
> com.cloudera.flume.handlers.thrift.ThriftEventSource: Queue still has 0 
> elements ...
> 2011-06-15 05:37:51,868 WARN 
> com.cloudera.flume.handlers.debug.LazyOpenDecorator: Closing a lazy sink that 
> was not logically opened
> 2011-06-15 05:37:51,868 INFO com.cloudera.flume.agent.LogicalNode: 
> flume-agent: Connector stopped: LazyOpenSource | LazyOpenDecorator
> 2011-06-15 05:37:51,875 INFO com.cloudera.flume.agent.LogicalNode: Node 
> config successfully set to com.cloudera.flume.conf.FlumeConfigData@42143753
> 2011-06-15 05:37:51,880 INFO com.cloudera.flume.agent.LogicalNode: Connector 
> started: LazyOpenSource | LazyOpenDecorator
> 2011-06-15 05:37:51,881 INFO 
> com.cloudera.flume.handlers.thrift.ThriftEventSource: Starting blocking 
> thread pool server on port 35090...
> 2011-06-15 05:37:52,788 INFO 
> com.cloudera.flume.handlers.thrift.ThriftEventSource: Closed server on port 
> 35092...
> 2011-06-15 05:37:52,788 INFO 
> com.cloudera.flume.handlers.thrift.ThriftEventSource: Queue still has 6 
> elements ...
> I once produced an exception using this master-down/master-up procedure:
> 2011-06-15 04:50:45,543 ERROR com.cloudera.flume.core.connector.DirectDriver: 
> Driving src/sink failed! LazyOpenSource | LazyOpenDecorator because 
> NaiveFileWALDeco not open for append
> java.lang.IllegalStateException: NaiveFileWALDeco not open for append
>       at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:145)
>       at 
> com.cloudera.flume.agent.durability.NaiveFileWALDeco.append(NaiveFileWALDeco.java:133)
>       at com.cloudera.flume.core.CompositeSink.append(CompositeSink.java:61)
>       at 
> com.cloudera.flume.agent.AgentFailChainSink.append(AgentFailChainSink.java:103)
>       at 
> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
>       at 
> com.cloudera.flume.handlers.debug.LazyOpenDecorator.append(LazyOpenDecorator.java:75)
>       at 
> com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:93)
> 2011-06-15 04:50:45,544 INFO com.cloudera.flume.agent.LogicalNode: Connector 
> xxxxxxxx.internal-E2E exited with error NaiveFileWALDeco not open for append
> 2011-06-15 04:50:46,544 INFO 
> com.cloudera.flume.handlers.thrift.ThriftEventSource: Closed server on port 
> 35090...
> 2011-06-15 04:50:46,545 INFO 
> com.cloudera.flume.handlers.thrift.ThriftEventSource: Queue still has 6 
> elements ...
> 2011-06-15 04:50:50,443 INFO com.cloudera.flume.agent.AgentFailChainSink: 
> Setting e2e failover chain to  { ackedWriteAhead => { stubbornAppend => { 
> insistentOpen => failChain(" %s 
> ","tsink(\"collector1\",35853)","tsink(\"collector2\",35853)") } } }
> 2011-06-15 04:50:50,443 INFO com.cloudera.flume.agent.AgentFailChainSink: 
> Setting failover chain to  { ackedWriteAhead => { stubbornAppend => { 
> insistentOpen => failChain(" %s 
> ","tsink(\"collector2\",35853)","tsink(\"collector2\",35853)") } } }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to