Hi Paul, Would you kindly attach the logs from both tier 2 collectors where you observe the sinks occasionally stepping on each other. Can you please attach your flume config and note the version of flume-ng?
Best, Jeff On Sun, Mar 31, 2013 at 7:12 PM, JR <[email protected]> wrote: > Hi Paul, > > I apologize that I am not giving you a solution, but in turn have a > question about your avro sink to tier2 avro src. > > Could you please share the conf file? I have tried to put the sink and > source as follows, but I still get RPC connection failed. > > If you have had success, could you please tell me how you got yours to > work? > > What is the command like / shell scripts you wrote to connect the tier1--> > tier2 --> HDFS? > > Thanks! > > > Avro source ---> mem Channel ----> Avro sink --> (next node) avro source > --> mem channel ---> hdfs sink > > #agent1 on node1 > agent1.sources = avroSource > agent1.channels = ch1 > agent1.sinks = avroSink > > #agent2 on node2 > agent2.sources = avroSource2 > agent2.channels = ch2 > agent2.sinks = hdfsSink > > # first source - avro > agent1.sources.avroSource. > type = avro > agent1.sources.avroSource.bind = 0.0.0.0 > agent1.sources.avroSource.port = 41414 > agent1.sources.avroSource.channels = ch1 > > # first sink - avro > agent1.sinks.avroSink.type = avro > agent1.sinks.avroSink.hostname = 0.0.0.0 > agent1.sinks.avroSink.port = 41415 > agent1.sinks.avroSink.channel = ch1 > > # second source - avro > agent2.sources.avroSource2.type = avro > agent2.sources.avroSource2.bind = node2 ip > agent2.sources.avroSource2.port = 41415 > agent2.sources.avroSource2.channel = ch2 > > # second sink - hdfs > agent2.sinks.hdfsSink.type = hdfs > agent2.sinks.hdfsSink.channel = ch2 > agent2.sinks.hdfsSink.hdfs.writeFormat = Text > agent2.sinks.hdfsSink.hdfs.filePrefix = testing > agent2.sinks.hdfsSink.hdfs.path = hdfs://node2:9000/flume/ > > # channels > agent1.channels.ch1.type = memory > agent1.channels.ch1.capacity = 1000 > agent2.channels.ch2.type = memory > agent2.channels.ch2.capacity = 1000 > > > Am getting errors with the ports. Could someone please check if I have > connected the sink in node1 to source in node 2 properly? > > 13/03/24 04:32:16 INFO source.AvroSource: Starting Avro source avroSource: > { bindAddress: 0.0.0.0, port: 41414 }... > 13/03/24 04:32:16 INFO instrumentation. > MonitoredCounterGroup: Monitoried counter group for type: SINK, name: > avroSink, registered successfully. > 13/03/24 04:32:16 INFO instrumentation.MonitoredCounterGroup: Component > type: SINK, name: avroSink started > 13/03/24 04:32:16 INFO sink.AvroSink: Avro sink avroSink: Building > RpcClient with hostname: 0.0.0.0, port: 41415 > 13/03/24 04:32:16 WARN sink.AvroSink: Unable to create avro client using > hostname: 0.0.0.0, port: 41415 > org.apache.flume.FlumeException: NettyAvroRpcClient { host: 0.0.0.0, port: > 41415 }: RPC connection error > at > org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:117) > at > org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:93) > at > org.apache.flume.api.NettyAvroRpcClient.configure(NettyAvroRpcClient.java:507) > at > org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:88) > at > org.apache.flume.sink.AvroSink.createConnection(AvroSink.java:182) > at org.apache.flume.sink.AvroSink.start(AvroSink.java:242) > at > org.apache.flume.sink.DefaultSinkProcessor.start(DefaultSinkProcessor.java:46) > at org.apache.flume.SinkRunner.start(SinkRunner.java:79) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:452) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:328) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:161) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:109) > > > > On Fri, Mar 29, 2013 at 4:20 PM, Paul Chavez < > [email protected]> wrote: > >> ** >> I am curious about the observed behavior of a set of agents configured >> with a Load Balancing sink processor. >> >> I have 4 'tier1' agents receiving events directly from app servers that >> feed into 2 'tier2' agents that write to HDFS. They are connected up via >> Avro Sink/Sources and a Load Balancing Sink Processor. >> >> Both 'tier2' agents write the same directory and I have observed they >> occasionally step on each other and one of the tier2 agents at that point >> 'loses' and gets hung up on a file lease exception. I'm not concerned with >> that at the moment as I know it's not best practices and this is more of a >> pilot architecture. >> >> My concern is that once a tier2 agent gets stuck it obviously fills it's >> channel in time, and then stops accepting put requests from the Avro >> source. At this point my *expectation* is that the upstream tier1 agents >> will continue to round-robin to the tier2 nodes with every other 'put' >> request failing. Assuming the remaining tier2 node can handle the >> throughput (which it can) I would not expect the tier1 agents to ever fill >> their channels. >> >> In actuality what happens is the tier1 agents slowly fill the channel and >> eventually start refusing put attempts from the application servers. It >> seems that once a given batch has been allocated to the bad sink, it won't >> ever get released to be processed by the other, working sink. >> >> Is this the way it should work? Is this a defect or as designed? I will >> probably switch to a failover processor because I really only need one HDFS >> writer to keep up with my data, but I do think this isn't working as >> intended. >> >> thanks, >> Paul >> >> > >
