Re: Do I have to use 2 collector ports if I have 2 different flows?

alo alt Wed, 09 May 2012 23:19:09 -0700

Thats correct.
Another "workaround" is to start flume and config the flows via CLI, that will 
work too and can be scripted:


Script
=====

exec config NODE 'SOURCE' autoE2EChain 
exec map VNAME NODENAME 
exec config VNAME 'autoCollectorSource' 'batch(1000,36000) collectorSink( 
"hdfs://NN:PORT/PATH, "OPTIONS", 36000)' 
exec config NODE FLOWNAME 'SOURCE' autoE2EChain 
exec config VNAME FLOWNAME 'autoCollectorSource' 'batch(1000,36000) 
collectorSink( "hdfs://NN:PORT/PATH, "OPTIONS",36000)'

NODE = nodename
VNAME = mapping
FLOWNAME = flow

Than load via (assume master @localhost):
flume shell -c localhost -s /tmp/flow.script 

Best,
 Alex 

--
Alexander Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF

On May 9, 2012, at 10:59 PM, Eric Sammer wrote:

> Steve:
> 
> It's been a while since I've looked at OG, to be perfectly honest, so take 
> this with a grain of salt. If I recall, you do need to use separate ports. 
> Flow names were primarily used to detect where separation was necessary in 
> autoChains. In other words, if you properly labeled flows and used 
> autoChains, we were able to figure out that flow A and B can't be sent to the 
> same group of collectors. When not using autoChains (and few people could due 
> to some of the limitations) these labels don't mean much. Separate ports were 
> required so the internals could be kept relatively simple (e.g. a source need 
> not make routing decisions except in highly customized cases).
> 
> Hope this helps.
> 
> On Wed, May 9, 2012 at 9:59 AM, Steve Hoffman <[email protected]> wrote:
> I had setup 1 flow using flow isolation from several nodes to a single
> collector (default port 35853).  Things have been running just fine.
> 
> When I added a second flow to the same collector, not only did the
> second collector sink not receive any data, but the first went into
> ERROR state (I assume because the first flow uses a different format
> than the first and isn't written to deal with the second flow's
> format).
> 
> When I moved the second flow-id to another port (35854) my data came
> thru which makes me believe you can't send different flow-ids to the
> same collector port.  That would seem very odd.  After all, that is
> the point of the flow identifier -- if I had to have a separate port
> for every flow id, why bother with flow names?
> 
> This is cloudera cdh3u3 flume (not NG).  And before you suggest moving
> to NG, not an option at this time.
> 
> Thanks,
> Steve
> 
> 
> 
> -- 
> Eric Sammer
> twitter: esammer
> data: www.cloudera.com

Re: Do I have to use 2 collector ports if I have 2 different flows?

Reply via email to