Thanks Andrew for your quick response. My sources (server PUD) can't put events into an agregation point. For this reason I'm following a PollingSource schema where my agent needs to be configured with thousands of sources. Any clues for use cases where data is injected considering a polling process?
Regards! --- JuanFra Rodriguez Cardoso 2014-09-04 17:41 GMT+02:00 Andrew Ehrlich <[email protected]>: > One way to avoid managing so many sources would be to have an aggregation > point between the data generators the flume sources. For example, maybe you > could have the data generators put events into a message queue(s), then > have flume consume from there? > > Andrew > > ---- On Thu, 04 Sep 2014 08:29:04 -0700 *JuanFra Rodriguez > Cardoso<[email protected] > <[email protected]>>* wrote ---- > > Hi all: > > Considering an environment with thousands of sources, which are the best > practices for managing the agent configuration (flume.conf)? Is it > recommended to create a multi-layer topology where each agent takes control > of a subset of sources? > > In that case, a conf mgmg server (such as Puppet) would be responsible for > editing flume.conf with parameters 'agent.sources' from source1 to > source3000 (assuming we have 3000 sources machines). > > Are my thoughts aligned with that scenarios of large scale data ingest? > > Thanks a lot! > --- > JuanFra > > >
