Re: New production setup

Arvind Prabhakar Fri, 18 May 2012 14:17:08 -0700

Hi Mahesh,

The concepts of Flume 1.x (NG) are different from Flume 0.9.x. For a quick
premier on the changed concepts and to understand them better, please
glance thorough the blog post we did earlier [2]. Due to these changes the
components developed for earlier version of Flume are not compatible with
the new implementation.


Regarding implementing custom sinks in Flume 1.x, it is fairly
straightforward. You create an implementation of the interface
org.apache.flume.Sink. If your implementation class is
com.exmample.custom.MySink, you can plug that into the system via the
following configuration:

agent.channels = c1
agent.sinks = s1

agent.sinks.s1.type = com.example.custom.MySink
agent.sinks.s1.channel = c1
agent.sinks.s1.sink_property = value
...

Any configuration within the agent.sinks.s1 namespace will be passed to the
configure() method implemented by your sink before it is start()ed. If the
system shutsdown, the sink will be stop()ped before that etc.

For even easier route into implementing custom sinks for Flume 1.x, just
extend out of an existing sink like the LoggerSink and override the
process() method.

Hope this helps.

Thanks,
Arvind Prabhakar

[2] https://blogs.apache.org/flume/entry/flume_ng_architecture





On Fri, May 18, 2012 at 2:04 PM, M@he$h <[email protected]> wrote:

> Hello Arvind,
>
> I was using flume-0.9.x version and I had everything working nicely , the
> only issue I had was tailing a specific file which is in discussion in
> another thread. The query I have is : I had my own regexAll extractor and
> hbase sink java programs, so if I upgrade to flume-NG version , can I still
> use the custom extractor and hbase sink programs with flume-NG?
>
> the flume-NG wiki
> http://archive.cloudera.com/cdh4/cdh/4/flume-ng-1.1.0-cdh4.0.0b2/FlumeUserGuide.html,
>  does not give much explanation or samples on how to use the custom sinks.
> Could you please let me know about it?
>
> look forward for your response.
>
>
> On Fri, May 18, 2012 at 8:54 AM, Arvind Prabhakar <[email protected]>wrote:
>
>> Hi Simon,
>>
>> The wiki page is dated to say the least. At the moment there are many
>> active deployments of Flume NG that are in staging if not production. I
>> encourage you to look at the performance numbers that were recently
>> published on the wiki [1].
>>
>> The usecase you have described seems something that Flume should be able
>> to handle very easily. I encourage you to look at the log4j appender,
>> Memory/File channels and the HDFS event sink. Of course you could plan on
>> using other components as well if this does not fit well with your
>> application.
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/FLUME/Flume+NG+Performance+Measurements
>>
>> Thanks,
>> Arvind Prabhakar
>>
>>
>> On Fri, May 18, 2012 at 4:58 AM, Simon Kelly <[email protected]>wrote:
>>
>>> Hi
>>>
>>> I'm interested in using Flume to store audit logs in HDFS which can then
>>> be queried with Hive. I see that the links on the Flume page point to Flume
>>> NG which says its not ready for production use yet. Is that still the case?
>>>
>>> Our use case would likely look something like this:
>>>
>>>    - 15 servers running a Java web server and logging audit data (1-2K
>>>    per event, 20-90 events per second per server)
>>>    - Hadoop running on 5 machine cluster (4x2.4GHz processors, 8GB RAM,
>>>    8TB total storage)
>>>
>>> Its important that all data makes it into HDFS.
>>>
>>> I'd appreciate any comments on how to proceed with this.
>>>
>>> Best regards
>>> Simon Kelly
>>>
>>
>>
>
>
> --
> *Thanks and Regards,
> *
> Mahesh
> 619-816-7011.
>

Re: New production setup

Reply via email to