Re: Analysis of Data

Steven Yates Thu, 07 Feb 2013 15:08:45 -0800

The only missing link within the Flume architecture I see in this
conversation is the actual channel's and brokers themselves which
orchestrate this lovely undertaking of data collection. One opportunity I do
see (and I may be wrong) is for the data to offloaded into a system such as
Apache Mahout  before being sent to the sink. Perhaps the concept of a
ChannelAdapter of sorts? I.e Mahout Adapter ? Just thinking out loud and it
may be well out of the question.

Thanks
Steve

From:  Nitin Pawar <[email protected]>
Reply-To:  <[email protected]>
Date:  Thu, 7 Feb 2013 16:52:12 +0530
To:  <[email protected]>
Subject:  Re: Analysis of Data

1) Flume is isolated distributed system in the sense one agent does not idea
about any other agent
2) Flume in the sense when needs to collect data from multiple references
and work across different data sets, it may not have the entire data set
needed 
3) let us assume we have required data on agents for processing it in
batches, do we really want to pressurize a live production server for data
processing which can be done by systems like storm or hadoop or other
system? 

these are my ideas .. i can be totally wrong but just from systems point of
view it looks good option to keep data acquisition separate from data
processing and then storing the processed data for further data serving

On Thu, Feb 7, 2013 at 4:29 PM, Mike Percy <[email protected]> wrote:
> Let's take this conversation further. What is missing?
> 
> 
> On Thu, Feb 7, 2013 at 2:39 AM, Inder Pall <[email protected]> wrote:
>> flume is a platform to get events to the right sink (HDFS, local-file, ....)
>> analytics is not something which falls in it's territory
>> 
>> - Inder
>> 
>> 
>> On Thu, Feb 7, 2013 at 3:22 PM, Surindhar <[email protected]> wrote:
>>> Hi,
>>> 
>>> Does Flume supports Analysis of Data?
>>> 
>>> Br,
>>> 
>>> 
>> 
>> 
>> 
>> -- 
>> - Inder
>> "You are average of the 5 people you spend the most time with"
> 

-- 
Nitin Pawar

Re: Analysis of Data

Reply via email to