Re: Realtime sensor's tcpip data to hadoop

Hardik Pandya Wed, 14 May 2014 15:06:07 -0700

If I were you I would ask following questions to get the answer

> forget about for a minute and ask yourself how tcpip data are currently
being stored - in fs/rdbmbs?
> hadoop is for offiline batch processing - if you are looking for real
time streaming solution - there is a storm (from linkedin) that can go well
with kafka (messaging queue) or spark streaming (which is in memory
map-reduce) and takes real time streams - has in built twitter api but you
need to write your own service to poll data every few seconds and send it
in RDD format
> storm is complementary to hadoop - spark in conjuction with hadoop will
allow you to do both offline and real time data analytics





On Tue, May 6, 2014 at 10:48 PM, Alex Lee <[email protected]> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Reply via email to