Re: Real time data processing on Hadoop WITHOUT Java

2014-07-17 Thread Sam Goodwin
Summingbird is a big data platform. Part of it attempts to provide a
functional interface to real-time data analysis
https://github.com/twitter/summingbird


On Thu, Jul 17, 2014 at 11:59 PM, Darryl Stoflet  wrote:

> There is some exploratory work underway to translate pig jobs into storm.
> I know that doesn't help you now, but things/tools evolve quickly.
>
> Hopefully Yahoo will open source there implementation soon :-)
> Hello Experts,
>
> I am new to real time processing of hadoop and storm too. I checked the
> implementation details provided over storm documentation however it seems
> to be all java coding. I'm a database guy and didn't work on java stuffs
> earlier.
>
> I have worked on Hive/Impala/Pig related things for batch processing and
> oozie for orchestration. I have a shell scripting exp for automation
> related stuffs and some beginner level tasks on NoSql (Hbase).
>
> It will be really helpful if you can guide me regd
> - Any alternative tool which can be used for real time data processing on
> hadoop based on my technical exp
> - where to start learning Java essentials for hadoop
> - is there any SQL dialect available for real time data processing on Big
> data
>
> Thanks in advance for your time on this.
>
> Regards,
> Saurabh
>
> Sent from my iPhone, please avoid typos.
>


Re: Real time data processing on Hadoop WITHOUT Java

2014-07-17 Thread Darryl Stoflet
There is some exploratory work underway to translate pig jobs into storm. I
know that doesn't help you now, but things/tools evolve quickly.

Hopefully Yahoo will open source there implementation soon :-)
Hello Experts,

I am new to real time processing of hadoop and storm too. I checked the
implementation details provided over storm documentation however it seems
to be all java coding. I'm a database guy and didn't work on java stuffs
earlier.

I have worked on Hive/Impala/Pig related things for batch processing and
oozie for orchestration. I have a shell scripting exp for automation
related stuffs and some beginner level tasks on NoSql (Hbase).

It will be really helpful if you can guide me regd
- Any alternative tool which can be used for real time data processing on
hadoop based on my technical exp
- where to start learning Java essentials for hadoop
- is there any SQL dialect available for real time data processing on Big
data

Thanks in advance for your time on this.

Regards,
Saurabh

Sent from my iPhone, please avoid typos.


RE: Real time data processing on Hadoop WITHOUT Java

2014-07-17 Thread Huang, Roger
Hi Saurabh,
In general Hadoop is not a good fit for real-time processing; it's better for 
batch processing.  
If you don't want to use Java, Hadoop Streaming allows you to write M/R jobs in 
any language that can read from standard input and write to standard output.
http://wiki.apache.org/hadoop/HadoopStreaming

For real-time analytics use Storm.  You can implement Storm bolts and spouts in 
a non-JVM language as per the multilang protocol.   
http://storm.incubator.apache.org/documentation/Using-non-JVM-languages-with-Storm.html

For low latency "SQL" / Hive Query Language, look at Impala or SparkSQL.

-R
-Original Message-
From: Db-Blog [mailto:mpp.databa...@gmail.com] 
Sent: Thursday, July 17, 2014 4:03 PM
To: user@storm.incubator.apache.org
Subject: Real time data processing on Hadoop WITHOUT Java

Hello Experts, 

I am new to real time processing of hadoop and storm too. I checked the 
implementation details provided over storm documentation however it seems to be 
all java coding. I'm a database guy and didn't work on java stuffs earlier. 

I have worked on Hive/Impala/Pig related things for batch processing and oozie 
for orchestration. I have a shell scripting exp for automation related stuffs 
and some beginner level tasks on NoSql (Hbase). 

It will be really helpful if you can guide me regd
- Any alternative tool which can be used for real time data processing on 
hadoop based on my technical exp
- where to start learning Java essentials for hadoop
- is there any SQL dialect available for real time data processing on Big data 

Thanks in advance for your time on this. 

Regards,
Saurabh

Sent from my iPhone, please avoid typos.


Real time data processing on Hadoop WITHOUT Java

2014-07-17 Thread Db-Blog
Hello Experts, 

I am new to real time processing of hadoop and storm too. I checked the 
implementation details provided over storm documentation however it seems to be 
all java coding. I'm a database guy and didn't work on java stuffs earlier. 

I have worked on Hive/Impala/Pig related things for batch processing and oozie 
for orchestration. I have a shell scripting exp for automation related stuffs 
and some beginner level tasks on NoSql (Hbase). 

It will be really helpful if you can guide me regd
- Any alternative tool which can be used for real time data processing on 
hadoop based on my technical exp
- where to start learning Java essentials for hadoop
- is there any SQL dialect available for real time data processing on Big data 

Thanks in advance for your time on this. 

Regards,
Saurabh

Sent from my iPhone, please avoid typos.