Hi Saurabh, In general Hadoop is not a good fit for real-time processing; it's better for batch processing. If you don't want to use Java, Hadoop Streaming allows you to write M/R jobs in any language that can read from standard input and write to standard output. http://wiki.apache.org/hadoop/HadoopStreaming
For real-time analytics use Storm. You can implement Storm bolts and spouts in a non-JVM language as per the multilang protocol. http://storm.incubator.apache.org/documentation/Using-non-JVM-languages-with-Storm.html For low latency "SQL" / Hive Query Language, look at Impala or SparkSQL. -R -----Original Message----- From: Db-Blog [mailto:[email protected]] Sent: Thursday, July 17, 2014 4:03 PM To: [email protected] Subject: Real time data processing on Hadoop WITHOUT Java Hello Experts, I am new to real time processing of hadoop and storm too. I checked the implementation details provided over storm documentation however it seems to be all java coding. I'm a database guy and didn't work on java stuffs earlier. I have worked on Hive/Impala/Pig related things for batch processing and oozie for orchestration. I have a shell scripting exp for automation related stuffs and some beginner level tasks on NoSql (Hbase). It will be really helpful if you can guide me regd - Any alternative tool which can be used for real time data processing on hadoop based on my technical exp - where to start learning Java essentials for hadoop - is there any SQL dialect available for real time data processing on Big data Thanks in advance for your time on this. Regards, Saurabh Sent from my iPhone, please avoid typos.
