Hey There, Thanks for suggesting the below mentioned links however I am aware of how hadoop works and referred the below links in detail since my inception with Hadoop. My apologies if my earlier email wasn't clear enough to explain my problem statement.
Staring Fresh again! I have experience in hadoop and worked on Bare metal and cloud implementations of big data e.g. Cloudera HD, Hortonworks HD and Amazon EMR's. During this affair I got a chance to explore Hive, Impala, Sqoop and Pig in detail and processed large data sets residing in HDFS. Also enjoyed playing with Shell Scripts to automate commands and orchestrate processes. All this was batch processing and majorly related to SQL. Now I want to move with Real-Time implementations and other technologies (mentioned in trailing mails); which definitely need Java Expertise. I am seeking guidance to learn specific java topics which will be needed for Hadoop only! Links/Posts/courses on the same will be really helpful. I also look forward to contribute and share my knowledge to the community. :) Thanks, Saurabh > On 15-Aug-2014, at 5:09 am, Nishant Kelkar <nishant....@gmail.com> wrote: > > Hi Saurabh, > > Welcome to the world of Apache Hadoop! Here are a few good places to start: > > 1. Apache Hadoop Definitive Guide book: > http://shop.oreilly.com/product/0636920021773.do (you could find a free > e-copy if you Google some :) ) > 2. Hadoop Javadocs: https://hadoop.apache.org/docs/current/api/ > 3. If you want to install Hadoop on your local, Noll's tutorial on how to do > so for a pseudo-distributed mode is really nice: > http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/ > 4. The way I started, is by experimenting with Hadoop on my Linux box > terminal. You should definitely try out basic operations, like adding a file > to HDFS from your local filesystem, copying a file from HDFS to your local, > looking at filesystem size, moving files around in HDFS, etc. Here's where > you can start: http://hadoop.apache.org/docs/r0.18.3/hdfs_shell.html > > In general, I think you should also look at blogs/posts that help you > distinguish Java from the other languages you've used (like HiveQL for > example). How is Java different from C++? What is the difference between a > declarative programming language and an object-oriented programming language? > How does Java create objects? How does it manage them, and dispose of them? > These are the questions you want to look into first, even before starting to > write code in Java. > > Welcome to the group once again, and hope you'll be able to start > contributing to the open-source community real quick! :) > > Best Regards, > Nishant Kelkar > > >> On Thu, Aug 14, 2014 at 3:27 PM, Db-Blog <mpp.databa...@gmail.com> wrote: >> Greetings to everyone. >> >> I am a newbie in Java and seeks guidance in learning "Java specifically >> required for Hadoop". It will be really helpful if someone can pass on the >> links/topics/online-courses which can be helpful to get started on it. >> >> I come from ETL & DB- SQL background and currently working on >> Hive/Impala/Pig/Sqoop since couple of years. >> >> I have done some research on other tools of Big Data and Java will be >> required in depth. Below is the list of tools analysed : >> - Real time processing (Apache Kafka and Storm) >> - Advance Searching (Solr/Lucene) >> - Machine learning (Apache Mahout) >> >> Please feel free to comment if I am off-base on anything. >> >> Kindly suggest regarding the same and thanks for going thru the post and >> providing your valuable time. >> >> Thanks, >> Saurabh >