A fun one I did to learn some pieces was:
-Python and beautiful soup to crawl whatever you want (I did stock forums)
- write it to Kafka
- then flume to s3
then spark to read in the data and make pretty graphs using seaborn in 
jupyter/zeppelin

I was trying to score sentiment and graph it vs price

You could also just pull in the twitter sample stream and do wordcounts, etc. 

It really depends what you want to learn ofcourse...just pick something you 
find interesting and work through the steps. It would be best to pick a goal, 
then research which tech is relevant for which piece of your pipeline 

RE: interviews, you probably don't want to be interviewing for big data jobs 
with no experience with it? If it's a junior role and you're already a solid 
dev it makes sense. But interviewing or worse yet landing a job and being 
expected to manage any of these pieces having only played with them in 
tutorials would be bad for both parties 

Sent from my iPhone

> On Jan 12, 2017, at 7:22 AM, Jeya Vimalan <[email protected]> wrote:
> 
> Dear All,
> 
> Apologies for naive questions.
> 
> I am learning hadoop myself and have sql background 3+ Years.
> 
> 1) I need your suggestions getting focused on the topics
> and tutorials to become an expert.
> 
> 2) Any suggestions for working on real time project 
> 
> 3) Also, in this regard, would you mind sharing me some important questions 
> that have been asked on your interviews, that help me to 
> get prepared. 
> 
> I am going through all the weblinks and tutorials,
> but your answers may trim me the best. 
> 
> Look forward hearing from you.
> Thanks in advance.
> 
> Thanks and best regards,
> Vimal

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to