Good Afternoon Ashwini, You can find out information about the project at the Nutch project wiki, which is here - https://wiki.apache.org/nutch/GoogleSummerOfCode#NUTCH-1936_GSoC_2015_-_Move_Nutch_to_Hadoop_2.X We are looking for students to provide input to their project proposals based on the format defined here - https://wiki.apache.org/nutch/GoogleSummerOfCode#Student_Proposals If you require access to the Nutch wiki (which you will) then please sign up and send us your wiki username. We will then grant you contributors rights. Best Lewis
On Tue, Mar 10, 2015 at 2:37 AM, ASHWINI TOKEKAR <[email protected]> wrote: > Dear Sir, > > Greetings, from Hyderabad India. > > I pursuing M.tech in Computer Science Engineering at International > Institute of Information Technology, Hyderabad. My area of interest is > Information Retrieval and Extraction, Opinion Mining and Big Data Analytics. > > I have written a white paper on Big Data as my undergraduate > project in a project team of 4 members including me. We also had a member > from industry (Computer Sciences Corporation). This project spanned from > January-2012 to June-2012. > As, part of the white paper we studied, the various tools > used for storage,processing and visualization of Big Data. We also studied > how can we classify Big Data and various dimensions of it. It also involved > how Big Data analytics can be helpful in day-to-day life. > > The next project which I did in my undergraduate study was > building A Hadoop based Search Engine.This also was an industry project > mentored by professionals from Computer Sciences Corporation. Our search > engine did searching in txt,pdf and html files. It also did synonym search. > > In my M.tech as a part of my Information Retrieval and > Extraction course, I have built a search engine on Wikidump of size 42 Gb. > It does searching in 700-800 ms.The size of index is 7.4 Gb. I have also > relevance ranking of the documents using tf-idf mechanism. > > I am interested in doing the project "Move Nutch to Hadoop 2.X" > under you. Sir, using the power of Hadoop as distributed system to crawl in > a network.Sir, I would like to know the way Nutch is built upon so that I > can proceed on thinking how can we port Nutch onto Hadoop and prepare a > project proposal. > > > The URL to my linkedIn profile is : > http://in.linkedin.com/pub/ashwini-tokekar/78/150/403 > > -- > Ashwini Tokekar > M.tech CSE > IIIT Hyderabad > India > -- *Lewis*

