Welcome Thamme Gowda! Cheers, Markus
-----Original message----- > From:Thamme Gowda <[email protected]> > Sent: Monday 23rd May 2016 0:56 > To: [email protected]; [email protected] > Subject: Re: [ANNOUNCE] New Nutch committer and PMC - Thamme Gowda N. > > Hi Sebastian, > thanks for the invitation and setting this up. > > Hello everybody, > > I am so glad to be on board. > > About me: > Im currently a grad student (masters) at Univ. of Southern California > (USC), Los Angeles. Im fortunate enough to meet professor Chris Mattmann at > USC. > Prior to my grad studies, I worked as a full-stack developer at few startups > in Bangalore, India. I am also a tech co-founder of a text analysis platform, > http://datoin.com <http://datoin.com>. I found my interest in A.I. so here I > am at USC grad school. I am on my way for an internship at NASA JPL this > summer. > > How I met Nutch: > In 2014, with my team at Datoin.com we integrated Crawler/Input component to > our platform. We picked Nutch because we had rest of the platform on Hadoop. > Boom! that was when I first put my hands on nutch code. > Last fall I took a graduate level Information Retrieval (IR) course at USC > taught by prof. Mattmann. Then joined hands with his team at NASA JPL to work > on IR related projects. We use and improve Nutch. > > Some of my recent work related to Nutch: > Added an extension point and an extension to pass certain external URLS when > db.ignore.external is set. Fixed bugs and improved common crawl dumper. A > clustering toolkit for clustering Nutch output based on CSS styles and DOM > structures [2]... > > More coming soon this summer! > > I am interested in after-crawl analysis and bringing them back to Nutch as > extensions. > I also presented "Clustering the output of Nutch ...." at recent ApacheCon NA > [1]. > > I also love work on these: > reusable JVM containers to make it fast and efficient. Thinking of > spark execution backend (A step ahead - a switchable execution backend to > support MR and Spark, just like what Gora did to storage backend). > stats and analytics of crawl job in real-time > I am exicted to be involved with the community to imrove Nutch. > > - > Thanks and Regards, > Thamme > > [1] > http://www.slideshare.net/thammegowda/clustering-output-of-apache-nutch-using-apache-spark > > <http://www.slideshare.net/thammegowda/clustering-output-of-apache-nutch-using-apache-spark>[2] > https://github.com/uscdataScience/autoextractor/wiki/Clustering-Tutorial > <https://github.com/uscdataScience/autoextractor/wiki/Clustering-Tutorial> > > -- > Thamme Gowda > Grad Student at USC <http://usc.edu> > @thammegowda <https://twitter.com/thammegowda> | 213-536-3552 > http://scf.usc.edu/~tnarayan/ <http://scf.usc.edu/~tnarayan/> > > On Sun, May 22, 2016 at 1:02 PM, Sebastian Nagel <[email protected] > <mailto:[email protected]>> wrote: > Dear all, > > it is my pleasure to announce that Thamme Gowda N. has joined us > as committer and member of the Nutch PMC. Congratulations on your > new role within the Apache Nutch community! > > Thamme, would you mind telling us about yourself, your relation > to Nutch, what youve done so far, etc.? > > Cheers and welcome on board! > > Sebastian (on behalf of the Nutch PMC) >

