Hi Chris,
My user name is AnkitGoel.
Glad to be able to contribute. Thanks.

ps: tried to send u an email and got an auto response. congrats if i may
say so

On Thu, Jul 23, 2015 at 8:47 PM, Mattmann, Chris A (3980) <
[email protected]> wrote:

> Yes that would be fantastic. How about a wiki page on getting up
> and running and overcoming problems with the most recent Nutch?
>
> The Nutch wiki is here:
>
> http://wiki.apache.org/nutch/
>
> Please sign up for an account and tell me your username. Then I’ll
> grant you permissions to edit the wiki.
>
> Thank you Ankit!
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: [email protected]
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
> -----Original Message-----
> From: Ankit Goel <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Thursday, July 23, 2015 at 7:22 AM
> To: "[email protected]" <[email protected]>
> Subject: Re: Nutch on the cloud
>
> >Hey,
> >@Chris, I would love to help with the wiki (honored in fact), but my
> >inputs
> >are not with respect to the getting started process. More along the lines
> >of frequent errors after that. For example, the redirect plugin doesnt
> >work
> >how u expect it to (not even with the latest one). Or sometimes the
> >parsechecker will give results that a normal nutch run wont, even tho its
> >the same regex filter, or where to check it. Or which solr you need to
> >start with cause the 5.x has a diff file structure. Things like that on
> >which you spend a long.
> >
> >If there is a wiki for such a page I will gladly step up to the plate. It
> >isnt exactly faq either. I was thinking I could blog about it, but I think
> >ur idea of a wiki would be better so that it can be updated by later
> >authors as the problems are removed. Uh so should I create one on the
> >nutch
> >site? Also many of the problems are questioned multiple times  in the
> >mailing grp, and google search just doesnt cut it. So maybe a repository
> >of
> >frequent problems? that sort?
> >thanks for the heads up on the other guide. gave me a starting point.
> >
> >
> >On Thu, Jul 23, 2015 at 6:24 AM, Mattmann, Chris A (3980) <
> >[email protected]> wrote:
> >
> >> Thanks Ankit for the honest feedback. Would you be willing to update
> >> our wiki and improve the instructions based on your experiences for
> >> our gotchas?
> >>
> >> We have a guide we have been working on ourselves to getting Nutch
> >> running and churning on ElasticMap Reduce. That’s where I’d recommend
> >> starting.
> >>
> >> Cheers,
> >> Chris
> >>
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> Chris Mattmann, Ph.D.
> >> Chief Architect
> >> Instrument Software and Science Data Systems Section (398)
> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >> Office: 168-519, Mailstop: 168-527
> >> Email: [email protected]
> >> WWW:  http://sunset.usc.edu/~mattmann/
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> Adjunct Associate Professor, Computer Science Department
> >> University of Southern California, Los Angeles, CA 90089 USA
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: Ankit Goel <[email protected]>
> >> Reply-To: "[email protected]" <[email protected]>
> >> Date: Wednesday, July 22, 2015 at 5:51 PM
> >> To: "[email protected]" <[email protected]>
> >> Subject: Nutch on the cloud
> >>
> >> >Hi,
> >> >After my runs on my lappy, I'm ready to port my work to the cloud.
> >> >Planning
> >> >to use Amazon. One thing I noticed when I started with nutch that there
> >> >were a lot of things unsaid on the site/wiki and took me a lot of time
> >>to
> >> >figure out. Pitfalls if I may call them. I dont really have code or
> >> >scripts, but I need nutch to run all the time on the cloud.
> >> >
> >> >So before I port to the cloud, are there any things I should beware of
> >>or
> >> >lookout for? Like is AWS fine with nutch? Are there any configurations
> >>I
> >> >should remember? Any advice on implementation to ease my transition and
> >> >run
> >> >nutch 24hrs? i will be running a seed file and crawl the net in
> >>general.
> >> >Thanks
> >> >
> >> >--
> >> >Regards,
> >> >Ankit Goel
> >> >http://about.me/ankitgoel
> >>
> >>
> >
> >
> >--
> >Regards,
> >Ankit Goel
> >http://about.me/ankitgoel
>
>


-- 
Regards,
Ankit Goel
http://about.me/ankitgoel

Reply via email to