Re: About Apache Nutch 1.1 Final Release

2010-04-17 Thread Mattmann, Chris A (388J)
Hey Andrzej,

You got it. I got bogged down yesterday but will apply this patch (was going to 
ask you about it) before I roll the RC.

Safe travels buddy!

Cheers,
Chris


On 4/16/10 11:55 PM, "Andrzej Bialecki"  wrote:

On 2010-04-17 05:45, Phil Barnett wrote:
> On Sat, 2010-04-10 at 18:22 +0200, Andrzej Bialecki wrote:
>
>> More details on this (your environment, OS, JDK version) and
>> logs/stacktraces would be highly appreciated! You mentioned that you
>> have some scripts - if you could extract relevant portions from them (or
>> copy the scripts) it would help us to ensure that it's not a simple
>> command-line error.
>
> I posted another thread tonight with the fixed code.

See here: https://issues.apache.org/jira/browse/NUTCH-812

>
> Can you please commit it for all of us?

I'm traveling today ... Chris, can you perhaps apply the patch before
you roll another RC?

--
Best regards,
Andrzej Bialecki <><
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com




++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.mattm...@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++



Re: About Apache Nutch 1.1 Final Release

2010-04-16 Thread Andrzej Bialecki
On 2010-04-17 05:45, Phil Barnett wrote:
> On Sat, 2010-04-10 at 18:22 +0200, Andrzej Bialecki wrote:
> 
>> More details on this (your environment, OS, JDK version) and
>> logs/stacktraces would be highly appreciated! You mentioned that you
>> have some scripts - if you could extract relevant portions from them (or
>> copy the scripts) it would help us to ensure that it's not a simple
>> command-line error.
> 
> I posted another thread tonight with the fixed code.

See here: https://issues.apache.org/jira/browse/NUTCH-812

> 
> Can you please commit it for all of us?

I'm traveling today ... Chris, can you perhaps apply the patch before
you roll another RC?

-- 
Best regards,
Andrzej Bialecki <><
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: About Apache Nutch 1.1 Final Release

2010-04-16 Thread Phil Barnett
On Sat, 2010-04-10 at 18:22 +0200, Andrzej Bialecki wrote:

> More details on this (your environment, OS, JDK version) and
> logs/stacktraces would be highly appreciated! You mentioned that you
> have some scripts - if you could extract relevant portions from them (or
> copy the scripts) it would help us to ensure that it's not a simple
> command-line error.

I posted another thread tonight with the fixed code.

Can you please commit it for all of us?

Thanks.



Re: About Apache Nutch 1.1 Final Release

2010-04-13 Thread Phil Barnett
On Sat, Apr 10, 2010 at 11:04 PM, Phil Barnett  wrote:

> On Sat, 2010-04-10 at 18:22 +0200, Andrzej Bialecki wrote:
> > On 2010-04-10 17:49, Phil Barnett wrote:
> > > On Thu, 2010-04-08 at 21:31 -0700, Mattmann, Chris A (388J) wrote:
> > >> Hi there,
> > >>
> > >> Well as soon as we have 3 +1 binding VOTEs. Right now I'm the only PMC
> member that's VOTE'd +1 on the release.
> > >>
> > >> Hopefully in the next few days someone will have a chance to check...
> > >
> > > I tried to get the Release Candidate (latest nightly build) running
> > > yesterday and I ran into problems with both of the scripts that I use
> to
> > > crawl with 1.0.
> > >
> > > But the smaller bin/crawl method finished the crawl and then
> immediately
> > > had a java exception when starting the next step.
> > >
> > > Sorry I don't have more specifics, but I'm at home, the setup is at
> work
> > > and I had to revert to get things back running. But I built a dev
> > > machine so I can play with 1.1 and get more specific.
> >
> > More details on this (your environment, OS, JDK version) and
> > logs/stacktraces would be highly appreciated! You mentioned that you
> > have some scripts - if you could extract relevant portions from them (or
> > copy the scripts) it would help us to ensure that it's not a simple
> > command-line error.
>
> Will do, Monday.
>
> Basics.
>
> HP DL-360 G4 Dual Xeon, 4G ram, Mirrored SCSI.
>
> Fresh install of CentOS 5.4
>
> Java from Sun.
>
> ant from repository, compiled from nightly build.
>
> I'll try to get you more details Monday evening. I'm driving down to
> work tonight to get the -dev machine running so I'll have something to
> break on Monday. ;-)
>
> Wow, it's been a brutal week at work so far. I did manage to get the dev
server up and managed to try again to crawl. This is a full from scratch
install.

I'm seeing two things.

1. When I run bin/nutch crawl, it finds the seed site and spiders it. When I
run deepcrawl it never finds anything. They both use the same seed
directory.

2. During bin/nutch crawl, I get a null pointer exception in function main
right after it decides it has crawled the last page. From memory, it was
line 133.

The logs/hadoop.log file doesn't show anything of merit.

I started documenting exactly what was going on but I worked from 9 am to
12:30 am working some nasty network problems and I never got it gathered up.

I will be able to get it to you tomorrow. Sorry for the delay.

Phil Barnett
Senior Analyst
Walt Disney World.


Re: About Apache Nutch 1.1 Final Release

2010-04-10 Thread Phil Barnett
On Sat, 2010-04-10 at 18:22 +0200, Andrzej Bialecki wrote:
> On 2010-04-10 17:49, Phil Barnett wrote:
> > On Thu, 2010-04-08 at 21:31 -0700, Mattmann, Chris A (388J) wrote:
> >> Hi there,
> >>
> >> Well as soon as we have 3 +1 binding VOTEs. Right now I'm the only PMC 
> >> member that's VOTE'd +1 on the release.
> >>
> >> Hopefully in the next few days someone will have a chance to check...
> > 
> > I tried to get the Release Candidate (latest nightly build) running
> > yesterday and I ran into problems with both of the scripts that I use to
> > crawl with 1.0.
> > 
> > But the smaller bin/crawl method finished the crawl and then immediately
> > had a java exception when starting the next step.
> > 
> > Sorry I don't have more specifics, but I'm at home, the setup is at work
> > and I had to revert to get things back running. But I built a dev
> > machine so I can play with 1.1 and get more specific.
> 
> More details on this (your environment, OS, JDK version) and
> logs/stacktraces would be highly appreciated! You mentioned that you
> have some scripts - if you could extract relevant portions from them (or
> copy the scripts) it would help us to ensure that it's not a simple
> command-line error.

Will do, Monday.

Basics. 

HP DL-360 G4 Dual Xeon, 4G ram, Mirrored SCSI.

Fresh install of CentOS 5.4

Java from Sun.

ant from repository, compiled from nightly build.

I'll try to get you more details Monday evening. I'm driving down to
work tonight to get the -dev machine running so I'll have something to
break on Monday. ;-)



Re: About Apache Nutch 1.1 Final Release

2010-04-10 Thread Andrzej Bialecki
On 2010-04-10 17:49, Phil Barnett wrote:
> On Thu, 2010-04-08 at 21:31 -0700, Mattmann, Chris A (388J) wrote:
>> Hi there,
>>
>> Well as soon as we have 3 +1 binding VOTEs. Right now I'm the only PMC 
>> member that's VOTE'd +1 on the release.
>>
>> Hopefully in the next few days someone will have a chance to check...
> 
> I tried to get the Release Candidate (latest nightly build) running
> yesterday and I ran into problems with both of the scripts that I use to
> crawl with 1.0.
> 
> But the smaller bin/crawl method finished the crawl and then immediately
> had a java exception when starting the next step.
> 
> Sorry I don't have more specifics, but I'm at home, the setup is at work
> and I had to revert to get things back running. But I built a dev
> machine so I can play with 1.1 and get more specific.

More details on this (your environment, OS, JDK version) and
logs/stacktraces would be highly appreciated! You mentioned that you
have some scripts - if you could extract relevant portions from them (or
copy the scripts) it would help us to ensure that it's not a simple
command-line error.



-- 
Best regards,
Andrzej Bialecki <><
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: About Apache Nutch 1.1 Final Release

2010-04-10 Thread Phil Barnett
On Thu, 2010-04-08 at 21:31 -0700, Mattmann, Chris A (388J) wrote:
> Hi there,
> 
> Well as soon as we have 3 +1 binding VOTEs. Right now I'm the only PMC member 
> that's VOTE'd +1 on the release.
> 
> Hopefully in the next few days someone will have a chance to check...

I tried to get the Release Candidate (latest nightly build) running
yesterday and I ran into problems with both of the scripts that I use to
crawl with 1.0.

But the smaller bin/crawl method finished the crawl and then immediately
had a java exception when starting the next step.

Sorry I don't have more specifics, but I'm at home, the setup is at work
and I had to revert to get things back running. But I built a dev
machine so I can play with 1.1 and get more specific.

Phil Barnett
Senior Analyst
Walt Disney World.



Re: About Apache Nutch 1.1 Final Release

2010-04-08 Thread Mattmann, Chris A (388J)
Hi there,

Well as soon as we have 3 +1 binding VOTEs. Right now I'm the only PMC member 
that's VOTE'd +1 on the release.

Hopefully in the next few days someone will have a chance to check...

Cheers,
Chris


On 4/8/10 8:54 PM, "yhdelgado"  wrote:



Hi. I have a question. When the Apache Nutch 1.1 Final Release, will be
released?. Grettings.
--
View this message in context: 
http://n3.nabble.com/About-Apache-Nutch-1-1-Final-Release-tp707586p707586.html
Sent from the Nutch - User mailing list archive at Nabble.com.



++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.mattm...@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++