Re: About Apache Nutch 1.1 Final Release
Hey Andrzej, You got it. I got bogged down yesterday but will apply this patch (was going to ask you about it) before I roll the RC. Safe travels buddy! Cheers, Chris On 4/16/10 11:55 PM, "Andrzej Bialecki" wrote: On 2010-04-17 05:45, Phil Barnett wrote: > On Sat, 2010-04-10 at 18:22 +0200, Andrzej Bialecki wrote: > >> More details on this (your environment, OS, JDK version) and >> logs/stacktraces would be highly appreciated! You mentioned that you >> have some scripts - if you could extract relevant portions from them (or >> copy the scripts) it would help us to ensure that it's not a simple >> command-line error. > > I posted another thread tonight with the fixed code. See here: https://issues.apache.org/jira/browse/NUTCH-812 > > Can you please commit it for all of us? I'm traveling today ... Chris, can you perhaps apply the patch before you roll another RC? -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: About Apache Nutch 1.1 Final Release
On 2010-04-17 05:45, Phil Barnett wrote: > On Sat, 2010-04-10 at 18:22 +0200, Andrzej Bialecki wrote: > >> More details on this (your environment, OS, JDK version) and >> logs/stacktraces would be highly appreciated! You mentioned that you >> have some scripts - if you could extract relevant portions from them (or >> copy the scripts) it would help us to ensure that it's not a simple >> command-line error. > > I posted another thread tonight with the fixed code. See here: https://issues.apache.org/jira/browse/NUTCH-812 > > Can you please commit it for all of us? I'm traveling today ... Chris, can you perhaps apply the patch before you roll another RC? -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: About Apache Nutch 1.1 Final Release
On Sat, 2010-04-10 at 18:22 +0200, Andrzej Bialecki wrote: > More details on this (your environment, OS, JDK version) and > logs/stacktraces would be highly appreciated! You mentioned that you > have some scripts - if you could extract relevant portions from them (or > copy the scripts) it would help us to ensure that it's not a simple > command-line error. I posted another thread tonight with the fixed code. Can you please commit it for all of us? Thanks.
Re: About Apache Nutch 1.1 Final Release
On Sat, Apr 10, 2010 at 11:04 PM, Phil Barnett wrote: > On Sat, 2010-04-10 at 18:22 +0200, Andrzej Bialecki wrote: > > On 2010-04-10 17:49, Phil Barnett wrote: > > > On Thu, 2010-04-08 at 21:31 -0700, Mattmann, Chris A (388J) wrote: > > >> Hi there, > > >> > > >> Well as soon as we have 3 +1 binding VOTEs. Right now I'm the only PMC > member that's VOTE'd +1 on the release. > > >> > > >> Hopefully in the next few days someone will have a chance to check... > > > > > > I tried to get the Release Candidate (latest nightly build) running > > > yesterday and I ran into problems with both of the scripts that I use > to > > > crawl with 1.0. > > > > > > But the smaller bin/crawl method finished the crawl and then > immediately > > > had a java exception when starting the next step. > > > > > > Sorry I don't have more specifics, but I'm at home, the setup is at > work > > > and I had to revert to get things back running. But I built a dev > > > machine so I can play with 1.1 and get more specific. > > > > More details on this (your environment, OS, JDK version) and > > logs/stacktraces would be highly appreciated! You mentioned that you > > have some scripts - if you could extract relevant portions from them (or > > copy the scripts) it would help us to ensure that it's not a simple > > command-line error. > > Will do, Monday. > > Basics. > > HP DL-360 G4 Dual Xeon, 4G ram, Mirrored SCSI. > > Fresh install of CentOS 5.4 > > Java from Sun. > > ant from repository, compiled from nightly build. > > I'll try to get you more details Monday evening. I'm driving down to > work tonight to get the -dev machine running so I'll have something to > break on Monday. ;-) > > Wow, it's been a brutal week at work so far. I did manage to get the dev server up and managed to try again to crawl. This is a full from scratch install. I'm seeing two things. 1. When I run bin/nutch crawl, it finds the seed site and spiders it. When I run deepcrawl it never finds anything. They both use the same seed directory. 2. During bin/nutch crawl, I get a null pointer exception in function main right after it decides it has crawled the last page. From memory, it was line 133. The logs/hadoop.log file doesn't show anything of merit. I started documenting exactly what was going on but I worked from 9 am to 12:30 am working some nasty network problems and I never got it gathered up. I will be able to get it to you tomorrow. Sorry for the delay. Phil Barnett Senior Analyst Walt Disney World.
Re: About Apache Nutch 1.1 Final Release
On Sat, 2010-04-10 at 18:22 +0200, Andrzej Bialecki wrote: > On 2010-04-10 17:49, Phil Barnett wrote: > > On Thu, 2010-04-08 at 21:31 -0700, Mattmann, Chris A (388J) wrote: > >> Hi there, > >> > >> Well as soon as we have 3 +1 binding VOTEs. Right now I'm the only PMC > >> member that's VOTE'd +1 on the release. > >> > >> Hopefully in the next few days someone will have a chance to check... > > > > I tried to get the Release Candidate (latest nightly build) running > > yesterday and I ran into problems with both of the scripts that I use to > > crawl with 1.0. > > > > But the smaller bin/crawl method finished the crawl and then immediately > > had a java exception when starting the next step. > > > > Sorry I don't have more specifics, but I'm at home, the setup is at work > > and I had to revert to get things back running. But I built a dev > > machine so I can play with 1.1 and get more specific. > > More details on this (your environment, OS, JDK version) and > logs/stacktraces would be highly appreciated! You mentioned that you > have some scripts - if you could extract relevant portions from them (or > copy the scripts) it would help us to ensure that it's not a simple > command-line error. Will do, Monday. Basics. HP DL-360 G4 Dual Xeon, 4G ram, Mirrored SCSI. Fresh install of CentOS 5.4 Java from Sun. ant from repository, compiled from nightly build. I'll try to get you more details Monday evening. I'm driving down to work tonight to get the -dev machine running so I'll have something to break on Monday. ;-)
Re: About Apache Nutch 1.1 Final Release
On 2010-04-10 17:49, Phil Barnett wrote: > On Thu, 2010-04-08 at 21:31 -0700, Mattmann, Chris A (388J) wrote: >> Hi there, >> >> Well as soon as we have 3 +1 binding VOTEs. Right now I'm the only PMC >> member that's VOTE'd +1 on the release. >> >> Hopefully in the next few days someone will have a chance to check... > > I tried to get the Release Candidate (latest nightly build) running > yesterday and I ran into problems with both of the scripts that I use to > crawl with 1.0. > > But the smaller bin/crawl method finished the crawl and then immediately > had a java exception when starting the next step. > > Sorry I don't have more specifics, but I'm at home, the setup is at work > and I had to revert to get things back running. But I built a dev > machine so I can play with 1.1 and get more specific. More details on this (your environment, OS, JDK version) and logs/stacktraces would be highly appreciated! You mentioned that you have some scripts - if you could extract relevant portions from them (or copy the scripts) it would help us to ensure that it's not a simple command-line error. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: About Apache Nutch 1.1 Final Release
On Thu, 2010-04-08 at 21:31 -0700, Mattmann, Chris A (388J) wrote: > Hi there, > > Well as soon as we have 3 +1 binding VOTEs. Right now I'm the only PMC member > that's VOTE'd +1 on the release. > > Hopefully in the next few days someone will have a chance to check... I tried to get the Release Candidate (latest nightly build) running yesterday and I ran into problems with both of the scripts that I use to crawl with 1.0. But the smaller bin/crawl method finished the crawl and then immediately had a java exception when starting the next step. Sorry I don't have more specifics, but I'm at home, the setup is at work and I had to revert to get things back running. But I built a dev machine so I can play with 1.1 and get more specific. Phil Barnett Senior Analyst Walt Disney World.
Re: About Apache Nutch 1.1 Final Release
Hi there, Well as soon as we have 3 +1 binding VOTEs. Right now I'm the only PMC member that's VOTE'd +1 on the release. Hopefully in the next few days someone will have a chance to check... Cheers, Chris On 4/8/10 8:54 PM, "yhdelgado" wrote: Hi. I have a question. When the Apache Nutch 1.1 Final Release, will be released?. Grettings. -- View this message in context: http://n3.nabble.com/About-Apache-Nutch-1-1-Final-Release-tp707586p707586.html Sent from the Nutch - User mailing list archive at Nabble.com. ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++