Thanks for coming, everyone! We had around 25 people. A *huge*
success, for Seattle. And a big thanks to 10gen for sending Richard.
Can't wait to see you all next month.
On Wed, Feb 24, 2010 at 2:15 PM, Bradford Stephens
bradfordsteph...@gmail.com wrote:
The Seattle Hadoop/Scalability/NoSQL
You can add a specific rule before that exclusion rule
Something like :
+.*/?page=.*
2010/2/25, Ian M. Evans ianev...@digitalhit.com:
I suck at regex and in keeping with the Olympic spirit, I probably suck
at giant slalom too.
In the regex-urlfilter.txt there's the suggested probable queries
Replace it with this: -...@!*]
That's it...
Best regards,
---
Andreas P. Koenzen
On 25/02/2010, at 03:06 a.m., Ian M. Evans wrote:
I suck at regex and in keeping with the Olympic spirit, I probably
suck
at giant slalom too.
In the regex-urlfilter.txt there's the suggested probable
On 2010-02-24 17:34, Pedro Bezunartea López wrote:
Hi Ashley,
Hi,
I'm looking to reproduce program analysis results based on Nutch v0.4. I
realize this is a very old release, but is it possible to obtain the source
from somewhere? I see some of the classes I'm looking for in v0.7, but I
need
I was curious about this, and after a little browsing through sourceforge, I
found the CVS link:
http://nutch.cvs.sourceforge.net/viewvc/nutch/nutch/?pathrev=nutch_0_4
HTH,
Pedro.
2010/2/25 Andrzej Bialecki a...@getopt.org
On 2010-02-24 17:34, Pedro Bezunartea López wrote:
Hi Ashley,
Great, thanks!
2010/2/25 Pedro Bezunartea López pe...@bezunartea.net:
I was curious about this, and after a little browsing through sourceforge, I
found the CVS link:
http://nutch.cvs.sourceforge.net/viewvc/nutch/nutch/?pathrev=nutch_0_4
HTH,
Pedro.
2010/2/25 Andrzej Bialecki
Hello,
I'm trying to upgrade from Nutch 0.9 to Nutch 1.0 and I've solved all of the
issues that I seem be having, except for one.
When I run a web crawl, everything fetches fine until it gets to dedup, in
which case, I get this stack trace:
2010-02-25 14:31:46,592 WARN mapred.LocalJobRunner -