..and here is to a Vote: +1
Oh, per usual, forgot to throw in my +1. So, +1!
Cheers,
Chris
On 4/7/10 1:14 AM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hi Folks,
I have posted a candidate for the Apache Nutch 1.1 release. The source
code
is at:
Running nutch 0.9 for a long time without problems, but have just
started to see this error when executing (all from within the nutch 0.9
bin directory) :-
./nutch mergesegs $crawldir/MERGEDsegments $crawldir/segements/*
The error is :-
Exception in thread main java.io.IOException: No input
hi. what does this command supposed to do ?
also do you know is there any way i can parse and save text of html files
while crawling ?
On 7 April 2010 14:32, Gareth Gale gareth.g...@hp.com wrote:
Running nutch 0.9 for a long time without problems, but have just started
to see this error when
Hi all,
I ran a web-crawl using a domain filter for a specific country.
---regex-urlfilter.txt--
# URLs ending with the domain .al
+^http://(.*(\.al|\.al/.*)$)
# skip anything else
-.
--
Even though I didn't set the topN parameter,
I'm not sure what exactly changed that made all my nullpointer errors go
away, but I'm grateful for it, whatever it was.
So, +1 from me, not that I'm even sure I get a vote in the matter, but if
it's open to anyone on the list, I'm on board.
--
View this message in context:
hi folks
do you know i can save parsed text while crawling event
how can i do this
ty
On 7 April 2010 20:11, tsmori tim_m...@ncsu.edu wrote:
I'm not sure what exactly changed that made all my nullpointer errors go
away, but I'm grateful for it, whatever it was.
So, +1 from me, not that I'm
Hi,
This is a VOTE thread. Please do not post your user question on this thread as
we are VOTE'ing on a particular release.
You can re-post a new thread with your question, and I would highly encourage
it.
Thanks!
Cheers,
Chris
On 4/7/10 6:26 PM, cefurkan0 cefurkan0 cefurk...@gmail.com