Hey Markus,
Don't beat yourself up over it -- you did awesome work and have been
contributing a ton
so who cares!
If we need to do another patch release, we can easily do it (especially with
super release
guy Lewis!)
Cheers,
Chris
On Jun 26, 2012, at 3:55 PM, Markus Jelsma wrote:
> Hi,
>
>
Hi,
The HostURLNormalizer is not supposed to be in 1.5.1, this is true for other
issues as well. Nutch 1.5.1 is a bugfix release and should not be pulled from
trunk but from the tag + the required patches, i didn't notice it was pulled
from trunk until now.
The build issue has for that plugin
-1
The plugin urlnormalizer-host (NUTCH-1319 listed in CHANGES.txt)
is missing in the bin package.
It also does not build for the src package: it's missing in
src/plugins/build.xml of 1.5.1.
@Markus:
You are right: up to 1.4 there was a top-level folder apache-nutch-1.x/
in the package (src and bi
OK, JIRA and fix for 1.6?
On 26 June 2012 17:32, Markus Jelsma wrote:
> This was command line. I didn't notive it with 1.5 because i unpacked that
> in a GUI. It really unpacks in the cwd, or my system makes a fool out of me
> :)
>
> wget
> http://people.apache.org/~lewismc/apache-nutch-1.5.1-rc
This was command line. I didn't notive it with 1.5 because i unpacked that in a
GUI. It really unpacks in the cwd, or my system makes a fool out of me :)
wget
http://people.apache.org/~lewismc/apache-nutch-1.5.1-rc1/apache-nutch-1.5.1-src.tar.gz
tar -xvzf apache-nutch-1.5.1-src.tar.gz
ls
apach
Thanks lewis, but I don't think its related to NUTCH-769.
>From what I understand of NUTCH-769, it concerns scenarios in which the
hosts are indeed unresponsive and an exception is thrown on same url over
and over.
My problem here is with protocol-httpclient.
The urls and hosts are responsive, but
Probably depends on the tool you are using to open the archive. It does
that with File Roller on Ubuntu but works fine on the command line or when
doing "extract here" from the file menu
Not a blocker IMHO
On 26 June 2012 08:04, Markus Jelsma wrote:
> Hi,
>
> It builds and runs smoothly but the
Hi,
On Tue, Jun 26, 2012 at 1:39 PM, nutch.bu...@gmail.com
wrote:
> after a while fetcher starts throwing
> httpclient.connectionPoolTimeoutException: Timeout waiting for connection
> for almost each url.
>
> Any solution for this issue?
This looks like it's related to the fix in NUTCH-769 can y
Hi Markus,
I've just unpacked both the src.zip and src.tar.gz and they both
create a directory apache-nutch-1.5.1-src with everything inside... is
this what you require?
Lewis
On Tue, Jun 26, 2012 at 8:04 AM, Markus Jelsma
wrote:
> Hi,
>
> It builds and runs smoothly but there's something that
update (or whatever the actual name of the command is) after parsing?
On 25 June 2012 22:35, wrote:
> Hello,
>
> I have tested nutch-2.0 with hbase and mysql trying to index only one url
> with depth 1.
>
> I tried to fetch an html tag value and parse it to metadata column in
> webpage object b
Hi,
It builds and runs smoothly but there's something that didn't catch my eye with
1.5 since i then used a GUI to unpack the src file, the src and bin packages
decompresses everything in the cwd, this means no apache-nutch-1.5 folder is
created. This was the case with 1.4 and earlier. I believ
11 matches
Mail list logo