[
https://issues.apache.org/jira/browse/NUTCH-673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Dean updated NUTCH-673:
Priority: Minor (was: Major)
Priority has been changed to "minor".
> Upgrade the Carro
Versions: 0.9.0
Environment: All Nutch deployments.
Reporter: Sean Dean
Fix For: 1.0.0
Release 3.0 of the Carrot2 plug-in was released recently.
We currently have version 2.1 in the source tree and upgrading it to the latest
version before 1.0-release might make
to work with trunk (and the
future 1.0 release). I would personally like to see NUTCH-92 (or some form of
it) included in trunk for a legitimate evaluation before the next release.
Sean Dean
From: Andrzej Bialecki <[EMAIL PROTECTED]>
To: nut
Folks,
I was wondering if anyone could shed some light on the status of this issue
heading into a potential 1.0 (or 0.x) release over the few months?
I realize many upgrades have been made to Hadoop and Lucene, and in addition to
that bug fixes in just about every element of the system but does
+1 for my official non-binding vote :)
You might want to correct the word "confiquration" at "1." in CHANGES-0.9.txt,
and CHANGES.txt inside the package.
Everything else looks great and more importantly, runs! Good work guys.
- Original Message
From: Chris Mattmann <[EMAIL PROTECTED
It looks like we might want to at least give it a try then, with the worst
possible case of Nutch users having to keep speculative execution disabled if
it causes grief again. If other problems arise, then we can just revert back to
0.11.2 which seems to be stable in terms of all the Nutch opera
thing in performance as the Java processes didn't lock despite
the lowering of total threads.
- Original Message ----
From: Sean Dean <[EMAIL PROTECTED]>
To: nutch-dev@lucene.apache.org
Sent: Wednesday, March 7, 2007 6:52:05 PM
Subject: Re: 0.9 release
Great, thanks a lot.
With NUTCH-233 the issue is independent of Hadoop and lies with the
regex-urlfilter. The last solution posted in JIRA gives you more room to work
with, it allowed myself to fetch a segment over 1-2 million but I ran into the
same issue when the segment approached 10 million in size.
Unless you
.
All this testing will be based off revision 515791 in trunk.
- Original Message
From: Andrzej Bialecki <[EMAIL PROTECTED]>
To: nutch-dev@lucene.apache.org
Sent: Wednesday, March 7, 2007 5:04:21 PM
Subject: Re: 0.9 release
Sean Dean wrote:
> As it stands now with whats in tr
As it stands now with whats in trunk under 0.9-dev, one of the biggest problems
is the version of Hadoop we have included. It fails on anything above 200k
URLs, and should be considered a "blocker" issue.
Its my understanding that Andrzej has a newer Hadoop JAR with some custom
patches applied
sion hang reduce process for ever) - I
> propose to apply the fix provided by Sean Dean and close this issue for
> now.
yes that was the resolution also last time :)
> * NUTCH-427 (protocol-smb). This relies on a LGPL library, and it's
> certainly not critical (as this is an opt
This was corrected in Hadoop as per issue HADOOP-917, but I'm thinking some
code in Nutch might have to be changed also. I reported this issue (via mailing
list) a while ago and I'm glad it was fixed, but I have been purposely staying
with revision 495214 of trunk which seems to provide the best
I have had a common error come up now on two seperate fetches, both using the
new Hadoop 0.10.1. The first error came up on my regular fetch using my large
Nutch DB, but to rule out any problems with that (possibly related to the new
fetch statuses) i created a brand new DB using the standard DM
[
http://issues.apache.org/jira/browse/NUTCH-417?page=comments#action_12459073 ]
Sean Dean commented on NUTCH-417:
-
Speculative execution is now off by default with Hadoop 0.9.2 as per issue
HADOOP-827. Since there was only two other fixes with
[
http://issues.apache.org/jira/browse/NUTCH-224?page=comments#action_12455065 ]
Sean Dean commented on NUTCH-224:
-
Just a note on my comment above, it seems JIRA cant display (or wont display)
Korean text after I accept the comment.
If your
[
http://issues.apache.org/jira/browse/NUTCH-224?page=comments#action_12455064 ]
Sean Dean commented on NUTCH-224:
-
I just tested this today using 0.9-dev and it seems the changes made back in
0.7.2 to Lucene didnt fix the issue. At some point
[
http://issues.apache.org/jira/browse/NUTCH-233?page=comments#action_12453919 ]
Sean Dean commented on NUTCH-233:
-
Could I suggest that this change, from ".*(/.+?)/.*?\1/.*?\1/" to
".*(/[^/]+)/[^/]+\1/[^/]+\1/" be committ
[
http://issues.apache.org/jira/browse/NUTCH-224?page=comments#action_12416108 ]
Sean Dean commented on NUTCH-224:
-
Im still using 0.7.1 and also see this problem.
In the Nutch 0.7.2 release they upgraded to Lucene 1.9.1, which included the
above fixes
18 matches
Mail list logo