Sami Siren pisze:
Andrzej Bialecki wrote:
Sami Siren wrote:
I am planning to build the first rc for nutch 1.0 at Tue 3.3.2009
morning (EET). There are still some issues marked as fix for 1.0 in
Jira. Neither of the two remaining _bugs_ seems too important to me,
actually I only count the issues assigned to developers as real
candidates to be included in 1.0:
NUTCH-578 (kubes)
NUTCH-477 (ab)
NUTCH-669 (siren)
There's one Critical issue reported, related to NekoHTML (NUTCH-700).
I'm not sure what are the feature differences (pertinent to Nutch)
between 0.9.4 and 1.9.11 - perhaps downgrading is the safest course
of action.
I will take care of that.
I am also volunteering to push all open issues to 1.1 before
starting the RC build on Tuesday. Any objections on the proposed
procedure or timing?
Sounds good.
great!
--
Sami Siren
What about new scoring and new indexing? Will it be integrated as a
primary scoring algorithm? I have problem with it on LinkRank:
2009-03-02 20:43:45,708 INFO webgraph.LinkRank - Starting link counter job
2009-03-02 20:43:47,838 INFO webgraph.LinkRank - Finished link counter job
2009-03-02 20:43:47,839 INFO webgraph.LinkRank - Reading numlinks temp file
2009-03-02 20:43:47,840 INFO webgraph.LinkRank - Deleting numlinks temp
file
2009-03-02 20:43:47,842 FATAL webgraph.LinkRank - LinkAnalysis:
java.lang.NullPointerException
at
org.apache.nutch.scoring.webgraph.LinkRank.runCounter(LinkRank.java:113)
at
org.apache.nutch.scoring.webgraph.LinkRank.analyze(LinkRank.java:582)
at org.apache.nutch.scoring.webgraph.LinkRank.run(LinkRank.java:657)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at
org.apache.nutch.scoring.webgraph.LinkRank.main(LinkRank.java:627)
Another question what about indexing framework mentioned here:
http://www.mail-archive.com/nutch-u...@lucene.apache.org/msg11764.html
Have all those new scoring and indexing would be real step forward.
Thanks,
Bartosz