Sami Siren pisze:
Andrzej Bialecki wrote:
Sami Siren wrote:
I am planning to build the first rc for nutch 1.0 at Tue 3.3.2009 morning (EET). There are still some issues marked as fix for 1.0 in Jira. Neither of the two remaining _bugs_ seems too important to me, actually I only count the issues assigned to developers as real candidates to be included in 1.0:

NUTCH-578 (kubes)
NUTCH-477 (ab)
NUTCH-669 (siren)

There's one Critical issue reported, related to NekoHTML (NUTCH-700). I'm not sure what are the feature differences (pertinent to Nutch) between 0.9.4 and 1.9.11 - perhaps downgrading is the safest course of action.
I will take care of that.


I am also volunteering to push all open issues to 1.1 before starting the RC build on Tuesday. Any objections on the proposed procedure or timing?

Sounds good.
great!

--
Sami Siren



What about new scoring and new indexing? Will it be integrated as a primary scoring algorithm? I have problem with it on LinkRank:

2009-03-02 20:43:45,708 INFO  webgraph.LinkRank - Starting link counter job
2009-03-02 20:43:47,838 INFO  webgraph.LinkRank - Finished link counter job
2009-03-02 20:43:47,839 INFO  webgraph.LinkRank - Reading numlinks temp file
2009-03-02 20:43:47,840 INFO webgraph.LinkRank - Deleting numlinks temp file 2009-03-02 20:43:47,842 FATAL webgraph.LinkRank - LinkAnalysis: java.lang.NullPointerException at org.apache.nutch.scoring.webgraph.LinkRank.runCounter(LinkRank.java:113) at org.apache.nutch.scoring.webgraph.LinkRank.analyze(LinkRank.java:582)
       at org.apache.nutch.scoring.webgraph.LinkRank.run(LinkRank.java:657)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.scoring.webgraph.LinkRank.main(LinkRank.java:627)

Another question what about indexing framework mentioned here:
http://www.mail-archive.com/nutch-u...@lucene.apache.org/msg11764.html


Have all those new scoring and indexing would be real step forward.

Thanks,
Bartosz

Reply via email to