[jira] Commented: (NUTCH-779) Mechanism for passing metadata from parse to crawldb

2010-03-30 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851719#action_12851719 ] Hudson commented on NUTCH-779: -- Integrated in Nutch-trunk #1112 (See

[jira] Commented: (NUTCH-779) Mechanism for passing metadata from parse to crawldb

2010-03-29 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850915#action_12850915 ] Julien Nioche commented on NUTCH-779: - Could anyone please review this issue? I would

[jira] Commented: (NUTCH-779) Mechanism for passing metadata from parse to crawldb

2010-03-29 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850939#action_12850939 ] Andrzej Bialecki commented on NUTCH-779: - CrawlDbReducer, the cramped line {{if

Re: [jira] Commented: (NUTCH-779) Mechanism for passing metadata from parse to crawldb

2010-01-20 Thread MilleBii
I'd like to use Julien's approach because I found the scoring filter complex to understand. My use case is the following : 1. during scoring after parsing, I want to tag interesting pages for me, say meta=HIT 2. in the next step (to be created) I would like to prune the segment of NON-HIT content

[jira] Commented: (NUTCH-779) Mechanism for passing metadata from parse to crawldb

2010-01-19 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802172#action_12802172 ] Julien Nioche commented on NUTCH-779: - The property needs some documentation in

[jira] Commented: (NUTCH-779) Mechanism for passing metadata from parse to crawldb

2010-01-19 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802175#action_12802175 ] Andrzej Bialecki commented on NUTCH-779: - Personally I would use ScoringFilters

[jira] Commented: (NUTCH-779) Mechanism for passing metadata from parse to crawldb

2010-01-18 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801875#action_12801875 ] Andrzej Bialecki commented on NUTCH-779: - You can already achieve this with