Increasing the score of especific pages

2010-03-11 Thread Santiago PĂ©rez
I want to modify Nutch for increasing the score of some pages in their CrawlDatum. The objective of this is recognizing which pages include a certain token. Increasing the score to a high value will be useful for being chosen again in the next Segment generation. I modified like this:

[jira] Resolved: (NUTCH-798) Upgrade to SOLR1.4

2010-03-11 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche resolved NUTCH-798. - Resolution: Fixed Updated SOLRJ's dependencies at the same time : Deleting

[jira] Resolved: (NUTCH-801) Remove RTF and MP3 parse plugins

2010-03-11 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche resolved NUTCH-801. - Resolution: Fixed Committed revision 921840. Remove RTF and MP3 parse plugins

Creating new linked entries in crawlDB

2010-03-11 Thread nikinch
Hi everyone I've been using nutch for a while now and i've come up on a snag. I'm trying to find where new linked pages are added to the segment as a specific entry. To make myself clear i've been through the fetch class and the crawlDBFilter and reducer. But i'm looking for the initial entry

Re: Creating new linked entries in crawlDB

2010-03-11 Thread Jesiel Trevisan
Please Keep me out this Group. Tks ___ Jesiel A.S. Trevisan Email: jesieltrevi...@gmail.com.br MSN: jesieltrevi...@hotmail.com Skype AIM: jesieltrevisan YahooMessager: jesiel.trevisan ICQ:: 46527510

[jira] Commented: (NUTCH-801) Remove RTF and MP3 parse plugins

2010-03-11 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844352#action_12844352 ] Hudson commented on NUTCH-801: -- Integrated in Nutch-trunk #1093 (See

[jira] Commented: (NUTCH-798) Upgrade to SOLR1.4

2010-03-11 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844351#action_12844351 ] Hudson commented on NUTCH-798: -- Integrated in Nutch-trunk #1093 (See