Re: NUTCH-119 :: how hard to fix

2007-06-28 Thread Doğacan Güney
On 6/27/07, Kai_testing Middleton [EMAIL PROTECTED] wrote: wow, setting db.max.outlinks.per.page immediately fixed my problem. It looks like I totally mis-diagnosed things. May I pose two questions: 1) how did you view all the outlinks? bin/nutch plugin parse-html

[jira] Commented: (NUTCH-474) Fetcher2 sets server-delay and blocking checks incorrectly

2007-06-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508747 ] Hudson commented on NUTCH-474: -- Integrated in Nutch-Nightly #131 (See

[jira] Commented: (NUTCH-498) Use Combiner in LinkDb to increase speed of linkdb generation

2007-06-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508748 ] Hudson commented on NUTCH-498: -- Integrated in Nutch-Nightly #131 (See

[jira] Commented: (NUTCH-499) Refactor LinkDb and LinkDbMerger to reuse code

2007-06-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508749 ] Hudson commented on NUTCH-499: -- Integrated in Nutch-Nightly #131 (See

Re: [jira] Commented: (NUTCH-474) Fetcher2 sets server-delay and blocking checks incorrectly

2007-06-28 Thread Doğacan Güney
On 6/28/07, Hudson (JIRA) [EMAIL PROTECTED] wrote: [ https://issues.apache.org/jira/browse/NUTCH-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508747 ] Hudson commented on NUTCH-474: -- Integrated in Nutch-Nightly #131

[jira] Commented: (NUTCH-392) OutputFormat implementations should pass on Progressable

2007-06-28 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508812 ] Doğacan Güney commented on NUTCH-392: - OK, I have done a bit of testing on compression but I'm stuck. Here it is:

[jira] Commented: (NUTCH-392) OutputFormat implementations should pass on Progressable

2007-06-28 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508816 ] Andrzej Bialecki commented on NUTCH-392: - Re: Content versioning - we can use negative int values as version

[jira] Commented: (NUTCH-392) OutputFormat implementations should pass on Progressable

2007-06-28 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508818 ] Doğacan Güney commented on NUTCH-392: - Re: Content versioning - we can use negative int values as version

[jira] Commented: (NUTCH-392) OutputFormat implementations should pass on Progressable

2007-06-28 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508820 ] Sami Siren commented on NUTCH-392: -- But why is parse_text_block's size so close to parse_text data of parse_text

[jira] Commented: (NUTCH-392) OutputFormat implementations should pass on Progressable

2007-06-28 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508823 ] Doğacan Güney commented on NUTCH-392: - data of parse_text is already compressed so recompressing it does not

[jira] Commented: (NUTCH-392) OutputFormat implementations should pass on Progressable

2007-06-28 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508861 ] Doğacan Güney commented on NUTCH-392: - After changing ParseText to not do any internal compression, segment

problem with nutch 0.8.1 compile

2007-06-28 Thread Tsengtan A Shuy
Where can I find the library for import com.etranslate.tm.processing.rtf.ParseException; java source code.

[jira] Commented: (NUTCH-392) OutputFormat implementations should pass on Progressable

2007-06-28 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508900 ] Andrzej Bialecki commented on NUTCH-392: - Excellent work, Doğacan - thank you. The numbers for RECORD

RE: problem with nutch 0.8.1 compile

2007-06-28 Thread Tsengtan A Shuy
I found the jar file. I like to join the nutch developer team. Where shall I get start? Adam Shuy President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent:

RE: problem with nutch 0.8.1 compile

2007-06-28 Thread Tsengtan A Shuy
I tried to create eclipse launcher, but I got the following error: Exception in thread main java.io.IOException: Input directory C:/JavaSearchEngine/nutch-0.8.1/urls in local is invalid. How to solve the above error? Adam Shuy President ePacific Web Design Hosting Professional Web/Software