[jira] Commented: (NUTCH-133) ParserFactory does not work as expected

2005-12-07 Thread JIRA
[ http://issues.apache.org/jira/browse/NUTCH-133?page=comments#action_12359564 ] Lutischán Ferenc commented on NUTCH-133: Dear Stephan, Please see http://issues.apache.org/jira/browse/NUTCH-123. This problem is also problem in cached.jsp. Regards,

[jira] Commented: (NUTCH-133) ParserFactory does not work as expected

2005-12-07 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-133?page=comments#action_12359603 ] Chris A. Mattmann commented on NUTCH-133: - Just another comment on the issue. The reported bug listed as the following: Problem: Actually the configuration of parser

[jira] Commented: (NUTCH-133) ParserFactory does not work as expected

2005-12-07 Thread Stefan Groschupf (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-133?page=comments#action_12359610 ] Stefan Groschupf commented on NUTCH-133: Jerome: Since 3 months or so url extentions and also magic content type detection is never used. I suggest to assign this

Re: RCP known limitation or bug?

2005-12-07 Thread Doug Cutting
This should work. TestRPC.java has a case which returns void (ping). Can you send a simple test case that fails? Doug Stefan Groschupf wrote: Hi, I never used the RCP that intensive so I was surprised to found this limitation. Is it known that the RCP.call method can only call methods that

Re: RCP known limitation or bug?

2005-12-07 Thread Stefan Groschupf
Sure... the test case: ### package org.apache.nutch.ipc; import java.lang.reflect.Method; import java.net.InetSocketAddress; import junit.framework.TestCase; public class TestMultiCall extends TestCase { private Server fServer; public TestMultiCall() { fServer =

[jira] Commented: (NUTCH-133) ParserFactory does not work as expected

2005-12-07 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-133?page=comments#action_12359624 ] Doug Cutting commented on NUTCH-133: It would be great to have some junit tests which illustrate these problems. If we can first all agree on the desired behaviour, then

[jira] Commented: (NUTCH-134) Summarizer doesn't select the best snippets

2005-12-07 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-134?page=comments#action_12359626 ] Doug Cutting commented on NUTCH-134: Can we yet replace Nutch's summarizer with the summarizer in Lucene's contrib directory? Are there features that Nutch requires that

[jira] Commented: (NUTCH-133) ParserFactory does not work as expected

2005-12-07 Thread Stefan Groschupf (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-133?page=comments#action_12359627 ] Stefan Groschupf commented on NUTCH-133: Doug, I already attached a unit test that call ParseUtil.parse(Content) and simulate the different scenarios. I can extend the

[jira] Commented: (NUTCH-134) Summarizer doesn't select the best snippets

2005-12-07 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-134?page=comments#action_12359629 ] Andrzej Bialecki commented on NUTCH-134: - I _think_ the Lucene summarizer requires more CPU than this one... but this has to be checked. I'll work on that.

[jira] Commented: (NUTCH-133) ParserFactory does not work as expected

2005-12-07 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-133?page=comments#action_12359634 ] Doug Cutting commented on NUTCH-133: Stefan, sorry I missed the test case. If others agree that these cases should pass, then we should commit the test case alone as a

[jira] Commented: (NUTCH-133) ParserFactory does not work as expected

2005-12-07 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-133?page=comments#action_12359645 ] Chris A. Mattmann commented on NUTCH-133: - Hi Stefan, Thanks for your reply. I actually like a lot of your proposed changes having to do with the MimeType cleansing

[jira] Commented: (NUTCH-133) ParserFactory does not work as expected

2005-12-07 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-133?page=comments#action_12359647 ] Jerome Charron commented on NUTCH-133: -- Stefan: 1. url extentions and also magic content type detection are used. This is the only way protocol-file and protocol-ftp can

[jira] Commented: (NUTCH-134) Summarizer doesn't select the best snippets

2005-12-07 Thread byron miller (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-134?page=comments#action_12359649 ] byron miller commented on NUTCH-134: I would take more cpu for better summaries any day :) cpu power is cheaper than manual intervention! If any testing is needed, don't

[jira] Commented: (NUTCH-133) ParserFactory does not work as expected

2005-12-07 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-133?page=comments#action_12359679 ] Chris A. Mattmann commented on NUTCH-133: - Hi Doug, I like this idea for the getContentType method. In general, I completely agree that the server provided content

Re: Nutch 0.8 update issue

2005-12-07 Thread Jack Tang
Guys My fault! I miss copying the segments dir. Sorry for that. Pls ignore this messgae. /Jack On 12/8/05, Jack Tang [EMAIL PROTECTED] wrote: Hi All Currently I update my nutch from 0.7 to 0.8-dev (svn version) and come across one question on searcher. I wrote my own indexer and searcher