[jira] Commented: (NUTCH-616) Reset Fetch Retry counter when fetch is successful

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578705#action_12578705 ] Andrzej Bialecki commented on NUTCH-616: - I'm considering a different approach to

[jira] Updated: (NUTCH-616) Reset Fetch Retry counter when fetch is successful

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated NUTCH-616: Attachment: NUTCH-616-v2.patch This patch uses FetchSchedule to maintain the counter.

[jira] Assigned: (NUTCH-616) Reset Fetch Retry counter when fetch is successful

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki reassigned NUTCH-616: --- Assignee: Andrzej Bialecki Reset Fetch Retry counter when fetch is successful

[jira] Closed: (NUTCH-613) Empty Summaries and Cached Pages

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki closed NUTCH-613. --- Resolution: Fixed Fix Version/s: (was: 0.9.0) Assignee: Andrzej Bialecki

[jira] Commented: (NUTCH-613) Empty Summaries and Cached Pages

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578754#action_12578754 ] Andrzej Bialecki commented on NUTCH-613: - Patch committed to trunk. Thank you!

[jira] Commented: (NUTCH-615) Redirected URL are fetched wihtout setting any FetchInterval

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578742#action_12578742 ] Andrzej Bialecki commented on NUTCH-615: - I think the code in ParseOutputFormat

[jira] Commented: (NUTCH-612) URL filtering is always disabled in Generator when invoked by Crawl

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578770#action_12578770 ] Andrzej Bialecki commented on NUTCH-612: - Patch committed to trunk rev. 637114.

[jira] Closed: (NUTCH-612) URL filtering is always disabled in Generator when invoked by Crawl

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki closed NUTCH-612. --- Resolution: Fixed Assignee: Andrzej Bialecki URL filtering is always disabled in

[jira] Closed: (NUTCH-601) Recrawling on existing crawl directory using force option

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki closed NUTCH-601. --- Resolution: Fixed Fix Version/s: 1.0.0 Assignee: Andrzej Bialecki

[jira] Commented: (NUTCH-601) Recrawling on existing crawl directory using force option

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578781#action_12578781 ] Andrzej Bialecki commented on NUTCH-601: - Patch v. 1.0 applied to trunk in rev.

[jira] Closed: (NUTCH-592) Fetcher2 : NPE for page with status ProtocolStatus.TEMP_MOVED

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki closed NUTCH-592. --- Resolution: Duplicate Assignee: Andrzej Bialecki (was: Emmanuel Joke) Fetcher2 : NPE

[jira] Closed: (NUTCH-590) Index multiple docs per call using IndexingFilter extension point

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki closed NUTCH-590. --- Resolution: Won't Fix Assignee: Andrzej Bialecki Index multiple docs per call using

[jira] Commented: (NUTCH-592) Fetcher2 : NPE for page with status ProtocolStatus.TEMP_MOVED

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578786#action_12578786 ] Andrzej Bialecki commented on NUTCH-592: - Duplicate of NUTCH-597 and NUTCH-615.

[jira] Commented: (NUTCH-590) Index multiple docs per call using IndexingFilter extension point

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578788#action_12578788 ] Andrzej Bialecki commented on NUTCH-590: - No further comments or patches provided.

[jira] Commented: (NUTCH-610) Can't Update or modify an index while web gui is running

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578773#action_12578773 ] Andrzej Bialecki commented on NUTCH-610: - If there are no objections I would like

[jira] Commented: (NUTCH-575) NPE in OpenSearchServlet when summary is null

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578795#action_12578795 ] Andrzej Bialecki commented on NUTCH-575: - I applied the remaining patch

[jira] Closed: (NUTCH-575) NPE in OpenSearchServlet when summary is null

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki closed NUTCH-575. --- Resolution: Fixed Assignee: Andrzej Bialecki NPE in OpenSearchServlet when summary is

Re: [jira] Commented: (NUTCH-575) NPE in OpenSearchServlet when summary is null

2008-03-14 Thread Jesiel Trevisan
Please, I want to leave this mail list about nutch. I already sent a e-mail to keep of this mail list, but, I'm still receving many e-mail about it, with FROM: nutch-dev@lucene.apache.org Please, let me know how STOP to recever these emails. Thanks so much. On Fri, Mar 14, 2008 at 12:10 PM,

Re: [jira] Commented: (NUTCH-575) NPE in OpenSearchServlet when summary is null

2008-03-14 Thread Andrzej Bialecki
Jesiel Trevisan wrote: Please, I want to leave this mail list about nutch. I already sent a e-mail to keep of this mail list, but, I'm still receving many e-mail about it, with FROM: nutch-dev@lucene.apache.org Hi, Have you sent the email as described here

Problem in running Nutch where proxy authentication is required.

2008-03-14 Thread naveen.goswami
Hi All, I am facing a problem in running nutch where the proxy authentication is required to crawl the site.(eg. google.com, yahoo.com) I am able to crawl the sites which do not require proxy authentication from our domain (eg abc.com), it is successfully creating a crawl folder and 5

Re: Problem in running Nutch where proxy authentication is required.

2008-03-14 Thread Susam Pal
I still can't see any DEBUG logs in your log file. Did you go through my earlier mail? Regards, Susam Pal On Wed, Mar 12, 2008 at 9:39 PM, [EMAIL PROTECTED] wrote: Hi All, I am facing a problem in running nutch where the proxy authentication is required to crawl the site.(eg. google.com,

[jira] Commented: (NUTCH-566) Sun's URL class has bug in creation of relative query URLs

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578957#action_12578957 ] Andrzej Bialecki commented on NUTCH-566: - I agree that this should be put into a

[jira] Commented: (NUTCH-126) Fetching via https does not work with a proxy (patch)

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578969#action_12578969 ] Andrzej Bialecki commented on NUTCH-126: - Patch applied to trunk, rev. 637308.

[jira] Commented: (NUTCH-157) Problem during parsing msword document . It fetching properly but parsing is not working. Please show me the way how can i parse it

2008-03-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578972#action_12578972 ] Andrzej Bialecki commented on NUTCH-157: - This branch is in End Of Life status.

[jira] Commented: (NUTCH-612) URL filtering is always disabled in Generator when invoked by Crawl

2008-03-14 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579003#action_12579003 ] Hudson commented on NUTCH-612: -- Integrated in Nutch-trunk #390 (See