[jira] Closed: (NUTCH-149) outlinks not shown properly in cached.jsp

2006-02-07 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-149?page=all ] Chris A. Mattmann closed NUTCH-149: --- Closed at request of reporter: not a bug > outlinks not shown properly in cached.jsp > - > > Key: NUTCH-

[jira] Resolved: (NUTCH-149) outlinks not shown properly in cached.jsp

2006-02-07 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-149?page=all ] Chris A. Mattmann resolved NUTCH-149: - Resolution: Invalid Closed at request of the reporter: not a bug. > outlinks not shown properly in cached.jsp > -

[jira] Commented: (NUTCH-140) Add alias capability in parse-plugins.xml file that allows mimeType->extensionId mapping

2006-02-14 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-140?page=comments#action_12366376 ] Chris A. Mattmann commented on NUTCH-140: - Hi Folks, I've went ahead and created an initial patch for this issue. I'll be attaching it to JIRA within the next day for

[jira] Created: (NUTCH-210) Context.xml file for Nutch web application

2006-02-14 Thread Chris A. Mattmann (JIRA)
Context.xml file for Nutch web application -- Key: NUTCH-210 URL: http://issues.apache.org/jira/browse/NUTCH-210 Project: Nutch Type: Improvement Components: web gui Versions: 0.7.1, 0.7, 0.6, 0.7.2-dev, 0.8-dev En

[jira] Updated: (NUTCH-140) Add alias capability in parse-plugins.xml file that allows mimeType->extensionId mapping

2006-02-15 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-140?page=all ] Chris A. Mattmann updated NUTCH-140: Attachment: NUTCH-140.20051502.patch.txt An initial patch for NUTCH-140 for everyone's review. > Add alias capability in parse-plugins.xml file that all

[jira] Updated: (NUTCH-218) need DOAP file for Nutch

2006-02-28 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-218?page=all ] Chris A. Mattmann updated NUTCH-218: Attachment: doap_Nutch.rdf I generated this off the DOAP generator page. Feel free to use it, or not. > need DOAP file for Nutch > -

[jira] Updated: (NUTCH-210) Context.xml file for Nutch web application

2006-03-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-210?page=all ] Chris A. Mattmann updated NUTCH-210: Attachment: NUTCH-210.Mattmann.patch.txt Initial NUTCH-210 patch. Uses an XSL stylesheet to read searcher., plugin., extension.clustering and extension.

[jira] Commented: (NUTCH-236) PdfParser and RSSParser Log4j appender redirection

2006-03-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-236?page=comments#action_12371664 ] Chris A. Mattmann commented on NUTCH-236: - >I'd be happy to make these changes and submit a patch, but I wanted to know it >the change would be welcome first. I think t

[jira] Commented: (NUTCH-220) PDF Box can't parse document: java.lang.NullPointerException

2006-03-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-220?page=comments#action_12371669 ] Chris A. Mattmann commented on NUTCH-220: - Could you provide some more detail on this issue? For instance, a stack trace here would be quite helpful in trying to debug

[jira] Commented: (NUTCH-185) XMLParser is configurable xml parser plugin.

2006-03-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-185?page=comments#action_12371671 ] Chris A. Mattmann commented on NUTCH-185: - I propose that either this issue be closed and the patch files moved to NUTCH-23, or that NUTCH-23 be closed, as the two are

[jira] Resolved: (NUTCH-34) Parsing different content formats

2006-03-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-34?page=all ] Chris A. Mattmann resolved NUTCH-34: Fix Version: 0.7.2-dev 0.8-dev Resolution: Fixed This issue was addressed via the application of NUTCH-88 applied to Nutch 0.7

[jira] Closed: (NUTCH-34) Parsing different content formats

2006-03-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-34?page=all ] Chris A. Mattmann closed NUTCH-34: -- Issue addressed by NUTCH-88. > Parsing different content formats > - > > Key: NUTCH-34 > URL: http://issue

[jira] Resolved: (NUTCH-24) Cannot handle incorrectly cased Content-Type

2006-03-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-24?page=all ] Chris A. Mattmann resolved NUTCH-24: Fix Version: 0.8-dev Resolution: Fixed This issue was addressed by NUTCH-139, the fault tolerant Metadata container. > Cannot handle incorrect

[jira] Closed: (NUTCH-24) Cannot handle incorrectly cased Content-Type

2006-03-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-24?page=all ] Chris A. Mattmann closed NUTCH-24: -- issue addressed by NUTCH-139. > Cannot handle incorrectly cased Content-Type > > > Key: NUTCH-24 >

[jira] Commented: (NUTCH-210) Context.xml file for Nutch web application

2006-03-25 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-210?page=comments#action_12371849 ] Chris A. Mattmann commented on NUTCH-210: - Hi Jerome, The updates look fine. No objections from my end. I hope people find the patch useful. Cheers, Chris > Cont

[jira] Resolved: (NUTCH-23) content text/xml parser

2006-03-25 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-23?page=all ] Chris A. Mattmann resolved NUTCH-23: Resolution: Duplicate Duplicate of NUTCH-185. > content text/xml parser > --- > > Key: NUTCH-23 > URL: http:/

[jira] Closed: (NUTCH-23) content text/xml parser

2006-03-25 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-23?page=all ] Chris A. Mattmann closed NUTCH-23: -- Duplicate of NUTCH-185. > content text/xml parser > --- > > Key: NUTCH-23 > URL: http://issues.apache.org/jira/browse/

[jira] Created: (NUTCH-245) XML Schemas for xml configuration files in conf directory

2006-04-07 Thread Chris A. Mattmann (JIRA)
XML Schemas for xml configuration files in conf directory - Key: NUTCH-245 URL: http://issues.apache.org/jira/browse/NUTCH-245 Project: Nutch Type: New Feature Components: fetcher, indexer, ndfs, searcher, we

[jira] Updated: (NUTCH-245) DTD Schemas for plugin.xml configuration files in conf directory

2006-04-07 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-245?page=all ] Chris A. Mattmann updated NUTCH-245: Summary: DTD Schemas for plugin.xml configuration files in conf directory (was: XML Schemas for xml configuration files in conf directory) > DTD Schema

[jira] Updated: (NUTCH-245) DTD Schemas for plugin.xml configuration files in conf directory

2006-04-11 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-245?page=all ] Chris A. Mattmann updated NUTCH-245: Description: Currently, the plugin.xml file does not have a DTD or XML Schema associated with it, and most people just go look at an existing plugin's p

[jira] Updated: (NUTCH-245) DTD Schemas for plugin.xml configuration files in conf directory

2006-04-11 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-245?page=all ] Chris A. Mattmann updated NUTCH-245: Attachment: NUTCH-245.Mattmann.patch.txt Here's the patch for the plugin DTD file. I got a lot of info from: http://help.eclipse.org/help31/index.jsp?to

[jira] Commented: (NUTCH-245) DTD Schemas for plugin.xml configuration files in conf directory

2006-04-12 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-245?page=comments#action_12374217 ] Chris A. Mattmann commented on NUTCH-245: - Hey Doug, I'm fine with that, I think it makes sense. Want an updated patch or just, the person who commits it can move it

[jira] Updated: (NUTCH-245) DTD for plugin.xml configuration files

2006-04-12 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-245?page=all ] Chris A. Mattmann updated NUTCH-245: Summary: DTD for plugin.xml configuration files (was: DTD Schemas for plugin.xml configuration files in conf directory) update title to reflect core is

[jira] Commented: (NUTCH-294) Topic-maps of related searchwords

2006-06-03 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-294?page=comments#action_12414597 ] Chris A. Mattmann commented on NUTCH-294: - Hi Stefan, I'm wondering if this issue is in any way related to the existing clustering-carrot plugin submitted by D. Weiss

[jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2006-06-03 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-258?page=comments#action_12414598 ] Chris A. Mattmann commented on NUTCH-258: - Hi there, I believe that the fetcher halting on a LOG.Severe is the intended behavior of the system. The use of this SEVERE

[jira] Assigned: (NUTCH-236) PdfParser and RSSParser Log4j appender redirection

2006-06-03 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-236?page=all ] Chris A. Mattmann reassigned NUTCH-236: --- Assign To: Chris A. Mattmann > PdfParser and RSSParser Log4j appender redirection > -- > >

[jira] Commented: (NUTCH-236) PdfParser and RSSParser Log4j appender redirection

2006-06-03 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-236?page=comments#action_12414599 ] Chris A. Mattmann commented on NUTCH-236: - Hi Jason, I'll have a patch prepared for this issue shortly, and I'll attach it to JIRA by this Sunday night. Thanks,

[jira] Updated: (NUTCH-236) PdfParser and RSSParser Log4j appender redirection

2006-06-03 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-236?page=all ] Chris A. Mattmann updated NUTCH-236: Due Date: 05/Jun/06 > PdfParser and RSSParser Log4j appender redirection > -- > > Key: NUTCH-236

[jira] Updated: (NUTCH-187) Cannot start Nutch datanodes on Windows outside of a cygwin environment because of DF

2006-06-03 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-187?page=all ] Chris A. Mattmann updated NUTCH-187: Summary: Cannot start Nutch datanodes on Windows outside of a cygwin environment because of DF (was: Run Nutch on Windows without Cygwin) Update comme

[jira] Resolved: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2006-06-04 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-258?page=all ] Chris A. Mattmann resolved NUTCH-258: - Resolution: Won't Fix The use of LOG.severe in the fetcher indicates an unrecoverable error: thus, this issue is not a bug, and in fact describes

[jira] Closed: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2006-06-04 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-258?page=all ] Chris A. Mattmann closed NUTCH-258: --- Won't fix: issue describes intended behavior of system (fetcher component). > Once Nutch logs a SEVERE log item, Nutch fails forevermore > --

[jira] Reopened: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2006-06-05 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-258?page=all ] Chris A. Mattmann reopened NUTCH-258: - Assign To: Chris A. Mattmann Issue found to in fact be a real issue with the Fetcher: here's the proposed solution: * add flag field (preferabl

[jira] Updated: (NUTCH-236) PdfParser and RSSParser Log4j appender redirection

2006-06-08 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-236?page=all ] Chris A. Mattmann updated NUTCH-236: Attachment: NUTCH-236.Mattmann.060806.patch.txt Okay a bit late, but as usual with me :-) This patch implements Jason's suggestion for the following two

[jira] Created: (NUTCH-304) Change JIRA email address for nutch issues from apache incubator

2006-06-08 Thread Chris A. Mattmann (JIRA)
Change JIRA email address for nutch issues from apache incubator Key: NUTCH-304 URL: http://issues.apache.org/jira/browse/NUTCH-304 Project: Nutch Type: Task Environment: Dell Pentium M mobile 1.4 Ghz

[jira] Updated: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2006-06-09 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-258?page=all ] Chris A. Mattmann updated NUTCH-258: Attachment: NUTCH-258.Mattmann.060906.patch.txt Hi Folks, Attached is a patch that implements the suggested two fixes to this issue. I had to go thro

[jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2006-06-15 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-258?page=comments#action_12416379 ] Chris A. Mattmann commented on NUTCH-258: - > Thanks for this patch Chris - even if now it is outdate by NUTCH-303 :-( > Since Nutch no more use the deprecated Hadoop Log

[jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2006-07-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-258?page=comments#action_12422962 ] Chris A. Mattmann commented on NUTCH-258: - Guys, This issue slipped off my radar for a bt, but I'll have some free time this week to work on it. If there

[jira] Updated: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2006-07-25 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-258?page=all ] Chris A. Mattmann updated NUTCH-258: Fix Version/s: 0.8-dev > Once Nutch logs a SEVERE log item, Nutch fails forevermore > -- > >

[jira] Created: (NUTCH-338) Remove the text parser as an option for parsing PDF files in parse-plugins.xml

2006-08-03 Thread Chris A. Mattmann (JIRA)
Remove the text parser as an option for parsing PDF files in parse-plugins.xml -- Key: NUTCH-338 URL: http://issues.apache.org/jira/browse/NUTCH-338 Project: Nutch I

[jira] Updated: (NUTCH-338) Remove the text parser as an option for parsing PDF files in parse-plugins.xml

2006-08-03 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-338?page=all ] Chris A. Mattmann updated NUTCH-338: Attachment: NUTCH-338.Mattmann.patch.txt simple patch for removing the parse-text plugin from being mapped to PDF content type in parse-plugins.xml. >

[jira] Updated: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2006-08-04 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-258?page=all ] Chris A. Mattmann updated NUTCH-258: Attachment: NUTCH-258.Mattmann.080406.patch.txt Hi Folks, Sorry I'm a little later than I expected on this one. Attached is a patch that implements th

[jira] Commented: (NUTCH-338) Remove the text parser as an option for parsing PDF files in parse-plugins.xml

2006-08-18 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-338?page=comments#action_12429033 ] Chris A. Mattmann commented on NUTCH-338: - Hi Andrzej, A patch is available that you can apply quickly to remove the text parser as an option for pdf. Cou

[jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2006-08-18 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-258?page=comments#action_12429035 ] Chris A. Mattmann commented on NUTCH-258: - Hi Folks, A patch is available on this issue. Has anyone who was experiencing the original problem tried out t

[jira] Commented: (NUTCH-338) Remove the text parser as an option for parsing PDF files in parse-plugins.xml

2006-08-18 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-338?page=comments#action_12429042 ] Chris A. Mattmann commented on NUTCH-338: - Hi Sami, Thanks much. It's weird that it was broken seeing as it was a one line patch, however, I tried it agai

[jira] Commented: (NUTCH-356) Plugin repository cache can lead to memory leak

2006-08-21 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-356?page=comments#action_12429548 ] Chris A. Mattmann commented on NUTCH-356: - -1 for closing this issue. If there is a demonstrable memory leak in the plugin system, then I think it should b

[jira] Created: (NUTCH-379) ParseUtil does not pass through the content's URL to the ParserFactory

2006-10-04 Thread Chris A. Mattmann (JIRA)
ParseUtil does not pass through the content's URL to the ParserFactory -- Key: NUTCH-379 URL: http://issues.apache.org/jira/browse/NUTCH-379 Project: Nutch Issue Type: Bug

[jira] Work started: (NUTCH-379) ParseUtil does not pass through the content's URL to the ParserFactory

2006-10-04 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-379?page=all ] Work on NUTCH-379 started by Chris A. Mattmann. > ParseUtil does not pass through the content's URL to the ParserFactory > -- > > Key: NUTCH-379 >

[jira] Updated: (NUTCH-379) ParseUtil does not pass through the content's URL to the ParserFactory

2006-10-04 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-379?page=all ] Chris A. Mattmann updated NUTCH-379: Attachment: NUTCH-379.Mattmann.100406.patch.txt Small patch that at least gets started on fixing the larger issue of content urls and parser mapping, in

[jira] Updated: (NUTCH-384) Protocol-file plugin does not allow the parse plugins framework to operate properly

2006-10-11 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-384?page=all ] Chris A. Mattmann updated NUTCH-384: Summary: Protocol-file plugin does not allow the parse plugins framework to operate properly (was: When using the file protocol one can not map a p

[jira] Assigned: (NUTCH-384) Protocol-file plugin does not allow the parse plugins framework to operate properly

2006-10-11 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-384?page=all ] Chris A. Mattmann reassigned NUTCH-384: --- Assignee: Chris A. Mattmann > Protocol-file plugin does not allow the parse plugins framework to operate > properly >

[jira] Updated: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-406?page=all ] Chris A. Mattmann updated NUTCH-406: Assignee: Chris A. Mattmann > Metadata tries to write null values > --- > > Key: NUTCH-406 >

[jira] Work started: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-406?page=all ] Work on NUTCH-406 started by Chris A. Mattmann. > Metadata tries to write null values > --- > > Key: NUTCH-406 > URL: http://issues.apache.org/jira/browse/NUTCH-406 >

[jira] Commented: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-406?page=comments#action_12452275 ] Chris A. Mattmann commented on NUTCH-406: - Hi Andrzej, Doğacan, +1. I think it makes a lot of sense to just not include the null key in the Met container.

[jira] Commented: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-406?page=comments#action_12452285 ] Chris A. Mattmann commented on NUTCH-406: - Hi Doğacan, Loooking at your latest patch, I'm not sure that it completely does the right behavior. For exampl

[jira] Commented: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-406?page=comments#action_12452286 ] Chris A. Mattmann commented on NUTCH-406: - Hi Andrzej, Yup, you caught the same thing as me. +1 for your solution. I will extend my above patch by writin

[jira] Resolved: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-406?page=all ] Chris A. Mattmann resolved NUTCH-406. - Fix Version/s: 0.9.0 Resolution: Fixed Fix applied and tested in trunk. > Metadata tries to write null values > --

[jira] Closed: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-406?page=all ] Chris A. Mattmann closed NUTCH-406. --- Patch applied to trunk: http://svn.apache.org/viewvc?view=rev&revision=478619 > Metadata tries to write null values > ---

[jira] Assigned: (NUTCH-390) Javadoc warnings

2006-11-24 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-390?page=all ] Chris A. Mattmann reassigned NUTCH-390: --- Assignee: Chris A. Mattmann > Javadoc warnings > > > Key: NUTCH-390 > URL: http://issues.apache.or

[jira] Assigned: (NUTCH-185) XMLParser is configurable xml parser plugin.

2006-11-24 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-185?page=all ] Chris A. Mattmann reassigned NUTCH-185: --- Assignee: Chris A. Mattmann > XMLParser is configurable xml parser plugin. > > > Key:

[jira] Commented: (NUTCH-407) Make Nutch crawling parent directories for file protocol configurable

2006-11-28 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-407?page=comments#action_12453934 ] Chris A. Mattmann commented on NUTCH-407: - I'm not entirey sure what the right answer to this is. One thing that I do know is that a colleague at my own wor

[jira] Created: (NUTCH-431) Move plugin specific properties out of nutch-site.xml and into specific conf files for plugins

2007-01-20 Thread Chris A. Mattmann (JIRA)
Move plugin specific properties out of nutch-site.xml and into specific conf files for plugins -- Key: NUTCH-431 URL: https://issues.apache.org/jira/browse/NUTCH-431

[jira] Commented: (NUTCH-353) pages that serverside forwards will be refetched every time

2007-01-20 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12466285 ] Chris A. Mattmann commented on NUTCH-353: - Doug, Let's see what you got. I'd be happy to take a look at it.

[jira] Assigned: (NUTCH-431) Move plugin specific properties out of nutch-site.xml and into specific conf files for plugins

2007-01-26 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned NUTCH-431: --- Assignee: Chris A. Mattmann > Move plugin specific properties out of nutch-site.xml an

[jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2007-01-26 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467887 ] Chris A. Mattmann commented on NUTCH-258: - Guys, From recent conversations on the mailing list where Doug me

[jira] Work started: (NUTCH-390) Javadoc warnings

2007-01-29 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-390 started by Chris A. Mattmann. > Javadoc warnings > > > Key: NUTCH-390 > URL: https://issues.apache.org/j

[jira] Resolved: (NUTCH-390) Javadoc warnings

2007-01-29 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-390. - Resolution: Fixed Fix Version/s: 0.9.0 I've fixed this issue in the trunk. I had to

[jira] Closed: (NUTCH-390) Javadoc warnings

2007-01-29 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann closed NUTCH-390. --- Fixed in the trunk: http://svn.apache.org/viewvc?view=rev&revision=501315 > Javadoc warnings >

[jira] Work started: (NUTCH-384) Protocol-file plugin does not allow the parse plugins framework to operate properly

2007-01-29 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-384 started by Chris A. Mattmann. > Protocol-file plugin does not allow the parse plugins framework to operate > properly >

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471780 ] Chris A. Mattmann commented on NUTCH-443: - Nutch Newbie, What exactly do you mean when you mention Apache

[jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-02-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471955 ] Chris A. Mattmann commented on NUTCH-444: - Hi Renaud, In fact, Rome does appear to be quite easy to use, giv

[jira] Assigned: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned NUTCH-443: --- Assignee: Chris A. Mattmann > allow parsers to return multiple Parse object, this will

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471956 ] Chris A. Mattmann commented on NUTCH-443: - I'll take the lead on evaluating these patches, and getting them in

[jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-02-10 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472005 ] Chris A. Mattmann commented on NUTCH-444: - Nutch Newbie: >From the commons-feedparser site: >http://jakarta.

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-13 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472692 ] Chris A. Mattmann commented on NUTCH-443: - Hi Nutch Newbie: I've already contacted Doğacan off-list and am cu

[jira] Assigned: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-02-13 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned NUTCH-444: --- Assignee: Chris A. Mattmann > Possibly use a different library to parse RSS feed for i

[jira] Resolved: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2007-02-13 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-258. - Resolution: Cannot Reproduce With recent API changes to Hadoop, and with the note from Sco

[jira] Closed: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2007-02-13 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann closed NUTCH-258. --- Bug not reproducable in current system, and no active users experiencing bug. > Once Nutch logs a

[jira] Assigned: (NUTCH-309) Uses commons logging Code Guards

2007-02-13 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned NUTCH-309: --- Assignee: Chris A. Mattmann (was: Jerome Charron) > Uses commons logging Code Guards

[jira] Work started: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-16 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-443 started by Chris A. Mattmann. > allow parsers to return multiple Parse object, this will speed up the rss > parser > ---

[jira] Updated: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-25 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-443: Attachment: NUTCH-443.022507.patch.txt Hi Folks, Attached is a candidate patch for commit

[jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-02-25 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12475794 ] Chris A. Mattmann commented on NUTCH-444: - Hi Nick, Thanks for your insightful comments on this issue. I thi

[jira] Commented: (NUTCH-384) Protocol-file plugin does not allow the parse plugins framework to operate properly

2007-03-08 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12479430 ] Chris A. Mattmann commented on NUTCH-384: - Thanks for your patch Heiko! I am looking at this right now. If the

[jira] Resolved: (NUTCH-384) Protocol-file plugin does not allow the parse plugins framework to operate properly

2007-03-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-384. - Resolution: Fixed Fix Version/s: 0.9.0 Fixed tested in local crawl, works sufficien

[jira] Closed: (NUTCH-384) Protocol-file plugin does not allow the parse plugins framework to operate properly

2007-03-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann closed NUTCH-384. --- Patch applied, with whitespace changes, and unit test (contributed by yours truly): http://svn.a

[jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-05-10 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494764 ] Chris A. Mattmann commented on NUTCH-444: - Hi Doğacan, Well I must say, with all the discussion that's gon

[jira] Reopened: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-05-13 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reopened NUTCH-443: - Assignee: Chris A. Mattmann (was: Andrzej Bialecki ) Per Doğacan's comment, we need to

[jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-05-13 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495381 ] Chris A. Mattmann commented on NUTCH-444: - Doğacan -- I will check this out tomorrow (Monday) night, latest Tu

[jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-05-30 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500133 ] Chris A. Mattmann commented on NUTCH-444: - Hi Guys, Okay, here is the way that I currently see this issue, a

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-06-16 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505501 ] Chris A. Mattmann commented on NUTCH-443: - Doğacan, Whoops :) This one kind of fell off the radar screen.

[jira] Commented: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object

2007-06-16 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505502 ] Chris A. Mattmann commented on NUTCH-485: - Doğacan, +1. As for your question, IMO, these type of minor change

[jira] Resolved: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-06-17 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-443. - Resolution: Fixed Patch tested and contributed by Dogacan. This update is a fix and semant

[jira] Closed: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-06-17 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann closed NUTCH-443. --- Patch applied to trunk: http://svn.apache.org/viewvc?rev=548076&view=rev > allow parsers to retu

[jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-06-17 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505607 ] Chris A. Mattmann commented on NUTCH-444: - Hi Nutch Newbie: I will take a look at this today, and take an act

[jira] Work started: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-06-17 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-444 started by Chris A. Mattmann. > Possibly use a different library to parse RSS feed for improved performance > and compatibility > --

[jira] Updated: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-06-17 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-444: Attachment: NUTCH-444.Mattmann.061707.patch.txt Hi Folks, Here is a patch that brings this

[jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-06-19 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506160 ] Chris A. Mattmann commented on NUTCH-444: - I agree with Doğacan in his above comments. I will go ahead and com

[jira] Closed: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-06-19 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann closed NUTCH-444. --- Patch applied to trunk: http://svn.apache.org/viewvc?rev=548730&view=rev > Possibly use a differ

[jira] Resolved: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-06-19 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-444. - Resolution: Fixed Applied my revised version of Dogacan's patch (posted earlier). Tested a

[jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-06-19 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506187 ] Chris A. Mattmann commented on NUTCH-444: - Doğacan, Sounds great. Please, go ahead and update the plugin to

[jira] Created: (NUTCH-562) Port mime type framework to use Tika mime detection framework

2007-09-28 Thread Chris A. Mattmann (JIRA)
Port mime type framework to use Tika mime detection framework - Key: NUTCH-562 URL: https://issues.apache.org/jira/browse/NUTCH-562 Project: Nutch Issue Type: Improvement

  1   2   3   >