[jira] Work started: (NUTCH-816) Add zip target to build.xml

2010-05-08 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-816 started by Chris A. Mattmann. Add zip target to build.xml --- Key: NUTCH-816 URL:

[jira] Resolved: (NUTCH-816) Add zip target to build.xml

2010-05-08 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-816. - Resolution: Fixed - fixed in r942427 Add zip target to build.xml

[jira] Created: (NUTCH-816) Add zip target to build.xml

2010-04-27 Thread Chris A. Mattmann (JIRA)
Add zip target to build.xml --- Key: NUTCH-816 URL: https://issues.apache.org/jira/browse/NUTCH-816 Project: Nutch Issue Type: Improvement Components: build Affects Versions: 1.0.0 Environment:

[jira] Work started: (NUTCH-812) Crawl.java incorrectly uses the Generator API resulting in NPE

2010-04-18 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-812 started by Chris A. Mattmann. Crawl.java incorrectly uses the Generator API resulting in NPE

[jira] Assigned: (NUTCH-812) Crawl.java incorrectly uses the Generator API resulting in NPE

2010-04-18 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned NUTCH-812: --- Assignee: Chris A. Mattmann Crawl.java incorrectly uses the Generator API resulting

[jira] Resolved: (NUTCH-812) Crawl.java incorrectly uses the Generator API resulting in NPE

2010-04-18 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-812. - Fix Version/s: 1.1 Resolution: Fixed - fixed in r935453. Thanks, Phil and Andrzej!

[jira] Commented: (NUTCH-570) Improvement of URL Ordering in Generator.java

2010-04-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854767#action_12854767 ] Chris A. Mattmann commented on NUTCH-570: - Hi Otis: I think your logic perfectly

[jira] Commented: (NUTCH-789) Improvements to Tika parser

2010-04-04 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853285#action_12853285 ] Chris A. Mattmann commented on NUTCH-789: - Hey Julien, Tika 0.7 is available from

[jira] Commented: (NUTCH-789) Improvements to Tika parser

2010-04-03 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853212#action_12853212 ] Chris A. Mattmann commented on NUTCH-789: - Hey Julien -- okey dok, Tika 0.7 has been

[jira] Updated: (NUTCH-249) black- white list url filtering

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-249: Fix Version/s: (was: 1.1) - push out per http://bit.ly/c7tBv9 black- white list url

[jira] Updated: (NUTCH-309) Uses commons logging Code Guards

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-309: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 Uses commons

[jira] Updated: (NUTCH-763) Separate configuration files from resources to be included in the job file

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-763: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 Separate

[jira] Updated: (NUTCH-577) Use explicit tika-config.xml file to enable mime magic detection to be turned on and off

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-577: Due Date: 30/Nov/07 (was: 30/Nov/07) Fix Version/s: (was: 1.1) - pushing this

[jira] Updated: (NUTCH-310) Review Log Levels

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-310: Fix Version/s: (was: 1.1) Assignee: Chris A. Mattmann (was: Jerome Charron) -

[jira] Updated: (NUTCH-673) Upgrade the Carrot2 plug-in to release 3.0

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-673: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 Upgrade the

[jira] Updated: (NUTCH-664) Possibility to update already stored documents.

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-664: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 Possibility to

[jira] Updated: (NUTCH-750) HtmlParser plugin - page title extraction

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-750: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 HtmlParser

[jira] Updated: (NUTCH-564) External parser supports encoding attribute

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-564: Patch Info: [Patch Available] Fix Version/s: (was: 1.1) - pushing this out per

[jira] Updated: (NUTCH-477) Extend URLFilters to support different filtering chains

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-477: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 Extend

[jira] Updated: (NUTCH-251) Administration GUI

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-251: Patch Info: [Patch Available] Fix Version/s: (was: 1.1) - pushing this out per

[jira] Updated: (NUTCH-609) Allow Plugins to be Loaded from Jar File(s)

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-609: Due Date: 13/Feb/08 (was: 13/Feb/08) Patch Info: [Patch Available] Fix

[jira] Resolved: (NUTCH-794) Language Identification must use check the parse metadata for language values

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-794. - Resolution: Fixed @julien -- I think this issue has been fixed in Tika right? If not,

[jira] Updated: (NUTCH-578) URL fetched with 403 is generated over and over again

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-578: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 URL fetched

[jira] Updated: (NUTCH-540) some problem about the Nutch cache

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-540: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 some problem

[jira] Updated: (NUTCH-455) dedup on tokenized fields is faulty

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-455: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 dedup on

[jira] Updated: (NUTCH-747) injectIndex metadatas and inherit these metadatas to all matching suburls

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-747: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 injectIndex

[jira] Updated: (NUTCH-479) Support for OR queries

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-479: Patch Info: [Patch Available] Fix Version/s: (was: 1.1) - pushing this out per

[jira] Updated: (NUTCH-677) Segment merge filering based on segment content

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-677: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 Segment merge

[jira] Updated: (NUTCH-774) Retry interval in crawl date is set to 0

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-774: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 Retry interval

[jira] Updated: (NUTCH-460) RDF parser plugin

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-460: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 RDF parser

[jira] Updated: (NUTCH-460) RDF parser plugin

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-460: Patch Info: [Patch Available] - pushing this out per http://bit.ly/c7tBv9 RDF parser

[jira] Updated: (NUTCH-729) NPE in FieldIndexer when BasicFields url doesn't exist

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-729: Due Date: 26/Mar/09 (was: 26/Mar/09) Patch Info: [Patch Available] Fix

[jira] Updated: (NUTCH-573) Multiple Domains - Query Search

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-573: - pushing this out per http://bit.ly/c7tBv9 Multiple Domains - Query Search

[jira] Updated: (NUTCH-717) Make Nutch Solr integration easier

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-717: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 Make Nutch Solr

[jira] Updated: (NUTCH-541) Index url field untokenized

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-541: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 Index url field

[jira] Updated: (NUTCH-628) Host database to keep track of host-level information

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-628: Patch Info: [Patch Available] Fix Version/s: (was: 1.1) - pushing this out per

[jira] Updated: (NUTCH-650) Hbase Integration

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-650: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 Hbase

[jira] Updated: (NUTCH-583) FeedParser empty links for items

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-583: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 FeedParser

[jira] Updated: (NUTCH-666) Analysis plugins for multiple language and new Language Identifier Tool

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-666: Due Date: 27/Nov/08 (was: 27/Nov/08) Fix Version/s: (was: 1.1) - pushing this

[jira] Updated: (NUTCH-666) Analysis plugins for multiple language and new Language Identifier Tool

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-666: Patch Info: [Patch Available] Analysis plugins for multiple language and new Language

[jira] Updated: (NUTCH-475) Adaptive crawl delay

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-475: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 Adaptive crawl

[jira] Updated: (NUTCH-771) Add WebGraph classes to the bin/nutch script

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-771: Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 Add WebGraph

[jira] Commented: (NUTCH-673) Upgrade the Carrot2 plug-in to release 3.0

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852047#action_12852047 ] Chris A. Mattmann commented on NUTCH-673: - Folks: if you get time to put together a

[jira] Commented: (NUTCH-789) Improvements to Tika parser

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852048#action_12852048 ] Chris A. Mattmann commented on NUTCH-789: - Folks, I'm going to put together an RC

[jira] Commented: (NUTCH-794) Language Identification must use check the parse metadata for language values

2010-03-31 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852101#action_12852101 ] Chris A. Mattmann commented on NUTCH-794: - Hey Julien, yepper, I posted an RC of

[jira] Commented: (NUTCH-801) Remove RTF and MP3 parse plugins

2010-03-10 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12843576#action_12843576 ] Chris A. Mattmann commented on NUTCH-801: - +1 on this from me, Julien. Sounds good.

[jira] Commented: (NUTCH-790) Some external javadoc links are broken

2010-02-14 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12833591#action_12833591 ] Chris A. Mattmann commented on NUTCH-790: - +1 to commit this. Thanks, Sami! Some

[jira] Commented: (NUTCH-766) Tika parser

2010-02-11 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832565#action_12832565 ] Chris A. Mattmann commented on NUTCH-766: - Hi Julien: {quote} @Chris : I just did a

[jira] Commented: (NUTCH-766) Tika parser

2010-02-11 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832588#action_12832588 ] Chris A. Mattmann commented on NUTCH-766: - @Julien: Sigh, no I didn't! :( That's

[jira] Commented: (NUTCH-766) Tika parser

2010-02-11 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832866#action_12832866 ] Chris A. Mattmann commented on NUTCH-766: - - forgot to add in dep libs, added in

[jira] Commented: (NUTCH-766) Tika parser

2010-02-10 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832255#action_12832255 ] Chris A. Mattmann commented on NUTCH-766: - {quote} +1 to commit this... {quote}

[jira] Commented: (NUTCH-766) Tika parser

2010-02-10 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832398#action_12832398 ] Chris A. Mattmann commented on NUTCH-766: - I'm going to hold off on committing this

[jira] Commented: (NUTCH-766) Tika parser

2010-01-25 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804546#action_12804546 ] Chris A. Mattmann commented on NUTCH-766: - Hi Sami: {quote} Chris, can you please

[jira] Commented: (NUTCH-766) Tika parser

2010-01-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12803709#action_12803709 ] Chris A. Mattmann commented on NUTCH-766: - {quote} Sure, but it would be silly to

[jira] Issue Comment Edited: (NUTCH-766) Tika parser

2010-01-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12803709#action_12803709 ] Chris A. Mattmann edited comment on NUTCH-766 at 1/22/10 2:38 PM:

[jira] Commented: (NUTCH-766) Tika parser

2010-01-11 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798718#action_12798718 ] Chris A. Mattmann commented on NUTCH-766: - Hi Julien: I have had a look and was

[jira] Created: (NUTCH-777) Upgrading to jetty6 broke unit tests

2009-12-18 Thread Chris A. Mattmann (JIRA)
Upgrading to jetty6 broke unit tests Key: NUTCH-777 URL: https://issues.apache.org/jira/browse/NUTCH-777 Project: Nutch Issue Type: Bug Components: build Environment: My MacBook pro,

[jira] Work started: (NUTCH-777) Upgrading to jetty6 broke unit tests

2009-12-18 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-777 started by Chris A. Mattmann. Upgrading to jetty6 broke unit tests Key: NUTCH-777

[jira] Commented: (NUTCH-777) Upgrading to jetty6 broke unit tests

2009-12-18 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792565#action_12792565 ] Chris A. Mattmann commented on NUTCH-777: - Here is what I was getting with the

[jira] Commented: (NUTCH-777) Upgrading to jetty6 broke unit tests

2009-12-18 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792579#action_12792579 ] Chris A. Mattmann commented on NUTCH-777: - Okay with the changes I'm about to

[jira] Updated: (NUTCH-766) Tika parser

2009-12-18 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-766: Fix Version/s: 1.1 Tika parser --- Key: NUTCH-766

[jira] Assigned: (NUTCH-766) Tika parser

2009-12-15 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned NUTCH-766: --- Assignee: Chris A. Mattmann Tika parser --- Key:

[jira] Work started: (NUTCH-766) Tika parser

2009-12-15 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-766 started by Chris A. Mattmann. Tika parser --- Key: NUTCH-766 URL:

[jira] Resolved: (NUTCH-185) XMLParser is configurable xml parser plugin.

2009-11-25 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-185. - Resolution: Won't Fix Fix Version/s: 1.1 See comments related to NUTCH-767 in this

[jira] Commented: (NUTCH-767) Update version of Tika for the MimeType detection

2009-11-18 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779476#action_12779476 ] Chris A. Mattmann commented on NUTCH-767: - Hi Julien, Thanks for pushing this

[jira] Commented: (NUTCH-714) Need a SFTP and SCP Protocol Handler

2009-03-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12680348#action_12680348 ] Chris A. Mattmann commented on NUTCH-714: - Hi Sanjoy, When you get a patch, let me

[jira] Assigned: (NUTCH-714) Need a SFTP and SCP Protocol Handler

2009-03-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned NUTCH-714: --- Assignee: Chris A. Mattmann Need a SFTP and SCP Protocol Handler

[jira] Commented: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException

2009-02-17 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674219#action_12674219 ] Chris A. Mattmann commented on NUTCH-631: - Sami, +1. Sorry I didn't have time to get

[jira] Assigned: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException

2009-02-02 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned NUTCH-631: --- Assignee: Chris A. Mattmann MoreIndexingFilter fails with NoSuchElementException

[jira] Work started: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException

2009-02-02 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-631 started by Chris A. Mattmann. MoreIndexingFilter fails with NoSuchElementException

[jira] Resolved: (NUTCH-621) Nutch needs to declare it's crypto usage

2008-09-29 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-621. - Resolution: Fixed - resolved in r699866 Nutch needs to declare it's crypto usage

[jira] Updated: (NUTCH-621) Nutch needs to declare it's crypto usage

2008-09-29 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-621: Affects Version/s: 0.7 0.7.1 0.7.2

[jira] Commented: (NUTCH-621) Nutch needs to declare it's crypto usage

2008-09-28 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12635241#action_12635241 ] Chris A. Mattmann commented on NUTCH-621: - Folks, Based on Jukka's comments, I've

[jira] Commented: (NUTCH-621) Nutch needs to declare it's crypto usage

2008-09-11 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12630445#action_12630445 ] Chris A. Mattmann commented on NUTCH-621: - Grant: Great, thanks. Okay, once you get

[jira] Updated: (NUTCH-621) Nutch needs to declare it's crypto usage

2008-09-10 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-621: Attachment: NUTCH-621.step1.Mattmann.091008.patch.txt Hey Grant: Sorry about this, but I

[jira] Commented: (NUTCH-621) Nutch needs to declare it's crypto usage

2008-06-11 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12604409#action_12604409 ] Chris A. Mattmann commented on NUTCH-621: - Hi Grant: Thanks. The code does exist in

[jira] Commented: (NUTCH-621) Nutch needs to declare it's crypto usage

2008-06-10 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12603884#action_12603884 ] Chris A. Mattmann commented on NUTCH-621: - Hi Grant: Thanks for the poke on this. I

[jira] Updated: (NUTCH-618) Tika error Media type alias already exists

2008-06-01 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-618: Attachment: NUTCH-618.Mattmann.patch.060108.txt Hey Guys: Okey dok: here's a candidate

[jira] Work started: (NUTCH-618) Tika error Media type alias already exists

2008-03-06 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-618 started by Chris A. Mattmann. Tika error Media type alias already exists Key: NUTCH-618

[jira] Commented: (NUTCH-618) Tika error Media type alias already exists

2008-03-06 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12576051#action_12576051 ] Chris A. Mattmann commented on NUTCH-618: - Hey Andrzej: bq. I noticed also another

[jira] Closed: (NUTCH-608) Upgrade nutch to use released apache-tika-0.1-incubating

2008-02-12 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann closed NUTCH-608. --- - Patch applied to trunk: http://svn.apache.org/viewvc?rev=620811view=rev Thanks for the

[jira] Resolved: (NUTCH-608) Upgrade nutch to use released apache-tika-0.1-incubating

2008-02-12 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-608. - Resolution: Fixed - added MimeUtil facade class to insulate Nutch from underlying mime

[jira] Updated: (NUTCH-608) Upgrade nutch to use released apache-tika-0.1-incubating

2008-02-11 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-608: Attachment: NUTCH-608.Mattmann.021108.patch.v4.txt For completeness sake, an attached patch

[jira] Updated: (NUTCH-608) Upgrade nutch to use released apache-tika-0.1-incubating

2008-02-11 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-608: Attachment: NUTCH-608.Mattmann.021108.patch.v3.txt Hi Andrzej, Thanks for your comments.

[jira] Updated: (NUTCH-608) Upgrade nutch to use released apache-tika-0.1-incubating

2008-02-11 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-608: Attachment: NUTCH-608.Mattmann.021108.patch.v2.txt Hi Andrzej: Good idea. The facade

[jira] Updated: (NUTCH-608) Upgrade nutch to use released apache-tika-0.1-incubating

2008-02-11 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-608: Attachment: NUTCH-608.Mattmann.021108.patch.txt - updated patch, removes unintentional

[jira] Updated: (NUTCH-608) Upgrade nutch to use released apache-tika-0.1-incubating

2008-02-10 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-608: Attachment: tika-0.1-incubating.jar apache tika 0.1-incubating Upgrade nutch to use

[jira] Updated: (NUTCH-608) Upgrade nutch to use released apache-tika-0.1-incubating

2008-02-10 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-608: Attachment: NUTCH-608.Mattmann.021008.patch.txt Initial patch, horrendously late :)

[jira] Commented: (NUTCH-608) Upgrade nutch to use released apache-tika-0.1-incubating

2008-02-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12567384#action_12567384 ] Chris A. Mattmann commented on NUTCH-608: - If there are no objections, I'd like to

[jira] Work started: (NUTCH-608) Upgrade nutch to use released apache-tika-0.1-incubating

2008-02-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-608 started by Chris A. Mattmann. Upgrade nutch to use released apache-tika-0.1-incubating

[jira] Commented: (NUTCH-608) Upgrade nutch to use released apache-tika-0.1-incubating

2008-02-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12567395#action_12567395 ] Chris A. Mattmann commented on NUTCH-608: - Sorry folks, the patch didn't go through

[jira] Commented: (NUTCH-609) Allow Plugins to be Loaded from Jar File(s)

2008-02-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12567407#action_12567407 ] Chris A. Mattmann commented on NUTCH-609: - bq. the downside to this is we could end

[jira] Commented: (NUTCH-607) Update build.xml to include tika jar in war file

2008-02-08 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12567263#action_12567263 ] Chris A. Mattmann commented on NUTCH-607: - +1, looks good! Update build.xml to

[jira] Created: (NUTCH-608) Upgrade nutch to use released apache-tika-0.1-incubating

2008-02-08 Thread Chris A. Mattmann (JIRA)
Upgrade nutch to use released apache-tika-0.1-incubating Key: NUTCH-608 URL: https://issues.apache.org/jira/browse/NUTCH-608 Project: Nutch Issue Type: Improvement

[jira] Created: (NUTCH-577) Use explicit tika-config.xml file to enable mime magic detection to be turned on and off

2007-11-17 Thread Chris A. Mattmann (JIRA)
Use explicit tika-config.xml file to enable mime magic detection to be turned on and off Key: NUTCH-577 URL: https://issues.apache.org/jira/browse/NUTCH-577

[jira] Commented: (NUTCH-547) Redirection handling: YahooSlurp's algorithm

2007-11-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540884 ] Chris A. Mattmann commented on NUTCH-547: - +1: without having tested it, it improves an existing significant

[jira] Commented: (NUTCH-574) Including inlink anchor text in index can create irrelevant search results.

2007-11-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540877 ] Chris A. Mattmann commented on NUTCH-574: - IMHO what Dennis suggest is fine so long as it's a configurable

[jira] Resolved: (NUTCH-562) Port mime type framework to use Tika mime detection framework

2007-10-08 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-562. - Resolution: Fixed - Applied patch, with minor changes to use static version of MimeUtils

[jira] Closed: (NUTCH-562) Port mime type framework to use Tika mime detection framework

2007-10-08 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann closed NUTCH-562. --- - Patch applied to trunk in r583016 Port mime type framework to use Tika mime detection

[jira] Updated: (NUTCH-562) Port mime type framework to use Tika mime detection framework

2007-10-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-562: Attachment: NUTCH-562.Mattmann.patch.txt Initial patch for comments: 1. This patch removes

  1   2   >