[jira] Commented: (NUTCH-798) Upgrade to SOLR1.4

2010-03-10 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12843546#action_12843546 ] Sami Siren commented on NUTCH-798: -- +1 Upgrade to SOLR1.4 --

[jira] Created: (NUTCH-793) search.jsp compile errors

2010-02-15 Thread Sami Siren (JIRA)
search.jsp compile errors - Key: NUTCH-793 URL: https://issues.apache.org/jira/browse/NUTCH-793 Project: Nutch Issue Type: Bug Components: web gui Reporter: Sami Siren Assignee: Sami

[jira] Resolved: (NUTCH-793) search.jsp compile errors

2010-02-15 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-793. -- Resolution: Fixed committed a fix search.jsp compile errors -

[jira] Resolved: (NUTCH-788) search.jsp typo causing searches to fail

2010-02-15 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-788. -- Resolution: Fixed Fix Version/s: 1.1 Assignee: Sami Siren Thanks Sammy for the fix, I

[jira] Commented: (NUTCH-789) Improvements to Tika parser

2010-02-15 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12833714#action_12833714 ] Sami Siren commented on NUTCH-789: -- It would be really useful to include the improvements

[jira] Created: (NUTCH-790) Some external javadoc links are broken

2010-02-14 Thread Sami Siren (JIRA)
Some external javadoc links are broken -- Key: NUTCH-790 URL: https://issues.apache.org/jira/browse/NUTCH-790 Project: Nutch Issue Type: Improvement Components: build Reporter: Sami

[jira] Updated: (NUTCH-790) Some external javadoc links are broken

2010-02-14 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-790: - Attachment: NUTCH-790.patch proposed patch, fixes links for lucene and hadoop, also updates j2se link to

[jira] Created: (NUTCH-791) External links for published javadocs are partially broken

2010-02-14 Thread Sami Siren (JIRA)
External links for published javadocs are partially broken -- Key: NUTCH-791 URL: https://issues.apache.org/jira/browse/NUTCH-791 Project: Nutch Issue Type: Bug Components:

[jira] Resolved: (NUTCH-790) Some external javadoc links are broken

2010-02-14 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-790. -- Resolution: Fixed Fix Version/s: 1.1 committed Some external javadoc links are broken

[jira] Updated: (NUTCH-792) Nutch version still contains 1.0

2010-02-14 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-792: - Attachment: NUTCH-792.patch pump version to 1.1-dev Nutch version still contains 1.0

[jira] Created: (NUTCH-792) Nutch version still contains 1.0

2010-02-14 Thread Sami Siren (JIRA)
Nutch version still contains 1.0 Key: NUTCH-792 URL: https://issues.apache.org/jira/browse/NUTCH-792 Project: Nutch Issue Type: Task Components: build Reporter: Sami Siren

[jira] Resolved: (NUTCH-792) Nutch version still contains 1.0

2010-02-14 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-792. -- Resolution: Fixed committed Nutch version still contains 1.0

[jira] Commented: (NUTCH-766) Tika parser

2010-02-10 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832406#action_12832406 ] Sami Siren commented on NUTCH-766: -- I suggest that we would still drive this a bit further

[jira] Updated: (NUTCH-766) Tika parser

2010-02-10 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-766: - Attachment: NutchTikaConfig.java Extended TikaConfig that is able to load parsers and can be used with

[jira] Updated: (NUTCH-766) Tika parser

2010-02-10 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-766: - Attachment: TikaParser.java Modified parser that can process package formats too. To get rid of the mime

[jira] Commented: (NUTCH-673) Upgrade the Carrot2 plug-in to release 3.0

2010-02-05 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12830053#action_12830053 ] Sami Siren commented on NUTCH-673: -- {quote} Any plans or reasons not to upgrade to Lucene

[jira] Commented: (NUTCH-781) Update Tika to v0.6 for the MimeType detection

2010-02-02 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828561#action_12828561 ] Sami Siren commented on NUTCH-781: -- {quote} the version we had was the same as the one

[jira] Resolved: (NUTCH-775) Enhance Searcher interface

2010-02-01 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-775. -- Resolution: Fixed I committed this Enhance Searcher interface --

[jira] Commented: (NUTCH-781) Update Tika to v0.6 for the MimeType detection

2010-02-01 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828275#action_12828275 ] Sami Siren commented on NUTCH-781: -- did you forgot to update conf/tika-mimetypes.xml ?

[jira] Commented: (NUTCH-775) Enhance Searcher interface

2010-01-28 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12806019#action_12806019 ] Sami Siren commented on NUTCH-775: -- If there are no objections I'll commit the proposed

[jira] Commented: (NUTCH-775) Enhance Searcher interface

2010-01-28 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12806051#action_12806051 ] Sami Siren commented on NUTCH-775: -- {quote}IMHO this could go as it is ... one suggestion

[jira] Commented: (NUTCH-766) Tika parser

2010-01-27 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805661#action_12805661 ] Sami Siren commented on NUTCH-766: -- {quote} Sure, it's more of a configuration

[jira] Commented: (NUTCH-766) Tika parser

2010-01-25 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804448#action_12804448 ] Sami Siren commented on NUTCH-766: -- +1, I'm going to agree on this one here Julien. Other

[jira] Commented: (NUTCH-766) Tika parser

2010-01-22 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12803664#action_12803664 ] Sami Siren commented on NUTCH-766: -- I took a brief look into the proposed patch, some

[jira] Commented: (NUTCH-766) Tika parser

2010-01-22 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12803673#action_12803673 ] Sami Siren commented on NUTCH-766: -- Sure, but it would be silly to block the whole Tika

[jira] Updated: (NUTCH-775) Enhance Searcher interface

2009-12-30 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-775: - Attachment: NUTCH-775.patch I ended up changing the Query API instead since the changes were smaller from

[jira] Commented: (NUTCH-666) Analysis plugins for multiple language and new Language Identifier Tool

2009-12-16 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12791829#action_12791829 ] Sami Siren commented on NUTCH-666: -- We should also consider switching to Tika for language

[jira] Created: (NUTCH-775) Enhance Searcher interface

2009-12-15 Thread Sami Siren (JIRA)
Enhance Searcher interface -- Key: NUTCH-775 URL: https://issues.apache.org/jira/browse/NUTCH-775 Project: Nutch Issue Type: Improvement Components: searcher Reporter: Sami Siren

[jira] Resolved: (NUTCH-743) Site search powered by Lucene/Solr

2009-07-02 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-743. -- Resolution: Fixed committed Site search powered by Lucene/Solr --

[jira] Created: (NUTCH-743) Site search powered by Lucene/Solr

2009-06-23 Thread Sami Siren (JIRA)
Site search powered by Lucene/Solr -- Key: NUTCH-743 URL: https://issues.apache.org/jira/browse/NUTCH-743 Project: Nutch Issue Type: New Feature Components: documentation Reporter: Sami

[jira] Updated: (NUTCH-743) Site search powered by Lucene/Solr

2009-06-23 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-743: - Attachment: NUTCH-743.patch If there are no objections I will commit this within a week or so. Site

[jira] Updated: (NUTCH-730) NPE in LinkRank if no nodes with which to create the WebGraph

2009-03-27 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-730: - Fix Version/s: (was: 1.0.0) NPE in LinkRank if no nodes with which to create the WebGraph

[jira] Resolved: (NUTCH-722) Nutch contains jars that we cannot redistribute

2009-03-23 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-722. -- Resolution: Fixed removed the jars and added note about this in README.txt Nutch contains jars that

[jira] Commented: (NUTCH-728) Improve nutch release packaging

2009-03-20 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683814#action_12683814 ] Sami Siren commented on NUTCH-728: -- not really, it just happens to be the mirror I use.

[jira] Created: (NUTCH-722) Nutch contains jars that we cannot redistribute

2009-03-19 Thread Sami Siren (JIRA)
Nutch contains jars that we cannot redistribute --- Key: NUTCH-722 URL: https://issues.apache.org/jira/browse/NUTCH-722 Project: Nutch Issue Type: Bug Reporter: Sami Siren

[jira] Created: (NUTCH-723) LICENCE.txt is lacking info that should be there

2009-03-19 Thread Sami Siren (JIRA)
LICENCE.txt is lacking info that should be there Key: NUTCH-723 URL: https://issues.apache.org/jira/browse/NUTCH-723 Project: Nutch Issue Type: Bug Components: build Affects

[jira] Created: (NUTCH-725) NOTICE.txt is lacking info that should be there

2009-03-19 Thread Sami Siren (JIRA)
NOTICE.txt is lacking info that should be there --- Key: NUTCH-725 URL: https://issues.apache.org/jira/browse/NUTCH-725 Project: Nutch Issue Type: Bug Affects Versions: 1.0.0

[jira] Created: (NUTCH-726) README.txt is lacking info that should be there

2009-03-19 Thread Sami Siren (JIRA)
README.txt is lacking info that should be there --- Key: NUTCH-726 URL: https://issues.apache.org/jira/browse/NUTCH-726 Project: Nutch Issue Type: Bug Components: build Affects

[jira] Created: (NUTCH-727) Add KEYS file to release artifact

2009-03-19 Thread Sami Siren (JIRA)
Add KEYS file to release artifact - Key: NUTCH-727 URL: https://issues.apache.org/jira/browse/NUTCH-727 Project: Nutch Issue Type: Bug Affects Versions: 1.0.0 Reporter: Sami Siren comment

[jira] Resolved: (NUTCH-726) README.txt is lacking info that should be there

2009-03-19 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-726. -- Resolution: Fixed Fix Version/s: 1.0.0 committed README.txt is lacking info that should be

[jira] Resolved: (NUTCH-724) Drop the JAI libraries

2009-03-19 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-724. -- Resolution: Duplicate Drop the JAI libraries -- Key: NUTCH-724

[jira] Commented: (NUTCH-722) Nutch contains jars that we cannot redistribute

2009-03-19 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683482#action_12683482 ] Sami Siren commented on NUTCH-722: -- +1, i am fine with this solution too Nutch contains

[jira] Resolved: (NUTCH-725) NOTICE.txt is lacking info that should be there

2009-03-19 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-725. -- Resolution: Fixed went through the libs and added copyright notices NOTICE.txt is lacking info that

[jira] Resolved: (NUTCH-723) LICENCE.txt is lacking info that should be there

2009-03-19 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-723. -- Resolution: Fixed added licenses of 4rd party software LICENCE.txt is lacking info that should be

[jira] Issue Comment Edited: (NUTCH-723) LICENCE.txt is lacking info that should be there

2009-03-19 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683618#action_12683618 ] Sami Siren edited comment on NUTCH-723 at 3/19/09 2:11 PM: --- added

[jira] Updated: (NUTCH-728) Improve nutch release packaging

2009-03-19 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-728: - Attachment: NUTCH-728.patch add simple target to generate source release tgz from svn tag -did not touch

[jira] Commented: (NUTCH-722) Nutch contains jars that we cannot redistribute

2009-03-19 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683634#action_12683634 ] Sami Siren commented on NUTCH-722: -- if there are no objections I will commit this change

[jira] Resolved: (NUTCH-715) Subcollection plugin doesn't work with default subcollections.xml file

2009-03-10 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-715. -- Resolution: Fixed committed, thanks Dmitry! Subcollection plugin doesn't work with default

[jira] Commented: (NUTCH-705) parse-rtf plugin

2009-03-10 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12680411#action_12680411 ] Sami Siren commented on NUTCH-705: -- I think we should start looking at Apache Tika for most

[jira] Created: (NUTCH-717) Make Nutch Solr integration easier

2009-03-10 Thread Sami Siren (JIRA)
Make Nutch Solr integration easier -- Key: NUTCH-717 URL: https://issues.apache.org/jira/browse/NUTCH-717 Project: Nutch Issue Type: New Feature Reporter: Sami Siren Fix For: 1.1

[jira] Commented: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1

2009-03-04 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12678691#action_12678691 ] Sami Siren commented on NUTCH-711: -- +1 Indexer failing after upgrade to Hadoop 0.19.1

[jira] Updated: (NUTCH-700) Neko1.9.11 goes into a loop

2009-03-02 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-700: - Fix Version/s: 1.0.0 Assignee: Sami Siren This one just bit me - the effect is that parsing

[jira] Resolved: (NUTCH-700) Neko1.9.11 goes into a loop

2009-03-02 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-700. -- Resolution: Fixed reverted to 0.9.4 Neko1.9.11 goes into a loop ---

[jira] Resolved: (NUTCH-669) Consolidate code for Fetcher and Fetcher2

2009-03-02 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-669. -- Resolution: Fixed replaced fetcher with fetcher2 Consolidate code for Fetcher and Fetcher2

[jira] Commented: (NUTCH-705) parse-rtf plugin

2009-02-27 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12677508#action_12677508 ] Sami Siren commented on NUTCH-705: -- I think that the patch contains some lgpl code that we

[jira] Resolved: (NUTCH-699) Add an official solr schema for solr integration

2009-02-26 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-699. -- Resolution: Fixed committed Add an official solr schema for solr integration

[jira] Assigned: (NUTCH-669) Consolidate code for Fetcher and Fetcher2

2009-02-26 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren reassigned NUTCH-669: Assignee: Sami Siren Consolidate code for Fetcher and Fetcher2

[jira] Commented: (NUTCH-703) Upgrade to Hadoop 0.19.1

2009-02-26 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12677266#action_12677266 ] Sami Siren commented on NUTCH-703: -- Andrzej, are you working with this now? Upgrade to

[jira] Resolved: (NUTCH-247) robot parser to restrict.

2009-02-24 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-247. -- Resolution: Fixed Assignee: Sami Siren (was: Dennis Kubes) committed this - added checking to

[jira] Created: (NUTCH-701) replace Fetcher with Fetcher2

2009-02-24 Thread Sami Siren (JIRA)
replace Fetcher with Fetcher2 - Key: NUTCH-701 URL: https://issues.apache.org/jira/browse/NUTCH-701 Project: Nutch Issue Type: Bug Components: fetcher Reporter: Sami Siren

[jira] Updated: (NUTCH-701) Replace Fetcher with Fetcher2

2009-02-24 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-701: - Summary: Replace Fetcher with Fetcher2 (was: replace Fetcher with Fetcher2) Replace Fetcher with

[jira] Resolved: (NUTCH-698) CrawlDb is corrupted after a few crawl cycles

2009-02-24 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-698. -- Resolution: Fixed committed. thanks guys CrawlDb is corrupted after a few crawl cycles

[jira] Commented: (NUTCH-699) Add an official solr schema for solr integration

2009-02-24 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12676233#action_12676233 ] Sami Siren commented on NUTCH-699: -- We could put it under conf/ ? Add an official solr

[jira] Resolved: (NUTCH-701) Replace Fetcher with Fetcher2

2009-02-24 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-701. -- Resolution: Duplicate Replace Fetcher with Fetcher2 -

[jira] Updated: (NUTCH-669) Consolidate code for Fetcher and Fetcher2

2009-02-24 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-669: - Fix Version/s: (was: 1.1) 1.0.0 Moving this back to 1.0 Are you close with your

[jira] Resolved: (NUTCH-694) Distributed Search Server fails

2009-02-22 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-694. -- Resolution: Fixed Committed. Thanks for testing it. Distributed Search Server fails

[jira] Commented: (NUTCH-477) Extend URLFilters to support different filtering chains

2009-02-22 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12675793#action_12675793 ] Sami Siren commented on NUTCH-477: -- It's your call. IMO the whole URLFIlters - URLFIlter,

[jira] Updated: (NUTCH-694) Distributed Search Server fails

2009-02-20 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-694: - Attachment: NUTCH-694-2.patch I rechecked this again and there was also something else wrong, I am

[jira] Updated: (NUTCH-694) Distributed Search Server fails

2009-02-20 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-694: - Patch Info: [Patch Available] Assignee: Sami Siren Distributed Search Server fails

[jira] Updated: (NUTCH-573) Multiple Domains - Query Search

2009-02-20 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-573: - Patch Info: [Patch Available] Multiple Domains - Query Search ---

[jira] Updated: (NUTCH-477) Extend URLFilters to support different filtering chains

2009-02-20 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-477: - Patch Info: [Patch Available] Extend URLFilters to support different filtering chains

[jira] Updated: (NUTCH-694) Distributed Search Server fails

2009-02-19 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-694: - Attachment: NUTCH-694.patch This fixed the problem for me. Distributed Search Server fails

[jira] Resolved: (NUTCH-695) incorrect mime type detection by MoreIndexingFilter plugin

2009-02-19 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-695. -- Resolution: Fixed Assignee: Sami Siren committed, thanks incorrect mime type detection by

[jira] Commented: (NUTCH-694) Distributed Search Server fails

2009-02-19 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674964#action_12674964 ] Sami Siren commented on NUTCH-694: -- Strange, did you update both ends (the server and the

[jira] Resolved: (NUTCH-687) Add RAT

2009-02-18 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-687. -- Resolution: Fixed Fix Version/s: 1.0.0 committed Add RAT --- Key:

[jira] Commented: (NUTCH-689) Swf parser doesn't seem to handle relative links

2009-02-18 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674520#action_12674520 ] Sami Siren commented on NUTCH-689: -- for some reason I cannot apply the patch: patching

[jira] Resolved: (NUTCH-591) StringIndexOutOfBoundsException when extracting text from a Word document.

2009-02-18 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-591. -- Resolution: Duplicate duplicate of NUTCH-691 StringIndexOutOfBoundsException when extracting text

[jira] Resolved: (NUTCH-688) Fix missing/wrong headers in source files

2009-02-18 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-688. -- Resolution: Fixed I think we are done with this. Fix missing/wrong headers in source files

[jira] Resolved: (NUTCH-691) Update jakarta poi jars to the most relevant version

2009-02-18 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-691. -- Resolution: Fixed Fix Version/s: 1.0.0 committed, Thanks Dmitry Update jakarta poi jars to the

[jira] Resolved: (NUTCH-563) Include custom fields in BasicQueryFilter

2009-02-18 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-563. -- Resolution: Fixed Assignee: Sami Siren committed, thanks Include custom fields in

[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19

2009-02-18 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674603#action_12674603 ] Sami Siren commented on NUTCH-692: -- Have you seen this outside of EC2? Only in multinode

[jira] Updated: (NUTCH-583) FeedParser empty links for items

2009-02-18 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-583: - Fix Version/s: (was: 1.0.0) 1.1 pushing this to 1.1 FeedParser empty links for

[jira] Updated: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException

2009-02-17 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-631: - Attachment: NUTCH-631.patch Attaching a patch that fixes the problem as proposed, If there are no

[jira] Created: (NUTCH-687) Add RAT

2009-02-17 Thread Sami Siren (JIRA)
Add RAT --- Key: NUTCH-687 URL: https://issues.apache.org/jira/browse/NUTCH-687 Project: Nutch Issue Type: Improvement Reporter: Sami Siren Assignee: Sami Siren Priority: Minor Attachments:

[jira] Updated: (NUTCH-687) Add RAT

2009-02-17 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-687: - Attachment: NUTCH-687.patch Add RAT --- Key: NUTCH-687 URL:

[jira] Resolved: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException

2009-02-17 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-631. -- Resolution: Fixed Assignee: Sami Siren (was: Chris A. Mattmann) committed, thanks

[jira] Resolved: (NUTCH-582) Add missing type parameters

2009-02-17 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-582. -- Resolution: Fixed yep, all of this has been committed Add missing type parameters

[jira] Updated: (NUTCH-86) LanguageIdentifier API enhancements

2009-02-17 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-86?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-86: Fix Version/s: (was: 1.0.0) removing from 1.0 queue since there has been no activity lately

[jira] Updated: (NUTCH-609) Allow Plugins to be Loaded from Jar File(s)

2009-02-17 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-609: - Fix Version/s: (was: 1.0.0) 1.1 pushing this to 1.1, feel free to put back if

[jira] Updated: (NUTCH-469) changes to geoPosition plugin to make it work on nutch 0.9

2009-02-17 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-469: - Fix Version/s: (was: 1.0.0) 1.1 pushing this to 1.1 changes to geoPosition

[jira] Updated: (NUTCH-309) Uses commons logging Code Guards

2009-02-17 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-309: - Fix Version/s: (was: 1.0.0) 1.1 pushing this to 1.1 Uses commons logging Code

[jira] Commented: (NUTCH-689) Swf parser doesn't seem to handle relative links

2009-02-17 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674360#action_12674360 ] Sami Siren commented on NUTCH-689: -- about development: check url

[jira] Commented: (NUTCH-621) Nutch needs to declare it's crypto usage

2008-06-10 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12603890#action_12603890 ] Sami Siren commented on NUTCH-621: -- I agree, seem to me that we're in same situation as

[jira] Commented: (NUTCH-602) Allow configurable number of handlers for search servers

2008-02-07 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12566779#action_12566779 ] Sami Siren commented on NUTCH-602: -- +1 Allow configurable number of handlers for search

[jira] Resolved: (NUTCH-580) Remove deprecated hadoop api calls (FS)

2008-01-19 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-580. -- Resolution: Fixed Fix Version/s: 1.0.0 Committed. Remove deprecated hadoop api calls (FS)

[jira] Created: (NUTCH-580) Remove deprecated hadoop api calls (FS)

2007-11-21 Thread Sami Siren (JIRA)
Remove deprecated hadoop api calls (FS) --- Key: NUTCH-580 URL: https://issues.apache.org/jira/browse/NUTCH-580 Project: Nutch Issue Type: Improvement Affects Versions: 0.9.0 Reporter:

[jira] Updated: (NUTCH-582) Add missing type parameters

2007-11-21 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-582: - Attachment: typeparams.patch Add missing type parameters ---

[jira] Created: (NUTCH-582) Add missing type parameters

2007-11-21 Thread Sami Siren (JIRA)
Add missing type parameters --- Key: NUTCH-582 URL: https://issues.apache.org/jira/browse/NUTCH-582 Project: Nutch Issue Type: Improvement Reporter: Sami Siren Assignee: Sami Siren

[jira] Commented: (NUTCH-568) Indexer does not update the Lucene TITLE field

2007-10-22 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536756 ] Sami Siren commented on NUTCH-568: -- There is a BOM (Byte Order Mark) in the beginning of the file [feff] that seems

[jira] Commented: (NUTCH-565) Arc File to Nutch Segments Converter

2007-10-12 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534364 ] Sami Siren commented on NUTCH-565: -- I didn't actually test this, but it looks like useful addition to nutch, so +1

  1   2   3   4   >