Java.io.IOException with multiple copyField/ directives
Hi! I've run into a strange behaviour while using Nutch (solrindexer) together with Solr 1.4.1. I'd like to copy the 'title' and 'content' field to another field, say, 'foo'. In my first attempt I added the copyField/ directives in schema.xml and got the java exception so I removed them from schema.xml. In my second attempt I added the copyField/ directives to the 'solrindex-mapping.xml' file and ran into the same exception again! Is this a known issue or have I stumbled into unknown territory? Any workarounds? Many thanks! /Peter
Re: Java.io.IOException with multiple copyField/ directives
On 2010-12-03 09:52, Peter Litsegård wrote: Hi! I've run into a strange behaviour while using Nutch (solrindexer) together with Solr 1.4.1. I'd like to copy the 'title' and 'content' field to another field, say, 'foo'. In my first attempt I added the copyField/ directives in schema.xml and got the java exception so I removed them from schema.xml. In my second attempt I added the copyField/ directives to the 'solrindex-mapping.xml' file and ran into the same exception again! Is this a known issue or have I stumbled into unknown territory? Any workarounds? I suspect that the field type declared in your schema.xml is not multiValued. What was the exception? -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
SV: Java.io.IOException with multiple copyField/ directives
Hi Andrzej! The exception was java.io.IOException. Of course I forgot to make the dest field multivalued. Embarrassing:-) I'll update the schema.xml file an try again... Stay tuned! Cheers, /Peter -Ursprungligt meddelande- Från: Andrzej Bialecki [mailto:a...@getopt.org] Skickat: den 3 december 2010 10:42 Till: dev@nutch.apache.org Ämne: Re: Java.io.IOException with multiple copyField/ directives On 2010-12-03 09:52, Peter Litsegård wrote: Hi! I've run into a strange behaviour while using Nutch (solrindexer) together with Solr 1.4.1. I'd like to copy the 'title' and 'content' field to another field, say, 'foo'. In my first attempt I added the copyField/ directives in schema.xml and got the java exception so I removed them from schema.xml. In my second attempt I added the copyField/ directives to the 'solrindex-mapping.xml' file and ran into the same exception again! Is this a known issue or have I stumbled into unknown territory? Any workarounds? I suspect that the field type declared in your schema.xml is not multiValued. What was the exception? -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
SV: Java.io.IOException with multiple copyField/ directives
Hi Andrzej! OF COURSE I'd forgot to set 'multiValued=true! Thanks for pointing this out! Cheers, /Peter -Ursprungligt meddelande- Från: Andrzej Bialecki [mailto:a...@getopt.org] Skickat: den 3 december 2010 10:42 Till: dev@nutch.apache.org Ämne: Re: Java.io.IOException with multiple copyField/ directives On 2010-12-03 09:52, Peter Litsegård wrote: Hi! I've run into a strange behaviour while using Nutch (solrindexer) together with Solr 1.4.1. I'd like to copy the 'title' and 'content' field to another field, say, 'foo'. In my first attempt I added the copyField/ directives in schema.xml and got the java exception so I removed them from schema.xml. In my second attempt I added the copyField/ directives to the 'solrindex-mapping.xml' file and ran into the same exception again! Is this a known issue or have I stumbled into unknown territory? Any workarounds? I suspect that the field type declared in your schema.xml is not multiValued. What was the exception? -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
[jira] Created: (NUTCH-944) Increase the number of elements to look for URLs and add the ability to specify multiple attributes by elements
Increase the number of elements to look for URLs and add the ability to specify multiple attributes by elements --- Key: NUTCH-944 URL: https://issues.apache.org/jira/browse/NUTCH-944 Project: Nutch Issue Type: Improvement Components: parser Affects Versions: 1.3 Environment: GNU/Linux Fedora 12 Reporter: Jean-Francois Gingras Priority: Minor Fix For: 1.3 Here a patch for DOMContentUtils.java that increase the number of elements to look for URLs. It also add the ability to specify multiple attributes by elements, for example: linkParams.put(frame, new LinkParams(frame, longdesc,src, 0)); linkParams.put(object, new LinkParams(object, classid,codebase,data,usemap, 0)); linkParams.put(video, new LinkParams(video, poster,src, 0)); // HTML 5 I have a patch for release-1.0 and branch-1.3 I would love to hear your comments about this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (NUTCH-944) Increase the number of elements to look for URLs and add the ability to specify multiple attributes by elements
[ https://issues.apache.org/jira/browse/NUTCH-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Francois Gingras updated NUTCH-944: Attachment: DOMContentUtils.java.path-1.3 DOMContentUtils.java.path-1.0 I upload the patch for 1.0 because we currently use it. Increase the number of elements to look for URLs and add the ability to specify multiple attributes by elements --- Key: NUTCH-944 URL: https://issues.apache.org/jira/browse/NUTCH-944 Project: Nutch Issue Type: Improvement Components: parser Affects Versions: 1.3 Environment: GNU/Linux Fedora 12 Reporter: Jean-Francois Gingras Priority: Minor Fix For: 1.3 Attachments: DOMContentUtils.java.path-1.0, DOMContentUtils.java.path-1.3 Here a patch for DOMContentUtils.java that increase the number of elements to look for URLs. It also add the ability to specify multiple attributes by elements, for example: linkParams.put(frame, new LinkParams(frame, longdesc,src, 0)); linkParams.put(object, new LinkParams(object, classid,codebase,data,usemap, 0)); linkParams.put(video, new LinkParams(video, poster,src, 0)); // HTML 5 I have a patch for release-1.0 and branch-1.3 I would love to hear your comments about this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (NUTCH-945) Indexing to multiple SOLR Servers
Indexing to multiple SOLR Servers - Key: NUTCH-945 URL: https://issues.apache.org/jira/browse/NUTCH-945 Project: Nutch Issue Type: Improvement Components: indexer Affects Versions: 1.2 Reporter: Charan Malemarpuram It would be nice to have a default Indexer in Nutch, which can submit docs to multiple SOLR Servers. Partitioning is always the question, when writing to multiple SOLR Servers. Default partitioning can be a simple hashcode based distribution with addition hooks to customization. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Build failed in Hudson: Nutch-trunk #1326
See https://hudson.apache.org/hudson/job/Nutch-trunk/1326/changes Changes: [ab] Fix breakage due to the changed Gora API. -- [...truncated 1006 lines...] A src/plugin/subcollection/src/java/org/apache/nutch A src/plugin/subcollection/src/java/org/apache/nutch/collection A src/plugin/subcollection/src/java/org/apache/nutch/collection/Subcollection.java A src/plugin/subcollection/src/java/org/apache/nutch/collection/CollectionManager.java A src/plugin/subcollection/src/java/org/apache/nutch/collection/package.html A src/plugin/subcollection/src/java/org/apache/nutch/indexer A src/plugin/subcollection/src/java/org/apache/nutch/indexer/subcollection A src/plugin/subcollection/src/java/org/apache/nutch/indexer/subcollection/SubcollectionIndexingFilter.java A src/plugin/subcollection/README.txt A src/plugin/subcollection/plugin.xml A src/plugin/subcollection/build.xml A src/plugin/index-more A src/plugin/index-more/ivy.xml A src/plugin/index-more/src A src/plugin/index-more/src/test A src/plugin/index-more/src/test/org A src/plugin/index-more/src/test/org/apache A src/plugin/index-more/src/test/org/apache/nutch A src/plugin/index-more/src/test/org/apache/nutch/indexer A src/plugin/index-more/src/test/org/apache/nutch/indexer/more A src/plugin/index-more/src/test/org/apache/nutch/indexer/more/TestMoreIndexingFilter.java A src/plugin/index-more/src/java A src/plugin/index-more/src/java/org A src/plugin/index-more/src/java/org/apache A src/plugin/index-more/src/java/org/apache/nutch A src/plugin/index-more/src/java/org/apache/nutch/indexer A src/plugin/index-more/src/java/org/apache/nutch/indexer/more A src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java A src/plugin/index-more/src/java/org/apache/nutch/indexer/more/package.html A src/plugin/index-more/plugin.xml A src/plugin/index-more/build.xml AUsrc/plugin/plugin.dtd A src/plugin/parse-ext A src/plugin/parse-ext/ivy.xml A src/plugin/parse-ext/src A src/plugin/parse-ext/src/test A src/plugin/parse-ext/src/test/org A src/plugin/parse-ext/src/test/org/apache A src/plugin/parse-ext/src/test/org/apache/nutch A src/plugin/parse-ext/src/test/org/apache/nutch/parse A src/plugin/parse-ext/src/test/org/apache/nutch/parse/ext A src/plugin/parse-ext/src/test/org/apache/nutch/parse/ext/TestExtParser.java A src/plugin/parse-ext/src/java A src/plugin/parse-ext/src/java/org A src/plugin/parse-ext/src/java/org/apache A src/plugin/parse-ext/src/java/org/apache/nutch A src/plugin/parse-ext/src/java/org/apache/nutch/parse A src/plugin/parse-ext/src/java/org/apache/nutch/parse/ext A src/plugin/parse-ext/src/java/org/apache/nutch/parse/ext/ExtParser.java A src/plugin/parse-ext/plugin.xml A src/plugin/parse-ext/build.xml A src/plugin/parse-ext/command A src/plugin/urlnormalizer-pass A src/plugin/urlnormalizer-pass/ivy.xml A src/plugin/urlnormalizer-pass/src A src/plugin/urlnormalizer-pass/src/test A src/plugin/urlnormalizer-pass/src/test/org A src/plugin/urlnormalizer-pass/src/test/org/apache A src/plugin/urlnormalizer-pass/src/test/org/apache/nutch A src/plugin/urlnormalizer-pass/src/test/org/apache/nutch/net A src/plugin/urlnormalizer-pass/src/test/org/apache/nutch/net/urlnormalizer A src/plugin/urlnormalizer-pass/src/test/org/apache/nutch/net/urlnormalizer/pass AU src/plugin/urlnormalizer-pass/src/test/org/apache/nutch/net/urlnormalizer/pass/TestPassURLNormalizer.java A src/plugin/urlnormalizer-pass/src/java A src/plugin/urlnormalizer-pass/src/java/org A src/plugin/urlnormalizer-pass/src/java/org/apache A src/plugin/urlnormalizer-pass/src/java/org/apache/nutch A src/plugin/urlnormalizer-pass/src/java/org/apache/nutch/net A src/plugin/urlnormalizer-pass/src/java/org/apache/nutch/net/urlnormalizer A src/plugin/urlnormalizer-pass/src/java/org/apache/nutch/net/urlnormalizer/pass AU src/plugin/urlnormalizer-pass/src/java/org/apache/nutch/net/urlnormalizer/pass/PassURLNormalizer.java AUsrc/plugin/urlnormalizer-pass/plugin.xml AUsrc/plugin/urlnormalizer-pass/build.xml A src/plugin/parse-html A src/plugin/parse-html/ivy.xml A src/plugin/parse-html/lib A src/plugin/parse-html/lib/tagsoup.LICENSE.txt A src/plugin/parse-html/src A src/plugin/parse-html/src/test A src/plugin/parse-html/src/test/org A src/plugin/parse-html/src/test/org/apache A