[jira] [Updated] (NUTCH-1388) Optionally maintain custom fetch interval despite AdaptiveFetchSchedule
[ https://issues.apache.org/jira/browse/NUTCH-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1388: - Attachment: NUTCH-1388-1.6-2.patch Complete patch that actually builds against current trunk. Optionally maintain custom fetch interval despite AdaptiveFetchSchedule --- Key: NUTCH-1388 URL: https://issues.apache.org/jira/browse/NUTCH-1388 Project: Nutch Issue Type: Improvement Components: injector Reporter: Markus Jelsma Assignee: Markus Jelsma Fix For: 1.6 Attachments: NUTCH-1388-1.6-1.patch, NUTCH-1388-1.6-2.patch During injection a custom fetch interval can be configured but it is not maintained with an AdaptiveFetchSchedule enabled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1405) Allow to overwrite CrawlDatum's with injected entries
[ https://issues.apache.org/jira/browse/NUTCH-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1405: - Patch Info: Patch Available Allow to overwrite CrawlDatum's with injected entries - Key: NUTCH-1405 URL: https://issues.apache.org/jira/browse/NUTCH-1405 Project: Nutch Issue Type: Improvement Components: injector Affects Versions: 1.5, 1.6 Reporter: Markus Jelsma Assignee: Markus Jelsma Priority: Minor Fix For: 1.6 Attachments: NUTCH-1405-1.6-1.patch Injector's reducer does not permit overwriting existing CrawlDatum entries. It is, however, useful to optionally overwrite so users can reset metadata manually. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1342) Read time out protocol-http
[ https://issues.apache.org/jira/browse/NUTCH-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1342: - Patch Info: Patch Available Read time out protocol-http --- Key: NUTCH-1342 URL: https://issues.apache.org/jira/browse/NUTCH-1342 Project: Nutch Issue Type: Bug Components: fetcher Affects Versions: 1.4, 1.5 Reporter: Markus Jelsma Assignee: Markus Jelsma Priority: Critical Fix For: 1.6 Attachments: NUTCH-1342-1.6-1.patch For some reason some URL's always time out with protocol-http but not protocol-httpclient. The stack trace is always the same: {code} 2012-04-20 11:25:44,275 ERROR http.Http - Failed to get protocol output java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.FilterInputStream.read(FilterInputStream.java:116) at java.io.PushbackInputStream.read(PushbackInputStream.java:169) at java.io.FilterInputStream.read(FilterInputStream.java:90) at org.apache.nutch.protocol.http.HttpResponse.readPlainContent(HttpResponse.java:228) at org.apache.nutch.protocol.http.HttpResponse.init(HttpResponse.java:157) at org.apache.nutch.protocol.http.Http.getResponse(Http.java:64) at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:138) {code} Some example URL's: * 404 http://www.fcgroningen.nl/tribunenamen/stemmen/ * 301 http://shop.fcgroningen.nl/aanbieding -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1388) Optionally maintain custom fetch interval despite AdaptiveFetchSchedule
[ https://issues.apache.org/jira/browse/NUTCH-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1388: - Patch Info: Patch Available Optionally maintain custom fetch interval despite AdaptiveFetchSchedule --- Key: NUTCH-1388 URL: https://issues.apache.org/jira/browse/NUTCH-1388 Project: Nutch Issue Type: Improvement Components: injector Reporter: Markus Jelsma Assignee: Markus Jelsma Fix For: 1.6 Attachments: NUTCH-1388-1.6-1.patch, NUTCH-1388-1.6-2.patch During injection a custom fetch interval can be configured but it is not maintained with an AdaptiveFetchSchedule enabled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1405) Allow to overwrite CrawlDatum's with injected entries
[ https://issues.apache.org/jira/browse/NUTCH-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1405: - Attachment: (was: NUTCH-1405-1.6-1.patch) Allow to overwrite CrawlDatum's with injected entries - Key: NUTCH-1405 URL: https://issues.apache.org/jira/browse/NUTCH-1405 Project: Nutch Issue Type: Improvement Components: injector Affects Versions: 1.5, 1.6 Reporter: Markus Jelsma Assignee: Markus Jelsma Priority: Minor Fix For: 1.6 Attachments: NUTCH-1405-1.6-2.patch Injector's reducer does not permit overwriting existing CrawlDatum entries. It is, however, useful to optionally overwrite so users can reset metadata manually. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1405) Allow to overwrite CrawlDatum's with injected entries
[ https://issues.apache.org/jira/browse/NUTCH-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1405: - Attachment: NUTCH-1405-1.6-2.patch Correct patch. Allow to overwrite CrawlDatum's with injected entries - Key: NUTCH-1405 URL: https://issues.apache.org/jira/browse/NUTCH-1405 Project: Nutch Issue Type: Improvement Components: injector Affects Versions: 1.5, 1.6 Reporter: Markus Jelsma Assignee: Markus Jelsma Priority: Minor Fix For: 1.6 Attachments: NUTCH-1405-1.6-2.patch Injector's reducer does not permit overwriting existing CrawlDatum entries. It is, however, useful to optionally overwrite so users can reset metadata manually. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1405) Allow to overwrite CrawlDatum's with injected entries
[ https://issues.apache.org/jira/browse/NUTCH-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1405: - Attachment: (was: NUTCH-1405-1.6-2.patch) Allow to overwrite CrawlDatum's with injected entries - Key: NUTCH-1405 URL: https://issues.apache.org/jira/browse/NUTCH-1405 Project: Nutch Issue Type: Improvement Components: injector Affects Versions: 1.5, 1.6 Reporter: Markus Jelsma Assignee: Markus Jelsma Priority: Minor Fix For: 1.6 Attachments: NUTCH-1405-1.6-3.patch Injector's reducer does not permit overwriting existing CrawlDatum entries. It is, however, useful to optionally overwrite so users can reset metadata manually. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1405) Allow to overwrite CrawlDatum's with injected entries
[ https://issues.apache.org/jira/browse/NUTCH-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1405: - Attachment: NUTCH-1405-1.6-3.patch This time without a debug log line!!! Allow to overwrite CrawlDatum's with injected entries - Key: NUTCH-1405 URL: https://issues.apache.org/jira/browse/NUTCH-1405 Project: Nutch Issue Type: Improvement Components: injector Affects Versions: 1.5, 1.6 Reporter: Markus Jelsma Assignee: Markus Jelsma Priority: Minor Fix For: 1.6 Attachments: NUTCH-1405-1.6-3.patch Injector's reducer does not permit overwriting existing CrawlDatum entries. It is, however, useful to optionally overwrite so users can reset metadata manually. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1406) Metatags-index/-parse plugin: conversion to Solr date format and prevents parsing/indexing of empty tags
[ https://issues.apache.org/jira/browse/NUTCH-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristof updated NUTCH-1406: Attachment: index-metadata.patch Metatags-index/-parse plugin: conversion to Solr date format and prevents parsing/indexing of empty tags Key: NUTCH-1406 URL: https://issues.apache.org/jira/browse/NUTCH-1406 Project: Nutch Issue Type: Improvement Components: indexer, parser Reporter: Kristof Priority: Minor Labels: conversion, date Attachments: index-metadata.patch This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format and prevents parsing/indexing of metatags that do not contain any content. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. The example used is an extended Dublin Core element dcterms.modified with the seed url http://www.cic.gc.ca/. dcterms.modified must also be defined in the metatags.names property. {code} property namemetatags.convert/name valuedcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} I read that SimpleDateFormat format is not a robust solution, so this improvement might have some problems. So far it worked well for me. Below more details about the changes. Please note: The attached jar-file was originally taken from NUTCH-809 (https://issues.apache.org/jira/browse/NUTCH-809). The plugin and tutorial there do not necessarily match the index-metadata plugin in subversion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1406) Metatags-index/-parse plugin: conversion to Solr date format and prevents parsing/indexing of empty tags
[ https://issues.apache.org/jira/browse/NUTCH-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristof updated NUTCH-1406: Description: This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format and prevents parsing/indexing of metatags that do not contain any content. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. The example used is an extended Dublin Core element dcterms.modified with the seed url http://www.cic.gc.ca/. dcterms.modified must also be defined in the metatags.names property. {code} property nameindex.dateconvert.md/name valuedcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} was: This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format and prevents parsing/indexing of metatags that do not contain any content. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. The example used is an extended Dublin Core element dcterms.modified with the seed url http://www.cic.gc.ca/. dcterms.modified must also be defined in the metatags.names property. {code} property namemetatags.convert/name valuedcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} I read that SimpleDateFormat format is not a robust solution, so this improvement might have some problems. So far it worked well for me. Below more details about the changes. Please note: The attached jar-file was originally taken from NUTCH-809 (https://issues.apache.org/jira/browse/NUTCH-809). The plugin and tutorial there do not necessarily match the index-metadata plugin in subversion. Metatags-index/-parse plugin: conversion to Solr date format and prevents parsing/indexing of empty tags Key: NUTCH-1406 URL: https://issues.apache.org/jira/browse/NUTCH-1406 Project: Nutch Issue Type: Improvement Components: indexer, parser Reporter: Kristof Priority: Minor Labels: conversion, date Attachments: index-metadata.patch This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format and prevents parsing/indexing of metatags that do not contain any content. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. The example used is an extended Dublin Core element dcterms.modified with the seed url http://www.cic.gc.ca/. dcterms.modified must also be defined in the metatags.names property. {code} property nameindex.dateconvert.md/name valuedcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1406) Metatags-index/-parse plugin: conversion to Solr date format and prevents parsing/indexing of empty tags
[ https://issues.apache.org/jira/browse/NUTCH-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristof updated NUTCH-1406: Attachment: (was: index-metadata-plugin.patch) Metatags-index/-parse plugin: conversion to Solr date format and prevents parsing/indexing of empty tags Key: NUTCH-1406 URL: https://issues.apache.org/jira/browse/NUTCH-1406 Project: Nutch Issue Type: Improvement Components: indexer, parser Reporter: Kristof Priority: Minor Labels: conversion, date Attachments: index-metadata.patch This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format and prevents parsing/indexing of metatags that do not contain any content. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. The example used is an extended Dublin Core element dcterms.modified with the seed url http://www.cic.gc.ca/. dcterms.modified must also be defined in the metatags.names property. {code} property namemetatags.convert/name valuedcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} I read that SimpleDateFormat format is not a robust solution, so this improvement might have some problems. So far it worked well for me. Below more details about the changes. Please note: The attached jar-file was originally taken from NUTCH-809 (https://issues.apache.org/jira/browse/NUTCH-809). The plugin and tutorial there do not necessarily match the index-metadata plugin in subversion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1406) metadata-index plugin: conversion to Solr date format
[ https://issues.apache.org/jira/browse/NUTCH-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristof updated NUTCH-1406: Description: This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. The example used is an extended Dublin Core element dcterms.modified with the seed url http://www.cic.gc.ca/. dcterms.modified must also be defined in the metatags.names property. {code} property nameindex.dateconvert.md/name valuedcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} was: This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format and prevents parsing/indexing of metatags that do not contain any content. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. The example used is an extended Dublin Core element dcterms.modified with the seed url http://www.cic.gc.ca/. dcterms.modified must also be defined in the metatags.names property. {code} property nameindex.dateconvert.md/name valuedcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} Summary: metadata-index plugin: conversion to Solr date format (was: Metatags-index/-parse plugin: conversion to Solr date format and prevents parsing/indexing of empty tags) metadata-index plugin: conversion to Solr date format - Key: NUTCH-1406 URL: https://issues.apache.org/jira/browse/NUTCH-1406 Project: Nutch Issue Type: Improvement Components: indexer, parser Reporter: Kristof Priority: Minor Labels: conversion, date Attachments: index-metadata.patch This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. The example used is an extended Dublin Core element dcterms.modified with the seed url http://www.cic.gc.ca/. dcterms.modified must also be defined in the metatags.names property. {code} property nameindex.dateconvert.md/name valuedcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-1406) metadata-index plugin: conversion to Solr date format
[ https://issues.apache.org/jira/browse/NUTCH-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13399222#comment-13399222 ] Kristof commented on NUTCH-1406: - Thank you for the clarification. When I originally looked for a plugin to index metadata early this year, the index-metatags was the one available. Hence I developed based on this, only realizing after trying to get it working with trunk that something did not add up. Obviously building on the committed index-metadata version is the way to go. I attached the hopefully correct way to patch it, and removed the wrong version and any information that might be misleading. I was not able to make extensive tests though as this was done using the version initially posted in NUTCH-809. metadata-index plugin: conversion to Solr date format - Key: NUTCH-1406 URL: https://issues.apache.org/jira/browse/NUTCH-1406 Project: Nutch Issue Type: Improvement Components: indexer, parser Reporter: Kristof Priority: Minor Labels: conversion, date Attachments: index-metadata.patch This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. The example used is an extended Dublin Core element dcterms.modified with the seed url http://www.cic.gc.ca/. dcterms.modified must also be defined in the metatags.names property. {code} property nameindex.dateconvert.md/name valuedcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1406) metadata-index plugin: conversion to Solr date format
[ https://issues.apache.org/jira/browse/NUTCH-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristof updated NUTCH-1406: Description: This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. This can be for example used with Dublin Core elements. dcterms.modified with the seed url http://www.cic.gc.ca dcterms.modified must also be defined in the metatags.names and index.parse.md propertie. {code} property nameindex.dateconvert.md/name valuemetatag.dcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} was: This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. The example used is an extended Dublin Core element dcterms.modified with the seed url http://www.cic.gc.ca/. dcterms.modified must also be defined in the metatags.names property. {code} property nameindex.dateconvert.md/name valuedcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} metadata-index plugin: conversion to Solr date format - Key: NUTCH-1406 URL: https://issues.apache.org/jira/browse/NUTCH-1406 Project: Nutch Issue Type: Improvement Components: indexer, parser Reporter: Kristof Priority: Minor Labels: conversion, date Attachments: index-metadata.patch This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. This can be for example used with Dublin Core elements. dcterms.modified with the seed url http://www.cic.gc.ca dcterms.modified must also be defined in the metatags.names and index.parse.md propertie. {code} property nameindex.dateconvert.md/name valuemetatag.dcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1406) metadata-index plugin: conversion to Solr date format
[ https://issues.apache.org/jira/browse/NUTCH-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristof updated NUTCH-1406: Description: This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. This can be for example used with Dublin Core elements. A subdomain which would have pages with the meta tag dcterms.modified would be cic.gc.ca. dcterms.modified must also be defined in the metatags.names and index.parse.md properties. {code} property nameindex.dateconvert.md/name valuemetatag.dcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} was: This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. This can be for example used with Dublin Core elements. dcterms.modified with the seed url http://www.cic.gc.ca dcterms.modified must also be defined in the metatags.names and index.parse.md propertie. {code} property nameindex.dateconvert.md/name valuemetatag.dcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} metadata-index plugin: conversion to Solr date format - Key: NUTCH-1406 URL: https://issues.apache.org/jira/browse/NUTCH-1406 Project: Nutch Issue Type: Improvement Components: indexer, parser Reporter: Kristof Priority: Minor Labels: conversion, date Attachments: index-metadata.patch This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. This can be for example used with Dublin Core elements. A subdomain which would have pages with the meta tag dcterms.modified would be cic.gc.ca. dcterms.modified must also be defined in the metatags.names and index.parse.md properties. {code} property nameindex.dateconvert.md/name valuemetatag.dcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1406) metadata-index plugin: conversion to Solr date format
[ https://issues.apache.org/jira/browse/NUTCH-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristof updated NUTCH-1406: Description: This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. The main benefit of this conversion is the possibility to create range facets. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. This can be for example used with Dublin Core elements. A subdomain which would have pages with the meta tag dcterms.modified would be cic.gc.ca. dcterms.modified must also be defined in the metatags.names and index.parse.md properties. {code} property nameindex.dateconvert.md/name valuemetatag.dcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} was: This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. This can be for example used with Dublin Core elements. A subdomain which would have pages with the meta tag dcterms.modified would be cic.gc.ca. dcterms.modified must also be defined in the metatags.names and index.parse.md properties. {code} property nameindex.dateconvert.md/name valuemetatag.dcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} metadata-index plugin: conversion to Solr date format - Key: NUTCH-1406 URL: https://issues.apache.org/jira/browse/NUTCH-1406 Project: Nutch Issue Type: Improvement Components: indexer, parser Reporter: Kristof Priority: Minor Labels: conversion, date Attachments: index-metadata.patch This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. The main benefit of this conversion is the possibility to create range facets. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. This can be for example used with Dublin Core elements. A subdomain which would have pages with the meta tag dcterms.modified would be cic.gc.ca. dcterms.modified must also be defined in the metatags.names and index.parse.md properties. {code} property nameindex.dateconvert.md/name valuemetatag.dcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-1406) metadata-index plugin: conversion to Solr date format
[ https://issues.apache.org/jira/browse/NUTCH-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13399228#comment-13399228 ] Markus Jelsma commented on NUTCH-1406: -- Hello, a few notes on your patch: * Nutch uses double space for a single indentation, not tabs; * convertIndicatior seems to be misspelled; * -MM-dd doesn't look like Solr's supported DateField as it's missing time and timezone Z. metadata-index plugin: conversion to Solr date format - Key: NUTCH-1406 URL: https://issues.apache.org/jira/browse/NUTCH-1406 Project: Nutch Issue Type: Improvement Components: indexer, parser Reporter: Kristof Priority: Minor Labels: conversion, date Attachments: index-metadata.patch This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. The main benefit of this conversion is the possibility to create range facets. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. This can be for example used with Dublin Core elements. A subdomain which would have pages with the meta tag dcterms.modified would be cic.gc.ca. dcterms.modified must also be defined in the metatags.names and index.parse.md properties. {code} property nameindex.dateconvert.md/name valuemetatag.dcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-1406) metadata-index plugin: conversion to Solr date format
[ https://issues.apache.org/jira/browse/NUTCH-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13399250#comment-13399250 ] Julien Nioche commented on NUTCH-1406: -- BTW we have formatting rules for Eclipse in the NutchGora branch (see eclipse-codeformat.xml). We could add this to the trunk as well metadata-index plugin: conversion to Solr date format - Key: NUTCH-1406 URL: https://issues.apache.org/jira/browse/NUTCH-1406 Project: Nutch Issue Type: Improvement Components: indexer, parser Reporter: Kristof Priority: Minor Labels: conversion, date Attachments: index-metadata.patch This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. The main benefit of this conversion is the possibility to create range facets. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. This can be for example used with Dublin Core elements. A subdomain which would have pages with the meta tag dcterms.modified would be cic.gc.ca. dcterms.modified must also be defined in the metatags.names and index.parse.md properties. {code} property nameindex.dateconvert.md/name valuemetatag.dcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (NUTCH-1408) RobotRulesParser main doesn't take URL's
Markus Jelsma created NUTCH-1408: Summary: RobotRulesParser main doesn't take URL's Key: NUTCH-1408 URL: https://issues.apache.org/jira/browse/NUTCH-1408 Project: Nutch Issue Type: Bug Affects Versions: 1.5 Reporter: Markus Jelsma Assignee: Markus Jelsma Priority: Minor Fix For: 1.6 lib-http's org.apache.nutch.protocol.http.api.RobotRulesParser main() takes a robot file and an URL file according to its usage output. It, however, expects URI paths not URL's and will therefore never work if an input contains URL's. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1408) RobotRulesParser main doesn't take URL's
[ https://issues.apache.org/jira/browse/NUTCH-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1408: - Attachment: NUTCH-1408-1.6-1.patch Patch turns input to an URL objects which is handled properly. RobotRulesParser main doesn't take URL's Key: NUTCH-1408 URL: https://issues.apache.org/jira/browse/NUTCH-1408 Project: Nutch Issue Type: Bug Affects Versions: 1.5 Reporter: Markus Jelsma Assignee: Markus Jelsma Priority: Minor Fix For: 1.6 Attachments: NUTCH-1408-1.6-1.patch lib-http's org.apache.nutch.protocol.http.api.RobotRulesParser main() takes a robot file and an URL file according to its usage output. It, however, expects URI paths not URL's and will therefore never work if an input contains URL's. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (NUTCH-1409) Remove deprecated properties in nutch-default.xml
Matthias Agethle created NUTCH-1409: --- Summary: Remove deprecated properties in nutch-default.xml Key: NUTCH-1409 URL: https://issues.apache.org/jira/browse/NUTCH-1409 Project: Nutch Issue Type: Improvement Reporter: Matthias Agethle Priority: Minor Fix For: 1.6 1) Remove deprecated properties from nutch-default.xml (generate.max.per.host and db.default.fetch.interval). 2) The already removed properties generate.max.per.host.by.ip and db.max.fetch.interval are still used in source code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1409) Remove deprecated properties in nutch-default.xml
[ https://issues.apache.org/jira/browse/NUTCH-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Agethle updated NUTCH-1409: Attachment: NUTCH-1409.patch Patch for trunk (rev 1352896) Remove deprecated properties in nutch-default.xml - Key: NUTCH-1409 URL: https://issues.apache.org/jira/browse/NUTCH-1409 Project: Nutch Issue Type: Improvement Reporter: Matthias Agethle Priority: Minor Fix For: 1.6 Attachments: NUTCH-1409.patch 1) Remove deprecated properties from nutch-default.xml (generate.max.per.host and db.default.fetch.interval). 2) The already removed properties generate.max.per.host.by.ip and db.max.fetch.interval are still used in source code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1409) Remove deprecated properties in nutch-default.xml
[ https://issues.apache.org/jira/browse/NUTCH-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Agethle updated NUTCH-1409: Patch Info: Patch Available Remove deprecated properties in nutch-default.xml - Key: NUTCH-1409 URL: https://issues.apache.org/jira/browse/NUTCH-1409 Project: Nutch Issue Type: Improvement Reporter: Matthias Agethle Priority: Minor Fix For: 1.6 Attachments: NUTCH-1409.patch 1) Remove deprecated properties from nutch-default.xml (generate.max.per.host and db.default.fetch.interval). 2) The already removed properties generate.max.per.host.by.ip and db.max.fetch.interval are still used in source code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-1408) RobotRulesParser main doesn't take URL's
[ https://issues.apache.org/jira/browse/NUTCH-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13399356#comment-13399356 ] Lewis John McGibbney commented on NUTCH-1408: - +1 RobotRulesParser main doesn't take URL's Key: NUTCH-1408 URL: https://issues.apache.org/jira/browse/NUTCH-1408 Project: Nutch Issue Type: Bug Affects Versions: 1.5 Reporter: Markus Jelsma Assignee: Markus Jelsma Priority: Minor Fix For: 1.6 Attachments: NUTCH-1408-1.6-1.patch lib-http's org.apache.nutch.protocol.http.api.RobotRulesParser main() takes a robot file and an URL file according to its usage output. It, however, expects URI paths not URL's and will therefore never work if an input contains URL's. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: 1.5.1 release
Hey Guys, (sorry for the top post) There's no reason to freeze trunk during releases. In fact, during the RC, once the branch (or tag for that matter) is created, trunk can continue on, no need to stop. Heck, we can always just tag or branch from a specific revision too so it's not really a biggie. Cheers, Chris On Jun 21, 2012, at 2:43 PM, Lewis John Mcgibbney wrote: Hi Markus, On Thu, Jun 21, 2012 at 10:02 PM, Markus Jelsma markus.jel...@openindex.io wrote: It's still not clear to me what 1.5.1 is going to look like. Will it be current trunk incl. the script bugfix or just 1.5 plus the bugfix? I would vote for the latter as it makes more sense for a bugfix release. I am easy on this one... I suggest we do it the normal way. Lets let folks chime in and see where we are on Saturday. It looks like 2.0 is going to be shifted with the new commits so do we wish to try and keep at least the minimal consistency between both releases? There is another debate behind this, in my opinion, about freezing trunk prior to releases and thus stopping active development. This has been an issue in the past. Is this something for another thread? Yeah I must also agree that we should branch trunk, keep the branch for the release then run the RC's from the branch regardless of how trunk comes on. My only suggestion for backporting patches from trunk to the release candidate branch is if it is a pretty critical bug fix as we've now discovered in 1.5! Additionally there is another note here as well w.r.t release managers. We've relied on the excellent work done by Chris (and others) as RM's for a number of releases but during the release period (on occasion, more recently) as you mention trunk has frozen temporarily. Of course it is the aim to prevent this happening should the RC not progress as we would all like. Hopefully we are moving towards a more adaptable and sustainable RM process within Nutch where the RM responsibility can be undertaken/overseen by more than one individual over the entire duration of the process. I think (and hope) we can consider the slight struggle we've had for 1.5 as an exception. As far back as I can remember RC's have always been efficient and smooth and I personally am committed to ensuring we return to the high precedent set by previous RM's. We've also seen an alternative (and in my opinion an improved) publication of Nutch atrifacts for 1.5. For reference I direct you to Julien's commentary [0] on this topic. Due to this, we've had to run additional RC's which has taken a bit longer than usual and I must personally apologise to everyone for at least one RC cock up which could have been avoided had I been more familiar with the Nutch specific release process. I think I'm ranting here so I'm going to give it a bye now. Lewis [0] http://digitalpebble.blogspot.co.uk/2012/06/whats-new-in-nutch-15.html ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Nutch 2.0 Press Announcement
Hello Lewis --great to hear from you, as always. Hello Nutch DevTeam! Of course; I'm happy to help. What's your timeframe? Traditionally, these sorts of announcements are usually something I work with the PMC on, vs. dev (no offense, folks, it's more of an issue of public exposure prior to the announcement being made). Whatever works best for you is fine...I'm flexible. Having said that, what is your timeframe? In other words, has v2.0 already been releases (I hope not!). Also, if you would like to include supporting testimonial quotes from highly-visible users (organizations), we are going to have to plan to set aside at least a week for those to come in (some companies have strict vetting/clearance requirements by their legal teams). And finally, in an ideal situation, we'll work on the announcement together (usually there's a point-person assigned to take the lead on this, and we'll run drafts by the list during the final editing stages) so I can get a better grasp of the project and be able to highlight what's new/important/sexy/*. Thanks again. I look forward to working with y'all g Chat soon, Sally From: Lewis John Mcgibbney lewis.mcgibb...@gmail.com To: Sally Khudairi s...@apache.org Cc: dev@nutch.apache.org Sent: Thursday, 21 June 2012, 16:49 Subject: Nutch 2.0 Press Announcement Good Evening Sally, First and foremost I hope you are keeping well and that the beginning of the summer has been kind to you... all the good weather still to come not to worry :0) The reason I contact you is that we (the Apache Nutch community) are nearly ready to release Nutch 2.0 which represents a pretty significant milestone for Apache Nutch as a project. Although Nutch 2.0 is not considered as main stream development (a decision made by the PMC some time ago) it still marks a real step forward for the project as a whole and also pays serious merit to users, developers and committers past and present. Due top these reasons I think it would be excellent for the community if we could really get the message out that the project is rocking in addition to the fact that it is an excellent, well followed, vibrant TLP within the foundation. I wonder if it would be possible for us to get a formal press announcement constructed based on input from ourselves in collaboration with your experience in this area? I am coming into the official press releases from an almost blind tangent so would really appreciate your guidance and input on this one if possible. Thanks in advance for any input you have. Best Lewis N.B Please anyone from dev@ chime in on this thread. I personally feel the better an announcement, the more our community grows. Thank you
[jira] [Updated] (NUTCH-1406) metadata-index plugin: conversion to Solr date format
[ https://issues.apache.org/jira/browse/NUTCH-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristof updated NUTCH-1406: Attachment: index-metadata_formatted.patch Formatting done (correct?), spelling error corrected. In regards to the format. You are right that Solr uses this date format -mm-ddThh:mm:ss.mmmZ. The used SimpleDateFormat -MM-dd correctly converts to the -mm-ddThh:mm:ss.mmmZ, but for dates only. I did not consider time when using it as the fields I am looking only have date. The conversion basically adds time information by interpreting the missing time as 00:00:00 and converting it to UTC based on the time zone settings of the machine used in the process. I just tested with some altered files into which I included time information and several SimpleDateFormat patterns trying to find one which works. So far I did not find any that works. A pattern going beyond the pattern -MM-dd the original field values only having are not converted. So it seems this solutions is only limited to dates. metadata-index plugin: conversion to Solr date format - Key: NUTCH-1406 URL: https://issues.apache.org/jira/browse/NUTCH-1406 Project: Nutch Issue Type: Improvement Components: indexer, parser Reporter: Kristof Priority: Minor Labels: conversion, date Attachments: index-metadata_formatted.patch This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. The main benefit of this conversion is the possibility to create range facets. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. This can be for example used with Dublin Core elements. A subdomain which would have pages with the meta tag dcterms.modified would be cic.gc.ca. dcterms.modified must also be defined in the metatags.names and index.parse.md properties. {code} property nameindex.dateconvert.md/name valuemetatag.dcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1406) metadata-index plugin: conversion to Solr date format
[ https://issues.apache.org/jira/browse/NUTCH-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristof updated NUTCH-1406: Attachment: (was: index-metadata.patch) metadata-index plugin: conversion to Solr date format - Key: NUTCH-1406 URL: https://issues.apache.org/jira/browse/NUTCH-1406 Project: Nutch Issue Type: Improvement Components: indexer, parser Reporter: Kristof Priority: Minor Labels: conversion, date Attachments: index-metadata_formatted.patch This improvement to the index-metatags plugin (sometimes also refered to parse-metatags plugin) allows for conversion of selected fields to the Solr date format. The main benefit of this conversion is the possibility to create range facets. In order to convert the values of selected metatags to Solr date format, you must specify in nutch-site.xml. This can be for example used with Dublin Core elements. A subdomain which would have pages with the meta tag dcterms.modified would be cic.gc.ca. dcterms.modified must also be defined in the metatags.names and index.parse.md properties. {code} property nameindex.dateconvert.md/name valuemetatag.dcterms.modified/value descriptionFor plugin index-metadata: Indicate here the name of the html meta tag that should be converted to Solr date format. /description /property {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Nutch-nutchgora #289
See https://builds.apache.org/job/Nutch-nutchgora/289/ -- Started by timer Building remotely on solaris1 in workspace https://builds.apache.org/job/Nutch-nutchgora/ws/ hudson.util.IOException2: remote file operation failed: https://builds.apache.org/job/Nutch-nutchgora/ws/ at hudson.remoting.Channel@30e3f2e6:solaris1 at hudson.FilePath.act(FilePath.java:838) at hudson.FilePath.act(FilePath.java:824) at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:743) at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:685) at hudson.model.AbstractProject.checkout(AbstractProject.java:1242) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494) at hudson.model.Run.execute(Run.java:1460) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:239) Caused by: java.io.IOException: Remote call on solaris1 failed at hudson.remoting.Channel.call(Channel.java:655) at hudson.FilePath.act(FilePath.java:831) ... 11 more Caused by: java.lang.LinkageError: duplicate class definition: hudson/model/Descriptor at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:621) at java.lang.ClassLoader.defineClass(ClassLoader.java:466) at hudson.remoting.RemoteClassLoader.loadClassFile(RemoteClassLoader.java:152) at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:131) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) at java.lang.Class.getDeclaredFields0(Native Method) at java.lang.Class.privateGetDeclaredFields(Class.java:2259) at java.lang.Class.getDeclaredField(Class.java:1852) at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1582) at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:52) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:408) at java.security.AccessController.doPrivileged(Native Method) at java.io.ObjectStreamClass.init(ObjectStreamClass.java:400) at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:297) at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:531) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1552) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1466) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1552) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1466) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1699) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1910) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1834) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1719) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1910) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1834) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1719) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1910) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1834) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1719) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348) at hudson.remoting.UserRequest.deserialize(UserRequest.java:182) at hudson.remoting.UserRequest.perform(UserRequest.java:98) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:287) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269) at java.util.concurrent.FutureTask.run(FutureTask.java:123) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651) at
Build failed in Jenkins: Nutch-trunk #1877
See https://builds.apache.org/job/Nutch-trunk/1877/ -- Started by timer Building remotely on solaris1 in workspace https://builds.apache.org/job/Nutch-trunk/ws/ hudson.util.IOException2: remote file operation failed: https://builds.apache.org/job/Nutch-trunk/ws/ at hudson.remoting.Channel@30e3f2e6:solaris1 at hudson.FilePath.act(FilePath.java:838) at hudson.FilePath.act(FilePath.java:824) at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:743) at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:685) at hudson.model.AbstractProject.checkout(AbstractProject.java:1242) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494) at hudson.model.Run.execute(Run.java:1460) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:239) Caused by: java.io.IOException: Remote call on solaris1 failed at hudson.remoting.Channel.call(Channel.java:655) at hudson.FilePath.act(FilePath.java:831) ... 11 more Caused by: java.lang.LinkageError: duplicate class definition: hudson/model/Descriptor at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:621) at java.lang.ClassLoader.defineClass(ClassLoader.java:466) at hudson.remoting.RemoteClassLoader.loadClassFile(RemoteClassLoader.java:152) at hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:131) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) at java.lang.Class.getDeclaredFields0(Native Method) at java.lang.Class.privateGetDeclaredFields(Class.java:2259) at java.lang.Class.getDeclaredField(Class.java:1852) at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1582) at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:52) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:408) at java.security.AccessController.doPrivileged(Native Method) at java.io.ObjectStreamClass.init(ObjectStreamClass.java:400) at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:297) at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:531) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1552) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1466) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1552) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1466) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1699) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1910) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1834) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1719) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1910) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1834) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1719) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1910) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1834) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1719) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348) at hudson.remoting.UserRequest.deserialize(UserRequest.java:182) at hudson.remoting.UserRequest.perform(UserRequest.java:98) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:287) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269) at java.util.concurrent.FutureTask.run(FutureTask.java:123) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651) at