[jira] [Created] (NUTCH-1475) Nutch 2.1 Index-More Plugin -- A better fall back value for date field

2012-10-06 Thread James Sullivan (JIRA)
James Sullivan created NUTCH-1475:
-

 Summary: Nutch 2.1 Index-More Plugin -- A better fall back value 
for date field
 Key: NUTCH-1475
 URL: https://issues.apache.org/jira/browse/NUTCH-1475
 Project: Nutch
  Issue Type: Bug
Affects Versions: nutchgora, 2.1
 Environment: All
Reporter: James Sullivan
Priority: Minor
 Attachments: index-more-2x.patch

Among other fields, the more plugin for Nutch 2.x provides a last modified 
and date field for the Solr index. The last modified field is the last 
modified date from the http headers if available, if not available it is left 
empty. Currently, the date field is the same as the last modified field 
unless that field is empty in which case getFetchTime is used as a fall back. I 
think getFetchTime is not a good fall back as it is the next fetch time and 
often a month or more in the future which doesn't make sense for the date 
field. Users do not expect webpages/documents with future dates. A more 
sensible fallback would be current date at the time it is indexed. 

This is possible by simply changing line 97 of 
https://svn.apache.org/repos/asf/nutch/branches/2.x/src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java
 from


time = page.getFetchTime(); // use fetch time

to

time = new Date().getTime();


Users interested in the getFetchTime value can still get it from the tstamp 
field.




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (NUTCH-1475) Nutch 2.1 Index-More Plugin -- A better fall back value for date field

2012-10-06 Thread James Sullivan (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Sullivan updated NUTCH-1475:
--

Attachment: index-more-2x.patch

 Nutch 2.1 Index-More Plugin -- A better fall back value for date field
 --

 Key: NUTCH-1475
 URL: https://issues.apache.org/jira/browse/NUTCH-1475
 Project: Nutch
  Issue Type: Bug
Affects Versions: nutchgora, 2.1
 Environment: All
Reporter: James Sullivan
Priority: Minor
  Labels: index-more, plugins
 Attachments: index-more-2x.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 Among other fields, the more plugin for Nutch 2.x provides a last modified 
 and date field for the Solr index. The last modified field is the last 
 modified date from the http headers if available, if not available it is left 
 empty. Currently, the date field is the same as the last modified field 
 unless that field is empty in which case getFetchTime is used as a fall back. 
 I think getFetchTime is not a good fall back as it is the next fetch time and 
 often a month or more in the future which doesn't make sense for the date 
 field. Users do not expect webpages/documents with future dates. A more 
 sensible fallback would be current date at the time it is indexed. 
 This is possible by simply changing line 97 of 
 https://svn.apache.org/repos/asf/nutch/branches/2.x/src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java
  from
 time = page.getFetchTime(); // use fetch time
 to
 time = new Date().getTime();
 Users interested in the getFetchTime value can still get it from the tstamp 
 field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Nutch-nutchgora #371

2012-10-06 Thread Apache Jenkins Server
See https://builds.apache.org/job/Nutch-nutchgora/371/

--
Started by timer
Building remotely on solaris1 in workspace 
https://builds.apache.org/job/Nutch-nutchgora/ws/
hudson.util.IOException2: remote file operation failed: 
https://builds.apache.org/job/Nutch-nutchgora/ws/ at 
hudson.remoting.Channel@1807c204:solaris1
at hudson.FilePath.act(FilePath.java:838)
at hudson.FilePath.act(FilePath.java:824)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:743)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:685)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1256)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494)
at hudson.model.Run.execute(Run.java:1502)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:236)
Caused by: java.io.IOException: Remote call on solaris1 failed
at hudson.remoting.Channel.call(Channel.java:673)
at hudson.FilePath.act(FilePath.java:831)
... 11 more
Caused by: java.lang.LinkageError: duplicate class definition: 
hudson/model/Descriptor
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:621)
at java.lang.ClassLoader.defineClass(ClassLoader.java:466)
at 
hudson.remoting.RemoteClassLoader.loadClassFile(RemoteClassLoader.java:152)
at 
hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:131)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
at java.lang.Class.getDeclaredFields0(Native Method)
at java.lang.Class.privateGetDeclaredFields(Class.java:2259)
at java.lang.Class.getDeclaredField(Class.java:1852)
at 
java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1582)
at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:52)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:408)
at java.security.AccessController.doPrivileged(Native Method)
at java.io.ObjectStreamClass.init(ObjectStreamClass.java:400)
at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:297)
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:531)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1552)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1466)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1552)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1466)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1699)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1910)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1834)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1719)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1910)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1834)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1719)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1910)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1834)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1719)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
at hudson.remoting.UserRequest.deserialize(UserRequest.java:182)
at hudson.remoting.UserRequest.perform(UserRequest.java:98)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)
at hudson.remoting.Request$2.run(Request.java:326)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
at java.util.concurrent.FutureTask.run(FutureTask.java:123)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)
at 

Build failed in Jenkins: Nutch-trunk #1980

2012-10-06 Thread Apache Jenkins Server
See https://builds.apache.org/job/Nutch-trunk/1980/

--
Started by timer
Building remotely on solaris1 in workspace 
https://builds.apache.org/job/Nutch-trunk/ws/
hudson.util.IOException2: remote file operation failed: 
https://builds.apache.org/job/Nutch-trunk/ws/ at 
hudson.remoting.Channel@1807c204:solaris1
at hudson.FilePath.act(FilePath.java:838)
at hudson.FilePath.act(FilePath.java:824)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:743)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:685)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1256)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494)
at hudson.model.Run.execute(Run.java:1502)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:236)
Caused by: java.io.IOException: Remote call on solaris1 failed
at hudson.remoting.Channel.call(Channel.java:673)
at hudson.FilePath.act(FilePath.java:831)
... 11 more
Caused by: java.lang.LinkageError: duplicate class definition: 
hudson/model/Descriptor
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:621)
at java.lang.ClassLoader.defineClass(ClassLoader.java:466)
at 
hudson.remoting.RemoteClassLoader.loadClassFile(RemoteClassLoader.java:152)
at 
hudson.remoting.RemoteClassLoader.findClass(RemoteClassLoader.java:131)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
at java.lang.Class.getDeclaredFields0(Native Method)
at java.lang.Class.privateGetDeclaredFields(Class.java:2259)
at java.lang.Class.getDeclaredField(Class.java:1852)
at 
java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1582)
at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:52)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:408)
at java.security.AccessController.doPrivileged(Native Method)
at java.io.ObjectStreamClass.init(ObjectStreamClass.java:400)
at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:297)
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:531)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1552)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1466)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1552)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1466)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1699)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1910)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1834)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1719)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1910)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1834)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1719)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1910)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1834)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1719)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1305)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
at hudson.remoting.UserRequest.deserialize(UserRequest.java:182)
at hudson.remoting.UserRequest.perform(UserRequest.java:98)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)
at hudson.remoting.Request$2.run(Request.java:326)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
at java.util.concurrent.FutureTask.run(FutureTask.java:123)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)
at