[ 
https://issues.apache.org/jira/browse/NUTCH-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703291#comment-14703291
 ] 

Asitang Mishra commented on NUTCH-1486:
---------------------------------------

Hey Lewis,
Your fix for the jar soup did not work for the Naive bayes plugin. It was not 
able to find classes. Here is what I got:

java.lang.Exception: java.lang.RuntimeException: 
java.lang.ClassNotFoundException: 
org.apache.mahout.vectorizer.document.SequenceFileTokenizerMapper
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
org.apache.mahout.vectorizer.document.SequenceFileTokenizerMapper
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
        at 
org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:718)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: 
org.apache.mahout.vectorizer.document.SequenceFileTokenizerMapper
        at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:340)
        at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
        ... 9 more
2015-08-19 09:27:41,936 ERROR naivebayes.NaiveBayesParseFilter - Error occured 
while training:: java.lang.IllegalStateException: Job failed!
        at 
org.apache.mahout.vectorizer.DocumentProcessor.tokenizeDocuments(DocumentProcessor.java:95)
        at 
org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:257)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at 
org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:56)
        at 
org.apache.nutch.parsefilter.naivebayes.NaiveBayesClassifier.createModel(NaiveBayesClassifier.java:99)
        at 
org.apache.nutch.parsefilter.naivebayes.NaiveBayesParseFilter.train(NaiveBayesParseFilter.java:93)
        at 
org.apache.nutch.parsefilter.naivebayes.NaiveBayesParseFilter.setConf(NaiveBayesParseFilter.java:148)
        at 
org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:163)
        at 
org.apache.nutch.plugin.PluginRepository.getOrderedPlugins(PluginRepository.java:441)
        at 
org.apache.nutch.parse.HtmlParseFilters.<init>(HtmlParseFilters.java:35)
        at org.apache.nutch.parse.html.HtmlParser.setConf(HtmlParser.java:343)
        at 
org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:163)
        at 
org.apache.nutch.parse.ParserFactory.getParsers(ParserFactory.java:136)
        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:78)
        at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:104)
        at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:46)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)



> Upgrade to Solr 4.10.2
> ----------------------
>
>                 Key: NUTCH-1486
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1486
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.6, 2.1
>         Environment: Solr 4.0, Nutch trunk 1.6-SNAPSHOT & Probably 2.2-SNAPHOT
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>              Labels: memex
>             Fix For: 1.11
>
>         Attachments: NUTCH-1486-1.8.patch, NUTCH-1486-1.9-trunk.patch, 
> NUTCH-1486-2.x-v3.patch, NUTCH-1486-2.x.patch, NUTCH-1486-2.x.v2.patch, 
> NUTCH-1486-nutchgora.patch, NUTCH-1486-trunk.patch, 
> NUTCH-1486-trunk.v2.patch, NUTCH-1486-trunk.v3.patch, 
> NUTCH-1486-trunkv4.patch, NUTCH-1486-trunkv5.patch
>
>
> When attempting to configure a 4 multicore 4.0 instance with Nutch 
> schema-solr4.xml file, I get the following exceptions.
> This has been discussed previously. As I see it we have two options
> 1. Keep maintaining both schema options
> 2. Ditch the more complex schema-solr4.xml in favour of vanilla schema.xml
> Thoughts?
> {code}
> SEVERE: Unable to create core: collection4
> org.apache.solr.common.SolrException: Unable to use updateLog: _version_field 
> must exist in schema, using indexed="true" stored="true" and 
> multiValued="false" (_version_ does not exist)
>       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:721)
>       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:566)
>       at org.apache.solr.core.CoreContainer.create(CoreContainer.java:850)
>       at org.apache.solr.core.CoreContainer.load(CoreContainer.java:534)
>       at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356)
>       at 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107)
>       at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:114)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:754)
>       at 
> org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:258)
>       at 
> org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1221)
>       at 
> org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:699)
>       at 
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:454)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:36)
>       at 
> org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:183)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:491)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:138)
>       at 
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:142)
>       at 
> org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:53)
>       at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:604)
>       at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:535)
>       at org.eclipse.jetty.util.Scanner.scan(Scanner.java:398)
>       at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:332)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:118)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:552)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:227)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:63)
>       at 
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:53)
>       at 
> org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:91)
>       at org.eclipse.jetty.server.Server.doStart(Server.java:263)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1215)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at 
> org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1138)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.eclipse.jetty.start.Main.invokeMain(Main.java:457)
>       at org.eclipse.jetty.start.Main.start(Main.java:602)
>       at org.eclipse.jetty.start.Main.main(Main.java:82)
> Caused by: org.apache.solr.common.SolrException: Unable to use updateLog: 
> _version_field must exist in schema, using indexed="true" stored="true" and 
> multiValued="false" (_version_ does not exist)
>       at org.apache.solr.update.UpdateLog.init(UpdateLog.java:236)
>       at org.apache.solr.update.UpdateHandler.initLog(UpdateHandler.java:94)
>       at org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:123)
>       at 
> org.apache.solr.update.DirectUpdateHandler2.<init>(DirectUpdateHandler2.java:97)
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>       at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:476)
>       at org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:544)
>       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:705)
>       ... 45 more
> Caused by: org.apache.solr.common.SolrException: _version_field must exist in 
> schema, using indexed="true" stored="true" and multiValued="false" (_version_ 
> does not exist)
>       at 
> org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:57)
>       at org.apache.solr.update.VersionInfo.<init>(VersionInfo.java:83)
>       at org.apache.solr.update.UpdateLog.init(UpdateLog.java:233)
>       ... 55 more
> 01-Nov-2012 16:26:15 org.apache.solr.common.SolrException log
> SEVERE: null:org.apache.solr.common.SolrException: Unable to use updateLog: 
> _version_field must exist in schema, using indexed="true" stored="true" and 
> multiValued="false" (_version_ does not exist)
>       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:721)
>       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:566)
>       at org.apache.solr.core.CoreContainer.create(CoreContainer.java:850)
>       at org.apache.solr.core.CoreContainer.load(CoreContainer.java:534)
>       at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356)
>       at 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107)
>       at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:114)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:754)
>       at 
> org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:258)
>       at 
> org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1221)
>       at 
> org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:699)
>       at 
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:454)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:36)
>       at 
> org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:183)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:491)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:138)
>       at 
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:142)
>       at 
> org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:53)
>       at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:604)
>       at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:535)
>       at org.eclipse.jetty.util.Scanner.scan(Scanner.java:398)
>       at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:332)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:118)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:552)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:227)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:63)
>       at 
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:53)
>       at 
> org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:91)
>       at org.eclipse.jetty.server.Server.doStart(Server.java:263)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1215)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at 
> org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1138)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.eclipse.jetty.start.Main.invokeMain(Main.java:457)
>       at org.eclipse.jetty.start.Main.start(Main.java:602)
>       at org.eclipse.jetty.start.Main.main(Main.java:82)
> Caused by: org.apache.solr.common.SolrException: Unable to use updateLog: 
> _version_field must exist in schema, using indexed="true" stored="true" and 
> multiValued="false" (_version_ does not exist)
>       at org.apache.solr.update.UpdateLog.init(UpdateLog.java:236)
>       at org.apache.solr.update.UpdateHandler.initLog(UpdateHandler.java:94)
>       at org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:123)
>       at 
> org.apache.solr.update.DirectUpdateHandler2.<init>(DirectUpdateHandler2.java:97)
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>       at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:476)
>       at org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:544)
>       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:705)
>       ... 45 more
> Caused by: org.apache.solr.common.SolrException: _version_field must exist in 
> schema, using indexed="true" stored="true" and multiValued="false" (_version_ 
> does not exist)
>       at 
> org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:57)
>       at org.apache.solr.update.VersionInfo.<init>(VersionInfo.java:83)
>       at org.apache.solr.update.UpdateLog.init(UpdateLog.java:233)
>       ... 55 more
> 01-Nov-2012 16:26:15 org.apache.solr.servlet.SolrDispatchFilter init
> INFO: user.dir=/home/lewis/ASF/solr/example
> 01-Nov-2012 16:26:15 org.apache.solr.servlet.SolrDispatchFilter init
> INFO: SolrDispatchFilter.init() done
> 2012-11-01 16:26:15.228:INFO:oejs.AbstractConnector:Started 
> [email protected]:8983
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to