EnwikiContentSource isn't thread safe
-------------------------------------
Key: LUCENE-1996
URL: https://issues.apache.org/jira/browse/LUCENE-1996
Project: Lucene - Java
Issue Type: Bug
Components: contrib/benchmark
Reporter: Michael McCandless
Priority: Minor
Fix For: 3.1
When I run this alg:
{code}
analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
content.source=org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource
docs.file=/x/lucene/enwiki-20090724-pages-articles.xml.bz2
doc.tokenized = false
ram.flush.mb=32.0
doc.stored = false
doc.term.vector = false
log.step.AddDoc=10000
directory=FSDirectory
autocommit=false
compound=false
work.dir=/lucene/work.wiki.nd0.02M
{ "BuildIndex"
- CreateIndex
[ { "AddDocs" AddDoc > : 10000 } : 2
- CloseIndex
}
RepSumByPrefRound BuildIndex
{code}
I hit exceptions in each thread like this:
{code}
Exception in thread "Thread-2" java.lang.RuntimeException:
org.xml.sax.SAXParseException: Open quote is expected for attribute "msxi"
associated with an element type "mdiiki".
at
org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:189)
at java.lang.Thread.run(Thread.java:613)
Caused by: org.xml.sax.SAXParseException: Open quote is expected for attribute
"msxi" associated with an element type "mdiiki".
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236)
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:386)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:316)
at
com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1441)
at
com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:802)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:578)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:222)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(XMLNSDocumentScannerImpl.java:779)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1794)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)
at
com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242)
at
org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:166)
... 1 more
{code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]