Just extend XPathEntityProcessor override nextRow() after 100
return null. Use it as your processor
--Noble
On Tue, Jun 24, 2008 at 10:45 AM, Noble Paul നോബിള് नोब्ळ्
<[EMAIL PROTECTED]> wrote:
> Just extend XPathEntityProcessor override nextRow() after 100 . Use
> it as your processor
> return null;
>
> On Tue, Jun 24, 2008 at 10:23 AM, mike segv <[EMAIL PROTECTED]> wrote:
>>
>> That fixed it.
>>
>> If I'm inserting millions of documents, how do I control docs/update? E.g.
>> if there are 50K docs per file, I'm thinking that I should probably code up
>> my own DataSource that allows me to stipulate docs/update. Like say, 100
>> instead of 50K. Does this make sense?
>>
>> Mike
>>
>>
>> Noble Paul നോബിള് नोब्ळ् wrote:
>>>
>>> hi ,
>>> You have not registered any datasources . the second entity needs a
>>> datasource.
>>> Remove the dataSource="null" and add a name for the second entity
>>> (good practice). No need for baseDir attribute for second entity .
>>> See the modified xml added below
>>> --Noble
>>>
>>> <dataConfig>
>>> <dataSource type="FileDataSource"/>
>>> <document>
>>> <entity name="f" processor="FileListEntityProcessor" fileName=".*xml"
>>> newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false"
>>> dataSource="null" baseDir="/san/tomcat-services/solr-medline">
>>> <entity name="x" processor="XPathEntityProcessor"
>>> forEach="/MedlineCitation"
>>> url="${f.fileAbsolutePath}" >
>>> <field column="pmid" xpath="/MedlineCitation/PMID"/>
>>> </entity>
>>> </entity>
>>> </document>
>>> </dataConfig>
>>>
>>> On Tue, Jun 24, 2008 at 6:39 AM, mike segv <[EMAIL PROTECTED]> wrote:
>>>>
>>>> I'm trying to use the fileListEntityProcessor to add some xml documents
>>>> to a
>>>> solr index. I'm running a nightly version of solr-1.3 with SOLR-469 and
>>>> SOLR-563. I've been able to successfuly run the slashdot httpDataSource
>>>> example. My data-config.xml file loads without errors. When I attempt
>>>> the
>>>> full-import command I get the exception below. Thanks for any help.
>>>>
>>>> Mike
>>>>
>>>> WARNING: No lockType configured for
>>>> /san/tomcat-services/solr-medline/solr/data/index/ assuming 'simple'
>>>> Jun 23, 2008 7:59:49 PM org.apache.solr.handler.dataimport.DataImporter
>>>> doFullImport
>>>> SEVERE: Full Import failed
>>>> java.lang.RuntimeException: java.lang.NullPointerException
>>>> at
>>>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:97)
>>>> at
>>>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:212)
>>>> at
>>>> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:166)
>>>> at
>>>> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:149)
>>>> at
>>>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:286)
>>>> at
>>>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:312)
>>>> at
>>>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179)
>>>> at
>>>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:140)
>>>> at
>>>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:335)
>>>> at
>>>> org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:386)
>>>> at
>>>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>>>> Caused by: java.lang.NullPointerException
>>>> at java.io.Reader.<init>(Reader.java:61)
>>>> at java.io.BufferedReader.<init>(BufferedReader.java:76)
>>>> at com.bea.xml.stream.MXParser.checkForXMLDecl(MXParser.java:775)
>>>> at com.bea.xml.stream.MXParser.setInput(MXParser.java:806)
>>>> at
>>>> com.bea.xml.stream.MXParserFactory.createXMLStreamReader(MXParserFactory.java:261)
>>>> at
>>>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:93)
>>>> ... 10 more
>>>>
>>>> Here is my data-config:
>>>>
>>>> <dataConfig>
>>>> <document>
>>>> <entity name="f" processor="FileListEntityProcessor" fileName=".*xml"
>>>> newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false"
>>>> dataSource="null" baseDi
>>>> r="/san/tomcat-services/solr-medline">
>>>> <entity processor="XPathEntityProcessor" forEach="/MedlineCitation"
>>>> url="${f.fileAbsolutePath}" dataSource="null">
>>>> <field column="pmid" xpath="/MedlineCitation/PMID"/>
>>>> </entity>
>>>> </entity>
>>>> </document>
>>>> </dataConfig>
>>>>
>>>> And a snippet from an xml file:
>>>> <MedlineCitation Owner="PIP" Status="MEDLINE">
>>>> <PMID>12236137</PMID>
>>>> <DateCreated>
>>>> <Year>1980</Year>
>>>> <Month>01</Month>
>>>> <Day>03</Day>
>>>> </DateCreated>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18081671.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> --Noble Paul
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18083747.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>
>
>
> --
> --Noble Paul
>
--
--Noble Paul