Hi,

I am trying to read documents from a file system into Solr, using 
dataimporthandler but keep getting the following errors:

[cid:image002.png@01D46082.022FF7A0]

Exception while processing: files document : 
null:org.apache.solr.handler.dataimport.DataImportHandlerException: 
java.lang.ClassCastException: java.io.InputStreamReader cannot be cast to 
java.io.InputStream

         at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:61)

         at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:270)

         at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)

         at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:517)

         at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)

         at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)

         at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:233)

         at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:424)

         at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:483)

         at 
org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:466)

         at java.lang.Thread.run(Thread.java:748)

Caused by: java.lang.ClassCastException: java.io.InputStreamReader cannot be 
cast to java.io.InputStream

         at 
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:132)

         at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:267)

         ... 9 more



[cid:image003.png@01D46082.022FF7A0]

Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: 
org.apache.solr.handler.dataimport.DataImportHandlerException: 
java.lang.ClassCastException: java.io.InputStreamReader cannot be cast to 
java.io.InputStream
         at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:271)
         at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:424)
         at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:483)
         at 
org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:466)
         at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: 
org.apache.solr.handler.dataimport.DataImportHandlerException: 
java.lang.ClassCastException: java.io.InputStreamReader cannot be cast to 
java.io.InputStream
         at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:417)
         at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
         at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:233)
         ... 4 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
java.lang.ClassCastException: java.io.InputStreamReader cannot be cast to 
java.io.InputStream
         at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:61)
         at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:270)
         at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
         at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:517)
         at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
         ... 6 more
Caused by: java.lang.ClassCastException: java.io.InputStreamReader cannot be 
cast to java.io.InputStream
         at 
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:132)
         at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:267)
         ... 9 more


My data-config file looks as follows:

<dataConfig>
  <dataSource name="bin" type="BinFileDataSource" />
  <document>
      <entity name="files" processor="FileListEntityProcessor" 
baseDir="D:/CAPTIA/docs/19107" fileName=".*DOC" recursive="true" 
rootEntity="false" dataSource="bin" onError="skip">
        <field column="fileAbsolutePath" name="id" />

        <entity
         name="read_file"
         processor="TikaEntityProcessor"
         url="${files.fileAbsolutePath}"
         >
          <field column="text" name="content" />
        </entity>
      </entity>
  </document>
</dataConfig>

And in the Schema I basically have two fields:

<field name="Id" type="string" indexed="true" stored="true" required="true" 
multiValued="false"/>
<field name="text" type="text_general" indexed="true" stored="false" 
multiValued="true"/>

Any help is appreciated.


Martin Frank Hansen


Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du 
KMD’s Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der fortæller, 
hvordan vi behandler oplysninger om dig.

Protection of your personal data is important to us. Here you can read KMD’s 
Privacy Policy<http://www.kmd.net/Privacy-Policy> outlining how we process your 
personal data.

Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information. Hvis 
du ved en fejltagelse modtager e-mailen, beder vi dig venligst informere 
afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi dig slette 
e-mailen i dit system uden at videresende eller kopiere den. Selvom e-mailen og 
ethvert vedhæftet bilag efter vores overbevisning er fri for virus og andre 
fejl, som kan påvirke computeren eller it-systemet, hvori den modtages og 
læses, åbnes den på modtagerens eget ansvar. Vi påtager os ikke noget ansvar 
for tab og skade, som er opstået i forbindelse med at modtage og bruge e-mailen.

Please note that this message may contain confidential information. If you have 
received this message by mistake, please inform the sender of the mistake by 
sending a reply, then delete the message from your system without making, 
distributing or retaining any copies of it. Although we believe that the 
message and any attachments are free from viruses and other errors that might 
affect the computer or it-system where it is received and read, the recipient 
opens the message at his or her own risk. We assume no responsibility for any 
loss or damage arising from the receipt or use of this message.

Reply via email to