thank you Rick the section logging of solr Admin show only name of Error
and caused by

Full Import failed:java.lang.RuntimeException:
java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
to read content Processing Document # 1
        at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:271)
        at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:417)
        at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:481)
        at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:462)
Caused by: java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
to read content Processing Document # 1
        at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:417)
        at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
        at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:233)
        ... 3 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to read content Processing Document # 1
        at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:70)
        at 
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:165)
        at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:244)
        at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
        at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:515)
        at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
        ... 5 more
Caused by: org.apache.tika.exception.TikaException: Unexpected
RuntimeException from
org.apache.tika.parser.microsoft.ooxml.OOXMLParser@88ee82
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
        at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
        at 
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:159)
        ... 9 more
Caused by: org.apache.poi.openxml4j.exceptions.InvalidOperationException:
Can't open the specified file:
'D:\solr-5.4.1\server\tmp\apache-tika-417176949707403825.tmp'
        at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:112)
        at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:225)
        at 
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:69)
        at 
org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:87)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
        ... 12 more
Caused by: java.util.zip.ZipException: invalid END header (bad central
directory offset)
        at java.util.zip.ZipFile.open(Native Method)
        at java.util.zip.ZipFile.<init>(ZipFile.java:220)
        at java.util.zip.ZipFile.<init>(ZipFile.java:150)
        at java.util.zip.ZipFile.<init>(ZipFile.java:164)
        at 
org.apache.poi.openxml4j.util.ZipSecureFile.<init>(ZipSecureFile.java:105)
        at 
org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:175)
        at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:110)


they are no trace or name of bad files who fail indexation


2016-06-24 12:39 GMT+00:00 Rick Leir <[email protected]>:

> Do you mean that some of your pdf's are corrupt and Tika cannot index
> them? There should be some mention in the log file, so you can know which
> pdf is a problem. Fix it somehow and re-index.
>
>
> On June 22, 2016 9:44:01 PM EDT, kostali hassan <[email protected]>
> wrote:
>>
>> ---------- Message transféré ----------
>> De : "kostali hassan" <[email protected]>
>> Date : 22 juin 2016 14:00
>> Objet : how collect a list of damaged file they can not be indexed
>> À : <[email protected]>
>> Cc :
>>
>> I start solr 5.4.1 to indexe rich data pdf and msword using data import
>> handler.
>> the file tika-config.xml I wrote: onError="skip"
>>
>> I want recover corrupted file
>>
>>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>

Reply via email to