[ 
https://issues.apache.org/jira/browse/TIKA-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Palsulich closed TIKA-1460.
---------------------------------
    Resolution: Cannot Reproduce

Closing as Cannot Reproduce, since it's been a month since my last comment and 
we don't have the file which reproduces the issue. Please reopen if you're 
still running into this!

> Could not parse predefined CMAP file for 'Adobe-GBK1-UCS2'
> ----------------------------------------------------------
>
>                 Key: TIKA-1460
>                 URL: https://issues.apache.org/jira/browse/TIKA-1460
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.3
>         Environment: win7,myeclipse8.5
>            Reporter: onyas
>            Priority: Critical
>
> for some reason,I could not upload the file,Here is the info..
> and i checked all the version in the directory of 
> \org\apache\pdfbox\resources\cmap, I have not found the ’Adobe-GBK1-UCS2‘ file
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from 
> org.apache.tika.parser.microsoft.OfficeParser@d640af
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
>       at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> Caused by: java.lang.IllegalArgumentException: Position 66048 past the end of 
> the file
>       at 
> org.apache.poi.poifs.nio.FileBackedDataSource.read(FileBackedDataSource.java:50)
>       at 
> org.apache.poi.poifs.filesystem.NPOIFSFileSystem.getBlockAt(NPOIFSFileSystem.java:420)
>       at 
> org.apache.poi.poifs.filesystem.NPOIFSFileSystem.readBAT(NPOIFSFileSystem.java:397)
>       at 
> org.apache.poi.poifs.filesystem.NPOIFSFileSystem.readCoreContents(NPOIFSFileSystem.java:356)
>       at 
> org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:202)
>       at 
> org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:184)
>       at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:156)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
>       ... 21 more
> the major code is :
>                 Parser parser = new AutoDetectParser();
>               ContentHandler handler = new BodyContentHandler(getNum());
>               Metadata metadata = new Metadata();
>               ParseContext context = new ParseContext();
>               InputStream stream = null;
>               StringBuffer content = new StringBuffer();
>               try {
>                       stream = new FileInputStream(file);
>                       if (stream != null) {
>                               parser.parse(stream, handler, metadata, 
> context);
>                               content = content.append(handler);
>                               
>                               if(StringUtils.isNotBlank(content.toString())){
>                                       hasContent = true;
>                                       handler = null;
>                                       metadata = null;
>                                       context = null;
>                               }
>                       }
> And the exception is throwed at this line== parser.parse(stream, handler, 
> metadata, context);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to