[
https://issues.apache.org/jira/browse/TIKA-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hardik updated TIKA-3249:
--------------------------
Description:
We are using EmbeddedDocumentExtractor class to parse RTF type of file.
Consider a case when Excel file is there in RTF file , now after parsing if we
extract that excel file then the generated excel file is by default hidden, if
we go to View and click on unhide then we are able to read excel data otherwise
not.
Is there anyway to overcome this issue ?
Sample code which we are using is below.
ParseContext pcontext = new ParseContext();ParseContext pcontext = new
ParseContext(); pcontext.set(EmbeddedDocumentExtractor.class, extractor);
Parser rtfParser = new AutoDetectParser(); BodyContentHandler handler = new
BodyContentHandler(); Metadata metadata = new Metadata(); try (InputStream
inputstream = new FileInputStream(new File(filepath))) \{
rtfParser.parse(inputstream, handler, metadata, pcontext); }
was:
We are using EmbeddedDocumentExtractor class to parse RTF type of file.
Consider a case when Excel file is there in RTF file , now after parsing if we
extract that excel file then the generated excel file is by default hidden, if
we go to View and click on unhide then we are able to read excel data otherwise
not.
Is there anyway to overcome this issue ?
> Excel type of files are generated hidden during RTF file parsing
> ----------------------------------------------------------------
>
> Key: TIKA-3249
> URL: https://issues.apache.org/jira/browse/TIKA-3249
> Project: Tika
> Issue Type: Bug
> Reporter: Hardik
> Priority: Critical
>
> We are using EmbeddedDocumentExtractor class to parse RTF type of file.
> Consider a case when Excel file is there in RTF file , now after parsing if
> we extract that excel file then the generated excel file is by default
> hidden, if we go to View and click on unhide then we are able to read excel
> data otherwise not.
> Is there anyway to overcome this issue ?
>
> Sample code which we are using is below.
> ParseContext pcontext = new ParseContext();ParseContext pcontext = new
> ParseContext(); pcontext.set(EmbeddedDocumentExtractor.class, extractor);
> Parser rtfParser = new AutoDetectParser(); BodyContentHandler handler = new
> BodyContentHandler(); Metadata metadata = new Metadata(); try (InputStream
> inputstream = new FileInputStream(new File(filepath))) \{
> rtfParser.parse(inputstream, handler, metadata, pcontext); }
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)