[jira] [Updated] (TIKA-3249) Excel type of files are generated hidden during RTF file parsing

Hardik (Jira) Wed, 16 Dec 2020 08:53:04 -0800


     [ 
https://issues.apache.org/jira/browse/TIKA-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Hardik  updated TIKA-3249:
--------------------------
    Description: 
We are using EmbeddedDocumentExtractor class to parse RTF type of file. 

Consider a case when Excel file is there in RTF file , now after parsing if we 
extract that excel file then the generated excel file is by default hidden, if 
we go to View and click on unhide then we are able to read excel data otherwise 
not.

Is there anyway to overcome this issue ?

 

Sample code which we are using is below.

ParseContext pcontext = new ParseContext();ParseContext pcontext = new 
ParseContext(); pcontext.set(EmbeddedDocumentExtractor.class, extractor); 
Parser rtfParser = new AutoDetectParser(); BodyContentHandler handler = new 
BodyContentHandler(); Metadata metadata = new Metadata(); try (InputStream 
inputstream = new FileInputStream(new File(filepath))) \{ 
rtfParser.parse(inputstream, handler, metadata, pcontext); }

 

  was:
We are using EmbeddedDocumentExtractor class to parse RTF type of file. 

Consider a case when Excel file is there in RTF file , now after parsing if we 
extract that excel file then the generated excel file is by default hidden, if 
we go to View and click on unhide then we are able to read excel data otherwise 
not.

Is there anyway to overcome this issue ?


> Excel type of files are generated hidden during RTF file parsing
> ----------------------------------------------------------------
>
>                 Key: TIKA-3249
>                 URL: https://issues.apache.org/jira/browse/TIKA-3249
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Hardik 
>            Priority: Critical
>
> We are using EmbeddedDocumentExtractor class to parse RTF type of file. 
> Consider a case when Excel file is there in RTF file , now after parsing if 
> we extract that excel file then the generated excel file is by default 
> hidden, if we go to View and click on unhide then we are able to read excel 
> data otherwise not.
> Is there anyway to overcome this issue ?
>  
> Sample code which we are using is below.
> ParseContext pcontext = new ParseContext();ParseContext pcontext = new 
> ParseContext(); pcontext.set(EmbeddedDocumentExtractor.class, extractor); 
> Parser rtfParser = new AutoDetectParser(); BodyContentHandler handler = new 
> BodyContentHandler(); Metadata metadata = new Metadata(); try (InputStream 
> inputstream = new FileInputStream(new File(filepath))) \{ 
> rtfParser.parse(inputstream, handler, metadata, pcontext); }
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (TIKA-3249) Excel type of files are generated hidden during RTF file parsing

Reply via email to