Yahav Amsalem created TIKA-3257:
-----------------------------------
Summary: RAR files extracted content is not separated from the
inner file names
Key: TIKA-3257
URL: https://issues.apache.org/jira/browse/TIKA-3257
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.23
Reporter: Yahav Amsalem
Attachments: test.rar
Attached is a RAR file containing a PPT file ("test.ppt") with one line in it -
"Here the PPT content starts".
However, the extracted text from tika is *not separating the file name and its
content* as follows:
"test.pptHere the PPT content starts"
--
This message was sent by Atlassian Jira
(v8.3.4#803005)