Hi Shay,

As mentioned earlier, I have not used it yet. But as per the comments in the 
commit, unrar needs to be available on the path. We don’t need to provide its 
path in config, but we need to enable the unrar parser like shown in config 
below:

https://github.com/apache/tika/blob/main/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pkg-module/src/test/resources/org/apache/tika/parser/pkg/tika-unrar-config.xml

Regards,
Sandeep Kulkarni

From: שי ברק <[email protected]>
Sent: Tuesday, April 25, 2023 5:42 PM
To: Sandeep Kulkarni <[email protected]>; [email protected]
Subject: Re: [External] Re: Tika server does not support RAR version 5

Just to make sure I got it right.
I need to have the unrar library on my machine and then add some lines in the 
tika-config.xml(probably the path to that library) so Tika would use that 
parser?
Thanks,
Shay.

On Tue, 25 Apr 2023 at 14:59 Sandeep Kulkarni via user 
<[email protected]<mailto:[email protected]>> wrote:
Hi Shay,

In Tika 2.5.0, support for unrar was added as external dependency. Take a look 
at it. I have not used it personally, but at least would like to try it out in 
next few months.

Add unrar as an optional parser (TIKA-3800).

https://issues.apache.org/jira/browse/TIKA-3800

Regards,
Sandeep Kulkarni

From: שי ברק <[email protected]<mailto:[email protected]>>
Sent: Tuesday, April 25, 2023 4:14 PM
To: [email protected]<mailto:[email protected]>
Subject: [External] Re: Tika server does not support RAR version 5

Thanks for the quick response.
It seems like this issue has not been resolved since long time ago.
Any chances that Tike would use different library instead of junrar or other 
ideas how to solve this?

On Mon, 24 Apr 2023 at 18:52 Rob McCoy 
<[email protected]<mailto:[email protected]>> wrote:
Hey Shay,

If I'm not mistaken, the RAR5 support for Tika is blocked until an underlying 
library (junrar) adds support for this, tracked here: 
https://issues.apache.org/jira/browse/TIKA-3211.

My understanding is RAR5 drastically changed how things worked, so it caused a 
need for nearly an entire rewrite of how they are handled, as discussed here: 
https://github.com/junrar/junrar/issues/23

‪On Mon, Apr 24, 2023 at 10:17 AM ‫שי ברק‎ 
<[email protected]<mailto:[email protected]>> wrote:
Hey everyone,
I tried to extract RAR by calling the ‘unpack/all’
endpoint on Tika server and got an exception that says:
“tika does not yet support rar version 5”.
Is there any plans to support this format on the next Tika server versions?

Thanks,
Shay.



--
Rob McCoy

Lead Software Engineer

Onna
[emailAddress]
[email protected]<mailto:[email protected]>
[website]
www.onna.com<https://www.onna.com/>
[https://go.onna.com/hubfs/branding/logo-onna-black-cropped.png]

This message and any attachments constitute electronic communication within the 
meaning of the Electronic Communications Privacy Act, 18 U.S.C. §§ 2510-2521, 
is intended for the recipient(s) only and may contain confidential and/or 
privileged information. If you are not the intended recipient, do not read, 
copy, distribute or use this information. If received in error, notify sender 
immediately by reply e-mail and delete this message.


Reply via email to