Hi Shay, As mentioned earlier, I have not used it yet. But as per the comments in the commit, unrar needs to be available on the path. We don’t need to provide its path in config, but we need to enable the unrar parser like shown in config below:
https://github.com/apache/tika/blob/main/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pkg-module/src/test/resources/org/apache/tika/parser/pkg/tika-unrar-config.xml Regards, Sandeep Kulkarni From: שי ברק <[email protected]> Sent: Tuesday, April 25, 2023 5:42 PM To: Sandeep Kulkarni <[email protected]>; [email protected] Subject: Re: [External] Re: Tika server does not support RAR version 5 Just to make sure I got it right. I need to have the unrar library on my machine and then add some lines in the tika-config.xml(probably the path to that library) so Tika would use that parser? Thanks, Shay. On Tue, 25 Apr 2023 at 14:59 Sandeep Kulkarni via user <[email protected]<mailto:[email protected]>> wrote: Hi Shay, In Tika 2.5.0, support for unrar was added as external dependency. Take a look at it. I have not used it personally, but at least would like to try it out in next few months. Add unrar as an optional parser (TIKA-3800). https://issues.apache.org/jira/browse/TIKA-3800 Regards, Sandeep Kulkarni From: שי ברק <[email protected]<mailto:[email protected]>> Sent: Tuesday, April 25, 2023 4:14 PM To: [email protected]<mailto:[email protected]> Subject: [External] Re: Tika server does not support RAR version 5 Thanks for the quick response. It seems like this issue has not been resolved since long time ago. Any chances that Tike would use different library instead of junrar or other ideas how to solve this? On Mon, 24 Apr 2023 at 18:52 Rob McCoy <[email protected]<mailto:[email protected]>> wrote: Hey Shay, If I'm not mistaken, the RAR5 support for Tika is blocked until an underlying library (junrar) adds support for this, tracked here: https://issues.apache.org/jira/browse/TIKA-3211. My understanding is RAR5 drastically changed how things worked, so it caused a need for nearly an entire rewrite of how they are handled, as discussed here: https://github.com/junrar/junrar/issues/23 On Mon, Apr 24, 2023 at 10:17 AM שי ברק <[email protected]<mailto:[email protected]>> wrote: Hey everyone, I tried to extract RAR by calling the ‘unpack/all’ endpoint on Tika server and got an exception that says: “tika does not yet support rar version 5”. Is there any plans to support this format on the next Tika server versions? Thanks, Shay. -- Rob McCoy Lead Software Engineer Onna [emailAddress] [email protected]<mailto:[email protected]> [website] www.onna.com<https://www.onna.com/> [https://go.onna.com/hubfs/branding/logo-onna-black-cropped.png] This message and any attachments constitute electronic communication within the meaning of the Electronic Communications Privacy Act, 18 U.S.C. §§ 2510-2521, is intended for the recipient(s) only and may contain confidential and/or privileged information. If you are not the intended recipient, do not read, copy, distribute or use this information. If received in error, notify sender immediately by reply e-mail and delete this message.
