Hi Manish, I think you should ask this one upstream on the Tika Dev lists. I’ve cc’ed them for you.
From: manish mathur <[email protected]> Date: Monday, March 15, 2021 at 4:41 AM To: <[email protected]> Subject: Re: Python-tika: issues related to memory consumption Hi Chris, I am using python-tika library to extract the content from pdf. but lot of junks are coming due to tables or graphs etc. so is there have any way to ignore while parsing pdf to get the content. Thanks in advance Thanks Manish Mathur On Mon, Feb 1, 2021 at 4:18 PM manish mathur <[email protected]> wrote: Hi Chris, I am using python-tika library for reading pdf urls, but gradually memory consumption is increasing so much. is there have any way to release the memory after reading one pdf url. Please let me know. Thanks in advance Thanks Manish Mathur
