Hi Shashank, ManifoldCF's memory consumption is bounded but scales by the number of worker threads you allow. If you have 100 worker threads and each doc can consume 50mb then you need to have at least 5gb right there for Solr output. Tika is also quite expensive memory-wise so I'd allocate at least 10gb for ManifoldCF to support the pipeline you have set up.
The best way to control memory, therefore, is probably to reduce the number of worker threads. (I assume you are using the combined war here, otherwise Tomcat would not be involved.) Karl On Thu, Jan 18, 2018 at 6:44 AM, Shashank Raj <[email protected]> wrote: > Hello Karl, > GC Overhead heap error occurs each time and tomcat closes. Heap allocated > is 7Gb(Xmx). Is there any other reason this issue is coming up? I am using > ManifoldCF's tika. > I have Unchecked "Use Update Extract" and max doc size as 50mb. > > >
