Hi, Are there any gotchas one should be aware of when configuring property "org.apache.manifoldcf.crawler.stuffamountfactor"?
At times, I see the manifold nodes in my cluster (and the postgresql box) not utilising all the resources they have. I have configured 30 worker threads which tend to sit idle waiting for documents (continuous crawl). This led me to tweak the batch size of the Stuffer thread indirectly using "org.apache.manifoldcf.crawler.stuffamountfactor" and setting it to 20 (I believe the default is 2). I understand that increasing the batch size results in a bigger result set coming back from the database. If the size is in the 1000s I doubt it would cause problems. My hope is a bigger stuffer batch would allow worker threads to operate more efficiently and handle more documents where possible. Please let me know if there are any particular concerns/guidelines over tweaking this config property or if there are better ways for increasing the width of the processing pipeline for each manifold instance. Thanks, Aeham
