Hi, An important variable to know/measure in the 150 docs/minute: How large are these documents?
-Marshall On 5/18/2020 8:47 AM, Eddie Epstein wrote: > Hi, > > Removing the AE from the pipeline was a good idea to help isolate the > bottleneck. The other two most likely possibilities are the collection > reader pulling from elastic search or the CAS consumer writing the > processing output. > > DUCC Jobs are a simple way to scale out compute bottlenecks across a > cluster. Scaleout may be of limited or no value for I/O bound jobs. > Please give a more complete picture of the processing scenario on DUCC. > > Regards, > Eddie > > > On Sat, May 16, 2020 at 1:29 AM Raja Muhammad Suleman < > sulem...@edgehill.ac.uk> wrote: > >> Hi, >> I've been trying to run a very small UIMA DUCC cluster with 2 slave nodes >> having 32GB of RAM each. I wrote a custom Collection Reader to read data >> from an Elasticsearch index and dump it into a new index after certain >> analysis engine processing. The Analysis Engine is a simple sentiment >> analysis code. The performance I'm getting is very slow as it is only able >> to process ~150 documents/minute. >> To test the performance without the analysis engine, I removed the AE from >> the pipeline but still I did not get any improvement in the processing >> speeds. Can you please guide me as to where I might be going wrong or what >> I can do to improve the processing speeds? >> >> Thank you. >> ________________________________ >> Edge Hill University<http://ehu.ac.uk/home/emailfooter> >> Teaching Excellence Framework Gold Award<http://ehu.ac.uk/tef/emailfooter> >> ________________________________ >> This message is private and confidential. If you have received this >> message in error, please notify the sender and remove it from your system. >> Any views or opinions presented are solely those of the author and do not >> necessarily represent those of Edge Hill or associated companies. Edge Hill >> University may monitor email traffic data and also the content of email for >> the purposes of security and business communications during staff absence.< >> http://ehu.ac.uk/itspolicies/emailfooter> >>