Hi,

An important variable to know/measure in the 150 docs/minute:  How large are
these documents?

-Marshall

On 5/18/2020 8:47 AM, Eddie Epstein wrote:
> Hi,
>
> Removing the AE from the pipeline was a good idea to help isolate the
> bottleneck. The other two most likely possibilities are the collection
> reader pulling from elastic search or the CAS consumer writing the
> processing output.
>
> DUCC Jobs are a simple way to scale out compute bottlenecks across a
> cluster. Scaleout may be of limited or no value for I/O bound jobs.
> Please give a more complete picture of the processing scenario on DUCC.
>
> Regards,
> Eddie
>
>
> On Sat, May 16, 2020 at 1:29 AM Raja Muhammad Suleman <
> sulem...@edgehill.ac.uk> wrote:
>
>> Hi,
>> I've been trying to run a very small UIMA DUCC cluster with 2 slave nodes
>> having 32GB of RAM each. I wrote a custom Collection Reader to read data
>> from an Elasticsearch index and dump it into a new index after certain
>> analysis engine processing. The Analysis Engine is a simple sentiment
>> analysis code. The performance I'm getting is very slow as it is only able
>> to process ~150 documents/minute.
>> To test the performance without the analysis engine, I removed the AE from
>> the pipeline but still I did not get any improvement in the processing
>> speeds. Can you please guide me as to where I might be going wrong or what
>> I can do to improve the processing speeds?
>>
>> Thank you.
>> ________________________________
>> Edge Hill University<http://ehu.ac.uk/home/emailfooter>
>> Teaching Excellence Framework Gold Award<http://ehu.ac.uk/tef/emailfooter>
>> ________________________________
>> This message is private and confidential. If you have received this
>> message in error, please notify the sender and remove it from your system.
>> Any views or opinions presented are solely those of the author and do not
>> necessarily represent those of Edge Hill or associated companies. Edge Hill
>> University may monitor email traffic data and also the content of email for
>> the purposes of security and business communications during staff absence.<
>> http://ehu.ac.uk/itspolicies/emailfooter>
>>

Reply via email to