Utilizing all available cores

Poojit Sharma Mon, 24 Nov 2014 13:48:03 -0800

Hi everyone,

I am using OODT Radix v0.7 and need some help in fine tuning my system. Let
me give you all an overview of my setup.


I am crawling files in a directory using 'crawler' and ingesting it into
the 'file manager'. I have a PGE task setup which is triggered after
successful ingestion into the 'file manager'. The PGE Task then posts the
file to Solr.

Everything works great but I would like to get the most of the available
resources. Currently, I am running this on c3.x8large AWS EC2 instance
which has 32 vCPUs. Since I have 2 million files, I have divided those
files into 32 folders and I am running 32 instances of 'crawler_launcher'.
When I monitor the system using 'htop' I don't see max CPU utilization. I
also notice in PCS Status via OPSUI, that a number of files are queued. I
also tried to set org.apache.oodt.cas.workflow.engine.minPoolSize and
maxPoolSize to 32, as well as Solr's maxIndexingThreads to 32, but I think
there is some bottleneck.

Is there an option to set number of threads of the 'file manager'?
Any help will be appreciated.

Thanks,

Poojit Sharma.

Utilizing all available cores

Reply via email to