Hi Steve, Thanks for the idea! I have seen ‘docker containers’ mentioned a few times in my reading so far, but I’m unfamiliar with the concept. I’ll look into it further.
Leah From: Steve Evans <stephen.ev...@duke.edu> Reply-To: "user@ctakes.apache.org" <user@ctakes.apache.org> Date: Tuesday, January 29, 2019 at 12:22 PM To: "user@ctakes.apache.org" <user@ctakes.apache.org> Subject: [EXTERNAL] RE: Processing large batches of files in cTAKES Leah, I run my ctakes work load using docker containers. I have built a container that serves ctakes requests via tomcat webservices. That’s not for the feint of heart and not for non-programmer types. But you might be able to install the ctakes software in a container with the input/output directories on the host and then run in parallel using file input/output. I run 10 containers to get the thru put we need (5/second). This is on a 16 cpu 64GB host (each container consumes about 2GB of ram) Not a slam dunk type answer but I thought it might help gen ideas Steve From: Baas,Leah <leah.b...@sanfordhealth.org> Sent: Tuesday, January 29, 2019 12:59 PM To: user@ctakes.apache.org Subject: Processing large batches of files in cTAKES Hi all, I would like to process a batch of 13,414 files (avg file size 6.2 KB) using the default clinical pipeline. I am new to cTAKES and computer programming, and I’m looking for guidance on how to process these files with maximum time/CPU efficiency. I am currently running my program on an Ubuntu VM with 3 CPUs. It takes me 28 seconds (real time) to process one 6.0 KB file. I’m reading up on parallel processing strategies, but would be grateful for any suggestions, tips, etc. that you might have! Thanks, Leah ----------------------------------------------------------------------- Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain privileged and confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.