cTAKES is a Java project, so it should work “out of the box” with the Java 
Spark libraries.  If you’re not used to using Spark + Java, then I would not 
recommend starting with cTAKES.  I suggest you start by using cTAKES as a Maven 
dependency alongside the Spark Maven dependencies.

If you want to use pySpark, then you are in the business of using Java libs 
from Python, like in 
http://stackoverflow.com/questions/476968/using-a-java-library-from-python 
<http://stackoverflow.com/questions/476968/using-a-java-library-from-python> 
and there is nothing special about cTAKES.

cTAKES uses UIMA on the backend, and this can be extremely confusing to new 
users.  Maybe you should isolate your problems

1. Use Spark + Java libs
2. Use Python + Java libs
3. Learn cTAKES on it’s own turf.  Namely, Java

Apache projects notoriously have dependency problems, and Spark is no 
exception.  HA!  “Exception”-- I’m funny.  Anyway, don’t expect the two to play 
together nicely at first.

b

~~~~~
May All Your Sequences Converge

> On Aug 12, 2016, at 10:05 AM, Bandeep Singh <[email protected]> wrote:
> 
> Hi Team,
> 
> I am very new to cTAKES and just started learning how to use it.
> I am wondering how to use cTakes API with SPARk (pyspark preferably) for Big 
> data.
> Can somebody point me in the right direction.
> 
> Till now I downloaded cTakes jars and tried building it with SPARK, but it 
> threw me some resource allocation exception.
> 
> Any response will be highly appreciated.
> 
> Thanks,
> Bandeep

Reply via email to