Can you use JNI to call the c++ functionality directly from Java? 

Or you wrap this into a MR step outside Spark and use Hadoop Streaming (it 
allows you to use shell scripts as mapper and reducer)?

You can also write temporary files for each partition and execute the software 
within a map step.

Generally you should not call external applications from Spark.

> Am 11.11.2018 um 23:13 schrieb Steve Lewis <lordjoe2...@gmail.com>:
> 
> I have a problem where a critical step needs to be performed by  a third 
> party c++ application. I can send or install this program on the worker 
> nodes. I can construct  a function holding all the data this program needs to 
> process. The problem is that the program is designed to read and write from 
> the local file system. I can call the program from Java and read its output 
> as  a  local file - then deleting all temporary files but I doubt that it is 
> possible to get the program to read from hdfs or any shared file system. 
> My question is can a function running on a worker node create temporary files 
> and pass the names of these to a local process assuming everything is cleaned 
> up after the call?
> 
> -- 
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to