Hello all,

I want to use Apex for executing R scripts wherein the parameters for the script are coming in as tuples. In this regard, I have a few questions:

 * I am presuming that the R dependencies are to be installed on all of
   the hadoop nodes and the R script is to be put in the classpath ?
   The R script will be referring to a few R libraries as part of its code.
 * Is it fair to say that that the YARN container allocation does not
   work exactly as the scriptoperator ( named as Rscript in malhar)
   uses the REngine which is present locally as a binary ? Especially
   if the R script itself uses parallelism in terms of its code  etc. I
   am asking this to plan out the resources required for such an
   implementation.
 * Is there a good documentation / pointer for best practices to be
   followed when developing applications which use the ScriptOperator
   equivalent constructs wherein there are external code constructs
   that might be executed ?

Regards,

Ananth

Reply via email to