Hi Ananth,

Please see my answers in-line

Regards,
Sandeep

On Thu, Oct 27, 2016 at 11:52 PM, ananth <ananthg.a...@gmail.com> wrote:

> Hello all,
>
> I want to use Apex for executing R scripts wherein the parameters for the
> script are coming in as tuples. In this regard, I have a few questions:
>
>  * I am presuming that the R dependencies are to be installed on all of
>    the hadoop nodes and the R script is to be put in the classpath ?
>    The R script will be referring to a few R libraries as part of its code.
>

[Sandeep] Yes, all R dependencies including R libraries should be installed
on all Hadoop nodes and R script should be in the classpath.


>  * Is it fair to say that that the YARN container allocation does not
>    work exactly as the scriptoperator ( named as Rscript in malhar)
>    uses the REngine which is present locally as a binary ? Especially
>    if the R script itself uses parallelism in terms of its code  etc. I
>    am asking this to plan out the resources required for such an
>    implementation.
>

[Sandeep] I might be wrong here but, I think, Rscript would be run inside
the YARN container.


>  * Is there a good documentation / pointer for best practices to be
>    followed when developing applications which use the ScriptOperator
>    equivalent constructs wherein there are external code constructs
>    that might be executed ?
>

[Sandeep] As far as I know there isn't any documentation as of now.


> Regards,
>
> Ananth
>
>

Reply via email to