Tom Howland created SPARK-34033:
-----------------------------------

             Summary: SparkR Daemon Initialization
                 Key: SPARK-34033
                 URL: https://issues.apache.org/jira/browse/SPARK-34033
             Project: Spark
          Issue Type: Improvement
          Components: R, SparkR
    Affects Versions: 3.2.0
         Environment: tested on centos 7 & spark 2.3.1 and on my mac & spark at 
master
            Reporter: Tom Howland


Provide a way for users to initialize the sparkR daemon before it forks.

Described in detail in 
[docs/sparkr.md|https://github.com/WamBamBoozle/spark/blob/daemon_init/docs/sparkr.md#daemon-initialization]

I'm a contractor to Target, where we have several projects doing ML with 
sparkR. The changes proposed here results in weeks of compute-time saved with 
every run.

(40000 partitions) * (5 seconds to load our R libraries) * (2 calls to gapply 
in our app) / 60 / 60 = 111 hours.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to