Idan Zalzberg created SPARK-1526:
------------------------------------

             Summary: Running spark driver program from my local machine
                 Key: SPARK-1526
                 URL: https://issues.apache.org/jira/browse/SPARK-1526
             Project: Spark
          Issue Type: Wish
          Components: Spark Core
            Reporter: Idan Zalzberg


Currently it seems that the design choice is that the driver program should be 
close network-wise to the worker and allow connections to be created from 
either side.

This makes using Spark somewhat harder since when I develop locally I not only 
to package all my program, but also all it's local dependencies.
let's say I have a local DB with names of files in HADOOP that I want to 
process with spark, now I need my local DB to be accessible from the cluster so 
it can fetch the file names in runtime.

The driver program is an awesome thing, but it loses some of it's strength if 
you can't really run it anywhere.

It seems to me that the problem is with the DAGScheduler that needs to be close 
to the worker, maybe it shouldn't be embedded in the driver then?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to