[jira] [Commented] (MESOS-700) more efficient distribution of frameworks via HDFS

Du Li (JIRA) Thu, 19 Sep 2013 16:00:44 -0700

    [ 
https://issues.apache.org/jira/browse/MESOS-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772429#comment-13772429
 ]


Du Li commented on MESOS-700:
-----------------------------

I browsed related parts in the mesos source code (e.g., launcher.cpp) but 
didn't dig too deep into it. However, on the slave logs, I did observe that 
there was one spark tgz file under each slaves/ZZZ/frameworks/XXX/.../runs/YYY/ 
along with its unzipped directory. For each spark job instance I created, there 
was one XXX, each in turn having multiple YYY's depending on the number of 
tasks assigned to this slave. Certainly, the tgz could be put under ZZZ (or 
XXX) so that it won't be downloaded/unzipped for each task.
                
> more efficient distribution of frameworks via HDFS
> --------------------------------------------------
>
>                 Key: MESOS-700
>                 URL: https://issues.apache.org/jira/browse/MESOS-700
>             Project: Mesos
>          Issue Type: Improvement
>          Components: framework
>    Affects Versions: 0.13.0, 0.14.0, 0.15.0
>         Environment: general
>            Reporter: Du Li
>             Fix For: 0.13.0, 0.14.0, 0.15.0
>
>
> I was exploring the latest code (0.15.0) at https://github.com/apache/mesos 
> to test the tgz distribution of frameworks. Take spark for example. I created 
> a tgz of spark binary and put it on HDFS. After a job is submitted, it is 
> decomposed into many tasks. For each task, the assigned mesos slave downloads 
> the tgz from HDFS, unzips it, and executes some script to launch the task. 
> This seems very wasteful and unnecessary. 
> Does the following suggestion make sense? When a spark job is submitted, the 
> spark/mesos master calculates a checksum or something the like for the tgz 
> distribution. Then the checksum is sent to the slaves when tasks are 
> assigned. If the same file has already been downloaded/unzipped, a slave 
> directly launches the task. This way the tgz is processed at most once for 
> each job (which may have thousands of tasks). The aggregated saving would be 
> tremendous.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MESOS-700) more efficient distribution of frameworks via HDFS

Reply via email to