[jira] [Updated] (SPARK-26498) Integrate barrier execution with MMLSpark's LightGBM

Dongjoon Hyun (JIRA) Tue, 16 Jul 2019 09:41:22 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-26498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dongjoon Hyun updated SPARK-26498:
----------------------------------
    Affects Version/s:     (was: 2.4.0)
                       3.0.0

> Integrate barrier execution with MMLSpark's LightGBM
> ----------------------------------------------------
>
>                 Key: SPARK-26498
>                 URL: https://issues.apache.org/jira/browse/SPARK-26498
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML, MLlib
>    Affects Versions: 3.0.0
>            Reporter: Ilya Matiach
>            Priority: Major
>
> I would like to use the new barrier execution mode introduced in spark 2.4 
> with LightGBM in the spark package mmlspark but I ran into some issues.
> Currently, the LightGBM distributed learner tries to figure out the number of 
> cores on the cluster and then does a coalesce and a mapPartitions, and inside 
> the mapPartitions we do a NetworkInit (where the address:port of all workers 
> needs to be passed in the constructor) and pass the data in-memory to the 
> native layer of the distributed lightgbm learner.
> With barrier execution mode, I think the code would become much more robust.  
> However, there are several issues that I am running into when trying to move 
> my code over to the new barrier execution mode scheduler:
> Does not support dynamic allocation – however, I think it would be convenient 
> if it restarted the job when the number of workers has decreased and allowed 
> the dev to decide whether to restart the job if the number of workers 
> increased
> Does not work with DataFrame or Dataset API, but I think it would be much 
> more convenient if it did.
> How does barrier execution mode deal with #partitions > #tasks?  If the 
> number of partitions is larger than the number of “tasks” or workers, can 
> barrier execution mode automatically coalesce the dataset to have # 
> partitions == # tasks?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-26498) Integrate barrier execution with MMLSpark's LightGBM

Reply via email to