[jira] [Created] (SPARK-29148) Modify dynamic allocation manager for stage level scheduling

2019-09-18 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-29148:
-

 Summary: Modify dynamic allocation manager for stage level 
scheduling
 Key: SPARK-29148
 URL: https://issues.apache.org/jira/browse/SPARK-29148
 Project: Spark
  Issue Type: Bug
  Components: Scheduler
Affects Versions: 3.0.0
Reporter: Thomas Graves


To Support Stage Level Scheduling, the dynamic allocation manager has to track 
the usage and need or executor per ResourceProfile.

We will have to figure out what to do with the metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-09-16 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930600#comment-16930600
 ] 

Thomas Graves commented on SPARK-27495:
---

The SPIP vote passed - 
https://mail-archives.apache.org/mod_mbox/spark-dev/201909.mbox/%3ccactgibw44ggts5apnlkuqsurnwqdejeadeiwfvdwl7ic3eh...@mail.gmail.com%3e

> SPIP: Support Stage level resource configuration and scheduling
> ---
>
> Key: SPARK-27495
> URL: https://issues.apache.org/jira/browse/SPARK-27495
> Project: Spark
>  Issue Type: Epic
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
> Objectives:
>  # Allow users to specify task and executor resource requirements at the 
> stage level. 
>  # Spark will use the stage level requirements to acquire the necessary 
> resources/executors and schedule tasks based on the per stage requirements.
> Many times users have different resource requirements for different stages of 
> their application so they want to be able to configure resources at the stage 
> level. For instance, you have a single job that has 2 stages. The first stage 
> does some  ETL which requires a lot of tasks, each with a small amount of 
> memory and 1 core each. Then you have a second stage where you feed that ETL 
> data into an ML algorithm. The second stage only requires a few executors but 
> each executor needs a lot of memory, GPUs, and many cores.  This feature 
> allows the user to specify the task and executor resource requirements for 
> the ETL Stage and then change them for the ML stage of the job. 
> Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
> extra Resources (GPU/FPGA/etc). It has the potential to allow for other 
> things like limiting the number of tasks per stage, specifying other 
> parameters for things like shuffle, etc. Initially I would propose we only 
> support resources as they are now. So Task resources would be cpu and other 
> resources (GPU, FPGA), that way we aren't adding in extra scheduling things 
> at this point.  Executor resources would be cpu, memory, and extra 
> resources(GPU,FPGA, etc). Changing the executor resources will rely on 
> dynamic allocation being enabled.
> Main use cases:
>  # ML use case where user does ETL and feeds it into an ML algorithm where 
> it’s using the RDD API. This should work with barrier scheduling as well once 
> it supports dynamic allocation.
>  # This adds the framework/api for Spark's own internal use.  In the future 
> (not covered by this SPIP), Catalyst could control the stage level resources 
> as it finds the need to change it between stages for different optimizations. 
> For instance, with the new columnar plugin to the query planner we can insert 
> stages into the plan that would change running something on the CPU in row 
> format to running it on the GPU in columnar format. This API would allow the 
> planner to make sure the stages that run on the GPU get the corresponding GPU 
> resources it needs to run. Another possible use case for catalyst is that it 
> would allow catalyst to add in more optimizations to where the user doesn’t 
> need to configure container sizes at all. If the optimizer/planner can handle 
> that for the user, everyone wins.
> This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I 
> think the DataSet API will require more changes because it specifically hides 
> the RDD from the users via the plans and catalyst can optimize the plan and 
> insert things into the plan. The only way I’ve found to make this work with 
> the Dataset API would be modifying all the plans to be able to get the 
> resource requirements down into where it creates the RDDs, which I believe 
> would be a lot of change.  If other people know better options, it would be 
> great to hear them.
> *Q2.* What problem is this proposal NOT designed to solve?
> The initial implementation is not going to add Dataset APIs.
> We are starting with allowing users to specify a specific set of 
> task/executor resources and plan to design it to be extendable, but the first 
> implementation will not support changing generic SparkConf configs and only 
> specific limited resources.
> This initial version will have a programmatic API for specifying the resource 
> requirements per stage, we can add the ability to perhaps have profiles in 
> the configs later if its useful.
> *Q3.* How is it done today, and what are the limits of current practice?
> Currently this is either done by having multiple spark jobs or requesting 
> containers with the max resources needed for any part of the job.  To

[jira] [Resolved] (SPARK-27492) GPU scheduling - High level user documentation

2019-09-11 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27492.
---
Fix Version/s: 3.0.0
   Resolution: Fixed

> GPU scheduling - High level user documentation
> --
>
> Key: SPARK-27492
> URL: https://issues.apache.org/jira/browse/SPARK-27492
> Project: Spark
>  Issue Type: Story
>  Components: Documentation
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
> Fix For: 3.0.0
>
>
> For the SPIP - Accelerator-aware task scheduling for Spark, 
> https://issues.apache.org/jira/browse/SPARK-24615 Add some high level user 
> documentation about how this feature works together and point to things like 
> the example discovery script, etc.
>  
>  - make sure to document the discovery script and what permissions are needed 
> and any security implications
>  - Document standalone - local-cluster mode limitation of only a single 
> resource file or discovery script so you have to have coordination on for it 
> to work right.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-09-11 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927558#comment-16927558
 ] 

Thomas Graves commented on SPARK-27495:
---

[~felixcheung]  [~jiangxb1987]  I put this up for vote on the dev mailing list. 
 Could you please take a look and comment there?

> SPIP: Support Stage level resource configuration and scheduling
> ---
>
> Key: SPARK-27495
> URL: https://issues.apache.org/jira/browse/SPARK-27495
> Project: Spark
>  Issue Type: Epic
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
> Objectives:
>  # Allow users to specify task and executor resource requirements at the 
> stage level. 
>  # Spark will use the stage level requirements to acquire the necessary 
> resources/executors and schedule tasks based on the per stage requirements.
> Many times users have different resource requirements for different stages of 
> their application so they want to be able to configure resources at the stage 
> level. For instance, you have a single job that has 2 stages. The first stage 
> does some  ETL which requires a lot of tasks, each with a small amount of 
> memory and 1 core each. Then you have a second stage where you feed that ETL 
> data into an ML algorithm. The second stage only requires a few executors but 
> each executor needs a lot of memory, GPUs, and many cores.  This feature 
> allows the user to specify the task and executor resource requirements for 
> the ETL Stage and then change them for the ML stage of the job. 
> Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
> extra Resources (GPU/FPGA/etc). It has the potential to allow for other 
> things like limiting the number of tasks per stage, specifying other 
> parameters for things like shuffle, etc. Initially I would propose we only 
> support resources as they are now. So Task resources would be cpu and other 
> resources (GPU, FPGA), that way we aren't adding in extra scheduling things 
> at this point.  Executor resources would be cpu, memory, and extra 
> resources(GPU,FPGA, etc). Changing the executor resources will rely on 
> dynamic allocation being enabled.
> Main use cases:
>  # ML use case where user does ETL and feeds it into an ML algorithm where 
> it’s using the RDD API. This should work with barrier scheduling as well once 
> it supports dynamic allocation.
>  # This adds the framework/api for Spark's own internal use.  In the future 
> (not covered by this SPIP), Catalyst could control the stage level resources 
> as it finds the need to change it between stages for different optimizations. 
> For instance, with the new columnar plugin to the query planner we can insert 
> stages into the plan that would change running something on the CPU in row 
> format to running it on the GPU in columnar format. This API would allow the 
> planner to make sure the stages that run on the GPU get the corresponding GPU 
> resources it needs to run. Another possible use case for catalyst is that it 
> would allow catalyst to add in more optimizations to where the user doesn’t 
> need to configure container sizes at all. If the optimizer/planner can handle 
> that for the user, everyone wins.
> This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I 
> think the DataSet API will require more changes because it specifically hides 
> the RDD from the users via the plans and catalyst can optimize the plan and 
> insert things into the plan. The only way I’ve found to make this work with 
> the Dataset API would be modifying all the plans to be able to get the 
> resource requirements down into where it creates the RDDs, which I believe 
> would be a lot of change.  If other people know better options, it would be 
> great to hear them.
> *Q2.* What problem is this proposal NOT designed to solve?
> The initial implementation is not going to add Dataset APIs.
> We are starting with allowing users to specify a specific set of 
> task/executor resources and plan to design it to be extendable, but the first 
> implementation will not support changing generic SparkConf configs and only 
> specific limited resources.
> This initial version will have a programmatic API for specifying the resource 
> requirements per stage, we can add the ability to perhaps have profiles in 
> the configs later if its useful.
> *Q3.* How is it done today, and what are the limits of current practice?
> Currently this is either done by having multiple spark jobs or requesting 
> containers with the max resources needed for any part of the job.  To do this 
> today, you can brea

[jira] [Commented] (SPARK-27492) GPU scheduling - High level user documentation

2019-09-05 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923528#comment-16923528
 ] 

Thomas Graves commented on SPARK-27492:
---

I think we talked about this a big in the prs for the main changes. I think its 
better to leave as resource since really cpu/memory could be rolled into this. 
They are just special since pre-existing ones. Also this could be extended to 
any resource type that wouldn't necessarily be an accelerator.  Also if 
https://issues.apache.org/jira/browse/SPARK-27495 stage scheduling gets 
approved everything will be under a single api there and just referred to as 
resource.

> GPU scheduling - High level user documentation
> --
>
> Key: SPARK-27492
> URL: https://issues.apache.org/jira/browse/SPARK-27492
> Project: Spark
>  Issue Type: Story
>  Components: Documentation
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> For the SPIP - Accelerator-aware task scheduling for Spark, 
> https://issues.apache.org/jira/browse/SPARK-24615 Add some high level user 
> documentation about how this feature works together and point to things like 
> the example discovery script, etc.
>  
>  - make sure to document the discovery script and what permissions are needed 
> and any security implications
>  - Document standalone - local-cluster mode limitation of only a single 
> resource file or discovery script so you have to have coordination on for it 
> to work right.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27492) GPU scheduling - High level user documentation

2019-09-04 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-27492:
-

Assignee: Thomas Graves

> GPU scheduling - High level user documentation
> --
>
> Key: SPARK-27492
> URL: https://issues.apache.org/jira/browse/SPARK-27492
> Project: Spark
>  Issue Type: Story
>  Components: Documentation
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> For the SPIP - Accelerator-aware task scheduling for Spark, 
> https://issues.apache.org/jira/browse/SPARK-24615 Add some high level user 
> documentation about how this feature works together and point to things like 
> the example discovery script, etc.
>  
>  - make sure to document the discovery script and what permissions are needed 
> and any security implications
>  - Document standalone - local-cluster mode limitation of only a single 
> resource file or discovery script so you have to have coordination on for it 
> to work right.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-09-04 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27495:
--
Description: 
*Q1.* What are you trying to do? Articulate your objectives using absolutely no 
jargon.

Objectives:
 # Allow users to specify task and executor resource requirements at the stage 
level. 
 # Spark will use the stage level requirements to acquire the necessary 
resources/executors and schedule tasks based on the per stage requirements.

Many times users have different resource requirements for different stages of 
their application so they want to be able to configure resources at the stage 
level. For instance, you have a single job that has 2 stages. The first stage 
does some  ETL which requires a lot of tasks, each with a small amount of 
memory and 1 core each. Then you have a second stage where you feed that ETL 
data into an ML algorithm. The second stage only requires a few executors but 
each executor needs a lot of memory, GPUs, and many cores.  This feature allows 
the user to specify the task and executor resource requirements for the ETL 
Stage and then change them for the ML stage of the job. 

Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
extra Resources (GPU/FPGA/etc). It has the potential to allow for other things 
like limiting the number of tasks per stage, specifying other parameters for 
things like shuffle, etc. Initially I would propose we only support resources 
as they are now. So Task resources would be cpu and other resources (GPU, 
FPGA), that way we aren't adding in extra scheduling things at this point.  
Executor resources would be cpu, memory, and extra resources(GPU,FPGA, etc). 
Changing the executor resources will rely on dynamic allocation being enabled.

Main use cases:
 # ML use case where user does ETL and feeds it into an ML algorithm where it’s 
using the RDD API. This should work with barrier scheduling as well once it 
supports dynamic allocation.
 # This adds the framework/api for Spark's own internal use.  In the future 
(not covered by this SPIP), Catalyst could control the stage level resources as 
it finds the need to change it between stages for different optimizations. For 
instance, with the new columnar plugin to the query planner we can insert 
stages into the plan that would change running something on the CPU in row 
format to running it on the GPU in columnar format. This API would allow the 
planner to make sure the stages that run on the GPU get the corresponding GPU 
resources it needs to run. Another possible use case for catalyst is that it 
would allow catalyst to add in more optimizations to where the user doesn’t 
need to configure container sizes at all. If the optimizer/planner can handle 
that for the user, everyone wins.

This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I think 
the DataSet API will require more changes because it specifically hides the RDD 
from the users via the plans and catalyst can optimize the plan and insert 
things into the plan. The only way I’ve found to make this work with the 
Dataset API would be modifying all the plans to be able to get the resource 
requirements down into where it creates the RDDs, which I believe would be a 
lot of change.  If other people know better options, it would be great to hear 
them.

*Q2.* What problem is this proposal NOT designed to solve?

The initial implementation is not going to add Dataset APIs.

We are starting with allowing users to specify a specific set of task/executor 
resources and plan to design it to be extendable, but the first implementation 
will not support changing generic SparkConf configs and only specific limited 
resources.

This initial version will have a programmatic API for specifying the resource 
requirements per stage, we can add the ability to perhaps have profiles in the 
configs later if its useful.

*Q3.* How is it done today, and what are the limits of current practice?

Currently this is either done by having multiple spark jobs or requesting 
containers with the max resources needed for any part of the job.  To do this 
today, you can break it into separate jobs where each job requests the 
corresponding resources needed, but then you have to write the data out 
somewhere and then read it back in between jobs.  This is going to take longer 
as well as require that job coordination between those to make sure everything 
works smoothly. Another option would be to request executors with your largest 
need up front and potentially waste those resources when they aren't being 
used, which in turn wastes money. For instance, for an ML application where it 
does ETL first, many times people request containers with GPUs and the GPUs sit 
idle while the ETL is happening. This is wasting those GPU resources and in 
turn money because those GPUs could have been used by other applicatio

[jira] [Resolved] (SPARK-28577) Ensure executorMemoryHead requested value not less than MEMORY_OFFHEAP_SIZE when MEMORY_OFFHEAP_ENABLED is true

2019-09-04 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-28577.
---
Fix Version/s: 3.0.0
 Assignee: Yang Jie
   Resolution: Fixed

> Ensure executorMemoryHead requested value not less than MEMORY_OFFHEAP_SIZE 
> when MEMORY_OFFHEAP_ENABLED is true
> ---
>
> Key: SPARK-28577
> URL: https://issues.apache.org/jira/browse/SPARK-28577
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: release-notes
> Fix For: 3.0.0
>
>
> If MEMORY_OFFHEAP_ENABLED is true, we should ensure executorOverheadMemory 
> not less than MEMORY_OFFHEAP_SIZE, otherwise the memory resource requested 
> for executor may be not enough.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28960) maven 3.6.1 gone from apache maven repo

2019-09-03 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-28960.
---
Resolution: Duplicate

> maven 3.6.1 gone from apache maven repo
> ---
>
> Key: SPARK-28960
> URL: https://issues.apache.org/jira/browse/SPARK-28960
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> It looks like maven 3.6.1 is missing from the apache maven repo:
> [http://apache.claz.org/maven/maven-3/]
> This is causing PR build failures:
> [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110045/console]
>  
> exec: curl -s -L 
> [https://www.apache.org/dyn/closer.lua?action=download&filename=/maven/maven-3/3.6.1/binaries/apache-maven-3.6.1-bin.tar.gz]
> gzip: stdin: not in gzip format
> tar: Child returned status 1
> tar: Error is not recoverable: exiting now
> Using `mvn` from path: 
> /home/jenkins/workspace/SparkPullRequestBuilder/build/apache-maven-3.6.1/bin/mvn
> build/mvn: line 163: 
> /home/jenkins/workspace/SparkPullRequestBuilder/build/apache-maven-3.6.1/bin/mvn:
>  No such file or directory
> Error while getting version string from Maven:



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28961) Upgrade Maven to 3.6.2

2019-09-03 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-28961:
--
Description: 
It looks like maven 3.6.1 is missing from the apache maven repo:

[http://apache.claz.org/maven/maven-3/]

This is causing PR build failures:

[https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110045/console]

 
 exec: curl -s -L 
 
[https://www.apache.org/dyn/closer.lua?action=download&filename=/maven/maven-3/3.6.1/binaries/apache-maven-3.6.1-bin.tar.gz]
 gzip: stdin: not in gzip format
 tar: Child returned status 1
 tar: Error is not recoverable: exiting now
 Using `mvn` from path: 
/home/jenkins/workspace/SparkPullRequestBuilder/build/apache-maven-3.6.1/bin/mvn
 build/mvn: line 163: 
/home/jenkins/workspace/SparkPullRequestBuilder/build/apache-maven-3.6.1/bin/mvn:
 No such file or directory
 Error while getting version string from Maven:
h4.  

> Upgrade Maven to 3.6.2
> --
>
> Key: SPARK-28961
> URL: https://issues.apache.org/jira/browse/SPARK-28961
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Assignee: Xiao Li
>Priority: Major
>
> It looks like maven 3.6.1 is missing from the apache maven repo:
> [http://apache.claz.org/maven/maven-3/]
> This is causing PR build failures:
> [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110045/console]
>  
>  exec: curl -s -L 
>  
> [https://www.apache.org/dyn/closer.lua?action=download&filename=/maven/maven-3/3.6.1/binaries/apache-maven-3.6.1-bin.tar.gz]
>  gzip: stdin: not in gzip format
>  tar: Child returned status 1
>  tar: Error is not recoverable: exiting now
>  Using `mvn` from path: 
> /home/jenkins/workspace/SparkPullRequestBuilder/build/apache-maven-3.6.1/bin/mvn
>  build/mvn: line 163: 
> /home/jenkins/workspace/SparkPullRequestBuilder/build/apache-maven-3.6.1/bin/mvn:
>  No such file or directory
>  Error while getting version string from Maven:
> h4.  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28960) maven 3.6.1 gone from apache maven repo

2019-09-03 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921611#comment-16921611
 ] 

Thomas Graves commented on SPARK-28960:
---

its a bit odd I don't see 3.6.2 on their page as released yet: 
[https://maven.apache.org/docs/history.html]

 

Anyone happen to know what the deal is?

> maven 3.6.1 gone from apache maven repo
> ---
>
> Key: SPARK-28960
> URL: https://issues.apache.org/jira/browse/SPARK-28960
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> It looks like maven 3.6.1 is missing from the apache maven repo:
> [http://apache.claz.org/maven/maven-3/]
> This is causing PR build failures:
> [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110045/console]
>  
> exec: curl -s -L 
> [https://www.apache.org/dyn/closer.lua?action=download&filename=/maven/maven-3/3.6.1/binaries/apache-maven-3.6.1-bin.tar.gz]
> gzip: stdin: not in gzip format
> tar: Child returned status 1
> tar: Error is not recoverable: exiting now
> Using `mvn` from path: 
> /home/jenkins/workspace/SparkPullRequestBuilder/build/apache-maven-3.6.1/bin/mvn
> build/mvn: line 163: 
> /home/jenkins/workspace/SparkPullRequestBuilder/build/apache-maven-3.6.1/bin/mvn:
>  No such file or directory
> Error while getting version string from Maven:



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-09-03 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921608#comment-16921608
 ] 

Thomas Graves commented on SPARK-27495:
---

Thanks for the feedback [~felixcheung]

1.   yes this is out of scope of this  SPIP, my statement was intended to just 
say this would allow for that in the future if we decided to do it.  I updated 
the description hopefully that is more clear.

2. We could certainly fail if there is a conflict for now and ask the user to 
resolve and fix their code.  I am a little worried users will hit  this fairly 
easily once it starts to get used and in most cases I would think doing a 
simple max or sum would be sufficient.   Perhaps if we start with a config for 
this and have it fail by default, but allow them to turn it off and it does 
something static like a max.  Then we can see how things go an potentially add 
in other options,. thoughts?

3.  Yeah I agree for ML use cases the strict mode makes more sensee to me, I 
think the hint could come into play more in the ETL side and I think even that 
would be unusual. If you are using those extra resources its more then likely 
to make things go faster so I would think you would want all things to run on 
those, but I guess if you have a shared cluster and those resources are in high 
demand it could be faster just to run on whatever is available.  The API would 
allow us to add it in the future but its not included in the first go here.

> SPIP: Support Stage level resource configuration and scheduling
> ---
>
> Key: SPARK-27495
> URL: https://issues.apache.org/jira/browse/SPARK-27495
> Project: Spark
>  Issue Type: Epic
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
> Objectives:
>  # Allow users to specify task and executor resource requirements at the 
> stage level. 
>  # Spark will use the stage level requirements to acquire the necessary 
> resources/executors and schedule tasks based on the per stage requirements.
> Many times users have different resource requirements for different stages of 
> their application so they want to be able to configure resources at the stage 
> level. For instance, you have a single job that has 2 stages. The first stage 
> does some  ETL which requires a lot of tasks, each with a small amount of 
> memory and 1 core each. Then you have a second stage where you feed that ETL 
> data into an ML algorithm. The second stage only requires a few executors but 
> each executor needs a lot of memory, GPUs, and many cores.  This feature 
> allows the user to specify the task and executor resource requirements for 
> the ETL Stage and then change them for the ML stage of the job. 
> Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
> extra Resources (GPU/FPGA/etc). It has the potential to allow for other 
> things like limiting the number of tasks per stage, specifying other 
> parameters for things like shuffle, etc. Initially I would propose we only 
> support resources as they are now. So Task resources would be cpu and other 
> resources (GPU, FPGA), that way we aren't adding in extra scheduling things 
> at this point.  Executor resources would be cpu, memory, and extra 
> resources(GPU,FPGA, etc). Changing the executor resources will rely on 
> dynamic allocation being enabled.
> Main use cases:
>  # ML use case where user does ETL and feeds it into an ML algorithm where 
> it’s using the RDD API. This should work with barrier scheduling as well once 
> it supports dynamic allocation.
>  # This adds the framework/api for Spark's own internal use.  In the future 
> (not covered by this SPIP), Catalyst could control the stage level resources 
> as it finds the need to change it between stages for different optimizations. 
> For instance, with the new columnar plugin to the query planner we can insert 
> stages into the plan that would change running something on the CPU in row 
> format to running it on the GPU in columnar format. This API would allow the 
> planner to make sure the stages that run on the GPU get the corresponding GPU 
> resources it needs to run. Another possible use case for catalyst is that it 
> would allow catalyst to add in more optimizations to where the user doesn’t 
> need to configure container sizes at all. If the optimizer/planner can handle 
> that for the user, everyone wins.
> This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I 
> think the DataSet API will require more changes because it specifically hides 
> the RDD from the users via the plans and catalyst can optimize the p

[jira] [Updated] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-09-03 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27495:
--
Description: 
*Q1.* What are you trying to do? Articulate your objectives using absolutely no 
jargon.

Objectives:
 # Allow users to specify task and executor resource requirements at the stage 
level. 
 # Spark will use the stage level requirements to acquire the necessary 
resources/executors and schedule tasks based on the per stage requirements.

Many times users have different resource requirements for different stages of 
their application so they want to be able to configure resources at the stage 
level. For instance, you have a single job that has 2 stages. The first stage 
does some  ETL which requires a lot of tasks, each with a small amount of 
memory and 1 core each. Then you have a second stage where you feed that ETL 
data into an ML algorithm. The second stage only requires a few executors but 
each executor needs a lot of memory, GPUs, and many cores.  This feature allows 
the user to specify the task and executor resource requirements for the ETL 
Stage and then change them for the ML stage of the job. 

Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
extra Resources (GPU/FPGA/etc). It has the potential to allow for other things 
like limiting the number of tasks per stage, specifying other parameters for 
things like shuffle, etc. Initially I would propose we only support resources 
as they are now. So Task resources would be cpu and other resources (GPU, 
FPGA), that way we aren't adding in extra scheduling things at this point.  
Executor resources would be cpu, memory, and extra resources(GPU,FPGA, etc). 
Changing the executor resources will rely on dynamic allocation being enabled.

Main use cases:
 # ML use case where user does ETL and feeds it into an ML algorithm where it’s 
using the RDD API. This should work with barrier scheduling as well once it 
supports dynamic allocation.
 # This adds the framework/api for Spark's own internal use.  In the future 
(not covered by this SPIP), Catalyst could control the stage level resources as 
it finds the need to change it between stages for different optimizations. For 
instance, with the new columnar plugin to the query planner we can insert 
stages into the plan that would change running something on the CPU in row 
format to running it on the GPU in columnar format. This API would allow the 
planner to make sure the stages that run on the GPU get the corresponding GPU 
resources it needs to run. Another possible use case for catalyst is that it 
would allow catalyst to add in more optimizations to where the user doesn’t 
need to configure container sizes at all. If the optimizer/planner can handle 
that for the user, everyone wins.

This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I think 
the DataSet API will require more changes because it specifically hides the RDD 
from the users via the plans and catalyst can optimize the plan and insert 
things into the plan. The only way I’ve found to make this work with the 
Dataset API would be modifying all the plans to be able to get the resource 
requirements down into where it creates the RDDs, which I believe would be a 
lot of change.  If other people know better options, it would be great to hear 
them.

*Q2.* What problem is this proposal NOT designed to solve?

The initial implementation is not going to add Dataset APIs.

We are starting with allowing users to specify a specific set of task/executor 
resources and plan to design it to be extendable, but the first implementation 
will not support changing generic SparkConf configs and only specific limited 
resources.

This initial version will have a programmatic API for specifying the resource 
requirements per stage, we can add the ability to perhaps have profiles in the 
configs later if its useful.

*Q3.* How is it done today, and what are the limits of current practice?

Currently this is either done by having multiple spark jobs or requesting 
containers with the max resources needed for any part of the job.  To do this 
today, you can break it into separate jobs where each job requests the 
corresponding resources needed, but then you have to write the data out 
somewhere and then read it back in between jobs.  This is going to take longer 
as well as require that job coordination between those to make sure everything 
works smoothly. Another option would be to request executors with your largest 
need up front and potentially waste those resources when they aren't being 
used, which in turn wastes money. For instance, for an ML application where it 
does ETL first, many times people request containers with GPUs and the GPUs sit 
idle while the ETL is happening. This is wasting those GPU resources and in 
turn money because those GPUs could have been used by other applicatio

[jira] [Created] (SPARK-28960) maven 3.6.1 gone from apache maven repo

2019-09-03 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-28960:
-

 Summary: maven 3.6.1 gone from apache maven repo
 Key: SPARK-28960
 URL: https://issues.apache.org/jira/browse/SPARK-28960
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 3.0.0
Reporter: Thomas Graves


It looks like maven 3.6.1 is missing from the apache maven repo:

[http://apache.claz.org/maven/maven-3/]

This is causing PR build failures:

[https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110045/console]

 
exec: curl -s -L 
[https://www.apache.org/dyn/closer.lua?action=download&filename=/maven/maven-3/3.6.1/binaries/apache-maven-3.6.1-bin.tar.gz]
gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now
Using `mvn` from path: 
/home/jenkins/workspace/SparkPullRequestBuilder/build/apache-maven-3.6.1/bin/mvn
build/mvn: line 163: 
/home/jenkins/workspace/SparkPullRequestBuilder/build/apache-maven-3.6.1/bin/mvn:
 No such file or directory
Error while getting version string from Maven:



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28917) Jobs can hang because of race of RDD.dependencies

2019-09-03 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921424#comment-16921424
 ] 

Thomas Graves commented on SPARK-28917:
---

So for this to happen, you essentially have to have that RDD referenced in 2 
separate threads and the second thread to start operating on it before it has 
been materialized in the first thread, correct?

I see how it can change with a checkpoint, but like you say I'm not sure how 
the cache would cause this. Anyway you can see if they are checkpointing? 

> Jobs can hang because of race of RDD.dependencies
> -
>
> Key: SPARK-28917
> URL: https://issues.apache.org/jira/browse/SPARK-28917
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 2.3.3, 2.4.3
>Reporter: Imran Rashid
>Priority: Major
>
> {{RDD.dependencies}} stores the precomputed cache value, but it is not 
> thread-safe.  This can lead to a race where the value gets overwritten, but 
> the DAGScheduler gets stuck in an inconsistent state.  In particular, this 
> can happen when there is a race between the DAGScheduler event loop, and 
> another thread (eg. a user thread, if there is multi-threaded job submission).
> First, a job is submitted by the user, which then computes the result Stage 
> and its parents:
> https://github.com/apache/spark/blob/24655583f1cb5dae2e80bb572604fb4a9761ec07/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L983
> Which eventually makes a call to {{rdd.dependencies}}:
> https://github.com/apache/spark/blob/24655583f1cb5dae2e80bb572604fb4a9761ec07/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L519
> At the same time, the user could also touch {{rdd.dependencies}} in another 
> thread, which could overwrite the stored value because of the race.
> Then the DAGScheduler checks the dependencies *again* later on in the job 
> submission, via {{getMissingParentStages}}
> https://github.com/apache/spark/blob/24655583f1cb5dae2e80bb572604fb4a9761ec07/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1025
> Because it will find new dependencies, it will create entirely different 
> stages.  Now the job has some orphaned stages which will never run.
> The symptoms of this are seeing disjoint sets of stages in the "Parents of 
> final stage" and the "Missing parents" messages on job submission, as well as 
> seeing repeated messages "Registered RDD X" for the same RDD id.  eg:
> {noformat}
> [INFO] 2019-08-15 23:22:31,570 org.apache.spark.SparkContext logInfo - 
> Starting job: count at XXX.scala:462
> ...
> [INFO] 2019-08-15 23:22:31,573 org.apache.spark.scheduler.DAGScheduler 
> logInfo - Registering RDD 14 (repartition at XXX.scala:421)
> ...
> ...
> [INFO] 2019-08-15 23:22:31,582 org.apache.spark.scheduler.DAGScheduler 
> logInfo - Got job 1 (count at XXX.scala:462) with 40 output partitions
> [INFO] 2019-08-15 23:22:31,582 org.apache.spark.scheduler.DAGScheduler 
> logInfo - Final stage: ResultStage 5 (count at XXX.scala:462)
> [INFO] 2019-08-15 23:22:31,582 org.apache.spark.scheduler.DAGScheduler 
> logInfo - Parents of final stage: List(ShuffleMapStage 4)
> [INFO] 2019-08-15 23:22:31,599 org.apache.spark.scheduler.DAGScheduler 
> logInfo - Registering RDD 14 (repartition at XXX.scala:421)
> [INFO] 2019-08-15 23:22:31,599 org.apache.spark.scheduler.DAGScheduler 
> logInfo - Missing parents: List(ShuffleMapStage 6)
> {noformat}
> Note that there is a similar issue w/ {{rdd.partitions}}. I don't see a way 
> it could mess up the scheduler (seems its only used for 
> {{rdd.partitions.length}}).  There is also an issue that {{rdd.storageLevel}} 
> is read and cached in the scheduler, but it could be modified simultaneously 
> by the user in another thread.   Similarly, I can't see a way it could effect 
> the scheduler.
> *WORKAROUND*:
> (a) call {{rdd.dependencies}} while you know that RDD is only getting touched 
> by one thread (eg. in the thread that created it, or before you submit 
> multiple jobs touching that RDD from other threads). Then that value will get 
> cached.
> (b) don't submit jobs from multiple threads.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27360) Standalone cluster mode support for GPU-aware scheduling

2019-08-27 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27360.
---
Fix Version/s: 3.0.0
   Resolution: Fixed

all subtasks are finished so resolving

> Standalone cluster mode support for GPU-aware scheduling
> 
>
> Key: SPARK-27360
> URL: https://issues.apache.org/jira/browse/SPARK-27360
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Major
> Fix For: 3.0.0
>
>
> Design and implement standalone manager support for GPU-aware scheduling:
> 1. static conf to describe resources
> 2. spark-submit to request resources 
> 2. auto discovery of GPUs
> 3. executor process isolation



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28414) Standalone worker/master UI updates for Resource scheduling

2019-08-27 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-28414.
---
Fix Version/s: 3.0.0
 Assignee: wuyi
   Resolution: Fixed

> Standalone worker/master UI updates for Resource scheduling
> ---
>
> Key: SPARK-28414
> URL: https://issues.apache.org/jira/browse/SPARK-28414
> Project: Spark
>  Issue Type: Sub-task
>  Components: Deploy
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: wuyi
>Priority: Major
> Fix For: 3.0.0
>
>
> https://issues.apache.org/jira/browse/SPARK-27360 is adding Resource 
> scheduling to standalone mode. Update the UI with the resource information 
> for master/worker as appropriate.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28577) Ensure executorMemoryHead requested value not less than MEMORY_OFFHEAP_SIZE when MEMORY_OFFHEAP_ENABLED is true

2019-08-22 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-28577:
--
Docs Text: On YARN, The off heap memory size is now separately included for 
the user in the container size it requests from YARN. Previously you had to add 
that into the overhead memory you requested, that is no longer needed.

> Ensure executorMemoryHead requested value not less than MEMORY_OFFHEAP_SIZE 
> when MEMORY_OFFHEAP_ENABLED is true
> ---
>
> Key: SPARK-28577
> URL: https://issues.apache.org/jira/browse/SPARK-28577
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: release-notes
>
> If MEMORY_OFFHEAP_ENABLED is true, we should ensure executorOverheadMemory 
> not less than MEMORY_OFFHEAP_SIZE, otherwise the memory resource requested 
> for executor may be not enough.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28577) Ensure executorMemoryHead requested value not less than MEMORY_OFFHEAP_SIZE when MEMORY_OFFHEAP_ENABLED is true

2019-08-22 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-28577:
--
Labels: release-notes  (was: )

> Ensure executorMemoryHead requested value not less than MEMORY_OFFHEAP_SIZE 
> when MEMORY_OFFHEAP_ENABLED is true
> ---
>
> Key: SPARK-28577
> URL: https://issues.apache.org/jira/browse/SPARK-28577
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: release-notes
>
> If MEMORY_OFFHEAP_ENABLED is true, we should ensure executorOverheadMemory 
> not less than MEMORY_OFFHEAP_SIZE, otherwise the memory resource requested 
> for executor may be not enough.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27361) YARN support for GPU-aware scheduling

2019-08-14 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907431#comment-16907431
 ] 

Thomas Graves commented on SPARK-27361:
---

Since that was done prior to this feature, I think its ok to leave it alone. It 
worked just fine on its own to purely request yarn get the resources (with no 
integration with spark scheduler, etc) We did modify how that worked in 
https://issues.apache.org/jira/browse/SPARK-27959 so that one should be linked 
here I think.

> YARN support for GPU-aware scheduling
> -
>
> Key: SPARK-27361
> URL: https://issues.apache.org/jira/browse/SPARK-27361
> Project: Spark
>  Issue Type: Story
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
> Fix For: 3.0.0
>
>
> Design and implement YARN support for GPU-aware scheduling:
>  * User can request GPU resources at Spark application level.
>  * How the Spark executor discovers GPU's when run on YARN
>  * Integrate with YARN 3.2 GPU support.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23151) Provide a distribution of Spark with Hadoop 3.0

2019-08-13 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906218#comment-16906218
 ] 

Thomas Graves commented on SPARK-23151:
---

Hey so for the hadoop-3.2 profile, the dependencies are quite a bit different 
since we are using the hive version of orc.  The things it pulls in are quite 
different then the nohive version of orc.  It would be really nice if we had a 
different classifier or something so users can actually tell the difference in 
what is published on mvn. Right now all we know is its a spark jar with version 
X.x and scala version y.y but don't know which hadoop profile and thus 
dependencies got pulled in. Something to think about anyway

> Provide a distribution of Spark with Hadoop 3.0
> ---
>
> Key: SPARK-23151
> URL: https://issues.apache.org/jira/browse/SPARK-23151
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 2.2.0, 2.2.1
>Reporter: Louis Burke
>Priority: Major
>
> Provide a Spark package that supports Hadoop 3.0.0. Currently the Spark 
> package
> only supports Hadoop 2.7 i.e. spark-2.2.1-bin-hadoop2.7.tgz. The implication 
> is
> that using up to date Kinesis libraries alongside s3 causes a clash w.r.t
> aws-java-sdk.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28679) Spark Yarn ResourceRequestHelper shouldn't lookup setResourceInformation is no resources specified

2019-08-09 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-28679:
-

 Summary: Spark Yarn ResourceRequestHelper shouldn't lookup 
setResourceInformation is no resources specified
 Key: SPARK-28679
 URL: https://issues.apache.org/jira/browse/SPARK-28679
 Project: Spark
  Issue Type: Bug
  Components: YARN
Affects Versions: 3.0.0
Reporter: Thomas Graves


in the Spark Yarn ResourceRequestHelper it uses reflection to lookup 
setResourceInformation. We should skip that lookup if the resource Map is empty.

[https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ResourceRequestHelper.scala#L154]

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27371) Standalone master receives resource info from worker and allocate driver/executor properly

2019-08-09 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-27371:
-

Assignee: wuyi

> Standalone master receives resource info from worker and allocate 
> driver/executor properly
> --
>
> Key: SPARK-27371
> URL: https://issues.apache.org/jira/browse/SPARK-27371
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: wuyi
>Priority: Major
>
> As an admin, I can let Spark Standalone worker automatically discover GPUs 
> installed on worker nodes. So I don't need to manually configure them.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27368) Design: Standalone supports GPU scheduling

2019-08-09 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27368.
---
Resolution: Fixed

resolving since code committed

> Design: Standalone supports GPU scheduling
> --
>
> Key: SPARK-27368
> URL: https://issues.apache.org/jira/browse/SPARK-27368
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Major
>
> Design draft:
> Scenarios:
> * client-mode, worker might create one or more executor processes, from 
> different Spark applications.
> * cluster-mode, worker might create driver process as well.
> * local-cluster model, there could be multiple worker processes on the same 
> node. This is an undocumented use of standalone mode, which is mainly for 
> tests.
> * Resource isolation is not considered here.
> Because executor and driver processes on the same node will share the 
> accelerator resources, worker must take the role that allocates resources. So 
> we will add spark.worker.resource.[resourceName].discoveryScript conf for 
> workers to discover resources. User need to match the resourceName in driver 
> and executor requests. Besides CPU cores and memory, worker now also 
> considers resources in creating new executors or drivers.
> Example conf:
> {code}
> # static worker conf
> spark.worker.resource.gpu.discoveryScript=/path/to/list-gpus.sh
> # application conf
> spark.driver.resource.gpu.amount=4
> spark.executor.resource.gpu.amount=2
> spark.task.resource.gpu.amount=1
> {code}
> In client mode, driver process is not launched by worker. So user can specify 
> driver resource discovery script. In cluster mode, if user still specify 
> driver resource discovery script, it is ignored with a warning.
> Supporting resource isolation is tricky because Spark worker doesn't know how 
> to isolate resources unless we hardcode some resource names like GPU support 
> in YARN, which is less ideal. Support resource isolation of multiple resource 
> types is even harder. In the first version, we will implement accelerator 
> support without resource isolation.
> Timeline:
> 1. Worker starts.
> 2. Worker loads `work.source.*` conf and runs discovery scripts to discover 
> resources.
> 3. Worker reports to master cores, memory, and resources (new) and registers.
> 4. An application starts.
> 5. Master finds workers with sufficient available resources and let worker 
> start executor or driver process.
> 6. Worker assigns executor / driver resources by passing the resource info 
> from command-line.
> 7. Application ends.
> 8. Master requests worker to kill driver/executor process.
> 9. Master updates available resources.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27371) Standalone master receives resource info from worker and allocate driver/executor properly

2019-08-09 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27371.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

> Standalone master receives resource info from worker and allocate 
> driver/executor properly
> --
>
> Key: SPARK-27371
> URL: https://issues.apache.org/jira/browse/SPARK-27371
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: wuyi
>Priority: Major
> Fix For: 3.0.0
>
>
> As an admin, I can let Spark Standalone worker automatically discover GPUs 
> installed on worker nodes. So I don't need to manually configure them.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27492) GPU scheduling - High level user documentation

2019-08-07 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27492:
--
Description: 
For the SPIP - Accelerator-aware task scheduling for Spark, 
https://issues.apache.org/jira/browse/SPARK-24615 Add some high level user 
documentation about how this feature works together and point to things like 
the example discovery script, etc.

 
 - make sure to document the discovery script and what permissions are needed 
and any security implications
 - Document standalone - local-cluster mode limitation of only a single 
resource file or discovery script so you have to have coordination on for it to 
work right.

  was:
For the SPIP - Accelerator-aware task scheduling for Spark, 
https://issues.apache.org/jira/browse/SPARK-24615 Add some high level user 
documentation about how this feature works together and point to things like 
the example discovery script, etc.

 

- make sure to document the discovery script and what permissions are needed 
and any security implications


> GPU scheduling - High level user documentation
> --
>
> Key: SPARK-27492
> URL: https://issues.apache.org/jira/browse/SPARK-27492
> Project: Spark
>  Issue Type: Story
>  Components: Documentation
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> For the SPIP - Accelerator-aware task scheduling for Spark, 
> https://issues.apache.org/jira/browse/SPARK-24615 Add some high level user 
> documentation about how this feature works together and point to things like 
> the example discovery script, etc.
>  
>  - make sure to document the discovery script and what permissions are needed 
> and any security implications
>  - Document standalone - local-cluster mode limitation of only a single 
> resource file or discovery script so you have to have coordination on for it 
> to work right.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-08-07 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27495:
--
Description: 
*Q1.* What are you trying to do? Articulate your objectives using absolutely no 
jargon.

Objectives:
 # Allow users to specify task and executor resource requirements at the stage 
level. 
 # Spark will use the stage level requirements to acquire the necessary 
resources/executors and schedule tasks based on the per stage requirements.

Many times users have different resource requirements for different stages of 
their application so they want to be able to configure resources at the stage 
level. For instance, you have a single job that has 2 stages. The first stage 
does some  ETL which requires a lot of tasks, each with a small amount of 
memory and 1 core each. Then you have a second stage where you feed that ETL 
data into an ML algorithm. The second stage only requires a few executors but 
each executor needs a lot of memory, GPUs, and many cores.  This feature allows 
the user to specify the task and executor resource requirements for the ETL 
Stage and then change them for the ML stage of the job.  

Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
extra Resources (GPU/FPGA/etc). It has the potential to allow for other things 
like limiting the number of tasks per stage, specifying other parameters for 
things like shuffle, etc. Initially I would propose we only support resources 
as they are now. So Task resources would be cpu and other resources (GPU, 
FPGA), that way we aren't adding in extra scheduling things at this point.  
Executor resources would be cpu, memory, and extra resources(GPU,FPGA, etc). 
Changing the executor resources will rely on dynamic allocation being enabled.

Main use cases:
 # ML use case where user does ETL and feeds it into an ML algorithm where it’s 
using the RDD API. This should work with barrier scheduling as well once it 
supports dynamic allocation.
 # Spark internal use by catalyst. Catalyst could control the stage level 
resources as it finds the need to change it between stages for different 
optimizations. For instance, with the new columnar plugin to the query planner 
we can insert stages into the plan that would change running something on the 
CPU in row format to running it on the GPU in columnar format. This API would 
allow the planner to make sure the stages that run on the GPU get the 
corresponding GPU resources it needs to run. Another possible use case for 
catalyst is that it would allow catalyst to add in more optimizations to where 
the user doesn’t need to configure container sizes at all. If the 
optimizer/planner can handle that for the user, everyone wins.

This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I think 
the DataSet API will require more changes because it specifically hides the RDD 
from the users via the plans and catalyst can optimize the plan and insert 
things into the plan. The only way I’ve found to make this work with the 
Dataset API would be modifying all the plans to be able to get the resource 
requirements down into where it creates the RDDs, which I believe would be a 
lot of change.  If other people know better options, it would be great to hear 
them.

*Q2.* What problem is this proposal NOT designed to solve?

The initial implementation is not going to add Dataset APIs.

We are starting with allowing users to specify a specific set of task/executor 
resources and plan to design it to be extendable, but the first implementation 
will not support changing generic SparkConf configs and only specific limited 
resources.

This initial version will have a programmatic API for specifying the resource 
requirements per stage, we can add the ability to perhaps have profiles in the 
configs later if its useful.

*Q3.* How is it done today, and what are the limits of current practice?

Currently this is either done by having multiple spark jobs or requesting 
containers with the max resources needed for any part of the job.  To do this 
today, you can break it into separate jobs where each job requests the 
corresponding resources needed, but then you have to write the data out 
somewhere and then read it back in between jobs.  This is going to take longer 
as well as require that job coordination between those to make sure everything 
works smoothly. Another option would be to request executors with your largest 
need up front and potentially waste those resources when they aren't being 
used, which in turn wastes money. For instance, for an ML application where it 
does ETL first, many times people request containers with GPUs and the GPUs sit 
idle while the ETL is happening. This is wasting those GPU resources and in 
turn money because those GPUs could have been used by other applications until 
they were really needed.  

Note for the catalyst internal 

[jira] [Updated] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-08-07 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27495:
--
Description: 
*Q1.* What are you trying to do? Articulate your objectives using absolutely no 
jargon.

Objectives:
 # Allow users to specify task and executor resource requirements at the stage 
level. 
 # Spark will use the stage level requirements to acquire the necessary 
resources/executors and schedule tasks based on the per stage requirements.

Many times users have different resource requirements for different stages of 
their application so they want to be able to configure resources at the stage 
level. For instance, you have a single job that has 2 stages. The first stage 
does some  ETL which requires a lot of tasks, each with a small amount of 
memory and 1 core each. Then you have a second stage where you feed that ETL 
data into an ML algorithm. The second stage only requires a few executors but 
each executor needs a lot of memory, GPUs, and many cores.  This feature allows 
the user to specify the task and executor resource requirements for the ETL 
Stage and then change them for the ML stage of the job.

Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
extra Resources (GPU/FPGA/etc). It has the potential to allow for other things 
like limiting the number of tasks per stage, specifying other parameters for 
things like shuffle, etc. Initially I would propose we only support resources 
as they are now. So Task resources would be cpu and other resources (GPU, 
FPGA), that way we aren't adding in extra scheduling things at this point.  
Executor resources would be cpu, memory, and extra resources(GPU,FPGA, etc). 
Changing the executor resources will rely on dynamic allocation being enabled.

Main use cases:
 # ML use case where user does ETL and feeds it into an ML algorithm where it’s 
using the RDD API. This should work with barrier scheduling as well once it 
supports dynamic allocation.
 # Spark internal use by catalyst. Catalyst could control the stage level 
resources as it finds the need to change it between stages for different 
optimizations. For instance, with the new columnar plugin to the query planner 
we can insert stages into the plan that would change running something on the 
CPU in row format to running it on the GPU in columnar format. This API would 
allow the planner to make sure the stages that run on the GPU get the 
corresponding GPU resources it needs to run. Another possible use case for 
catalyst is that it would allow catalyst to add in more optimizations to where 
the user doesn’t need to configure container sizes at all. If the 
optimizer/planner can handle that for the user, everyone wins.

This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I think 
the DataSet API will require more changes because it specifically hides the RDD 
from the users via the plans and catalyst can optimize the plan and insert 
things into the plan. The only way I’ve found to make this work with the 
Dataset API would be modifying all the plans to be able to get the resource 
requirements down into where it creates the RDDs, which I believe would be a 
lot of change.  If other people know better options, it would be great to hear 
them.

*Q2.* What problem is this proposal NOT designed to solve?

The initial implementation is not going to add Dataset APIs.

We are starting with allowing users to specify a specific set of task/executor 
resources and plan to design it to be extendable, but the first implementation 
will not support changing generic SparkConf configs and only specific limited 
resources.

This initial version will have a programmatic API for specifying the resource 
requirements per stage, we can add the ability to perhaps have profiles in the 
configs later if its useful.

*Q3.* How is it done today, and what are the limits of current practice?

Currently this is either done by having multiple spark jobs or requesting 
containers with the max resources needed for any part of the job.  To do this 
today, you can break it into separate jobs where each job requests the 
corresponding resources needed, but then you have to write the data out 
somewhere and then read it back in between jobs.  This is going to take longer 
as well as require that job coordination between those to make sure everything 
works smoothly. Another option would be to request executors with your largest 
need up front and potentially waste those resources when they aren't being 
used, which in turn wastes money. For instance, for an ML application where it 
does ETL first, many times people request containers with GPUs and the GPUs sit 
idle while the ETL is happening. This is wasting those GPU resources and in 
turn money because those GPUs could have been used by other applications until 
they were really needed.  

Note for the catalyst internal us

[jira] [Commented] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-08-07 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902076#comment-16902076
 ] 

Thomas Graves commented on SPARK-27495:
---

Thanks for the comments.
 # I was trying to leave the API open to allow both.  Not sure you had a chance 
to look at the design/api doc, but with the ResourceProfile I have a .require 
function, that would be a must have and then mentioned in there we could add a 
.prefer function that would be more of a hint or a nice to have.  Also I should 
have clarified (and I'll update the description) that in this initial 
implementation I only intend to support scheduling on the same things we do 
now. So for Tasks it would only be CPU and other resources (GPU, FPGa, etc). 
For Executor resource we would support requesting cpu, memory (offheap, onheap, 
overhead, pyspark), and other resources.  So I wouldn't add support for task 
memory per stage at this point. It would also take other changes to enforce a 
task memory usage as well.  But your point is valid if that gets added so I 
think we need to be careful about what we allow for each. 
 # So I think you have brought up 2 issues here.
 ## One is how we do resource merging/conflict resolution. My proposal was to 
simply look at the RDD's within the stage and merge any resource profiles.  
There are multiple ways we could do the merge, one is naively just use a max, 
but as you pointed out some operations might be better done as a sum, so I was 
hoping to be able to either make it configurable with perhaps some defaults for 
different operations. I guess the question is if we need that for the initial 
implementation or we just make sure to design for it.  I think you are also 
proposing to use the parents RDD (do you just mean parents within a stage or 
always?) but I think that all depends on how we define it and how the user is 
going to expect it.  I might have a very large RDD that needs processing that 
shrinks down to a very small one.  To process that large RDD I need a lot of 
resources, but once its shrunk I don't need that many.  So the child of that 
wouldn't need the same resources as its parent.  That is why I was proposing to 
set the resources specifically for that RDD. We need to handle those that have 
shuffle, like groupBy to make sure its carried over. 
 ## so the other things you brought up I think it talking more about the task 
resources here since you are saying per partition, correct?  There are 
definitely still issues even within the same RDD that some partitions may be 
larger then others.  I'm not really trying to solve that problem here.  This 
feature may open it up so that Spark can to more intelligent things there, but 
this initial round users would still have to configure the resources for the 
worse case.  I'm essentially leaving the configs the same as what they have 
now. Users I think are used to defining the executor resources based on the 
worse case any task within the worst stage will consume, this just allows them 
to go one step further to control it per stage.  It also hopefully opens it up 
to do more of that configuration for them in the future.   For now, the idea is 
why would only configure the resources for the stages they want, all other 
stages would default back to the global configs.  I'll clarify this in the spip 
as well. We could change this but I think it gets hard for the user to know the 
scoping.

> SPIP: Support Stage level resource configuration and scheduling
> ---
>
> Key: SPARK-27495
> URL: https://issues.apache.org/jira/browse/SPARK-27495
> Project: Spark
>  Issue Type: Epic
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
> Objectives:
>  # Allow users to specify task and executor resource requirements at the 
> stage level. 
>  # Spark will use the stage level requirements to acquire the necessary 
> resources/executors and schedule tasks based on the per stage requirements.
> Many times users have different resource requirements for different stages of 
> their application so they want to be able to configure resources at the stage 
> level. For instance, you have a single job that has 2 stages. The first stage 
> does some  ETL which requires a lot of tasks, each with a small amount of 
> memory and 1 core each. Then you have a second stage where you feed that ETL 
> data into an ML algorithm. The second stage only requires a few executors but 
> each executor needs a lot of memory, GPUs, and many cores.  This feature 
> allows the user to specify the task and executor resource requirements for 
> the ETL Stage and then 

[jira] [Assigned] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-08-06 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-27495:
-

Assignee: Thomas Graves

> SPIP: Support Stage level resource configuration and scheduling
> ---
>
> Key: SPARK-27495
> URL: https://issues.apache.org/jira/browse/SPARK-27495
> Project: Spark
>  Issue Type: Epic
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
> Objectives:
>  # Allow users to specify task and executor resource requirements at the 
> stage level. 
>  # Spark will use the stage level requirements to acquire the necessary 
> resources/executors and schedule tasks based on the per stage requirements.
> Many times users have different resource requirements for different stages of 
> their application so they want to be able to configure resources at the stage 
> level. For instance, you have a single job that has 2 stages. The first stage 
> does some  ETL which requires a lot of tasks, each with a small amount of 
> memory and 1 core each. Then you have a second stage where you feed that ETL 
> data into an ML algorithm. The second stage only requires a few executors but 
> each executor needs a lot of memory, GPUs, and many cores.  This feature 
> allows the user to specify the task and executor resource requirements for 
> the ETL Stage and then change them for the ML stage of the job.
> Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
> extra Resources (GPU/FPGA/etc). It has the potential to allow for other 
> things like limiting the number of tasks per stage, specifying other 
> parameters for things like shuffle, etc. Changing the executor resources will 
> rely on dynamic allocation being enabled.
> Main use cases:
>  # ML use case where user does ETL and feeds it into an ML algorithm where 
> it’s using the RDD API. This should work with barrier scheduling as well once 
> it supports dynamic allocation.
>  # Spark internal use by catalyst. Catalyst could control the stage level 
> resources as it finds the need to change it between stages for different 
> optimizations. For instance, with the new columnar plugin to the query 
> planner we can insert stages into the plan that would change running 
> something on the CPU in row format to running it on the GPU in columnar 
> format. This API would allow the planner to make sure the stages that run on 
> the GPU get the corresponding GPU resources it needs to run. Another possible 
> use case for catalyst is that it would allow catalyst to add in more 
> optimizations to where the user doesn’t need to configure container sizes at 
> all. If the optimizer/planner can handle that for the user, everyone wins.
> This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I 
> think the DataSet API will require more changes because it specifically hides 
> the RDD from the users via the plans and catalyst can optimize the plan and 
> insert things into the plan. The only way I’ve found to make this work with 
> the Dataset API would be modifying all the plans to be able to get the 
> resource requirements down into where it creates the RDDs, which I believe 
> would be a lot of change.  If other people know better options, it would be 
> great to hear them.
> *Q2.* What problem is this proposal NOT designed to solve?
> The initial implementation is not going to add Dataset APIs.
> We are starting with allowing users to specify a specific set of 
> task/executor resources and plan to design it to be extendable, but the first 
> implementation will not support changing generic SparkConf configs and only 
> specific limited resources.
> This initial version will have a programmatic API for specifying the resource 
> requirements per stage, we can add the ability to perhaps have profiles in 
> the configs later if its useful.
> *Q3.* How is it done today, and what are the limits of current practice?
> Currently this is either done by having multiple spark jobs or requesting 
> containers with the max resources needed for any part of the job.  To do this 
> today, you can break it into separate jobs where each job requests the 
> corresponding resources needed, but then you have to write the data out 
> somewhere and then read it back in between jobs.  This is going to take 
> longer as well as require that job coordination between those to make sure 
> everything works smoothly. Another option would be to request executors with 
> your largest need up front and potentially waste those resources when they 
> aren't being used, which in turn wastes money. For instance, for an ML 
> ap

[jira] [Commented] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-08-06 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901253#comment-16901253
 ] 

Thomas Graves commented on SPARK-27495:
---

I updated the Jira to have the SPIP text in description. I split the appendices 
out into a google doc because it was becoming long and to allow inline comments 
in the doc.  I can pull them back in here if people prefer.

> SPIP: Support Stage level resource configuration and scheduling
> ---
>
> Key: SPARK-27495
> URL: https://issues.apache.org/jira/browse/SPARK-27495
> Project: Spark
>  Issue Type: Epic
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
> Objectives:
>  # Allow users to specify task and executor resource requirements at the 
> stage level. 
>  # Spark will use the stage level requirements to acquire the necessary 
> resources/executors and schedule tasks based on the per stage requirements.
> Many times users have different resource requirements for different stages of 
> their application so they want to be able to configure resources at the stage 
> level. For instance, you have a single job that has 2 stages. The first stage 
> does some  ETL which requires a lot of tasks, each with a small amount of 
> memory and 1 core each. Then you have a second stage where you feed that ETL 
> data into an ML algorithm. The second stage only requires a few executors but 
> each executor needs a lot of memory, GPUs, and many cores.  This feature 
> allows the user to specify the task and executor resource requirements for 
> the ETL Stage and then change them for the ML stage of the job.
> Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
> extra Resources (GPU/FPGA/etc). It has the potential to allow for other 
> things like limiting the number of tasks per stage, specifying other 
> parameters for things like shuffle, etc. Changing the executor resources will 
> rely on dynamic allocation being enabled.
> Main use cases:
>  # ML use case where user does ETL and feeds it into an ML algorithm where 
> it’s using the RDD API. This should work with barrier scheduling as well once 
> it supports dynamic allocation.
>  # Spark internal use by catalyst. Catalyst could control the stage level 
> resources as it finds the need to change it between stages for different 
> optimizations. For instance, with the new columnar plugin to the query 
> planner we can insert stages into the plan that would change running 
> something on the CPU in row format to running it on the GPU in columnar 
> format. This API would allow the planner to make sure the stages that run on 
> the GPU get the corresponding GPU resources it needs to run. Another possible 
> use case for catalyst is that it would allow catalyst to add in more 
> optimizations to where the user doesn’t need to configure container sizes at 
> all. If the optimizer/planner can handle that for the user, everyone wins.
> This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I 
> think the DataSet API will require more changes because it specifically hides 
> the RDD from the users via the plans and catalyst can optimize the plan and 
> insert things into the plan. The only way I’ve found to make this work with 
> the Dataset API would be modifying all the plans to be able to get the 
> resource requirements down into where it creates the RDDs, which I believe 
> would be a lot of change.  If other people know better options, it would be 
> great to hear them.
> *Q2.* What problem is this proposal NOT designed to solve?
> The initial implementation is not going to add Dataset APIs.
> We are starting with allowing users to specify a specific set of 
> task/executor resources and plan to design it to be extendable, but the first 
> implementation will not support changing generic SparkConf configs and only 
> specific limited resources.
> This initial version will have a programmatic API for specifying the resource 
> requirements per stage, we can add the ability to perhaps have profiles in 
> the configs later if its useful.
> *Q3.* How is it done today, and what are the limits of current practice?
> Currently this is either done by having multiple spark jobs or requesting 
> containers with the max resources needed for any part of the job.  To do this 
> today, you can break it into separate jobs where each job requests the 
> corresponding resources needed, but then you have to write the data out 
> somewhere and then read it back in between jobs.  This is going to take 
> longer as well as require that job coordination between those to make sure 
> everything works smoothly.

[jira] [Updated] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-08-06 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27495:
--
Description: 
*Q1.* What are you trying to do? Articulate your objectives using absolutely no 
jargon.

Objectives:
 # Allow users to specify task and executor resource requirements at the stage 
level. 
 # Spark will use the stage level requirements to acquire the necessary 
resources/executors and schedule tasks based on the per stage requirements.

Many times users have different resource requirements for different stages of 
their application so they want to be able to configure resources at the stage 
level. For instance, you have a single job that has 2 stages. The first stage 
does some  ETL which requires a lot of tasks, each with a small amount of 
memory and 1 core each. Then you have a second stage where you feed that ETL 
data into an ML algorithm. The second stage only requires a few executors but 
each executor needs a lot of memory, GPUs, and many cores.  This feature allows 
the user to specify the task and executor resource requirements for the ETL 
Stage and then change them for the ML stage of the job.

Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
extra Resources (GPU/FPGA/etc). It has the potential to allow for other things 
like limiting the number of tasks per stage, specifying other parameters for 
things like shuffle, etc. Changing the executor resources will rely on dynamic 
allocation being enabled.

Main use cases:
 # ML use case where user does ETL and feeds it into an ML algorithm where it’s 
using the RDD API. This should work with barrier scheduling as well once it 
supports dynamic allocation.
 # Spark internal use by catalyst. Catalyst could control the stage level 
resources as it finds the need to change it between stages for different 
optimizations. For instance, with the new columnar plugin to the query planner 
we can insert stages into the plan that would change running something on the 
CPU in row format to running it on the GPU in columnar format. This API would 
allow the planner to make sure the stages that run on the GPU get the 
corresponding GPU resources it needs to run. Another possible use case for 
catalyst is that it would allow catalyst to add in more optimizations to where 
the user doesn’t need to configure container sizes at all. If the 
optimizer/planner can handle that for the user, everyone wins.

This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I think 
the DataSet API will require more changes because it specifically hides the RDD 
from the users via the plans and catalyst can optimize the plan and insert 
things into the plan. The only way I’ve found to make this work with the 
Dataset API would be modifying all the plans to be able to get the resource 
requirements down into where it creates the RDDs, which I believe would be a 
lot of change.  If other people know better options, it would be great to hear 
them.

*Q2.* What problem is this proposal NOT designed to solve?

The initial implementation is not going to add Dataset APIs.

We are starting with allowing users to specify a specific set of task/executor 
resources and plan to design it to be extendable, but the first implementation 
will not support changing generic SparkConf configs and only specific limited 
resources.

This initial version will have a programmatic API for specifying the resource 
requirements per stage, we can add the ability to perhaps have profiles in the 
configs later if its useful.

*Q3.* How is it done today, and what are the limits of current practice?

Currently this is either done by having multiple spark jobs or requesting 
containers with the max resources needed for any part of the job.  To do this 
today, you can break it into separate jobs where each job requests the 
corresponding resources needed, but then you have to write the data out 
somewhere and then read it back in between jobs.  This is going to take longer 
as well as require that job coordination between those to make sure everything 
works smoothly. Another option would be to request executors with your largest 
need up front and potentially waste those resources when they aren't being 
used, which in turn wastes money. For instance, for an ML application where it 
does ETL first, many times people request containers with GPUs and the GPUs sit 
idle while the ETL is happening. This is wasting those GPU resources and in 
turn money because those GPUs could have been used by other applications until 
they were really needed.  

Note for the catalyst internal use, that can’t be done today.

*Q4.* What is new in your approach and why do you think it will be successful?

This is a new way for users to specify the per stage resource requirements.  
This will give users and Spark a lot more flexibility within a job and get 
better utilization

[jira] [Updated] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-08-06 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27495:
--
Description: 
*Q1.* What are you trying to do? Articulate your objectives using absolutely no 
jargon.

Objectives:
 # Allow users to specify task and executor resource requirements at the stage 
level. 
 # Spark will use the stage level requirements to acquire the necessary 
resources/executors and schedule tasks based on the per stage requirements.

Many times users have different resource requirements for different stages of 
their application so they want to be able to configure resources at the stage 
level. For instance, you have a single job that has 2 stages. The first stage 
does some  ETL which requires a lot of tasks, each with a small amount of 
memory and 1 core each. Then you have a second stage where you feed that ETL 
data into an ML algorithm. The second stage only requires a few executors but 
each executor needs a lot of memory, GPUs, and many cores.  This feature allows 
the user to specify the task and executor resource requirements for the ETL 
Stage and then change them for the ML stage of the job. 


Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
extra Resources (GPU/FPGA/etc). It has the potential to allow for other things 
like limiting the number of tasks per stage, specifying other parameters for 
things like shuffle, etc. Changing the executor resources will rely on dynamic 
allocation being enabled.


Main use cases:
 # ML use case where user does ETL and feeds it into an ML algorithm where it’s 
using the RDD API. This should work with barrier scheduling as well once it 
supports dynamic allocation.
 # Spark internal use by catalyst. Catalyst could control the stage level 
resources as it finds the need to change it between stages for different 
optimizations. For instance, with the new columnar plugin to the query planner 
we can insert stages into the plan that would change running something on the 
CPU in row format to running it on the GPU in columnar format. This API would 
allow the planner to make sure the stages that run on the GPU get the 
corresponding GPU resources it needs to run. Another possible use case for 
catalyst is that it would allow catalyst to add in more optimizations to where 
the user doesn’t need to configure container sizes at all. If the 
optimizer/planner can handle that for the user, everyone wins. 


This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I think 
the DataSet API will require more changes because it specifically hides the RDD 
from the users via the plans and catalyst can optimize the plan and insert 
things into the plan. The only way I’ve found to make this work with the 
Dataset API would be modifying all the plans to be able to get the resource 
requirements down into where it creates the RDDs, which I believe would be a 
lot of change.  If other people know better options, it would be great to hear 
them.

*Q2.* What problem is this proposal NOT designed to solve?

The initial implementation is not going to add Dataset APIs.

We are starting with allowing users to specify a specific set of task/executor 
resources and plan to design it to be extendable, but the first implementation 
will not support changing generic SparkConf configs and only specific limited 
resources.

This initial version will have a programmatic API for specifying the resource 
requirements per stage, we can add the ability to perhaps have profiles in the 
configs later if its useful.


*Q3.* How is it done today, and what are the limits of current practice?

Currently this is either done by having multiple spark jobs or requesting 
containers with the max resources needed for any part of the job.  To do this 
today, you can break it into separate jobs where each job requests the 
corresponding resources needed, but then you have to write the data out 
somewhere and then read it back in between jobs.  This is going to take longer 
as well as require that job coordination between those to make sure everything 
works smoothly. Another option would be to request executors with your largest 
need up front and potentially waste those resources when they aren't being 
used, which in turn wastes money. For instance, for an ML application where it 
does ETL first, many times people request containers with GPUs and the GPUs sit 
idle while the ETL is happening. This is wasting those GPU resources and in 
turn money because those GPUs could have been used by other applications until 
they were really needed.  

Note for the catalyst internal use, that can’t be done today.

*Q4.* What is new in your approach and why do you think it will be successful?

This is a new way for users to specify the per stage resource requirements.  
This will give users and Spark a lot more flexibility within a job and get 
better utili

[jira] [Updated] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-08-06 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27495:
--
Issue Type: Epic  (was: Story)

> SPIP: Support Stage level resource configuration and scheduling
> ---
>
> Key: SPARK-27495
> URL: https://issues.apache.org/jira/browse/SPARK-27495
> Project: Spark
>  Issue Type: Epic
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> Currently Spark supports CPU level scheduling and we are adding in 
> accelerator aware scheduling with 
> https://issues.apache.org/jira/browse/SPARK-24615, but both of those are 
> scheduling via application level configurations.  Meaning there is one 
> configuration that is set for the entire lifetime of the application and the 
> user can't change it between Spark jobs/stages within that application.  
> Many times users have different requirements for different stages of their 
> application so they want to be able to configure at the stage level what 
> resources are required for that stage.
> For example, I might start a spark application which first does some ETL work 
> that needs lots of cores to run many tasks in parallel, then once that is 
> done I want to run some ML job and at that point I want GPU's, less CPU's, 
> and more memory.
> With this Jira we want to add the ability for users to specify the resources 
> for different stages.
> Note that https://issues.apache.org/jira/browse/SPARK-24615 had some 
> discussions on this but this part of it was removed from that.
> We should come up with a proposal on how to do this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-07-29 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27495:
--
Summary: SPIP: Support Stage level resource configuration and scheduling  
(was: Support Stage level resource configuration and scheduling)

> SPIP: Support Stage level resource configuration and scheduling
> ---
>
> Key: SPARK-27495
> URL: https://issues.apache.org/jira/browse/SPARK-27495
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> Currently Spark supports CPU level scheduling and we are adding in 
> accelerator aware scheduling with 
> https://issues.apache.org/jira/browse/SPARK-24615, but both of those are 
> scheduling via application level configurations.  Meaning there is one 
> configuration that is set for the entire lifetime of the application and the 
> user can't change it between Spark jobs/stages within that application.  
> Many times users have different requirements for different stages of their 
> application so they want to be able to configure at the stage level what 
> resources are required for that stage.
> For example, I might start a spark application which first does some ETL work 
> that needs lots of cores to run many tasks in parallel, then once that is 
> done I want to run some ML job and at that point I want GPU's, less CPU's, 
> and more memory.
> With this Jira we want to add the ability for users to specify the resources 
> for different stages.
> Note that https://issues.apache.org/jira/browse/SPARK-24615 had some 
> discussions on this but this part of it was removed from that.
> We should come up with a proposal on how to do this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28414) Standalone worker/master UI updates for Resource scheduling

2019-07-16 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-28414:
-

 Summary: Standalone worker/master UI updates for Resource 
scheduling
 Key: SPARK-28414
 URL: https://issues.apache.org/jira/browse/SPARK-28414
 Project: Spark
  Issue Type: Sub-task
  Components: Deploy
Affects Versions: 3.0.0
Reporter: Thomas Graves


https://issues.apache.org/jira/browse/SPARK-27360 is adding Resource scheduling 
to standalone mode. Update the UI with the resource information for 
master/worker as appropriate.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28403) Executor Allocation Manager can add an extra executor when speculative tasks

2019-07-15 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-28403:
-

 Summary: Executor Allocation Manager can add an extra executor 
when speculative tasks
 Key: SPARK-28403
 URL: https://issues.apache.org/jira/browse/SPARK-28403
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.3.0
Reporter: Thomas Graves


It looks like SPARK-19326 added a bug in the execuctor allocation maanger where 
it adds an extra executor when it shouldn't when we have pending speculative 
tasks but the target number didn't change. 
[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L377]

It doesn't look like this is necessary since it already added in the 
pendingSpeculative tasks.

See the questioning of this on the PR at:

https://github.com/apache/spark/pull/18492/files#diff-b096353602813e47074ace09a3890d56R379



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-28213) Remove duplication between columnar and ColumnarBatchScan

2019-07-11 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-28213:
-

Assignee: Robert Joseph Evans

> Remove duplication between columnar and ColumnarBatchScan
> -
>
> Key: SPARK-28213
> URL: https://issues.apache.org/jira/browse/SPARK-28213
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
>Priority: Major
>
> There is a lot of duplicate code between Columanr.scala and 
> ColumanrBatchScan.  This should fix that.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28213) Remove duplication between columnar and ColumnarBatchScan

2019-07-11 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-28213.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

> Remove duplication between columnar and ColumnarBatchScan
> -
>
> Key: SPARK-28213
> URL: https://issues.apache.org/jira/browse/SPARK-28213
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
>Priority: Major
> Fix For: 3.0.0
>
>
> There is a lot of duplicate code between Columanr.scala and 
> ColumanrBatchScan.  This should fix that.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24615) SPIP: Accelerator-aware task scheduling for Spark

2019-07-11 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882954#comment-16882954
 ] 

Thomas Graves commented on SPARK-24615:
---

Hey [~jomach], thanks for the offer, you can find SPIP attached and designs are 
in the EPIC jiras.  So yes we are working on the issues in the Epic, and most 
of them are complete, see the links above.  I think most of them are already 
assigned, currently we are waiting on the standalonemode implementation PR to 
be reviewed.  I think the only one not assigned would be the mesos one, but we 
were waiting on someone with mesos experience that was wanting that, if that is 
your background you would take a look at that.

> SPIP: Accelerator-aware task scheduling for Spark
> -
>
> Key: SPARK-24615
> URL: https://issues.apache.org/jira/browse/SPARK-24615
> Project: Spark
>  Issue Type: Epic
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Saisai Shao
>Assignee: Thomas Graves
>Priority: Major
>  Labels: Hydrogen, SPIP
> Attachments: Accelerator-aware scheduling in Apache Spark 3.0.pdf, 
> SPIP_ Accelerator-aware scheduling.pdf
>
>
> (The JIRA received a major update on 2019/02/28. Some comments were based on 
> an earlier version. Please ignore them. New comments start at 
> [#comment-16778026].)
> h2. Background and Motivation
> GPUs and other accelerators have been widely used for accelerating special 
> workloads, e.g., deep learning and signal processing. While users from the AI 
> community use GPUs heavily, they often need Apache Spark to load and process 
> large datasets and to handle complex data scenarios like streaming. YARN and 
> Kubernetes already support GPUs in their recent releases. Although Spark 
> supports those two cluster managers, Spark itself is not aware of GPUs 
> exposed by them and hence Spark cannot properly request GPUs and schedule 
> them for users. This leaves a critical gap to unify big data and AI workloads 
> and make life simpler for end users.
> To make Spark be aware of GPUs, we shall make two major changes at high level:
> * At cluster manager level, we update or upgrade cluster managers to include 
> GPU support. Then we expose user interfaces for Spark to request GPUs from 
> them.
> * Within Spark, we update its scheduler to understand available GPUs 
> allocated to executors, user task requests, and assign GPUs to tasks properly.
> Based on the work done in YARN and Kubernetes to support GPUs and some 
> offline prototypes, we could have necessary features implemented in the next 
> major release of Spark. You can find a detailed scoping doc here, where we 
> listed user stories and their priorities.
> h2. Goals
> * Make Spark 3.0 GPU-aware in standalone, YARN, and Kubernetes.
> * No regression on scheduler performance for normal jobs.
> h2. Non-goals
> * Fine-grained scheduling within one GPU card.
> ** We treat one GPU card and its memory together as a non-divisible unit.
> * Support TPU.
> * Support Mesos.
> * Support Windows.
> h2. Target Personas
> * Admins who need to configure clusters to run Spark with GPU nodes.
> * Data scientists who need to build DL applications on Spark.
> * Developers who need to integrate DL features on Spark.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28234) Spark Resources - add python support to get resources

2019-07-10 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882092#comment-16882092
 ] 

Thomas Graves commented on SPARK-28234:
---

Testing driver side:
{code:java}
>>> sc.resources['gpu'].addresses
['0', '1']
>>> sc.resources['gpu'].name
'gpu'{code}
 

basic code for testing executor side:

 
{code:java}
from pyspark import TaskContext
import socket

def task_info(*_):
    ctx = TaskContext()
    return ["addrs: {0}".format(ctx.resources()['gpu'].addresses)]

for x in sc.parallelize([], 8).mapPartitions(task_info).collect():
    print(x)

{code}

> Spark Resources - add python support to get resources
> -
>
> Key: SPARK-28234
> URL: https://issues.apache.org/jira/browse/SPARK-28234
> Project: Spark
>  Issue Type: Story
>  Components: PySpark, Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> Add the equivalent python api for sc.resources and TaskContext.resources



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-28234) Spark Resources - add python support to get resources

2019-07-09 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-28234:
-

Assignee: Thomas Graves

> Spark Resources - add python support to get resources
> -
>
> Key: SPARK-28234
> URL: https://issues.apache.org/jira/browse/SPARK-28234
> Project: Spark
>  Issue Type: Story
>  Components: PySpark, Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> Add the equivalent python api for sc.resources and TaskContext.resources



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28234) Spark Resources - add python support to get resources

2019-07-02 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-28234:
-

 Summary: Spark Resources - add python support to get resources
 Key: SPARK-28234
 URL: https://issues.apache.org/jira/browse/SPARK-28234
 Project: Spark
  Issue Type: Story
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves


Add the equivalent python api for sc.resources and TaskContext.resources



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27360) Standalone cluster mode support for GPU-aware scheduling

2019-07-02 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877086#comment-16877086
 ] 

Thomas Graves commented on SPARK-27360:
---

Are you going to handle updating the master/worker UI to include resources?

> Standalone cluster mode support for GPU-aware scheduling
> 
>
> Key: SPARK-27360
> URL: https://issues.apache.org/jira/browse/SPARK-27360
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Major
>
> Design and implement standalone manager support for GPU-aware scheduling:
> 1. static conf to describe resources
> 2. spark-submit to request resources 
> 2. auto discovery of GPUs
> 3. executor process isolation



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27945) Make minimal changes to support columnar processing

2019-06-28 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-27945:
-

Assignee: Robert Joseph Evans

> Make minimal changes to support columnar processing
> ---
>
> Key: SPARK-27945
> URL: https://issues.apache.org/jira/browse/SPARK-27945
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
>Priority: Major
>
> As the first step for SPARK-27396 this is to put in the minimum changes 
> needed to allow a plugin to support columnar processing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27945) Make minimal changes to support columnar processing

2019-06-28 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27945.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

> Make minimal changes to support columnar processing
> ---
>
> Key: SPARK-27945
> URL: https://issues.apache.org/jira/browse/SPARK-27945
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
>Priority: Major
> Fix For: 3.0.0
>
>
> As the first step for SPARK-27396 this is to put in the minimum changes 
> needed to allow a plugin to support columnar processing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28116) Fix Flaky `SparkContextSuite.test resource scheduling under local-cluster mode`

2019-06-21 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869803#comment-16869803
 ] 

Thomas Graves commented on SPARK-28116:
---

Sorry for my delay on this, I'll take a look. haven't seen that one before.

> Fix Flaky `SparkContextSuite.test resource scheduling under local-cluster 
> mode`
> ---
>
> Key: SPARK-28116
> URL: https://issues.apache.org/jira/browse/SPARK-28116
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Dongjoon Hyun
>Priority: Major
>
> This test suite has two kind of failures.
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/6486/testReport/org.apache.spark/SparkContextSuite/test_resource_scheduling_under_local_cluster_mode/history/
> {code}
> org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to 
> eventually never returned normally. Attempted 3921 times over 
> 1.88681567 minutes. Last failure message: 
> Array(org.apache.spark.SparkExecutorInfoImpl@533793be, 
> org.apache.spark.SparkExecutorInfoImpl@5018e57d, 
> org.apache.spark.SparkExecutorInfoImpl@dea2485, 
> org.apache.spark.SparkExecutorInfoImpl@9e63ecd) had size 4 instead of 
> expected size 3.
>   at 
> org.scalatest.concurrent.Eventually.tryTryAgain$1(Eventually.scala:432)
>   at org.scalatest.concurrent.Eventually.eventually(Eventually.scala:439)
>   at org.scalatest.concurrent.Eventually.eventually$(Eventually.scala:391)
>   at 
> org.apache.spark.SparkContextSuite.eventually(SparkContextSuite.scala:50)
>   at org.scalatest.concurrent.Eventually.eventually(Eventually.scala:337)
>   at org.scalatest.concurrent.Eventually.eventually$(Eventually.scala:336)
>   at 
> org.apache.spark.SparkContextSuite.eventually(SparkContextSuite.scala:50)
>   at 
> org.apache.spark.SparkContextSuite.$anonfun$new$93(SparkContextSuite.scala:885){code}
> {code}
> org.scalatest.exceptions.TestFailedException: Array("0", "0", "1", "1", "1", 
> "2", "2", "2", "2") did not equal List("0", "0", "0", "1", "1", "1", "2", 
> "2", "2")
>   at 
> org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:528)
>   at 
> org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:527)
>   at 
> org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1560)
>   at 
> org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:501)
>   at 
> org.apache.spark.SparkContextSuite.$anonfun$new$93(SparkContextSuite.scala:894)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27823) Add an abstraction layer for accelerator resource handling to avoid manipulating raw confs

2019-06-07 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-27823:
-

Assignee: Thomas Graves

> Add an abstraction layer for accelerator resource handling to avoid 
> manipulating raw confs
> --
>
> Key: SPARK-27823
> URL: https://issues.apache.org/jira/browse/SPARK-27823
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
>
> In SPARK-27488, we extract resource requests and allocation by parsing raw 
> Spark confs. This hurts readability because we didn't have the abstraction at 
> resource level. After we merge the core changes, we should do a refactoring 
> and make the code more readable.
> See https://github.com/apache/spark/pull/24615#issuecomment-494580663.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27760) Spark resources - user configs change .count to be .amount

2019-06-06 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27760.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

> Spark resources - user configs change .count to be .amount
> --
>
> Key: SPARK-27760
> URL: https://issues.apache.org/jira/browse/SPARK-27760
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
> Fix For: 3.0.0
>
>
> For the Spark resources, we created the config
> spark.\{driver/executor}.resource.\{resourceName}.count
> I think we should change .count to be .amount. That more easily allows users 
> to specify things with suffix like memory in a single config and they can 
> combine the value and unit. Without this they would have to specify 2 
> separate configs (like .count and .unit) which seems more of a hassle for the 
> user.
> Note the yarn configs for resources use amount:  
> spark.yarn.\{executor/driver/am}.resource=, where the  amont> is value and unit together. I think that makes a lot of sense. Filed a 
> separate Jira to add .amount to the yarn configs as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-24615) SPIP: Accelerator-aware task scheduling for Spark

2019-06-05 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-24615:
-

Assignee: Thomas Graves  (was: Xingbo Jiang)

> SPIP: Accelerator-aware task scheduling for Spark
> -
>
> Key: SPARK-24615
> URL: https://issues.apache.org/jira/browse/SPARK-24615
> Project: Spark
>  Issue Type: Epic
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Saisai Shao
>Assignee: Thomas Graves
>Priority: Major
>  Labels: Hydrogen, SPIP
> Attachments: Accelerator-aware scheduling in Apache Spark 3.0.pdf, 
> SPIP_ Accelerator-aware scheduling.pdf
>
>
> (The JIRA received a major update on 2019/02/28. Some comments were based on 
> an earlier version. Please ignore them. New comments start at 
> [#comment-16778026].)
> h2. Background and Motivation
> GPUs and other accelerators have been widely used for accelerating special 
> workloads, e.g., deep learning and signal processing. While users from the AI 
> community use GPUs heavily, they often need Apache Spark to load and process 
> large datasets and to handle complex data scenarios like streaming. YARN and 
> Kubernetes already support GPUs in their recent releases. Although Spark 
> supports those two cluster managers, Spark itself is not aware of GPUs 
> exposed by them and hence Spark cannot properly request GPUs and schedule 
> them for users. This leaves a critical gap to unify big data and AI workloads 
> and make life simpler for end users.
> To make Spark be aware of GPUs, we shall make two major changes at high level:
> * At cluster manager level, we update or upgrade cluster managers to include 
> GPU support. Then we expose user interfaces for Spark to request GPUs from 
> them.
> * Within Spark, we update its scheduler to understand available GPUs 
> allocated to executors, user task requests, and assign GPUs to tasks properly.
> Based on the work done in YARN and Kubernetes to support GPUs and some 
> offline prototypes, we could have necessary features implemented in the next 
> major release of Spark. You can find a detailed scoping doc here, where we 
> listed user stories and their priorities.
> h2. Goals
> * Make Spark 3.0 GPU-aware in standalone, YARN, and Kubernetes.
> * No regression on scheduler performance for normal jobs.
> h2. Non-goals
> * Fine-grained scheduling within one GPU card.
> ** We treat one GPU card and its memory together as a non-divisible unit.
> * Support TPU.
> * Support Mesos.
> * Support Windows.
> h2. Target Personas
> * Admins who need to configure clusters to run Spark with GPU nodes.
> * Data scientists who need to build DL applications on Spark.
> * Developers who need to integrate DL features on Spark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27760) Spark resources - user configs change .count to be .amount

2019-06-05 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27760:
--
Description: 
For the Spark resources, we created the config

spark.\{driver/executor}.resource.\{resourceName}.count

I think we should change .count to be .amount. That more easily allows users to 
specify things with suffix like memory in a single config and they can combine 
the value and unit. Without this they would have to specify 2 separate configs 
(like .count and .unit) which seems more of a hassle for the user.

Note the yarn configs for resources use amount:  
spark.yarn.\{executor/driver/am}.resource=, where the  
is value and unit together. I think that makes a lot of sense. Filed a separate 
Jira to add .amount to the yarn configs as well.

  was:
For the Spark resources, we created the config

spark.\{driver/executor}.resource.\{resourceName}.count

I think we should change .count to be .amount. That more easily allows users to 
specify things with suffix like memory in a single config and they can combine 
the value and unit. Without this they would have to specify 2 separate configs 
which seems more of a hassle for the user.

Note the yarn configs for resources use amount:  
spark.yarn.\{executor/driver/am}.resource=, where the  
is value and unit together. I think that makes a lot of sense. Filed a separate 
Jira to add .amount to the yarn configs as well.


> Spark resources - user configs change .count to be .amount
> --
>
> Key: SPARK-27760
> URL: https://issues.apache.org/jira/browse/SPARK-27760
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> For the Spark resources, we created the config
> spark.\{driver/executor}.resource.\{resourceName}.count
> I think we should change .count to be .amount. That more easily allows users 
> to specify things with suffix like memory in a single config and they can 
> combine the value and unit. Without this they would have to specify 2 
> separate configs (like .count and .unit) which seems more of a hassle for the 
> user.
> Note the yarn configs for resources use amount:  
> spark.yarn.\{executor/driver/am}.resource=, where the  amont> is value and unit together. I think that makes a lot of sense. Filed a 
> separate Jira to add .amount to the yarn configs as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27760) Spark resources - user configs change .count to be .amount

2019-06-05 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27760:
--
Description: 
For the Spark resources, we created the config

spark.\{driver/executor}.resource.\{resourceName}.count

I think we should change .count to be .amount. That more easily allows users to 
specify things with suffix like memory in a single config and they can combine 
the value and unit. Without this they would have to specify 2 separate configs 
which seems more of a hassle for the user.

Note the yarn configs for resources use amount:  
spark.yarn.\{executor/driver/am}.resource=, where the  
is value and unit together. I think that makes a lot of sense. Filed a separate 
Jira to add .amount to the yarn configs as well.

  was:
For the Spark resources, we created the config

spark.\{driver/executor}.resource.\{resourceName}.count

I think we should change .count to be .amount. That more easily allows users to 
specify things with suffix like memory in a single config and they can combine 
the value and unit. Without this they would have to specify 2 separate configs 
which seems more of a hassle for the user.


> Spark resources - user configs change .count to be .amount
> --
>
> Key: SPARK-27760
> URL: https://issues.apache.org/jira/browse/SPARK-27760
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> For the Spark resources, we created the config
> spark.\{driver/executor}.resource.\{resourceName}.count
> I think we should change .count to be .amount. That more easily allows users 
> to specify things with suffix like memory in a single config and they can 
> combine the value and unit. Without this they would have to specify 2 
> separate configs which seems more of a hassle for the user.
> Note the yarn configs for resources use amount:  
> spark.yarn.\{executor/driver/am}.resource=, where the  amont> is value and unit together. I think that makes a lot of sense. Filed a 
> separate Jira to add .amount to the yarn configs as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27760) Spark resources - user configs change .count to be .amount

2019-06-05 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-27760:
-

Assignee: Thomas Graves

> Spark resources - user configs change .count to be .amount
> --
>
> Key: SPARK-27760
> URL: https://issues.apache.org/jira/browse/SPARK-27760
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> For the Spark resources, we created the config
> spark.\{driver/executor}.resource.\{resourceName}.count
> I think we should change .count to be .amount. That more easily allows users 
> to specify things with suffix like memory in a single config and they can 
> combine the value and unit. Without this they would have to specify 2 
> separate configs which seems more of a hassle for the user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27760) Spark resources - user configs change .count to be .amount

2019-06-05 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27760:
--
Summary: Spark resources - user configs change .count to be .amount  (was: 
Spark resources - user configs change .count to be .amount, and yarn configs 
should match)

> Spark resources - user configs change .count to be .amount
> --
>
> Key: SPARK-27760
> URL: https://issues.apache.org/jira/browse/SPARK-27760
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> For the Spark resources, we created the config
> spark.\{driver/executor}.resource.\{resourceName}.count
> I think we should change .count to be .amount. That more easily allows users 
> to specify things with suffix like memory in a single config and they can 
> combine the value and unit. Without this they would have to specify 2 
> separate configs which seems more of a hassle for the user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27959) Change YARN resource configs to use .amount

2019-06-05 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-27959:
-

 Summary: Change YARN resource configs to use .amount
 Key: SPARK-27959
 URL: https://issues.apache.org/jira/browse/SPARK-27959
 Project: Spark
  Issue Type: Story
  Components: YARN
Affects Versions: 3.0.0
Reporter: Thomas Graves


we are adding in generic resource support into spark where we have suffix for 
the amount of the resource so that we could support other configs. 

Spark on yarn already had added configs to request resources via the configs 
spark.yarn.\{executor/driver/am}.resource=, where the  
is value and unit together.  We should change those configs to have a .amount 
suffix on them to match the spark configs and to allow future configs to be 
more easily added. YARN itself already supports tags and attributes so if we 
want the user to be able to pass those from spark at some point having a suffix 
makes sense.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27760) Spark resources - user configs change .count to be .amount, and yarn configs should match

2019-06-05 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27760:
--
Description: 
For the Spark resources, we created the config

spark.\{driver/executor}.resource.\{resourceName}.count

I think we should change .count to be .amount. That more easily allows users to 
specify things with suffix like memory in a single config and they can combine 
the value and unit. Without this they would have to specify 2 separate configs 
which seems more of a hassle for the user.

  was:
For the Spark resources, we created the config

spark.\{driver/executor}.resource.\{resourceName}.count

I think we should change .count to be .amount. That more easily allows users to 
specify things with suffix like memory in a single config and they can combine 
the value and unit. Without this they would have to specify 2 separate configs 
which seems more of a hassle for the user.

Spark on yarn already had added configs to request resources via the configs 
spark.yarn.\{executor/driver/am}.resource=, where the  
is value and unit together.  We should change those configs to have a .amount 
suffix on them to match the spark configs and to allow future configs to be 
more easily added. YARN itself already supports tags and attributes so if we 
want the user to be able to pass those from spark at some point having a suffix 
makes sense.


> Spark resources - user configs change .count to be .amount, and yarn configs 
> should match
> -
>
> Key: SPARK-27760
> URL: https://issues.apache.org/jira/browse/SPARK-27760
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> For the Spark resources, we created the config
> spark.\{driver/executor}.resource.\{resourceName}.count
> I think we should change .count to be .amount. That more easily allows users 
> to specify things with suffix like memory in a single config and they can 
> combine the value and unit. Without this they would have to specify 2 
> separate configs which seems more of a hassle for the user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27364) User-facing APIs for GPU-aware scheduling

2019-06-05 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27364.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

> User-facing APIs for GPU-aware scheduling
> -
>
> Key: SPARK-27364
> URL: https://issues.apache.org/jira/browse/SPARK-27364
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
> Fix For: 3.0.0
>
>
> Design and implement:
> * General guidelines for cluster managers to understand resource requests at 
> application start. The concrete conf/param will be under the design of each 
> cluster manager.
> * APIs to fetch assigned resources from task context.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27364) User-facing APIs for GPU-aware scheduling

2019-06-05 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856895#comment-16856895
 ] 

Thomas Graves commented on SPARK-27364:
---

User facing changes are all committed so going to close this.

 

A few changes from above. getResources was just called resources.  The driver 
config for standalone mode takes a json file rather then individual address 
configs. (spark.driver.resourceFile)

> User-facing APIs for GPU-aware scheduling
> -
>
> Key: SPARK-27364
> URL: https://issues.apache.org/jira/browse/SPARK-27364
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
>
> Design and implement:
> * General guidelines for cluster managers to understand resource requests at 
> application start. The concrete conf/param will be under the design of each 
> cluster manager.
> * APIs to fetch assigned resources from task context.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27374) Fetch assigned resources from TaskContext

2019-05-31 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27374.
---
Resolution: Duplicate

> Fetch assigned resources from TaskContext
> -
>
> Key: SPARK-27374
> URL: https://issues.apache.org/jira/browse/SPARK-27374
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27362) Kubernetes support for GPU-aware scheduling

2019-05-31 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27362.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

> Kubernetes support for GPU-aware scheduling
> ---
>
> Key: SPARK-27362
> URL: https://issues.apache.org/jira/browse/SPARK-27362
> Project: Spark
>  Issue Type: Story
>  Components: Kubernetes
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
> Fix For: 3.0.0
>
>
> Design and implement k8s support for GPU-aware scheduling.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27373) Design: Kubernetes support for GPU-aware scheduling

2019-05-31 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27373.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

> Design: Kubernetes support for GPU-aware scheduling
> ---
>
> Key: SPARK-27373
> URL: https://issues.apache.org/jira/browse/SPARK-27373
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27897) GPU Scheduling - move example discovery Script to scripts directory

2019-05-31 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27897.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

> GPU Scheduling - move example discovery Script to scripts directory
> ---
>
> Key: SPARK-27897
> URL: https://issues.apache.org/jira/browse/SPARK-27897
> Project: Spark
>  Issue Type: Story
>  Components: Examples
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Minor
> Fix For: 3.0.0
>
>
> SPARK-27725 GPU Scheduling - add an example discovery Script  added a script 
> at 
> [https://github.com/apache/spark/blob/master/examples/src/main/resources/getGpusResources.sh.]
> Instead of having it in the resources directory lets move it to the scripts 
> directory



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27897) GPU Scheduling - move example discovery Script to scripts directory

2019-05-31 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-27897:
-

 Summary: GPU Scheduling - move example discovery Script to scripts 
directory
 Key: SPARK-27897
 URL: https://issues.apache.org/jira/browse/SPARK-27897
 Project: Spark
  Issue Type: Story
  Components: Examples
Affects Versions: 3.0.0
Reporter: Thomas Graves
Assignee: Thomas Graves


SPARK-27725 GPU Scheduling - add an example discovery Script  added a script at 
[https://github.com/apache/spark/blob/master/examples/src/main/resources/getGpusResources.sh.]

Instead of having it in the resources directory lets move it to the scripts 
directory



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27361) YARN support for GPU-aware scheduling

2019-05-30 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27361.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

all the subtasks are finished and the parts of the spark on hadoop 3.x that we 
need are also merged so resolving this.

> YARN support for GPU-aware scheduling
> -
>
> Key: SPARK-27361
> URL: https://issues.apache.org/jira/browse/SPARK-27361
> Project: Spark
>  Issue Type: Story
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
> Fix For: 3.0.0
>
>
> Design and implement YARN support for GPU-aware scheduling:
>  * User can request GPU resources at Spark application level.
>  * How the Spark executor discovers GPU's when run on YARN
>  * Integrate with YARN 3.2 GPU support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27835) Resource Scheduling: change driver config from addresses to resourcesFile

2019-05-30 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27835.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

> Resource Scheduling: change driver config from addresses to resourcesFile
> -
>
> Key: SPARK-27835
> URL: https://issues.apache.org/jira/browse/SPARK-27835
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
> Fix For: 3.0.0
>
>
> currently we added a driver config 
> spark.driver.resources..addresses for standalone mode.
> We should change that to be consistent with the executor side changes and 
> make it spark.driver.resourcesFile. This makes it more flexible in that we 
> wouldn't have to add more configs for different types of resources, 
> standalone mode would just need to output different json in that file.  It 
> will also make some of the parsing code common between the 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27396) SPIP: Public APIs for extended Columnar Processing Support

2019-05-29 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851445#comment-16851445
 ] 

Thomas Graves commented on SPARK-27396:
---

The vote passed.

> SPIP: Public APIs for extended Columnar Processing Support
> --
>
> Key: SPARK-27396
> URL: https://issues.apache.org/jira/browse/SPARK-27396
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Robert Joseph Evans
>Priority: Major
>
> *SPIP: Columnar Processing Without Arrow Formatting Guarantees.*
>  
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
> The Dataset/DataFrame API in Spark currently only exposes to users one row at 
> a time when processing data.  The goals of this are to
>  # Add to the current sql extensions mechanism so advanced users can have 
> access to the physical SparkPlan and manipulate it to provide columnar 
> processing for existing operators, including shuffle.  This will allow them 
> to implement their own cost based optimizers to decide when processing should 
> be columnar and when it should not.
>  # Make any transitions between the columnar memory layout and a row based 
> layout transparent to the users so operations that are not columnar see the 
> data as rows, and operations that are columnar see the data as columns.
>  
> Not Requirements, but things that would be nice to have.
>  # Transition the existing in memory columnar layouts to be compatible with 
> Apache Arrow.  This would make the transformations to Apache Arrow format a 
> no-op. The existing formats are already very close to those layouts in many 
> cases.  This would not be using the Apache Arrow java library, but instead 
> being compatible with the memory 
> [layout|https://arrow.apache.org/docs/format/Layout.html] and possibly only a 
> subset of that layout.
>  
> *Q2.* What problem is this proposal NOT designed to solve? 
> The goal of this is not for ML/AI but to provide APIs for accelerated 
> computing in Spark primarily targeting SQL/ETL like workloads.  ML/AI already 
> have several mechanisms to get data into/out of them. These can be improved 
> but will be covered in a separate SPIP.
> This is not trying to implement any of the processing itself in a columnar 
> way, with the exception of examples for documentation.
> This does not cover exposing the underlying format of the data.  The only way 
> to get at the data in a ColumnVector is through the public APIs.  Exposing 
> the underlying format to improve efficiency will be covered in a separate 
> SPIP.
> This is not trying to implement new ways of transferring data to external 
> ML/AI applications.  That is covered by separate SPIPs already.
> This is not trying to add in generic code generation for columnar processing. 
>  Currently code generation for columnar processing is only supported when 
> translating columns to rows.  We will continue to support this, but will not 
> extend it as a general solution. That will be covered in a separate SPIP if 
> we find it is helpful.  For now columnar processing will be interpreted.
> This is not trying to expose a way to get columnar data into Spark through 
> DataSource V2 or any other similar API.  That would be covered by a separate 
> SPIP if we find it is needed.
>  
> *Q3.* How is it done today, and what are the limits of current practice?
> The current columnar support is limited to 3 areas.
>  # Internal implementations of FileFormats, optionally can return a 
> ColumnarBatch instead of rows.  The code generation phase knows how to take 
> that columnar data and iterate through it as rows for stages that wants rows, 
> which currently is almost everything.  The limitations here are mostly 
> implementation specific. The current standard is to abuse Scala’s type 
> erasure to return ColumnarBatches as the elements of an RDD[InternalRow]. The 
> code generation can handle this because it is generating java code, so it 
> bypasses scala’s type checking and just casts the InternalRow to the desired 
> ColumnarBatch.  This makes it difficult for others to implement the same 
> functionality for different processing because they can only do it through 
> code generation. There really is no clean separate path in the code 
> generation for columnar vs row based. Additionally, because it is only 
> supported through code generation if for any reason code generation would 
> fail there is no backup.  This is typically fine for input formats but can be 
> problematic when we get into more extensive processing.
>  # When caching data it can optionally be cached in a columnar format if the 
> input is also columnar.  This is similar to the first area and has the same 
> limitations because the 

[jira] [Resolved] (SPARK-27376) Design: YARN supports Spark GPU-aware scheduling

2019-05-29 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27376.
---
Resolution: Fixed

resolving this since seems no objections

> Design: YARN supports Spark GPU-aware scheduling
> 
>
> Key: SPARK-27376
> URL: https://issues.apache.org/jira/browse/SPARK-27376
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27361) YARN support for GPU-aware scheduling

2019-05-29 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851298#comment-16851298
 ] 

Thomas Graves commented on SPARK-27361:
---

Note the design Jira covers more details.

> YARN support for GPU-aware scheduling
> -
>
> Key: SPARK-27361
> URL: https://issues.apache.org/jira/browse/SPARK-27361
> Project: Spark
>  Issue Type: Story
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
>
> Design and implement YARN support for GPU-aware scheduling:
>  * User can request GPU resources at Spark application level.
>  * How the Spark executor discovers GPU's when run on YARN
>  * Integrate with YARN 3.2 GPU support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27361) YARN support for GPU-aware scheduling

2019-05-29 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27361:
--
Description: 
Design and implement YARN support for GPU-aware scheduling:
 * User can request GPU resources at Spark application level.
 * How the Spark executor discovers GPU's when run on YARN
 * Integrate with YARN 3.2 GPU support.

  was:
Design and implement YARN support for GPU-aware scheduling:

* User can request GPU resources at Spark application level.
* YARN can pass GPU info to Spark executor.
* Integrate with YARN 3.2 GPU support.


> YARN support for GPU-aware scheduling
> -
>
> Key: SPARK-27361
> URL: https://issues.apache.org/jira/browse/SPARK-27361
> Project: Spark
>  Issue Type: Story
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
>
> Design and implement YARN support for GPU-aware scheduling:
>  * User can request GPU resources at Spark application level.
>  * How the Spark executor discovers GPU's when run on YARN
>  * Integrate with YARN 3.2 GPU support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27362) Kubernetes support for GPU-aware scheduling

2019-05-28 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-27362:
-

Assignee: Thomas Graves

> Kubernetes support for GPU-aware scheduling
> ---
>
> Key: SPARK-27362
> URL: https://issues.apache.org/jira/browse/SPARK-27362
> Project: Spark
>  Issue Type: Story
>  Components: Kubernetes
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
>
> Design and implement k8s support for GPU-aware scheduling.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27725) GPU Scheduling - add an example discovery Script

2019-05-28 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-27725:
-

Assignee: Thomas Graves

> GPU Scheduling - add an example discovery Script
> 
>
> Key: SPARK-27725
> URL: https://issues.apache.org/jira/browse/SPARK-27725
> Project: Spark
>  Issue Type: Story
>  Components: Examples
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> We should add an example script that can be used to discovery GPU's and 
> output the correctly formatted JSON.
> Something like below, but it needs to be tested on various systems with more 
> then 2 gpu's:
> ADDRS=`nvidia-smi --query-gpu=index --format=csv,noheader | sed 
> 'N;s/\n/\",\"/'`
> COUNT=`echo $ADDRS | tr -cd , | wc -c`
> ALLCOUNT=`expr $COUNT + 1`
> #echo \{\"name\": \"gpu\", \"addresses\":[\"$ADDRS\"]}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27835) Resource Scheduling: change driver config from addresses to resourcesFile

2019-05-28 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-27835:
-

Assignee: Thomas Graves

> Resource Scheduling: change driver config from addresses to resourcesFile
> -
>
> Key: SPARK-27835
> URL: https://issues.apache.org/jira/browse/SPARK-27835
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> currently we added a driver config 
> spark.driver.resources..addresses for standalone mode.
> We should change that to be consistent with the executor side changes and 
> make it spark.driver.resourcesFile. This makes it more flexible in that we 
> wouldn't have to add more configs for different types of resources, 
> standalone mode would just need to output different json in that file.  It 
> will also make some of the parsing code common between the 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27835) Resource Scheduling: change driver config from addresses to resourcesFile

2019-05-24 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-27835:
-

 Summary: Resource Scheduling: change driver config from addresses 
to resourcesFile
 Key: SPARK-27835
 URL: https://issues.apache.org/jira/browse/SPARK-27835
 Project: Spark
  Issue Type: Story
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves


currently we added a driver config 
spark.driver.resources..addresses for standalone mode.

We should change that to be consistent with the executor side changes and make 
it spark.driver.resourcesFile. This makes it more flexible in that we wouldn't 
have to add more configs for different types of resources, standalone mode 
would just need to output different json in that file.  It will also make some 
of the parsing code common between the 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27495) Support Stage level resource configuration and scheduling

2019-05-21 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16845219#comment-16845219
 ] 

Thomas Graves commented on SPARK-27495:
---

I'm working though the design of this and there are definitely a lot of things 
to think about here.  I would like to get other peoples input before going much 
further.  

I think a few main points we need to decide on:

1) What resources can user specify per stage - the more I think about this, the 
more things I can think of people wanting to change.  For instance, normally in 
Spark you don't specify the task requirements, you specify the executor 
requirements and then possibly the cores per task.  So if someone is requesting 
resources per stage, I think we need a way to specify both task and executor 
requirements.  You could specify executor requirements based on task 
requirements like say I want 4 tasks per executor and then multiply the task 
requirements, but then you have things like overhead memory and users aren't 
used to specifying at task level, so I think its better to separate those out.  
Then going beyond that, we know people want to limit the # of total running 
tasks per stage, looking at memory, there is offheap, heap, overhead memory. I 
can envision people wanting to change retries or shuffle parameters per stage.  
Its definitely more stuff then just a few resources.

Basically its coming to a lot of things in SparkConf.    Now whether that is 
the interface we show to users or not is another question.  You could for 
instance let them pass an entire SparkConf in and set the configs they are used 
to setting and deal with that.  But then you have to error or ignore the 
configs we don't support dynamically changing and you have to deal with 
resolving conflicts on an unknown set of things if they have specified 
different confs for multiple operations that get combined into a single stage 
(ie like map.filter.groupby and they specified conflicting resources for map 
and groupby).  Or you could make an interface that only gives them specific 
options and keep adding to that as people request more things.  The latter I 
think is cleaner in some ways but is also less flexible and requires a new API 
vs possibly using the configs users are already used to.

2) API.  I think ideally each of the operators (RDD.*, Dataset.*, where *  is 
map, filter, groupby, sort, join, etc) would have an optional parameter to 
specify the resources you want to use.  I think this would make it clear to the 
user that for at least that operation these will be applied.  It also helps 
with cases you don't have an RDD yet, like on the initial read of files. This 
however could mean a lot of API changes. 

Another way, which was originally proposed in SPARK-24615,  is adding something 
like a .withResource api but then you have to deal with do you prefix it, post 
fix it, etc.  If you postfix it what about things like eager execution mode.  
prefix seems to make more sense. But then you still don't have an option for 
the readers.   I think this also makes the scoping less clear, although you 
still have some of that with adding it to the individual operators.

3) Scoping.  The scoping could be confusing to the users.  Ideally I want to do 
RDD/Dataset/Data frame api's (I realize the Jira was initially more in the 
scope of the barrier scheduling, but if we are going to do it, it seems like we 
should make it generic). The RDD is a bit more obvious where the stage 
boundaries might be, but with catalyst it can do any sort of optimizations that 
could lead to stage boundaries the user doesn't expect.    In either case you 
also have cases where things are in 2 stages, like groupby where it does the 
partial aggregation, the shuffle, then the full aggregation.  The withResources 
would have to apply to both stages.   Then you have things like do you keep 
using that resource profile until they change it or is it just those stages and 
then it goes back to the default application level configs.  You could also go 
back to what I mentioned on the Jira where the withResources would be like a 
function scope {}...  withResources() \{   everything that should be done 
within that resource profile } but that syntax isn't like anything we have 
now that I'm aware of.

3) How to deal with multiple, possibly conflict resource requirements.  You can 
go with the max in some cases but for some cases that might not make sense. For 
instance if you are doing memory you might actually want them to be the sum, 
for instance you know this operations need x memory, then catalyst combines 
that with another operation that needs y memory.  You would want to sum those 
or you would have to have the user again realize those will get combined and 
have them do it.  The latter isn't ideal either.

Really, the dataframe/dataset api shouldn't need to have any api to specify 

[jira] [Commented] (SPARK-26989) Flaky test:DAGSchedulerSuite.Barrier task failures from the same stage attempt don't trigger multiple stage retries

2019-05-21 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844845#comment-16844845
 ] 

Thomas Graves commented on SPARK-26989:
---

seeing the same error intermittently, the weird thing is I sometimes also see 
it error with below, like its not converting the collection or maybe its still 
a timing issue, but I had increased the timeout from 10 seconds to 30 seconds 
as well.

org.scalatest.exceptions.TestFailedException: ArrayBuffer(0) did not equal 
List(0) at 
org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:528) at 
org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:527) at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1560) at 
org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:501) at 
org.apache.spark.scheduler.DAGSchedulerSuite.$anonfun$new$144(DAGSchedulerSuite.scala:2656)

> Flaky test:DAGSchedulerSuite.Barrier task failures from the same stage 
> attempt don't trigger multiple stage retries
> ---
>
> Key: SPARK-26989
> URL: https://issues.apache.org/jira/browse/SPARK-26989
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Marcelo Vanzin
>Priority: Major
>
> https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102761/testReport/junit/org.apache.spark.scheduler/DAGSchedulerSuite/Barrier_task_failures_from_the_same_stage_attempt_don_t_trigger_multiple_stage_retries/
> {noformat}
> org.apache.spark.scheduler.DAGSchedulerSuite.Barrier task failures from the 
> same stage attempt don't trigger multiple stage retries
> Error Message
> org.scalatest.exceptions.TestFailedException: ArrayBuffer() did not equal 
> List(0)
> Stacktrace
> sbt.ForkMain$ForkError: org.scalatest.exceptions.TestFailedException: 
> ArrayBuffer() did not equal List(0)
>   at 
> org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:528)
>   at 
> org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:527)
>   at 
> org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1560)
>   at 
> org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:501)
>   at 
> org.apache.spark.scheduler.DAGSchedulerSuite.$anonfun$new$144(DAGSchedulerSuite.scala:2644)
>   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>   at org.scalatest.Transformer.apply(Transformer.scala:22)
>   at org.scalatest.Transformer.apply(Transformer.scala:20)
>   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:104)
>   at 
> org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
>   at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
>   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
>   at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
>   at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
>   at 
> org.apache.spark.scheduler.DAGSchedulerSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(DAGSchedulerSuite.scala:122)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27374) Fetch assigned resources from TaskContext

2019-05-20 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844067#comment-16844067
 ] 

Thomas Graves commented on SPARK-27374:
---

I believe this is being done under: 
https://issues.apache.org/jira/browse/SPARK-27366

> Fetch assigned resources from TaskContext
> -
>
> Key: SPARK-27374
> URL: https://issues.apache.org/jira/browse/SPARK-27374
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27725) GPU Scheduling - add an example discovery Script

2019-05-20 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844066#comment-16844066
 ] 

Thomas Graves commented on SPARK-27725:
---

the above sed command doesn't quite handle it, here is a command that works for 
many gpus:

 

ADDRS=`nvidia-smi --query-gpu=index --format=csv,noheader | sed -e ':a' -e 'N' 
-e'$!ba' -e 's/\n/","/g'`
echo \{\"name\": \"gpu\", \"addresses\":[\"$ADDRS\"]}

> GPU Scheduling - add an example discovery Script
> 
>
> Key: SPARK-27725
> URL: https://issues.apache.org/jira/browse/SPARK-27725
> Project: Spark
>  Issue Type: Story
>  Components: Examples
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> We should add an example script that can be used to discovery GPU's and 
> output the correctly formatted JSON.
> Something like below, but it needs to be tested on various systems with more 
> then 2 gpu's:
> ADDRS=`nvidia-smi --query-gpu=index --format=csv,noheader | sed 
> 'N;s/\n/\",\"/'`
> COUNT=`echo $ADDRS | tr -cd , | wc -c`
> ALLCOUNT=`expr $COUNT + 1`
> #echo \{\"name\": \"gpu\", \"addresses\":[\"$ADDRS\"]}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27376) Design: YARN supports Spark GPU-aware scheduling

2019-05-17 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16842262#comment-16842262
 ] 

Thomas Graves commented on SPARK-27376:
---

I filed https://issues.apache.org/jira/browse/SPARK-27760 to change the config 
from .count to .amount and have yarn configs match.  If you have other thoughts 
we can document there.

> Design: YARN supports Spark GPU-aware scheduling
> 
>
> Key: SPARK-27376
> URL: https://issues.apache.org/jira/browse/SPARK-27376
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27760) Spark resources - user configs change .count to be .amount, and yarn configs should match

2019-05-17 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-27760:
-

 Summary: Spark resources - user configs change .count to be 
.amount, and yarn configs should match
 Key: SPARK-27760
 URL: https://issues.apache.org/jira/browse/SPARK-27760
 Project: Spark
  Issue Type: Story
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves


For the Spark resources, we created the config

spark.\{driver/executor}.resource.\{resourceName}.count

I think we should change .count to be .amount. That more easily allows users to 
specify things with suffix like memory in a single config and they can combine 
the value and unit. Without this they would have to specify 2 separate configs 
which seems more of a hassle for the user.

Spark on yarn already had added configs to request resources via the configs 
spark.yarn.\{executor/driver/am}.resource=, where the  
is value and unit together.  We should change those configs to have a .amount 
suffix on them to match the spark configs and to allow future configs to be 
more easily added. YARN itself already supports tags and attributes so if we 
want the user to be able to pass those from spark at some point having a suffix 
makes sense.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27736) Improve handling of FetchFailures caused by ExternalShuffleService losing track of executor registrations

2019-05-16 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841614#comment-16841614
 ] 

Thomas Graves commented on SPARK-27736:
---

to clarify my last suggestion, I mean each executor reports back to the driver 
about the fetch failure and the driver could see that multiple fetch failures 
happened to that same host for different executors output and then choose to 
invalidate all the output on that host if X number have already failed to 
fetch.  There are other things the driver could use the information on.  

 

> Improve handling of FetchFailures caused by ExternalShuffleService losing 
> track of executor registrations
> -
>
> Key: SPARK-27736
> URL: https://issues.apache.org/jira/browse/SPARK-27736
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 2.4.0
>Reporter: Josh Rosen
>Priority: Minor
>
> This ticket describes a fault-tolerance edge-case which can cause Spark jobs 
> to fail if a single external shuffle service process reboots and fails to 
> recover the list of registered executors (something which can happen when 
> using YARN if NodeManager recovery is disabled) _and_ the Spark job has a 
> large number of executors per host.
> I believe this problem can be worked around today via a change of 
> configurations, but I'm filing this issue to (a) better document this 
> problem, and (b) propose either a change of default configurations or 
> additional DAGScheduler logic to better handle this failure mode.
> h2. Problem description
> The external shuffle service process is _mostly_ stateless except for a map 
> tracking the set of registered applications and executors.
> When processing a shuffle fetch request, the shuffle services first checks 
> whether the requested block ID's executor is registered; if it's not 
> registered then the shuffle service throws an exception like 
> {code:java}
> java.lang.RuntimeException: Executor is not registered 
> (appId=application_1557557221330_6891, execId=428){code}
> and this exception becomes a {{FetchFailed}} error in the executor requesting 
> the shuffle block.
> In normal operation this error should not occur because executors shouldn't 
> be mis-routing shuffle fetch requests. However, this _can_ happen if the 
> shuffle service crashes and restarts, causing it to lose its in-memory 
> executor registration state. With YARN this state can be recovered from disk 
> if YARN NodeManager recovery is enabled (using the mechanism added in 
> SPARK-9439), but I don't believe that we perform state recovery in Standalone 
> and Mesos modes (see SPARK-24223).
> If state cannot be recovered then map outputs cannot be served (even though 
> the files probably still exist on disk). In theory, this shouldn't cause 
> Spark jobs to fail because we can always redundantly recompute lost / 
> unfetchable map outputs.
> However, in practice this can cause total job failures in deployments where 
> the node with the failed shuffle service was running a large number of 
> executors: by default, the DAGScheduler unregisters map outputs _only from 
> individual executor whose shuffle blocks could not be fetched_ (see 
> [code|https://github.com/apache/spark/blame/bfb3ffe9b33a403a1f3b6f5407d34a477ce62c85/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1643]),
>  so it can take several rounds of failed stage attempts to fail and clear 
> output from all executors on the faulty host. If the number of executors on a 
> host is greater than the stage retry limit then this can exhaust stage retry 
> attempts and cause job failures.
> This "multiple rounds of recomputation to discover all failed executors on a 
> host" problem was addressed by SPARK-19753, which added a 
> {{spark.files.fetchFailure.unRegisterOutputOnHost}} configuration which 
> promotes executor fetch failures into host-wide fetch failures (clearing 
> output from all neighboring executors upon a single failure). However, that 
> configuration is {{false}} by default.
> h2. Potential solutions
> I have a few ideas about how we can improve this situation:
>  - Update the [YARN external shuffle service 
> documentation|https://spark.apache.org/docs/latest/running-on-yarn.html#configuring-the-external-shuffle-service]
>  to recommend enabling node manager recovery.
>  - Consider defaulting {{spark.files.fetchFailure.unRegisterOutputOnHost}} to 
> {{true}}. This would improve out-of-the-box resiliency for large clusters. 
> The trade-off here is a reduction of efficiency in case there are transient 
> "false positive" fetch failures, but I suspect this case may be unlikely in 
> practice (so the change of default could be an acceptable trade-off). See

[jira] [Commented] (SPARK-27736) Improve handling of FetchFailures caused by ExternalShuffleService losing track of executor registrations

2019-05-16 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841593#comment-16841593
 ] 

Thomas Graves commented on SPARK-27736:
---

Yeah we always ran yarn with node manager recover on, but that doesn't help 
standalone mode unless you implement something similar.  But either way I think 
documenting it on yarn is a good idea.

We used to see transient fetch failures all the time, because of temporary 
spikes in disk usage, so I would be hesitant to turn on  
spark.files.fetchFailure.unRegisterOutputOnHost by default, but on the other 
hand users could turn it back off too, so it depends on what people think is 
most common.  

I don't think you can assume the death of shuffle service (NM on yarn) implies 
death of executor. We have seen Nodemanagers goes down with OOM and executor 
stays up. Without the NM there, there isn't really anything to clean up the 
containers on it.  Now you will obviously fetch fail from that node if it does 
go down.

Your last option seems like the best of those but like you mention could get a 
bit ugly with the String matching.

The other thing you can do is start tracking those fetch failures and have the 
driver make a more informed decision on that. This is work we had started to do 
at my previous employer but never had time to finish it. Its a much bigger 
change but really what we should be doing. It would allow us to make better 
decisions about black listing and see was it the map or reduce node that has 
issues, etc.

 

> Improve handling of FetchFailures caused by ExternalShuffleService losing 
> track of executor registrations
> -
>
> Key: SPARK-27736
> URL: https://issues.apache.org/jira/browse/SPARK-27736
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 2.4.0
>Reporter: Josh Rosen
>Priority: Minor
>
> This ticket describes a fault-tolerance edge-case which can cause Spark jobs 
> to fail if a single external shuffle service process reboots and fails to 
> recover the list of registered executors (something which can happen when 
> using YARN if NodeManager recovery is disabled) _and_ the Spark job has a 
> large number of executors per host.
> I believe this problem can be worked around today via a change of 
> configurations, but I'm filing this issue to (a) better document this 
> problem, and (b) propose either a change of default configurations or 
> additional DAGScheduler logic to better handle this failure mode.
> h2. Problem description
> The external shuffle service process is _mostly_ stateless except for a map 
> tracking the set of registered applications and executors.
> When processing a shuffle fetch request, the shuffle services first checks 
> whether the requested block ID's executor is registered; if it's not 
> registered then the shuffle service throws an exception like 
> {code:java}
> java.lang.RuntimeException: Executor is not registered 
> (appId=application_1557557221330_6891, execId=428){code}
> and this exception becomes a {{FetchFailed}} error in the executor requesting 
> the shuffle block.
> In normal operation this error should not occur because executors shouldn't 
> be mis-routing shuffle fetch requests. However, this _can_ happen if the 
> shuffle service crashes and restarts, causing it to lose its in-memory 
> executor registration state. With YARN this state can be recovered from disk 
> if YARN NodeManager recovery is enabled (using the mechanism added in 
> SPARK-9439), but I don't believe that we perform state recovery in Standalone 
> and Mesos modes (see SPARK-24223).
> If state cannot be recovered then map outputs cannot be served (even though 
> the files probably still exist on disk). In theory, this shouldn't cause 
> Spark jobs to fail because we can always redundantly recompute lost / 
> unfetchable map outputs.
> However, in practice this can cause total job failures in deployments where 
> the node with the failed shuffle service was running a large number of 
> executors: by default, the DAGScheduler unregisters map outputs _only from 
> individual executor whose shuffle blocks could not be fetched_ (see 
> [code|https://github.com/apache/spark/blame/bfb3ffe9b33a403a1f3b6f5407d34a477ce62c85/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1643]),
>  so it can take several rounds of failed stage attempts to fail and clear 
> output from all executors on the faulty host. If the number of executors on a 
> host is greater than the stage retry limit then this can exhaust stage retry 
> attempts and cause job failures.
> This "multiple rounds of recomputation to discover all failed executors on a 
> host" problem was addressed by SPARK-19753, which added a 

[jira] [Comment Edited] (SPARK-27373) Design: Kubernetes support for GPU-aware scheduling

2019-05-16 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841506#comment-16841506
 ] 

Thomas Graves edited comment on SPARK-27373 at 5/16/19 4:21 PM:


for the kubernetes side, it has 2 options for requesting containers: 1) pod 
templates, 2) through normal spark and spark.kubernetes configs

For adding in the spark resource support, we can take the spark configs 
spark.\{driver/executor}.resource.\{resourceName}.count and combine this with a 
new config for the vendor name like 
spark.\{driver/executor}.resource.\{resourceName}.vendor to match the device 
plugin support from k8s ( 
[https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/)]
 and add that to the PodBuilder.

We could make the vendor config kubernetes specific, but I'm thinking we leave 
it generic and then just state its only supported on kubernetes right now.  
Depending on the setup, I could see this being useful for say YARN since yarn 
support attributes and vendor could be an attribute

spark already has functionality to override and add certain things in the pod 
templates so we can use similar functionality with the resources. So we can 
support both the pod templates and the configs the same way.


was (Author: tgraves):
for the kubernetes side, it has 2 options for requesting containers: 1) pod 
templates, 2) through normal spark and spark.kubernetes configs

For adding in the spark resource support, we can take the spark configs 
spark.\{driver/executor}.resource.\{resourceName}.count and combine this with a 
new config for the vendor name like 
spark.\{driver/executor}.resource.\{resourceName}.vendor to match the device 
plugin support from k8s ( 
[https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/)]
 and add that to the PodBuilder.

spark already has functionality to override and add certain things in the pod 
templates so we can use similar functionality with the resources. So we can 
support both the pod templates and the configs the same way.

> Design: Kubernetes support for GPU-aware scheduling
> ---
>
> Key: SPARK-27373
> URL: https://issues.apache.org/jira/browse/SPARK-27373
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27373) Design: Kubernetes support for GPU-aware scheduling

2019-05-16 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841506#comment-16841506
 ] 

Thomas Graves commented on SPARK-27373:
---

for the kubernetes side, it has 2 options for requesting containers: 1) pod 
templates, 2) through normal spark and spark.kubernetes configs

For adding in the spark resource support, we can take the spark configs 
spark.\{driver/executor}.resource.\{resourceName}.count and combine this with a 
new config for the vendor name like 
spark.\{driver/executor}.resource.\{resourceName}.vendor to match the device 
plugin support from k8s ( 
[https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/)]
 and add that to the PodBuilder.

spark already has functionality to override and add certain things in the pod 
templates so we can use similar functionality with the resources. So we can 
support both the pod templates and the configs the same way.

> Design: Kubernetes support for GPU-aware scheduling
> ---
>
> Key: SPARK-27373
> URL: https://issues.apache.org/jira/browse/SPARK-27373
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27373) Design: Kubernetes support for GPU-aware scheduling

2019-05-16 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-27373:
-

Assignee: Thomas Graves

> Design: Kubernetes support for GPU-aware scheduling
> ---
>
> Key: SPARK-27373
> URL: https://issues.apache.org/jira/browse/SPARK-27373
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27377) Upgrade YARN to 3.1.2+ to support GPU

2019-05-16 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27377.
---
Resolution: Fixed

> Upgrade YARN to 3.1.2+ to support GPU
> -
>
> Key: SPARK-27377
> URL: https://issues.apache.org/jira/browse/SPARK-27377
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Priority: Major
>
> This task should be covered by SPARK-23710. Just a placeholder here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27376) Design: YARN supports Spark GPU-aware scheduling

2019-05-16 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841404#comment-16841404
 ] 

Thomas Graves commented on SPARK-27376:
---

[~mengxr] [~jiangxb]

Thoughts on my proposal above to rename the user facing resource config from 
.count to .amount and also adding it to the existing yarn configs?

> Design: YARN supports Spark GPU-aware scheduling
> 
>
> Key: SPARK-27376
> URL: https://issues.apache.org/jira/browse/SPARK-27376
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27378) spark-submit requests GPUs in YARN mode

2019-05-16 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841412#comment-16841412
 ] 

Thomas Graves commented on SPARK-27378:
---

Spark 3.0 already added support for requesting any resource from YARN via the 
configs: spark.yarn.\{executor/driver/am}.resource, so the changes required for 
this Jira are simply to map the new spark configs: 
spark.\{executor/driver}.resource.\{fpga/gpu}.count into the corresponding yarn 
configs. For other resource types we can't map them though because we don't 
know what they are called on the yarn side.  So for any other resource they 
will have to specify both configs spark.yarn.\{executor/driver/am}.resource and 
spark.\{executor/driver}.resource.\{fpga/gpu}.  That isn't ideal but the only 
other option would be to have some sort of mapping the user would pass in.  We 
can always add more yarn resource types if it adds them. The main 2 people are 
interested in seem to be gpu and fpga anyway, so I think for now this is fine.

> spark-submit requests GPUs in YARN mode
> ---
>
> Key: SPARK-27378
> URL: https://issues.apache.org/jira/browse/SPARK-27378
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Submit, YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27379) YARN passes GPU info to Spark executor

2019-05-16 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841409#comment-16841409
 ] 

Thomas Graves commented on SPARK-27379:
---

The way yarn works is it actually doesn't tell the application any info about 
what is was allocated. If you have hadoop 3.1+ and it setup for docker and 
isolation then its up to the user to discover what the container has. 

So based on that, I'm going to close this.

> YARN passes GPU info to Spark executor
> --
>
> Key: SPARK-27379
> URL: https://issues.apache.org/jira/browse/SPARK-27379
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27379) YARN passes GPU info to Spark executor

2019-05-16 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27379.
---
Resolution: Invalid
  Assignee: Thomas Graves

> YARN passes GPU info to Spark executor
> --
>
> Key: SPARK-27379
> URL: https://issues.apache.org/jira/browse/SPARK-27379
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27377) Upgrade YARN to 3.1.2+ to support GPU

2019-05-16 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841407#comment-16841407
 ] 

Thomas Graves commented on SPARK-27377:
---

there are enough pieces in the hadoop 3.2 support impelemnted that this is no 
longer blocking us so I'm going to close this.

> Upgrade YARN to 3.1.2+ to support GPU
> -
>
> Key: SPARK-27377
> URL: https://issues.apache.org/jira/browse/SPARK-27377
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Priority: Major
>
> This task should be covered by SPARK-23710. Just a placeholder here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27376) Design: YARN supports Spark GPU-aware scheduling

2019-05-16 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-27376:
-

Assignee: Thomas Graves

> Design: YARN supports Spark GPU-aware scheduling
> 
>
> Key: SPARK-27376
> URL: https://issues.apache.org/jira/browse/SPARK-27376
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Assignee: Thomas Graves
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27376) Design: YARN supports Spark GPU-aware scheduling

2019-05-16 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841402#comment-16841402
 ] 

Thomas Graves commented on SPARK-27376:
---

The design is pretty straight forward, there is really only 1 question which is 
consistency between the yarn resource configs and now the new spark resource 
configs, see the last paragraph for more details.

Require Hadoop 3.1 and > to get official GPU support.  Hadoop can be configured 
to use docker with isolation so that the containers yarn hands you back has the 
requested gpu's and other resources.  YARN does not give you information about 
what it allocated for gpu's, you have to discover it.  YARN has hardcoded 
resource types for fpga and gpu, anything else is user defined types. Spark 3.0 
already added support for requesting any resource from YARN via the configs: 
spark.yarn.\{executor/driver/am}.resource, so the changes required for this 
Jira are simply to map the new spark configs: 
spark.\{executor/driver}.resource.\{fpga/gpu}.count into the corresponding yarn 
configs. For other resource types we can't map them though because we don't 
know what they are called on the yarn side.  So for any other resource they 
will have to specify both configs spark.yarn.\{executor/driver/am}.resource and 
spark.\{executor/driver}.resource.\{fpga/gpu}.  That isn't ideal but the only 
other option would be to have some sort of mapping the user would pass in.  We 
can always add more yarn resource types if it adds them. The main 2 people are 
interested in seem to be gpu and fpga anyway, so I think for now this is fine.

For versions < hadoop 3.1 it won't allocate based on GPU, so if they are using 
hadoop 2.7, 2.8, etc they could still allocate nodes with GPU, with yarn node 
labels or other hacks, and tell Spark the count and to auto discover them and 
Spark will pick up whatever it sees in the container - or really whatever the 
discoveryScript returns, so people could potentially write that script to match 
whatever hacks they have for sharing gpu nodes now.

The  flow from user point would be:

For GPU and FPGA: User will specify the 
spark.\{executor/driver}.resource.\{gpu/fpga}.count and the 
spark.\{executor/driver}.resource.\{gpu/fpga}.discoveryScript. The spark yarn 
code maps these into the corresponding yarn resource config and asks yarn for 
the containers.  Yarn allocates the containers and Spark will run the discovery 
script to figure out what it has for allocations.

For other resource types the user will have to specify:  
spark.yarn.\{executor/driver/am}.resource and 
spark.\{executor/driver}.resource.\{gpu/fpga}.count and the 
spark.\{executor/driver}.resource.\{gpu/fpga}.discoveryScript.  

The only other thing that is a inconsistent is the 
spark.yarn.\{executor/driver/am}.resource configs don't  have a .count on the 
end. Right now that config takes a string as a value and splits that into an 
actual count and a unit. The yarn resource configs were just added in 3.0 so 
haven't been released so we could potentially change them.  We could change the 
spark user facing configs ( 
spark.\{executor/driver}.resource.\{gpu/fpga}.count) to be similar to make it 
easier for the user to specify both a count and unit in 1 config instead of 2, 
but I like the ability to separate them on the discovery side as well. We took  
the .unit support out in the executor pull request so it isn't there right now 
anyway.  We could do the opposite and change the yarn ones to have a .count and 
.unit as well just to make things consistent but that makes user have to 
specify 2 instead of 1.  Or the third option would be to have the .count and 
.unit and then eventually have a third one that lets the user specify them 
together if we add resources that actually use it.

My thoughts are  for the user facing configs we change .count to be .amount and 
let the user specify units on it. This makes it easier for the user and it 
allows us to extend later if we want. I think we should also change the 
spark.yarn configs to have a .amount because yarn has already added other 
things like tags and attributes so we if want to extend the spark support for 
those it makes more sense to have those as another postfix option 
spark.yarn...resource.tags=

We can leave everything else that is internal as separate count and units and 
since gpu/fpga don't need units we don't need to actually add it to our 
ResourceInformation since we already removed it. 

 

> Design: YARN supports Spark GPU-aware scheduling
> 
>
> Key: SPARK-27376
> URL: https://issues.apache.org/jira/browse/SPARK-27376
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Xiangrui Meng
>Priority: Major
>




--
This message wa

[jira] [Created] (SPARK-27725) GPU Scheduling - add an example discovery Script

2019-05-15 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-27725:
-

 Summary: GPU Scheduling - add an example discovery Script
 Key: SPARK-27725
 URL: https://issues.apache.org/jira/browse/SPARK-27725
 Project: Spark
  Issue Type: Story
  Components: Examples
Affects Versions: 3.0.0
Reporter: Thomas Graves


We should add an example script that can be used to discovery GPU's and output 
the correctly formatted JSON.

Something like below, but it needs to be tested on various systems with more 
then 2 gpu's:

ADDRS=`nvidia-smi --query-gpu=index --format=csv,noheader | sed 'N;s/\n/\",\"/'`
COUNT=`echo $ADDRS | tr -cd , | wc -c`
ALLCOUNT=`expr $COUNT + 1`
#echo \{\"name\": \"gpu\", \"addresses\":[\"$ADDRS\"]}

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27024) Executor interface for cluster managers to support GPU resources

2019-05-14 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27024.
---
Resolution: Fixed

> Executor interface for cluster managers to support GPU resources
> 
>
> Key: SPARK-27024
> URL: https://issues.apache.org/jira/browse/SPARK-27024
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Xingbo Jiang
>Assignee: Thomas Graves
>Priority: Major
>
> The executor interface shall deal with the resources allocated to the 
> executor by cluster managers(Standalone, YARN, Kubernetes).  The Executor 
> either needs to be told the resources it was given or it needs to discover 
> them in order for the executor to sync with the driver to expose available 
> resources to support task scheduling.
> Note this is part of a bigger feature for gpu-aware scheduling and is just 
> how the executor find the resources. The general flow :
>  * users ask for a certain set of resources, for instance number of gpus - 
> each cluster manager has a specific way to do this.
>  * cluster manager allocates a container or set of resources (standalone mode)
>  * When spark launches the executor in that container, the executor either 
> has to be told what resources it has or it has to auto discover them.
>  * Executor has to register with Driver and tell the driver the set of 
> resources it has so the scheduler can use that to schedule tasks that 
> requires a certain amount of each of those resources



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27488) Driver interface to support GPU resources

2019-05-14 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839402#comment-16839402
 ] 

Thomas Graves commented on SPARK-27488:
---

With this Jira we need to move the function that does the checks for various 
configs to a common place and we need to address the comment here: 
[https://github.com/apache/spark/pull/24406/files/b9dacef1d7d47d19df300628d3841b3a13c03547#r283617243]

To make it more clear on variables names.

> Driver interface to support GPU resources 
> --
>
> Key: SPARK-27488
> URL: https://issues.apache.org/jira/browse/SPARK-27488
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> We want to have an interface to allow the users on the driver to get what 
> resources are allocated to them.  This is mostly to handle the case the 
> cluster manager does not launch the driver in an isolated environment and 
> where users could be sharing hosts.  For instance in standalone mode it 
> doesn't support container isolation so a host may have 4 gpu's, but only 2 of 
> them could be assigned to the driver.  In this case we need an interface for 
> the cluster manager to specify what gpu's for the driver to use and an 
> interface for the user to get the resource information



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27520) Introduce a global config system to replace hadoopConfiguration

2019-05-09 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836730#comment-16836730
 ] 

Thomas Graves commented on SPARK-27520:
---

Can we add more to the description to explain why we are doing this?   I'm 
assuming this is to allow users to more easily change it.

> Introduce a global config system to replace hadoopConfiguration
> ---
>
> Key: SPARK-27520
> URL: https://issues.apache.org/jira/browse/SPARK-27520
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Xingbo Jiang
>Priority: Major
>
> hadoopConf can be accessed via `SparkContext.hadoopConfiguration` from both 
> user code and Spark internal. The configuration is mainly used to read files 
> from hadoop-supported file system(eg. get URI/get FileSystem/add security 
> credentials/get metastore connect url/etc.)
> We shall keep a global config that users can set and use that to track the 
> hadoop configurations, and avoid using `SparkContext.hadoopConfiguration`, 
> maybe we shall mark it as deprecate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27492) GPU scheduling - High level user documentation

2019-05-09 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27492:
--
Description: 
For the SPIP - Accelerator-aware task scheduling for Spark, 
https://issues.apache.org/jira/browse/SPARK-24615 Add some high level user 
documentation about how this feature works together and point to things like 
the example discovery script, etc.

 

- make sure to document the discovery script and what permissions are needed 
and any security implications

  was:For the SPIP - Accelerator-aware task scheduling for Spark, 
https://issues.apache.org/jira/browse/SPARK-24615 Add some high level user 
documentation about how this feature works together and point to things like 
the example discovery script, etc.


> GPU scheduling - High level user documentation
> --
>
> Key: SPARK-27492
> URL: https://issues.apache.org/jira/browse/SPARK-27492
> Project: Spark
>  Issue Type: Story
>  Components: Documentation
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> For the SPIP - Accelerator-aware task scheduling for Spark, 
> https://issues.apache.org/jira/browse/SPARK-24615 Add some high level user 
> documentation about how this feature works together and point to things like 
> the example discovery script, etc.
>  
> - make sure to document the discovery script and what permissions are needed 
> and any security implications



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27396) SPIP: Public APIs for extended Columnar Processing Support

2019-05-07 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835014#comment-16835014
 ] 

Thomas Graves commented on SPARK-27396:
---

I just started a vote thread on this SPIP please take a look and vote.

> SPIP: Public APIs for extended Columnar Processing Support
> --
>
> Key: SPARK-27396
> URL: https://issues.apache.org/jira/browse/SPARK-27396
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Robert Joseph Evans
>Priority: Major
>
> *SPIP: Columnar Processing Without Arrow Formatting Guarantees.*
>  
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
> The Dataset/DataFrame API in Spark currently only exposes to users one row at 
> a time when processing data.  The goals of this are to
>  # Add to the current sql extensions mechanism so advanced users can have 
> access to the physical SparkPlan and manipulate it to provide columnar 
> processing for existing operators, including shuffle.  This will allow them 
> to implement their own cost based optimizers to decide when processing should 
> be columnar and when it should not.
>  # Make any transitions between the columnar memory layout and a row based 
> layout transparent to the users so operations that are not columnar see the 
> data as rows, and operations that are columnar see the data as columns.
>  
> Not Requirements, but things that would be nice to have.
>  # Transition the existing in memory columnar layouts to be compatible with 
> Apache Arrow.  This would make the transformations to Apache Arrow format a 
> no-op. The existing formats are already very close to those layouts in many 
> cases.  This would not be using the Apache Arrow java library, but instead 
> being compatible with the memory 
> [layout|https://arrow.apache.org/docs/format/Layout.html] and possibly only a 
> subset of that layout.
>  
> *Q2.* What problem is this proposal NOT designed to solve? 
> The goal of this is not for ML/AI but to provide APIs for accelerated 
> computing in Spark primarily targeting SQL/ETL like workloads.  ML/AI already 
> have several mechanisms to get data into/out of them. These can be improved 
> but will be covered in a separate SPIP.
> This is not trying to implement any of the processing itself in a columnar 
> way, with the exception of examples for documentation.
> This does not cover exposing the underlying format of the data.  The only way 
> to get at the data in a ColumnVector is through the public APIs.  Exposing 
> the underlying format to improve efficiency will be covered in a separate 
> SPIP.
> This is not trying to implement new ways of transferring data to external 
> ML/AI applications.  That is covered by separate SPIPs already.
> This is not trying to add in generic code generation for columnar processing. 
>  Currently code generation for columnar processing is only supported when 
> translating columns to rows.  We will continue to support this, but will not 
> extend it as a general solution. That will be covered in a separate SPIP if 
> we find it is helpful.  For now columnar processing will be interpreted.
> This is not trying to expose a way to get columnar data into Spark through 
> DataSource V2 or any other similar API.  That would be covered by a separate 
> SPIP if we find it is needed.
>  
> *Q3.* How is it done today, and what are the limits of current practice?
> The current columnar support is limited to 3 areas.
>  # Internal implementations of FileFormats, optionally can return a 
> ColumnarBatch instead of rows.  The code generation phase knows how to take 
> that columnar data and iterate through it as rows for stages that wants rows, 
> which currently is almost everything.  The limitations here are mostly 
> implementation specific. The current standard is to abuse Scala’s type 
> erasure to return ColumnarBatches as the elements of an RDD[InternalRow]. The 
> code generation can handle this because it is generating java code, so it 
> bypasses scala’s type checking and just casts the InternalRow to the desired 
> ColumnarBatch.  This makes it difficult for others to implement the same 
> functionality for different processing because they can only do it through 
> code generation. There really is no clean separate path in the code 
> generation for columnar vs row based. Additionally, because it is only 
> supported through code generation if for any reason code generation would 
> fail there is no backup.  This is typically fine for input formats but can be 
> problematic when we get into more extensive processing.
>  # When caching data it can optionally be cached in a columnar format if the 
> input is also columnar.  This is similar to the f

[jira] [Commented] (SPARK-27396) SPIP: Public APIs for extended Columnar Processing Support

2019-04-22 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16823097#comment-16823097
 ] 

Thomas Graves commented on SPARK-27396:
---

thanks for the questions and commenting, please also vote on the DEV list email 
chain - subject:

[VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support
 
 
I'm going to extend that vote by a few days to give more people time to comment 
as I know its a busy time of year.

> SPIP: Public APIs for extended Columnar Processing Support
> --
>
> Key: SPARK-27396
> URL: https://issues.apache.org/jira/browse/SPARK-27396
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Robert Joseph Evans
>Priority: Major
>
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
>  
> The Dataset/DataFrame API in Spark currently only exposes to users one row at 
> a time when processing data.  The goals of this are to 
>  
>  # Expose to end users a new option of processing the data in a columnar 
> format, multiple rows at a time, with the data organized into contiguous 
> arrays in memory. 
>  # Make any transitions between the columnar memory layout and a row based 
> layout transparent to the end user.
>  # Allow for simple data exchange with other systems, DL/ML libraries, 
> pandas, etc. by having clean APIs to transform the columnar data into an 
> Apache Arrow compatible layout.
>  # Provide a plugin mechanism for columnar processing support so an advanced 
> user could avoid data transition between columnar and row based processing 
> even through shuffles. This means we should at least support pluggable APIs 
> so an advanced end user can implement the columnar partitioning themselves, 
> and provide the glue necessary to shuffle the data still in a columnar format.
>  # Expose new APIs that allow advanced users or frameworks to implement 
> columnar processing either as UDFs, or by adjusting the physical plan to do 
> columnar processing.  If the latter is too controversial we can move it to 
> another SPIP, but we plan to implement some accelerated computing in parallel 
> with this feature to be sure the APIs work, and without this feature it makes 
> that impossible.
>  
> Not Requirements, but things that would be nice to have.
>  # Provide default implementations for partitioning columnar data, so users 
> don’t have to.
>  # Transition the existing in memory columnar layouts to be compatible with 
> Apache Arrow.  This would make the transformations to Apache Arrow format a 
> no-op. The existing formats are already very close to those layouts in many 
> cases.  This would not be using the Apache Arrow java library, but instead 
> being compatible with the memory 
> [layout|https://arrow.apache.org/docs/format/Layout.html] and possibly only a 
> subset of that layout.
>  # Provide a clean transition from the existing code to the new one.  The 
> existing APIs which are public but evolving are not that far off from what is 
> being proposed.  We should be able to create a new parallel API that can wrap 
> the existing one. This means any file format that is trying to support 
> columnar can still do so until we make a conscious decision to deprecate and 
> then turn off the old APIs.
>  
> *Q2.* What problem is this proposal NOT designed to solve?
> This is not trying to implement any of the processing itself in a columnar 
> way, with the exception of examples for documentation, and possibly default 
> implementations for partitioning of columnar shuffle. 
>  
> *Q3.* How is it done today, and what are the limits of current practice?
> The current columnar support is limited to 3 areas.
>  # Input formats, optionally can return a ColumnarBatch instead of rows.  The 
> code generation phase knows how to take that columnar data and iterate 
> through it as rows for stages that wants rows, which currently is almost 
> everything.  The limitations here are mostly implementation specific. The 
> current standard is to abuse Scala’s type erasure to return ColumnarBatches 
> as the elements of an RDD[InternalRow]. The code generation can handle this 
> because it is generating java code, so it bypasses scala’s type checking and 
> just casts the InternalRow to the desired ColumnarBatch.  This makes it 
> difficult for others to implement the same functionality for different 
> processing because they can only do it through code generation. There really 
> is no clean separate path in the code generation for columnar vs row based. 
> Additionally because it is only supported through code generation if for any 
> reason code generation would fail there is no backup.  This is typically fine 
> for input formats but can

[jira] [Comment Edited] (SPARK-27495) Support Stage level resource configuration and scheduling

2019-04-19 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16822166#comment-16822166
 ] 

Thomas Graves edited comment on SPARK-27495 at 4/19/19 8:29 PM:


Unfortunately the link to the original design doc was removed from  
https://issues.apache.org/jira/browse/SPARK-24615 but just for reference there 
was some good discussions about this in comments    
https://issues.apache.org/jira/browse/SPARK-24615?focusedCommentId=16528293&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16528293
 all the way through 
https://issues.apache.org/jira/browse/SPARK-24615?focusedCommentId=16567393&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16567393

Just to summarize the proposal there after the discussions it was something 
like:
{code:java}
val rdd2 = rdd1.withResources().mapPartitions(){code}
Possibly getting more detailed:
{code:java}
rdd.withResources
 .prefer("/gpu/k80", 2) // prefix of resource logical name, amount 
.require("/cpu", 1)
 .require("/memory", 819200)
 .require("/disk", 1){code}

 The withResources would apply to just that stage.  If there were conflicting 
resources in say like a join
{code:java}
val rddA = rdd.withResources.mapPartitions()
val rddB = rdd.withResources.mapPartitions()
val rddC = rddA.join(rddB)
{code}

 Then you would have to resolve that conflict by either choosing largest or 
merging the requirements obviously choosing the largest of the two.

There are also a lot of corner cases we need to handle such as mentioned:

 
{code:java}
So which means RDDs with different resources requirements in one stage may have 
conflicts. For example: rdd1.withResources.mapPartitions { xxx 
}.withResources.mapPartitions { xxx }.collect, resources in rdd1 may be 
different from map rdd, so currently what I can think is that:
1. always pick the latter with warning log to say that multiple different 
resources in one stage is illegal.
 2. fail the stage with warning log to say that multiple different resources in 
one stage is illegal.
 3. merge conflicts with maximum resources needs. For example rdd1 requires 3 
gpus per task, rdd2 requires 4 gpus per task, then the merged requirement would 
be 4 gpus per task. (This is the high level description, details will be per 
partition based merging) [chosen].
 
{code}
 

One thing that was brought up multiple times is how a user will really know 
what stage it applies to.  Users aren't necessarily going to realize where the 
stage boundaries are.  I don't think anyone has a good solution to this.

Also I think for this to really be useful it has to be tied into dynamic 
allocation.  Without that they can just use the application level task configs 
we are adding in  SPARK-24615.

Of course the original proposal was only for RDDs as well. That was because it 
goes with barrier scheduling and I think the dataset/dataframe api is even 
harder to know where stage boundaries are because catalyst can optimize a bunch 
of things.

 


was (Author: tgraves):
Unfortunately the link to the original design doc was removed from  
https://issues.apache.org/jira/browse/SPARK-24615 but just for reference there 
was some good discussions about this in comments    
https://issues.apache.org/jira/browse/SPARK-24615?focusedCommentId=16528293&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16528293
 all the way through 
https://issues.apache.org/jira/browse/SPARK-24615?focusedCommentId=16567393&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16567393

Just to summarize the proposal there after the discussions it was something 
like:
{code:java}
val rdd2 = rdd1.withResources().mapPartitions(){code}
Possibly getting more detailed:
 rdd.withResources
 .prefer("/gpu/k80", 2) // prefix of resource logical name, amount 
.require("/cpu", 1)
 .require("/memory", 819200)
 .require("/disk", 1)
 The withResources would apply to just that stage.  If there were conflicting 
resources in say like a join
 val rddA = rdd.withResources.mapPartitions()

val rddB = rdd.withResources.mapPartitions()

val rddC = rddA.join(rddB)
 Then you would have to resolve that conflict by either choosing largest or 
merging the requirements obviously choosing the largest of the two.

There are also a lot of corner cases we need to handle such as mentioned:

 

 

 
{code:java}
So which means RDDs with different resources requirements in one stage may have 
conflicts. For example: rdd1.withResources.mapPartitions { xxx 
}.withResources.mapPartitions { xxx }.collect, resources in rdd1 may be 
different from map rdd, so currently what I can think is that:
1. always pick the latter with warning log to say that multiple different 
resources in one stage is illegal.
 2. fail the stage with warning lo

<    1   2   3   4   5   6   7   8   9   10   >