[jira] [Comment Edited] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-21 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519445#comment-16519445
 ] 

Thomas Graves edited comment on SPARK-24552 at 6/21/18 3:02 PM:


more details on hadoop committer side:

So I think the commit/delete thing is also an issue for existing v1 and hadoop 
committers as well. So this doesn't fully solve the problem. spark uses a file 
format like (HadoopMapReduceWriteConfigUtil/HadoopMapRedWriteConfigUtil):
{code:java}
{date}_{rddid}_{m/r}_{partitionid}_{task attempt number}
{code}
I believe the same fix as the v2 would work using the taskAttemptId instead of 
the attemptNumber.

In the case we have the stage failure and a second stage attempt the task 
attempt number could be the same and thus both tasks write to the same place. 
If one of them fails or is told not to commit it could delete the output which 
is being used by both.

Need to think through all the scenarios to make sure its covered.


was (Author: tgraves):
more details on hadoop committer side:

So I think the commit/delete thing is also an issue for existing v1 and hadoop 
committers as well. So this doesn't fully solve the problem. spark uses a file 
format like (HadoopMapReduceWriteConfigUtil/HadoopMapRedWriteConfigUtil):
{quote}{date}__{rddid}__{m/r}__{partitionid}__{task attempt number}
{quote}
I believe the same fix as the v2 would work using the taskAttemptId instead of 
the attemptNumber.

In the case we have the stage failure and a second stage attempt the task 
attempt number could be the same and thus both tasks write to the same place. 
If one of them fails or is told not to commit it could delete the output which 
is being used by both.

Need to think through all the scenarios to make sure its covered.

> Task attempt numbers are reused when stages are retried
> ---
>
> Key: SPARK-24552
> URL: https://issues.apache.org/jira/browse/SPARK-24552
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1, 2.2.0, 2.2.1, 2.3.0, 2.3.1
>Reporter: Ryan Blue
>Priority: Blocker
>
> When stages are retried due to shuffle failures, task attempt numbers are 
> reused. This causes a correctness bug in the v2 data sources write path.
> Data sources (both the original and v2) pass the task attempt to writers so 
> that writers can use the attempt number to track and clean up data from 
> failed or speculative attempts. In the v2 docs for DataWriterFactory, the 
> attempt number's javadoc states that "Implementations can use this attempt 
> number to distinguish writers of different task attempts."
> When two attempts of a stage use the same (partition, attempt) pair, two 
> tasks can create the same data and attempt to commit. The commit coordinator 
> prevents both from committing and will abort the attempt that finishes last. 
> When using the (partition, attempt) pair to track data, the aborted task may 
> delete data associated with the (partition, attempt) pair. If that happens, 
> the data for the task that committed is also deleted as well, which is a 
> correctness bug.
> For a concrete example, I have a data source that creates files in place 
> named with {{part---.}}. Because these 
> files are written in place, both tasks create the same file and the one that 
> is aborted deletes the file, leading to data corruption when the file is 
> added to the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-21 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519445#comment-16519445
 ] 

Thomas Graves edited comment on SPARK-24552 at 6/21/18 3:01 PM:


more details on hadoop committer side:

So I think the commit/delete thing is also an issue for existing v1 and hadoop 
committers as well. So this doesn't fully solve the problem. spark uses a file 
format like (HadoopMapReduceWriteConfigUtil/HadoopMapRedWriteConfigUtil):
{quote}{date}__{rddid}__{m/r}__{partitionid}__{task attempt number}
{quote}
I believe the same fix as the v2 would work using the taskAttemptId instead of 
the attemptNumber.

In the case we have the stage failure and a second stage attempt the task 
attempt number could be the same and thus both tasks write to the same place. 
If one of them fails or is told not to commit it could delete the output which 
is being used by both.

Need to think through all the scenarios to make sure its covered.


was (Author: tgraves):
more details on hadoop committer side:

So I think the commit/delete thing is also an issue for existing v1 and hadoop 
committers as well. So this doesn't fully solve the problem. spark uses a file 
format like (HadoopMapReduceWriteConfigUtil/HadoopMapRedWriteConfigUtil):
{quote}{{{date}_\{rddid}_\{m/r}_\{partitionid}_\{task attempt number}}}
{quote}
I believe the same fix as the v2 would work using the taskAttemptId instead of 
the attemptNumber.

In the case we have the stage failure and a second stage attempt the task 
attempt number could be the same and thus both tasks write to the same place. 
If one of them fails or is told not to commit it could delete the output which 
is being used by both.

Need to think through all the scenarios to make sure its covered.

> Task attempt numbers are reused when stages are retried
> ---
>
> Key: SPARK-24552
> URL: https://issues.apache.org/jira/browse/SPARK-24552
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1, 2.2.0, 2.2.1, 2.3.0, 2.3.1
>Reporter: Ryan Blue
>Priority: Blocker
>
> When stages are retried due to shuffle failures, task attempt numbers are 
> reused. This causes a correctness bug in the v2 data sources write path.
> Data sources (both the original and v2) pass the task attempt to writers so 
> that writers can use the attempt number to track and clean up data from 
> failed or speculative attempts. In the v2 docs for DataWriterFactory, the 
> attempt number's javadoc states that "Implementations can use this attempt 
> number to distinguish writers of different task attempts."
> When two attempts of a stage use the same (partition, attempt) pair, two 
> tasks can create the same data and attempt to commit. The commit coordinator 
> prevents both from committing and will abort the attempt that finishes last. 
> When using the (partition, attempt) pair to track data, the aborted task may 
> delete data associated with the (partition, attempt) pair. If that happens, 
> the data for the task that committed is also deleted as well, which is a 
> correctness bug.
> For a concrete example, I have a data source that creates files in place 
> named with {{part---.}}. Because these 
> files are written in place, both tasks create the same file and the one that 
> is aborted deletes the file, leading to data corruption when the file is 
> added to the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-21 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519445#comment-16519445
 ] 

Thomas Graves commented on SPARK-24552:
---

more details on hadoop committer side:

So I think the commit/delete thing is also an issue for existing v1 and hadoop 
committers as well. So this doesn't fully solve the problem. spark uses a file 
format like (HadoopMapReduceWriteConfigUtil/HadoopMapRedWriteConfigUtil):
{quote}{{{date}_\{rddid}_\{m/r}_\{partitionid}_\{task attempt number}}}
{quote}
I believe the same fix as the v2 would work using the taskAttemptId instead of 
the attemptNumber.

In the case we have the stage failure and a second stage attempt the task 
attempt number could be the same and thus both tasks write to the same place. 
If one of them fails or is told not to commit it could delete the output which 
is being used by both.

Need to think through all the scenarios to make sure its covered.

> Task attempt numbers are reused when stages are retried
> ---
>
> Key: SPARK-24552
> URL: https://issues.apache.org/jira/browse/SPARK-24552
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1, 2.2.0, 2.2.1, 2.3.0, 2.3.1
>Reporter: Ryan Blue
>Priority: Blocker
>
> When stages are retried due to shuffle failures, task attempt numbers are 
> reused. This causes a correctness bug in the v2 data sources write path.
> Data sources (both the original and v2) pass the task attempt to writers so 
> that writers can use the attempt number to track and clean up data from 
> failed or speculative attempts. In the v2 docs for DataWriterFactory, the 
> attempt number's javadoc states that "Implementations can use this attempt 
> number to distinguish writers of different task attempts."
> When two attempts of a stage use the same (partition, attempt) pair, two 
> tasks can create the same data and attempt to commit. The commit coordinator 
> prevents both from committing and will abort the attempt that finishes last. 
> When using the (partition, attempt) pair to track data, the aborted task may 
> delete data associated with the (partition, attempt) pair. If that happens, 
> the data for the task that committed is also deleted as well, which is a 
> correctness bug.
> For a concrete example, I have a data source that creates files in place 
> named with {{part---.}}. Because these 
> files are written in place, both tasks create the same file and the one that 
> is aborted deletes the file, leading to data corruption when the file is 
> added to the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-21 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519430#comment-16519430
 ] 

Thomas Graves commented on SPARK-24552:
---

this is actually a problem with hadoop committers, v1 and v2

> Task attempt numbers are reused when stages are retried
> ---
>
> Key: SPARK-24552
> URL: https://issues.apache.org/jira/browse/SPARK-24552
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1, 2.2.0, 2.2.1, 2.3.0, 2.3.1
>Reporter: Ryan Blue
>Priority: Blocker
>
> When stages are retried due to shuffle failures, task attempt numbers are 
> reused. This causes a correctness bug in the v2 data sources write path.
> Data sources (both the original and v2) pass the task attempt to writers so 
> that writers can use the attempt number to track and clean up data from 
> failed or speculative attempts. In the v2 docs for DataWriterFactory, the 
> attempt number's javadoc states that "Implementations can use this attempt 
> number to distinguish writers of different task attempts."
> When two attempts of a stage use the same (partition, attempt) pair, two 
> tasks can create the same data and attempt to commit. The commit coordinator 
> prevents both from committing and will abort the attempt that finishes last. 
> When using the (partition, attempt) pair to track data, the aborted task may 
> delete data associated with the (partition, attempt) pair. If that happens, 
> the data for the task that committed is also deleted as well, which is a 
> correctness bug.
> For a concrete example, I have a data source that creates files in place 
> named with {{part---.}}. Because these 
> files are written in place, both tasks create the same file and the one that 
> is aborted deletes the file, leading to data corruption when the file is 
> added to the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24611) Clean up OutputCommitCoordinator

2018-06-21 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519420#comment-16519420
 ] 

Thomas Graves commented on SPARK-24611:
---

[~joshrosen]  just noticed you were the last one to modify the dagscheduler for 
output commit coordinator

[https://github.com/apache/spark/commit/d0b56339625727744e2c30fc2167bc6a457d37f7]

where it split ShuffleMapStage from ResultStage handling.

Do you know of any case the ShuffleMapStage actually call into canCommit?

> Clean up OutputCommitCoordinator
> 
>
> Key: SPARK-24611
> URL: https://issues.apache.org/jira/browse/SPARK-24611
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Marcelo Vanzin
>Priority: Major
>
> This is a follow up to SPARK-24589, to address some issues brought up during 
> review of the change:
> - the DAGScheduler registers all stages with the coordinator, when at first 
> view only result stages need to. That would save memory in the driver.
> - the coordinator can track task IDs instead of the internal "TaskIdentifier" 
> type it uses; that would also save some memory, and also be more accurate.
> - {{TaskCommitDenied}} currently has a "job ID" when it's really a stage ID, 
> and it contains the task attempt number, when it should probably have the 
> task ID instead (like above).
> The latter is an API breakage (in a class tagged as developer API, but 
> still), and also affects data written to event logs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24622) Task attempts in other stage attempts not killed when one task attempt succeeds

2018-06-21 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519416#comment-16519416
 ] 

Thomas Graves commented on SPARK-24622:
---

Need to investigate further/test to make sure I am not missing anything

> Task attempts in other stage attempts not killed when one task attempt 
> succeeds
> ---
>
> Key: SPARK-24622
> URL: https://issues.apache.org/jira/browse/SPARK-24622
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 2.1.0
>Reporter: Thomas Graves
>Priority: Major
>
> Looking through the code handling for 
> [https://github.com/apache/spark/pull/21577,] I was looking to see how we are 
> killing task attempts.  I don't any where that we actually kill task attempts 
> for stage attempts not in the one that completed successfully.
>  
> For instance:
> stage 0.0 . (stage id 0, attempt 0)
>   - task 1.0 (task 1, attempt 0)
> Stage 0.1 (stage id 0, attempt 1) started due to fetch failure for instance
>   - task 1.0 (task 1, attempt 0) . Equivalent task for stage 0.0, task 1.0 
> because task 1.0 in stage 0.0 didn't finish and didn't fail.
>  
> Now if task 1.0 in stage 0.0 succeeds, it gets committed and marked as 
> successful.  We will mark the task in stage 0.1 as completed but there is no 
> where in the code that I see it actually kill task 1.0 in stage 0.1.
> Note that the scheduler does handle the case where we have 2 attempts 
> (speculation) in a single stage attempt.  It will kill the other attempt when 
> one of them succeeds.  See TaskSetManager.handleSuccessfulTask



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24622) Task attempts in other stage attempts not killed when one task attempt succeeds

2018-06-21 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-24622:
-

 Summary: Task attempts in other stage attempts not killed when one 
task attempt succeeds
 Key: SPARK-24622
 URL: https://issues.apache.org/jira/browse/SPARK-24622
 Project: Spark
  Issue Type: Bug
  Components: Scheduler
Affects Versions: 2.1.0
Reporter: Thomas Graves


Looking through the code handling for 
[https://github.com/apache/spark/pull/21577,] I was looking to see how we are 
killing task attempts.  I don't any where that we actually kill task attempts 
for stage attempts not in the one that completed successfully.

 

For instance:

stage 0.0 . (stage id 0, attempt 0)

  - task 1.0 (task 1, attempt 0)

Stage 0.1 (stage id 0, attempt 1) started due to fetch failure for instance

  - task 1.0 (task 1, attempt 0) . Equivalent task for stage 0.0, task 1.0 
because task 1.0 in stage 0.0 didn't finish and didn't fail.

 

Now if task 1.0 in stage 0.0 succeeds, it gets committed and marked as 
successful.  We will mark the task in stage 0.1 as completed but there is no 
where in the code that I see it actually kill task 1.0 in stage 0.1.

Note that the scheduler does handle the case where we have 2 attempts 
(speculation) in a single stage attempt.  It will kill the other attempt when 
one of them succeeds.  See TaskSetManager.handleSuccessfulTask



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24519) MapStatus has 2000 hardcoded

2018-06-19 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-24519:
--
Description: 
MapStatus uses hardcoded value of 2000 partitions to determine if it should use 
highly compressed map status. We should make it configurable to allow users to 
more easily tune their jobs with respect to this without having for them to 
modify their code to change the number of partitions.  Note we can leave this 
as an internal/undocumented config for now until we have more advise for the 
users on how to set this config.

Some of my reasoning:

The config gives you a way to easily change something without the user having 
to change code, redeploy jar, and then run again. You can simply change the 
config and rerun. It also allows for easier experimentation. Changing the # of 
partitions has other side affects, whether good or bad is situation dependent. 
It can be worse are you could be increasing # of output files when you don't 
want to be, affects the # of tasks needs and thus executors to run in parallel, 
etc.

There have been various talks about this number at spark summits where people 
have told customers to increase it to be 2001 partitions. Note if you just do a 
search for spark 2000 partitions you will fine various things all talking about 
this number.  This shows that people are modifying their code to take this into 
account so it seems to me having this configurable would be better.

Once we have more advice for users we could expose this and document 
information on it.

 

  was:MapStatus uses hardcoded value of 2000 partitions to determine if it 
should use highly compressed map status. We should make it configurable.


> MapStatus has 2000 hardcoded
> 
>
> Key: SPARK-24519
> URL: https://issues.apache.org/jira/browse/SPARK-24519
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: Hieu Tri Huynh
>Priority: Minor
>
> MapStatus uses hardcoded value of 2000 partitions to determine if it should 
> use highly compressed map status. We should make it configurable to allow 
> users to more easily tune their jobs with respect to this without having for 
> them to modify their code to change the number of partitions.  Note we can 
> leave this as an internal/undocumented config for now until we have more 
> advise for the users on how to set this config.
> Some of my reasoning:
> The config gives you a way to easily change something without the user having 
> to change code, redeploy jar, and then run again. You can simply change the 
> config and rerun. It also allows for easier experimentation. Changing the # 
> of partitions has other side affects, whether good or bad is situation 
> dependent. It can be worse are you could be increasing # of output files when 
> you don't want to be, affects the # of tasks needs and thus executors to run 
> in parallel, etc.
> There have been various talks about this number at spark summits where people 
> have told customers to increase it to be 2001 partitions. Note if you just do 
> a search for spark 2000 partitions you will fine various things all talking 
> about this number.  This shows that people are modifying their code to take 
> this into account so it seems to me having this configurable would be better.
> Once we have more advice for users we could expose this and document 
> information on it.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-15 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-24552:
--
Affects Version/s: 2.2.0
   2.2.1
   2.3.0
   2.3.1

> Task attempt numbers are reused when stages are retried
> ---
>
> Key: SPARK-24552
> URL: https://issues.apache.org/jira/browse/SPARK-24552
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1, 2.2.0, 2.2.1, 2.3.0, 2.3.1
>Reporter: Ryan Blue
>Priority: Blocker
>
> When stages are retried due to shuffle failures, task attempt numbers are 
> reused. This causes a correctness bug in the v2 data sources write path.
> Data sources (both the original and v2) pass the task attempt to writers so 
> that writers can use the attempt number to track and clean up data from 
> failed or speculative attempts. In the v2 docs for DataWriterFactory, the 
> attempt number's javadoc states that "Implementations can use this attempt 
> number to distinguish writers of different task attempts."
> When two attempts of a stage use the same (partition, attempt) pair, two 
> tasks can create the same data and attempt to commit. The commit coordinator 
> prevents both from committing and will abort the attempt that finishes last. 
> When using the (partition, attempt) pair to track data, the aborted task may 
> delete data associated with the (partition, attempt) pair. If that happens, 
> the data for the task that committed is also deleted as well, which is a 
> correctness bug.
> For a concrete example, I have a data source that creates files in place 
> named with {{part---.}}. Because these 
> files are written in place, both tasks create the same file and the one that 
> is aborted deletes the file, leading to data corruption when the file is 
> added to the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-15 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-24552:
--
Priority: Blocker  (was: Critical)

> Task attempt numbers are reused when stages are retried
> ---
>
> Key: SPARK-24552
> URL: https://issues.apache.org/jira/browse/SPARK-24552
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1, 2.2.0, 2.2.1, 2.3.0, 2.3.1
>Reporter: Ryan Blue
>Priority: Blocker
>
> When stages are retried due to shuffle failures, task attempt numbers are 
> reused. This causes a correctness bug in the v2 data sources write path.
> Data sources (both the original and v2) pass the task attempt to writers so 
> that writers can use the attempt number to track and clean up data from 
> failed or speculative attempts. In the v2 docs for DataWriterFactory, the 
> attempt number's javadoc states that "Implementations can use this attempt 
> number to distinguish writers of different task attempts."
> When two attempts of a stage use the same (partition, attempt) pair, two 
> tasks can create the same data and attempt to commit. The commit coordinator 
> prevents both from committing and will abort the attempt that finishes last. 
> When using the (partition, attempt) pair to track data, the aborted task may 
> delete data associated with the (partition, attempt) pair. If that happens, 
> the data for the task that committed is also deleted as well, which is a 
> correctness bug.
> For a concrete example, I have a data source that creates files in place 
> named with {{part---.}}. Because these 
> files are written in place, both tasks create the same file and the one that 
> is aborted deletes the file, leading to data corruption when the file is 
> added to the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22148) TaskSetManager.abortIfCompletelyBlacklisted should not abort when all current executors are blacklisted but dynamic allocation is enabled

2018-06-14 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512691#comment-16512691
 ] 

Thomas Graves commented on SPARK-22148:
---

ok, just update if you start working on it. thanks.

> TaskSetManager.abortIfCompletelyBlacklisted should not abort when all current 
> executors are blacklisted but dynamic allocation is enabled
> -
>
> Key: SPARK-22148
> URL: https://issues.apache.org/jira/browse/SPARK-22148
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.2.0
>Reporter: Juan Rodríguez Hortalá
>Priority: Major
> Attachments: SPARK-22148_WIP.diff
>
>
> Currently TaskSetManager.abortIfCompletelyBlacklisted aborts the TaskSet and 
> the whole Spark job with `task X (partition Y) cannot run anywhere due to 
> node and executor blacklist. Blacklisting behavior can be configured via 
> spark.blacklist.*.` when all the available executors are blacklisted for a 
> pending Task or TaskSet. This makes sense for static allocation, where the 
> set of executors is fixed for the duration of the application, but this might 
> lead to unnecessary job failures when dynamic allocation is enabled. For 
> example, in a Spark application with a single job at a time, when a node 
> fails at the end of a stage attempt, all other executors will complete their 
> tasks, but the tasks running in the executors of the failing node will be 
> pending. Spark will keep waiting for those tasks for 2 minutes by default 
> (spark.network.timeout) until the heartbeat timeout is triggered, and then it 
> will blacklist those executors for that stage. At that point in time, other 
> executors would had been released after being idle for 1 minute by default 
> (spark.dynamicAllocation.executorIdleTimeout), because the next stage hasn't 
> started yet and so there are no more tasks available (assuming the default of 
> spark.speculation = false). So Spark will fail because the only executors 
> available are blacklisted for that stage. 
> An alternative is requesting more executors to the cluster manager in this 
> situation. This could be retried a configurable number of times after a 
> configurable wait time between request attempts, so if the cluster manager 
> fails to provide a suitable executor then the job is aborted like in the 
> previous case. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24539) HistoryServer does not display metrics from tasks that complete after stage failure

2018-06-14 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512643#comment-16512643
 ] 

Thomas Graves commented on SPARK-24539:
---

Its possible, I thought when I checked the history server I was actually seeing 
them aggregated properly but I don't know if I checked the specific task 
events.  Probably all related.

> HistoryServer does not display metrics from tasks that complete after stage 
> failure
> ---
>
> Key: SPARK-24539
> URL: https://issues.apache.org/jira/browse/SPARK-24539
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.1
>Reporter: Imran Rashid
>Priority: Major
>
> I noticed that task metrics for completed tasks with a stage failure do not 
> show up in the new history server.  I have a feeling this is because all of 
> the tasks succeeded *after* the stage had been failed (so they were 
> completions from a "zombie" taskset).  The task metrics (eg. the shuffle read 
> size & shuffle write size) do not show up at all, either in the task table, 
> the executor table, or the overall stage summary metrics.  (they might not 
> show up in the job summary page either, but in the event logs I have, there 
> is another successful stage attempt after this one, and that is the only 
> thing which shows up in the jobs page.)  If you get task details from the api 
> endpoint (eg. 
> http://[host]:[port]/api/v1/applications/[app-id]/stages/[stage-id]/[stage-attempt])
>  then you can see the successful tasks and all the metrics
> Unfortunately the event logs I have are huge and I don't have a small repro 
> handy, but I hope that description is enough to go on.
> I loaded the event logs I have in the SHS from spark 2.2 and they appear fine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-14 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512519#comment-16512519
 ] 

Thomas Graves commented on SPARK-24552:
---

sorry just realized the v2 api is still marked experiment so downgrading to 
critical

> Task attempt numbers are reused when stages are retried
> ---
>
> Key: SPARK-24552
> URL: https://issues.apache.org/jira/browse/SPARK-24552
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Ryan Blue
>Priority: Critical
>
> When stages are retried due to shuffle failures, task attempt numbers are 
> reused. This causes a correctness bug in the v2 data sources write path.
> Data sources (both the original and v2) pass the task attempt to writers so 
> that writers can use the attempt number to track and clean up data from 
> failed or speculative attempts. In the v2 docs for DataWriterFactory, the 
> attempt number's javadoc states that "Implementations can use this attempt 
> number to distinguish writers of different task attempts."
> When two attempts of a stage use the same (partition, attempt) pair, two 
> tasks can create the same data and attempt to commit. The commit coordinator 
> prevents both from committing and will abort the attempt that finishes last. 
> When using the (partition, attempt) pair to track data, the aborted task may 
> delete data associated with the (partition, attempt) pair. If that happens, 
> the data for the task that committed is also deleted as well, which is a 
> correctness bug.
> For a concrete example, I have a data source that creates files in place 
> named with {{part---.}}. Because these 
> files are written in place, both tasks create the same file and the one that 
> is aborted deletes the file, leading to data corruption when the file is 
> added to the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-14 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-24552:
--
Priority: Critical  (was: Blocker)

> Task attempt numbers are reused when stages are retried
> ---
>
> Key: SPARK-24552
> URL: https://issues.apache.org/jira/browse/SPARK-24552
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Ryan Blue
>Priority: Critical
>
> When stages are retried due to shuffle failures, task attempt numbers are 
> reused. This causes a correctness bug in the v2 data sources write path.
> Data sources (both the original and v2) pass the task attempt to writers so 
> that writers can use the attempt number to track and clean up data from 
> failed or speculative attempts. In the v2 docs for DataWriterFactory, the 
> attempt number's javadoc states that "Implementations can use this attempt 
> number to distinguish writers of different task attempts."
> When two attempts of a stage use the same (partition, attempt) pair, two 
> tasks can create the same data and attempt to commit. The commit coordinator 
> prevents both from committing and will abort the attempt that finishes last. 
> When using the (partition, attempt) pair to track data, the aborted task may 
> delete data associated with the (partition, attempt) pair. If that happens, 
> the data for the task that committed is also deleted as well, which is a 
> correctness bug.
> For a concrete example, I have a data source that creates files in place 
> named with {{part---.}}. Because these 
> files are written in place, both tasks create the same file and the one that 
> is aborted deletes the file, leading to data corruption when the file is 
> added to the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-14 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512504#comment-16512504
 ] 

Thomas Graves commented on SPARK-24552:
---

Note if this is a correctness bug and can cause data loss/corruption it needs 
to be a blocker, changed to blocker, if I am misunderstanding please update.

> Task attempt numbers are reused when stages are retried
> ---
>
> Key: SPARK-24552
> URL: https://issues.apache.org/jira/browse/SPARK-24552
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Ryan Blue
>Priority: Blocker
>
> When stages are retried due to shuffle failures, task attempt numbers are 
> reused. This causes a correctness bug in the v2 data sources write path.
> Data sources (both the original and v2) pass the task attempt to writers so 
> that writers can use the attempt number to track and clean up data from 
> failed or speculative attempts. In the v2 docs for DataWriterFactory, the 
> attempt number's javadoc states that "Implementations can use this attempt 
> number to distinguish writers of different task attempts."
> When two attempts of a stage use the same (partition, attempt) pair, two 
> tasks can create the same data and attempt to commit. The commit coordinator 
> prevents both from committing and will abort the attempt that finishes last. 
> When using the (partition, attempt) pair to track data, the aborted task may 
> delete data associated with the (partition, attempt) pair. If that happens, 
> the data for the task that committed is also deleted as well, which is a 
> correctness bug.
> For a concrete example, I have a data source that creates files in place 
> named with {{part---.}}. Because these 
> files are written in place, both tasks create the same file and the one that 
> is aborted deletes the file, leading to data corruption when the file is 
> added to the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-14 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-24552:
--
Priority: Blocker  (was: Major)

> Task attempt numbers are reused when stages are retried
> ---
>
> Key: SPARK-24552
> URL: https://issues.apache.org/jira/browse/SPARK-24552
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Ryan Blue
>Priority: Blocker
>
> When stages are retried due to shuffle failures, task attempt numbers are 
> reused. This causes a correctness bug in the v2 data sources write path.
> Data sources (both the original and v2) pass the task attempt to writers so 
> that writers can use the attempt number to track and clean up data from 
> failed or speculative attempts. In the v2 docs for DataWriterFactory, the 
> attempt number's javadoc states that "Implementations can use this attempt 
> number to distinguish writers of different task attempts."
> When two attempts of a stage use the same (partition, attempt) pair, two 
> tasks can create the same data and attempt to commit. The commit coordinator 
> prevents both from committing and will abort the attempt that finishes last. 
> When using the (partition, attempt) pair to track data, the aborted task may 
> delete data associated with the (partition, attempt) pair. If that happens, 
> the data for the task that committed is also deleted as well, which is a 
> correctness bug.
> For a concrete example, I have a data source that creates files in place 
> named with {{part---.}}. Because these 
> files are written in place, both tasks create the same file and the one that 
> is aborted deletes the file, leading to data corruption when the file is 
> added to the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-14 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512500#comment-16512500
 ] 

Thomas Graves commented on SPARK-24552:
---

I agree, I don't think changing the attempt number at this point does much help 
and could cause confusion.  I would rather see something like this change if we 
do major reworking of the scheduler.

> Task attempt numbers are reused when stages are retried
> ---
>
> Key: SPARK-24552
> URL: https://issues.apache.org/jira/browse/SPARK-24552
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Ryan Blue
>Priority: Major
>
> When stages are retried due to shuffle failures, task attempt numbers are 
> reused. This causes a correctness bug in the v2 data sources write path.
> Data sources (both the original and v2) pass the task attempt to writers so 
> that writers can use the attempt number to track and clean up data from 
> failed or speculative attempts. In the v2 docs for DataWriterFactory, the 
> attempt number's javadoc states that "Implementations can use this attempt 
> number to distinguish writers of different task attempts."
> When two attempts of a stage use the same (partition, attempt) pair, two 
> tasks can create the same data and attempt to commit. The commit coordinator 
> prevents both from committing and will abort the attempt that finishes last. 
> When using the (partition, attempt) pair to track data, the aborted task may 
> delete data associated with the (partition, attempt) pair. If that happens, 
> the data for the task that committed is also deleted as well, which is a 
> correctness bug.
> For a concrete example, I have a data source that creates files in place 
> named with {{part---.}}. Because these 
> files are written in place, both tasks create the same file and the one that 
> is aborted deletes the file, leading to data corruption when the file is 
> added to the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-05-30 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-24415:
--
Priority: Critical  (was: Major)

> Stage page aggregated executor metrics wrong when failures 
> ---
>
> Key: SPARK-24415
> URL: https://issues.apache.org/jira/browse/SPARK-24415
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Critical
> Attachments: Screen Shot 2018-05-29 at 2.15.38 PM.png
>
>
> Running with spark 2.3 on yarn and having task failures and blacklisting, the 
> aggregated metrics by executor are not correct.  In my example it should have 
> 2 failed tasks but it only shows one.    Note I tested with master branch to 
> verify its not fixed.
> I will attach screen shot.
> To reproduce:
> $SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
> --executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
> --conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
> "spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
> "spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
> "spark.blacklist.killBlacklistedExecutors=true"
> import org.apache.spark.SparkEnv 
> sc.parallelize(1 to 1, 10).map \{ x => if (SparkEnv.get.executorId.toInt 
> >= 1 && SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad 
> executor") else (x % 3, x) }.reduceByKey((a, b) => a + b).collect()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-05-30 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495224#comment-16495224
 ] 

Thomas Graves commented on SPARK-24415:
---

ok so the issue here is in the AppStatusListener where its only updating the 
task metrics for liveStages.  It gets the second taskEnd event after it 
cancelled stage 2 so its no longer in the live stages array.  

> Stage page aggregated executor metrics wrong when failures 
> ---
>
> Key: SPARK-24415
> URL: https://issues.apache.org/jira/browse/SPARK-24415
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
> Attachments: Screen Shot 2018-05-29 at 2.15.38 PM.png
>
>
> Running with spark 2.3 on yarn and having task failures and blacklisting, the 
> aggregated metrics by executor are not correct.  In my example it should have 
> 2 failed tasks but it only shows one.    Note I tested with master branch to 
> verify its not fixed.
> I will attach screen shot.
> To reproduce:
> $SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
> --executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
> --conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
> "spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
> "spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
> "spark.blacklist.killBlacklistedExecutors=true"
> import org.apache.spark.SparkEnv 
> sc.parallelize(1 to 1, 10).map \{ x => if (SparkEnv.get.executorId.toInt 
> >= 1 && SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad 
> executor") else (x % 3, x) }.reduceByKey((a, b) => a + b).collect()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-05-30 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495204#comment-16495204
 ] 

Thomas Graves commented on SPARK-24415:
---

It also looks like in the history server they show up properly in the 
aggregated metrics, although if you look at the Tasks (for all stages) column 
on the jobs page, it only lists a single task failure where it should list 2.

> Stage page aggregated executor metrics wrong when failures 
> ---
>
> Key: SPARK-24415
> URL: https://issues.apache.org/jira/browse/SPARK-24415
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
> Attachments: Screen Shot 2018-05-29 at 2.15.38 PM.png
>
>
> Running with spark 2.3 on yarn and having task failures and blacklisting, the 
> aggregated metrics by executor are not correct.  In my example it should have 
> 2 failed tasks but it only shows one.    Note I tested with master branch to 
> verify its not fixed.
> I will attach screen shot.
> To reproduce:
> $SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
> --executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
> --conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
> "spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
> "spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
> "spark.blacklist.killBlacklistedExecutors=true"
> import org.apache.spark.SparkEnv 
> sc.parallelize(1 to 1, 10).map \{ x => if (SparkEnv.get.executorId.toInt 
> >= 1 && SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad 
> executor") else (x % 3, x) }.reduceByKey((a, b) => a + b).collect()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-05-30 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495200#comment-16495200
 ] 

Thomas Graves commented on SPARK-24415:
---

this might actually be an order of events type thing.  You will note that the 
config I have is stage.maxFailedTasksPerExecutor=1 so it should really only 
have 1 failed task, but looking at the log it seems it starts the second task 
before totally handling the blacklist from the first failure:

 

18/05/30 13:57:20 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, 
gsrd259n13.red.ygrid.yahoo.com, executor 2, partition 0, PROCESS_LOCAL, 7746 
bytes)
[Stage 2:> (0 + 1) / 10]18/05/30 13:57:20 INFO BlockManagerMasterEndpoint: 
Registering block manager gsrd259n13.red.ygrid.yahoo.com:43203 with 912.3 MB 
RAM, BlockManagerId(2, gsrd259n13.red.ygrid.yahoo.com, 43203, None)
18/05/30 13:57:21 INFO Client: Application report for 
application_1526529576371_25524 (state: RUNNING)
18/05/30 13:57:21 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 
gsrd259n13.red.ygrid.yahoo.com:43203 (size: 1941.0 B, free: 912.3 MB)
18/05/30 13:57:21 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 3, 
gsrd259n13.red.ygrid.yahoo.com, executor 2, partition 1, PROCESS_LOCAL, 7747 
bytes)
18/05/30 13:57:21 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2, 
gsrd259n13.red.ygrid.yahoo.com, executor 2): java.lang.RuntimeException: Bad 
executor



18/05/30 13:57:21 INFO TaskSetBlacklist: Blacklisting executor 2 for stage 2

18/05/30 13:57:21 INFO YarnScheduler: Cancelling stage 2
18/05/30 13:57:21 INFO YarnScheduler: Stage 2 was cancelled
18/05/30 13:57:21 INFO DAGScheduler: ShuffleMapStage 2 (map at :26) 
failed in 12.063 s due to Job aborted due to stage failure:

18/05/30 13:57:21 INFO DAGScheduler: Job 1 failed: collect at :26, 
took 12.069052 s

 

The thing is though that the executors page shows that it had 2 task failures 
on that node, its just in the aggregated metrics for that stage that doesn't 
have it.

> Stage page aggregated executor metrics wrong when failures 
> ---
>
> Key: SPARK-24415
> URL: https://issues.apache.org/jira/browse/SPARK-24415
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
> Attachments: Screen Shot 2018-05-29 at 2.15.38 PM.png
>
>
> Running with spark 2.3 on yarn and having task failures and blacklisting, the 
> aggregated metrics by executor are not correct.  In my example it should have 
> 2 failed tasks but it only shows one.    Note I tested with master branch to 
> verify its not fixed.
> I will attach screen shot.
> To reproduce:
> $SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
> --executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
> --conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
> "spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
> "spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
> "spark.blacklist.killBlacklistedExecutors=true"
> import org.apache.spark.SparkEnv 
> sc.parallelize(1 to 1, 10).map \{ x => if (SparkEnv.get.executorId.toInt 
> >= 1 && SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad 
> executor") else (x % 3, x) }.reduceByKey((a, b) => a + b).collect()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-05-30 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-24415:
--
Description: 
Running with spark 2.3 on yarn and having task failures and blacklisting, the 
aggregated metrics by executor are not correct.  In my example it should have 2 
failed tasks but it only shows one.    Note I tested with master branch to 
verify its not fixed.

I will attach screen shot.

To reproduce:

$SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
--executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
--conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
"spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
"spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
"spark.blacklist.killBlacklistedExecutors=true"

import org.apache.spark.SparkEnv 

sc.parallelize(1 to 1, 10).map \{ x => if (SparkEnv.get.executorId.toInt >= 
1 && SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad 
executor") else (x % 3, x) }.reduceByKey((a, b) => a + b).collect()

  was:
Running with spark 2.3 on yarn and having task failures and blacklisting, the 
aggregated metrics by executor are not correct.  In my example it should have 2 
failed tasks but it only shows one.    Note I tested with master branch to 
verify its not fixed.

I will attach screen shot.

To reproduce:

$SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
--executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
--conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
"spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
"spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
"spark.blacklist.killBlacklistedExecutors=true"

 

 

sc.parallelize(1 to 1, 10).map \{ x => if (SparkEnv.get.executorId.toInt >= 
1 && SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad 
executor") else (x % 3, x)  }.reduceByKey((a, b) => a + b).collect()


> Stage page aggregated executor metrics wrong when failures 
> ---
>
> Key: SPARK-24415
> URL: https://issues.apache.org/jira/browse/SPARK-24415
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
> Attachments: Screen Shot 2018-05-29 at 2.15.38 PM.png
>
>
> Running with spark 2.3 on yarn and having task failures and blacklisting, the 
> aggregated metrics by executor are not correct.  In my example it should have 
> 2 failed tasks but it only shows one.    Note I tested with master branch to 
> verify its not fixed.
> I will attach screen shot.
> To reproduce:
> $SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
> --executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
> --conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
> "spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
> "spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
> "spark.blacklist.killBlacklistedExecutors=true"
> import org.apache.spark.SparkEnv 
> sc.parallelize(1 to 1, 10).map \{ x => if (SparkEnv.get.executorId.toInt 
> >= 1 && SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad 
> executor") else (x % 3, x) }.reduceByKey((a, b) => a + b).collect()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-05-30 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-24415:
--
Description: 
Running with spark 2.3 on yarn and having task failures and blacklisting, the 
aggregated metrics by executor are not correct.  In my example it should have 2 
failed tasks but it only shows one.    Note I tested with master branch to 
verify its not fixed.

I will attach screen shot.

To reproduce:

$SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
--executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
--conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
"spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
"spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
"spark.blacklist.killBlacklistedExecutors=true"

 

 

sc.parallelize(1 to 1, 10).map \{ x => if (SparkEnv.get.executorId.toInt >= 
1 && SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad 
executor") else (x % 3, x)  }.reduceByKey((a, b) => a + b).collect()

  was:
Running with spark 2.3 on yarn and having task failures and blacklisting, the 
aggregated metrics by executor are not correct.  In my example it should have 2 
failed tasks but it only shows one.    Note I tested with master branch to 
verify its not fixed.

I will attach screen shot.

To reproduce:

$SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
--executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
--conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
"spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
"spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
"spark.blacklist.killBlacklistedExecutors=true"

 

sc.parallelize(1 to 1, 10).map\{ x => | if (SparkEnv.get.executorId.toInt 
>= 1 && SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad 
executor") | else (x % 3, x) | }.reduceByKey((a, b) => a + b).collect()


> Stage page aggregated executor metrics wrong when failures 
> ---
>
> Key: SPARK-24415
> URL: https://issues.apache.org/jira/browse/SPARK-24415
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
> Attachments: Screen Shot 2018-05-29 at 2.15.38 PM.png
>
>
> Running with spark 2.3 on yarn and having task failures and blacklisting, the 
> aggregated metrics by executor are not correct.  In my example it should have 
> 2 failed tasks but it only shows one.    Note I tested with master branch to 
> verify its not fixed.
> I will attach screen shot.
> To reproduce:
> $SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
> --executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
> --conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
> "spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
> "spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
> "spark.blacklist.killBlacklistedExecutors=true"
>  
>  
> sc.parallelize(1 to 1, 10).map \{ x => if (SparkEnv.get.executorId.toInt 
> >= 1 && SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad 
> executor") else (x % 3, x)  }.reduceByKey((a, b) => a + b).collect()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-05-30 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-24415:
--
Description: 
Running with spark 2.3 on yarn and having task failures and blacklisting, the 
aggregated metrics by executor are not correct.  In my example it should have 2 
failed tasks but it only shows one.    Note I tested with master branch to 
verify its not fixed.

I will attach screen shot.

To reproduce:

$SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
--executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
--conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
"spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
"spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
"spark.blacklist.killBlacklistedExecutors=true"

 

sc.parallelize(1 to 1, 10).map\{ x => | if (SparkEnv.get.executorId.toInt 
>= 1 && SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad 
executor") | else (x % 3, x) | }.reduceByKey((a, b) => a + b).collect()

  was:
Running with spark 2.3 on yarn and having task failures and blacklisting, the 
aggregated metrics by executor are not correct.  In my example it should have 2 
failed tasks but it only shows one.    Note I tested with master branch to 
verify its not fixed.

I will attach screen shot.

To reproduce:

$SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
--executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
--conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
"spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
"spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
"spark.blacklist.killBlacklistedExecutors=true"

 

sc.parallelize(1 to 1, 10).map

{ x => | if (SparkEnv.get.executorId.toInt >= 1 && 
SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad executor") 
| else (x % 3, x) | }

.reduceByKey((a, b) => a + b).collect()


> Stage page aggregated executor metrics wrong when failures 
> ---
>
> Key: SPARK-24415
> URL: https://issues.apache.org/jira/browse/SPARK-24415
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
> Attachments: Screen Shot 2018-05-29 at 2.15.38 PM.png
>
>
> Running with spark 2.3 on yarn and having task failures and blacklisting, the 
> aggregated metrics by executor are not correct.  In my example it should have 
> 2 failed tasks but it only shows one.    Note I tested with master branch to 
> verify its not fixed.
> I will attach screen shot.
> To reproduce:
> $SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
> --executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
> --conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
> "spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
> "spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
> "spark.blacklist.killBlacklistedExecutors=true"
>  
> sc.parallelize(1 to 1, 10).map\{ x => | if (SparkEnv.get.executorId.toInt 
> >= 1 && SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad 
> executor") | else (x % 3, x) | }.reduceByKey((a, b) => a + b).collect()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-24413) Executor Blacklisting shouldn't immediately fail the application if dynamic allocation is enabled and no active executors

2018-05-29 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-24413.
---
Resolution: Duplicate

> Executor Blacklisting shouldn't immediately fail the application if dynamic 
> allocation is enabled and no active executors
> -
>
> Key: SPARK-24413
> URL: https://issues.apache.org/jira/browse/SPARK-24413
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
>
> Currently with executor blacklisting enabled, dynamic allocation on, and you 
> only have 1 active executor (spark.blacklist.killBlacklistedExecutors setting 
> doesn't matter in this case, can be on or off), if you have a task fail that 
> results in the 1 executor you have getting blacklisted, then your entire 
> application will fail.  The error you get is something like:
> Aborting TaskSet 0.0 because task 9 (partition 9)
> cannot run anywhere due to node and executor blacklist.
> This is very undesirable behavior because you may have a huge job but one 
> task is the long tail and if it happens to hit a bad node that would 
> blacklist it, the entire job fail.
> Ideally since dynamic allocation is on, the schedule should not immediately 
> fail but it should let dynamic allocation try to get more executors. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24413) Executor Blacklisting shouldn't immediately fail the application if dynamic allocation is enabled and no active executors

2018-05-29 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494279#comment-16494279
 ] 

Thomas Graves commented on SPARK-24413:
---

thanks for linking those we can just dup this to SPARK-22148

> Executor Blacklisting shouldn't immediately fail the application if dynamic 
> allocation is enabled and no active executors
> -
>
> Key: SPARK-24413
> URL: https://issues.apache.org/jira/browse/SPARK-24413
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
>
> Currently with executor blacklisting enabled, dynamic allocation on, and you 
> only have 1 active executor (spark.blacklist.killBlacklistedExecutors setting 
> doesn't matter in this case, can be on or off), if you have a task fail that 
> results in the 1 executor you have getting blacklisted, then your entire 
> application will fail.  The error you get is something like:
> Aborting TaskSet 0.0 because task 9 (partition 9)
> cannot run anywhere due to node and executor blacklist.
> This is very undesirable behavior because you may have a huge job but one 
> task is the long tail and if it happens to hit a bad node that would 
> blacklist it, the entire job fail.
> Ideally since dynamic allocation is on, the schedule should not immediately 
> fail but it should let dynamic allocation try to get more executors. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24414) Stages page doesn't show all task attempts when failures

2018-05-29 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494265#comment-16494265
 ] 

Thomas Graves commented on SPARK-24414:
---

also just an fyi I also filed SPARK-24415, not sure if they are related as I 
haven't dug into that one yet.  

> Stages page doesn't show all task attempts when failures
> 
>
> Key: SPARK-24414
> URL: https://issues.apache.org/jira/browse/SPARK-24414
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Critical
>
> If you have task failures, the StagePage doesn't render all the task attempts 
> properly.  It seems to make the table the size of the total number of 
> successful tasks rather then including all the failed tasks.
> Even though the table size is smaller, if you sort by various columns you can 
> see all the tasks are actually there, it just seems the size of the table is 
> wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24414) Stages page doesn't show all task attempts when failures

2018-05-29 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494237#comment-16494237
 ] 

Thomas Graves commented on SPARK-24414:
---

I am looking to see if we can just return an empty table in the case the tasks 
aren't initialized yet. If you get to it first thats fine or had something else 
in mind 

> Stages page doesn't show all task attempts when failures
> 
>
> Key: SPARK-24414
> URL: https://issues.apache.org/jira/browse/SPARK-24414
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Critical
>
> If you have task failures, the StagePage doesn't render all the task attempts 
> properly.  It seems to make the table the size of the total number of 
> successful tasks rather then including all the failed tasks.
> Even though the table size is smaller, if you sort by various columns you can 
> see all the tasks are actually there, it just seems the size of the table is 
> wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24414) Stages page doesn't show all task attempts when failures

2018-05-29 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494199#comment-16494199
 ] 

Thomas Graves commented on SPARK-24414:
---

looks like this was broken by SPARK-23147, so we probably need to find a 
different solution.

[~vanzin] [~jerryshao]

> Stages page doesn't show all task attempts when failures
> 
>
> Key: SPARK-24414
> URL: https://issues.apache.org/jira/browse/SPARK-24414
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Critical
>
> If you have task failures, the StagePage doesn't render all the task attempts 
> properly.  It seems to make the table the size of the total number of 
> successful tasks rather then including all the failed tasks.
> Even though the table size is smaller, if you sort by various columns you can 
> see all the tasks are actually there, it just seems the size of the table is 
> wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-05-29 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-24415:
--
Description: 
Running with spark 2.3 on yarn and having task failures and blacklisting, the 
aggregated metrics by executor are not correct.  In my example it should have 2 
failed tasks but it only shows one.    Note I tested with master branch to 
verify its not fixed.

I will attach screen shot.

To reproduce:

$SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
--executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
--conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
"spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
"spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
"spark.blacklist.killBlacklistedExecutors=true"

 

sc.parallelize(1 to 1, 10).map

{ x => | if (SparkEnv.get.executorId.toInt >= 1 && 
SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad executor") 
| else (x % 3, x) | }

.reduceByKey((a, b) => a + b).collect()

  was:
Running with spark 2.3 on yarn and having task failures and blacklisting, the 
aggregated metrics by executor are not correct.  In my example it should have 2 
failed tasks but it only shows one.  

I will attach screen shot.

To reproduce:

$SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
--executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
--conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
"spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
"spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
"spark.blacklist.killBlacklistedExecutors=true"

 

sc.parallelize(1 to 1, 10).map { x =>
 | if (SparkEnv.get.executorId.toInt >= 1 && SparkEnv.get.executorId.toInt <= 
4) throw new RuntimeException("Bad executor")
 | else (x % 3, x)
 | }.reduceByKey((a, b) => a + b).collect()


> Stage page aggregated executor metrics wrong when failures 
> ---
>
> Key: SPARK-24415
> URL: https://issues.apache.org/jira/browse/SPARK-24415
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
> Attachments: Screen Shot 2018-05-29 at 2.15.38 PM.png
>
>
> Running with spark 2.3 on yarn and having task failures and blacklisting, the 
> aggregated metrics by executor are not correct.  In my example it should have 
> 2 failed tasks but it only shows one.    Note I tested with master branch to 
> verify its not fixed.
> I will attach screen shot.
> To reproduce:
> $SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
> --executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
> --conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
> "spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
> "spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
> "spark.blacklist.killBlacklistedExecutors=true"
>  
> sc.parallelize(1 to 1, 10).map
> { x => | if (SparkEnv.get.executorId.toInt >= 1 && 
> SparkEnv.get.executorId.toInt <= 4) throw new RuntimeException("Bad 
> executor") | else (x % 3, x) | }
> .reduceByKey((a, b) => a + b).collect()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-05-29 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-24415:
-

 Summary: Stage page aggregated executor metrics wrong when 
failures 
 Key: SPARK-24415
 URL: https://issues.apache.org/jira/browse/SPARK-24415
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 2.3.0
Reporter: Thomas Graves
 Attachments: Screen Shot 2018-05-29 at 2.15.38 PM.png

Running with spark 2.3 on yarn and having task failures and blacklisting, the 
aggregated metrics by executor are not correct.  In my example it should have 2 
failed tasks but it only shows one.  

I will attach screen shot.

To reproduce:

$SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
--executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
--conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
"spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
"spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
"spark.blacklist.killBlacklistedExecutors=true"

 

sc.parallelize(1 to 1, 10).map { x =>
 | if (SparkEnv.get.executorId.toInt >= 1 && SparkEnv.get.executorId.toInt <= 
4) throw new RuntimeException("Bad executor")
 | else (x % 3, x)
 | }.reduceByKey((a, b) => a + b).collect()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-05-29 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-24415:
--
Attachment: Screen Shot 2018-05-29 at 2.15.38 PM.png

> Stage page aggregated executor metrics wrong when failures 
> ---
>
> Key: SPARK-24415
> URL: https://issues.apache.org/jira/browse/SPARK-24415
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
> Attachments: Screen Shot 2018-05-29 at 2.15.38 PM.png
>
>
> Running with spark 2.3 on yarn and having task failures and blacklisting, the 
> aggregated metrics by executor are not correct.  In my example it should have 
> 2 failed tasks but it only shows one.  
> I will attach screen shot.
> To reproduce:
> $SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client 
> --executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true" 
> --conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
> "spark.blacklist.stage.maxFailedExecutorsPerNode=1"  --conf 
> "spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf 
> "spark.blacklist.killBlacklistedExecutors=true"
>  
> sc.parallelize(1 to 1, 10).map { x =>
>  | if (SparkEnv.get.executorId.toInt >= 1 && SparkEnv.get.executorId.toInt <= 
> 4) throw new RuntimeException("Bad executor")
>  | else (x % 3, x)
>  | }.reduceByKey((a, b) => a + b).collect()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24414) Stages page doesn't show all task attempts when failures

2018-05-29 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494083#comment-16494083
 ] 

Thomas Graves commented on SPARK-24414:
---

to reproduce this simply start a shell:

$SPARK_HOME/bin/spark-shell --num-executors 5  --master yarn --deploy-mode 
client

Run something that gets some tasks failures but not all:

sc.parallelize(1 to 1, 10).map { x =>
 | if (SparkEnv.get.executorId.toInt >= 1 && SparkEnv.get.executorId.toInt <= 
4) throw new RuntimeException("Bad executor")
 | else (x % 3, x)
 | }.reduceByKey((a, b) => a + b).collect()

 

Go to the stages page and you will only see 10 tasks rendered when it should 
has 21 total between succeeded and failed. 

> Stages page doesn't show all task attempts when failures
> 
>
> Key: SPARK-24414
> URL: https://issues.apache.org/jira/browse/SPARK-24414
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Critical
>
> If you have task failures, the StagePage doesn't render all the task attempts 
> properly.  It seems to make the table the size of the total number of 
> successful tasks rather then including all the failed tasks.
> Even though the table size is smaller, if you sort by various columns you can 
> see all the tasks are actually there, it just seems the size of the table is 
> wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24414) Stages page doesn't show all task attempts when failures

2018-05-29 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-24414:
-

 Summary: Stages page doesn't show all task attempts when failures
 Key: SPARK-24414
 URL: https://issues.apache.org/jira/browse/SPARK-24414
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 2.3.0
Reporter: Thomas Graves


If you have task failures, the StagePage doesn't render all the task attempts 
properly.  It seems to make the table the size of the total number of 
successful tasks rather then including all the failed tasks.

Even though the table size is smaller, if you sort by various columns you can 
see all the tasks are actually there, it just seems the size of the table is 
wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24413) Executor Blacklisting shouldn't immediately fail the application if dynamic allocation is enabled and no active executors

2018-05-29 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493978#comment-16493978
 ] 

Thomas Graves commented on SPARK-24413:
---

[~imranr]  thoughts on this?

> Executor Blacklisting shouldn't immediately fail the application if dynamic 
> allocation is enabled and no active executors
> -
>
> Key: SPARK-24413
> URL: https://issues.apache.org/jira/browse/SPARK-24413
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
>
> Currently with executor blacklisting enabled, dynamic allocation on, and you 
> only have 1 active executor (spark.blacklist.killBlacklistedExecutors setting 
> doesn't matter in this case, can be on or off), if you have a task fail that 
> results in the 1 executor you have getting blacklisted, then your entire 
> application will fail.  The error you get is something like:
> Aborting TaskSet 0.0 because task 9 (partition 9)
> cannot run anywhere due to node and executor blacklist.
> This is very undesirable behavior because you may have a huge job but one 
> task is the long tail and if it happens to hit a bad node that would 
> blacklist it, the entire job fail.
> Ideally since dynamic allocation is on, the schedule should not immediately 
> fail but it should let dynamic allocation try to get more executors. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24413) Executor Blacklisting shouldn't immediately fail the application if dynamic allocation is enabled and no active executors

2018-05-29 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-24413:
--
Summary: Executor Blacklisting shouldn't immediately fail the application 
if dynamic allocation is enabled and no active executors  (was: Executor 
Blacklisting shouldn't immediately fail the application if dynamic allocation 
is enabled and it doesn't have any other active executors )

> Executor Blacklisting shouldn't immediately fail the application if dynamic 
> allocation is enabled and no active executors
> -
>
> Key: SPARK-24413
> URL: https://issues.apache.org/jira/browse/SPARK-24413
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
>
> Currently with executor blacklisting enabled, dynamic allocation on, and you 
> only have 1 active executor (spark.blacklist.killBlacklistedExecutors setting 
> doesn't matter in this case, can be on or off), if you have a task fail that 
> results in the 1 executor you have getting blacklisted, then your entire 
> application will fail.  The error you get is something like:
> Aborting TaskSet 0.0 because task 9 (partition 9)
> cannot run anywhere due to node and executor blacklist.
> This is very undesirable behavior because you may have a huge job but one 
> task is the long tail and if it happens to hit a bad node that would 
> blacklist it, the entire job fail.
> Ideally since dynamic allocation is on, the schedule should not immediately 
> fail but it should let dynamic allocation try to get more executors. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24413) Executor Blacklisting shouldn't immediately fail the application if dynamic allocation is enabled and it doesn't have any other active executors

2018-05-29 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-24413:
-

 Summary: Executor Blacklisting shouldn't immediately fail the 
application if dynamic allocation is enabled and it doesn't have any other 
active executors 
 Key: SPARK-24413
 URL: https://issues.apache.org/jira/browse/SPARK-24413
 Project: Spark
  Issue Type: Improvement
  Components: Scheduler
Affects Versions: 2.3.0
Reporter: Thomas Graves


Currently with executor blacklisting enabled, dynamic allocation on, and you 
only have 1 active executor (spark.blacklist.killBlacklistedExecutors setting 
doesn't matter in this case, can be on or off), if you have a task fail that 
results in the 1 executor you have getting blacklisted, then your entire 
application will fail.  The error you get is something like:

Aborting TaskSet 0.0 because task 9 (partition 9)
cannot run anywhere due to node and executor blacklist.

This is very undesirable behavior because you may have a huge job but one task 
is the long tail and if it happens to hit a bad node that would blacklist it, 
the entire job fail.

Ideally since dynamic allocation is on, the schedule should not immediately 
fail but it should let dynamic allocation try to get more executors. 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6235) Address various 2G limits

2018-05-21 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482746#comment-16482746
 ] 

Thomas Graves commented on SPARK-6235:
--

>> Still unsupported:
 * large task results

 * large blocks in the WAL
 * individual records larger than 2 GB

 

Can you clarify what WAL is?

I have seen individual records larger then 2GB, I don't think its as common 
though.

Also can you clarify large task results? 

> Address various 2G limits
> -
>
> Key: SPARK-6235
> URL: https://issues.apache.org/jira/browse/SPARK-6235
> Project: Spark
>  Issue Type: Umbrella
>  Components: Shuffle, Spark Core
>Reporter: Reynold Xin
>Priority: Major
> Attachments: SPARK-6235_Design_V0.02.pdf
>
>
> An umbrella ticket to track the various 2G limit we have in Spark, due to the 
> use of byte arrays and ByteBuffers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21033) fix the potential OOM in UnsafeExternalSorter

2018-05-08 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467803#comment-16467803
 ] 

Thomas Graves commented on SPARK-21033:
---

[~cloud_fan] the followup PR [https://github.com/apache/spark/pull/21077] 
didn't go into spark 2.3.0, this should have had its own Jira and we need to 
udpate the fix version.  Can you please fix so we properly track what version 
this is in.  Also does this need to be backported to 2.3.1?

> fix the potential OOM in UnsafeExternalSorter
> -
>
> Key: SPARK-21033
> URL: https://issues.apache.org/jira/browse/SPARK-21033
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
> Fix For: 2.3.0
>
>
> In `UnsafeInMemorySorter`, one record may take 32 bytes: 1 `long` for 
> pointer, 1 `long` for key-prefix, and another 2 `long`s as the temporary 
> buffer for radix sort.
> In `UnsafeExternalSorter`, we set the 
> `DEFAULT_NUM_ELEMENTS_FOR_SPILL_THRESHOLD` to be `1024 * 1024 * 1024 / 2`, 
> and hoping the max size of point array to be 8 GB. However this is wrong, 
> `1024 * 1024 * 1024 / 2 * 32` is actually 16 GB, and if we grow the point 
> array before reach this limitation, we may hit the max-page-size error.
> Users may see exception like this on large dataset:
> {code}
> Caused by: java.lang.IllegalArgumentException: Cannot allocate a page with 
> more than 17179869176 bytes
> at 
> org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:241)
> at 
> org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:121)
> at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.acquireNewPageIfNecessary(UnsafeExternalSorter.java:374)
> at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.insertRecord(UnsafeExternalSorter.java:396)
> at 
> org.apache.spark.sql.execution.UnsafeExternalRowSorter.insertRow(UnsafeExternalRowSorter.java:94)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24124) Spark history server should create spark.history.store.path and set permissions properly

2018-04-30 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458835#comment-16458835
 ] 

Thomas Graves commented on SPARK-24124:
---

[~vanzin]  any objections to this?

> Spark history server should create spark.history.store.path and set 
> permissions properly
> 
>
> Key: SPARK-24124
> URL: https://issues.apache.org/jira/browse/SPARK-24124
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
>
> Current with the new spark history server you can set 
> spark.history.store.path to a location to store the levelDB files.  Currently 
> the directory has to be made before it can use that path.
> We should just have the history server create it and set the file permissions 
> on the leveldb files to be restrictive -> new FsPermission((short) 0700)
> the shuffle service already does this, this would be much more convenient to 
> use and prevent people from making mistakes with the permissions on the 
> directory and files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24124) Spark history server should create spark.history.store.path and set permissions properly

2018-04-30 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-24124:
-

 Summary: Spark history server should create 
spark.history.store.path and set permissions properly
 Key: SPARK-24124
 URL: https://issues.apache.org/jira/browse/SPARK-24124
 Project: Spark
  Issue Type: Story
  Components: Spark Core
Affects Versions: 2.3.0
Reporter: Thomas Graves


Current with the new spark history server you can set spark.history.store.path 
to a location to store the levelDB files.  Currently the directory has to be 
made before it can use that path.

We should just have the history server create it and set the file permissions 
on the leveldb files to be restrictive -> new FsPermission((short) 0700)

the shuffle service already does this, this would be much more convenient to 
use and prevent people from making mistakes with the permissions on the 
directory and files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-04-24 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-22683:
-

Assignee: Julien Cuquemelle

> DynamicAllocation wastes resources by allocating containers that will barely 
> be used
> 
>
> Key: SPARK-22683
> URL: https://issues.apache.org/jira/browse/SPARK-22683
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Julien Cuquemelle
>Assignee: Julien Cuquemelle
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.4.0
>
>
> While migrating a series of jobs from MR to Spark using dynamicAllocation, 
> I've noticed almost a doubling (+114% exactly) of resource consumption of 
> Spark w.r.t MR, for a wall clock time gain of 43%
> About the context: 
> - resource usage stands for vcore-hours allocation for the whole job, as seen 
> by YARN
> - I'm talking about a series of jobs because we provide our users with a way 
> to define experiments (via UI / DSL) that automatically get translated to 
> Spark / MR jobs and submitted on the cluster
> - we submit around 500 of such jobs each day
> - these jobs are usually one shot, and the amount of processing can vary a 
> lot between jobs, and as such finding an efficient number of executors for 
> each job is difficult to get right, which is the reason I took the path of 
> dynamic allocation.  
> - Some of the tests have been scheduled on an idle queue, some on a full 
> queue.
> - experiments have been conducted with spark.executor-cores = 5 and 10, only 
> results for 5 cores have been reported because efficiency was overall better 
> than with 10 cores
> - the figures I give are averaged over a representative sample of those jobs 
> (about 600 jobs) ranging from tens to thousands splits in the data 
> partitioning and between 400 to 9000 seconds of wall clock time.
> - executor idle timeout is set to 30s;
>  
> Definition: 
> - let's say an executor has spark.executor.cores / spark.task.cpus taskSlots, 
> which represent the max number of tasks an executor will process in parallel.
> - the current behaviour of the dynamic allocation is to allocate enough 
> containers to have one taskSlot per task, which minimizes latency, but wastes 
> resources when tasks are small regarding executor allocation and idling 
> overhead. 
> The results using the proposal (described below) over the job sample (600 
> jobs):
> - by using 2 tasks per taskSlot, we get a 5% (against -114%) reduction in 
> resource usage, for a 37% (against 43%) reduction in wall clock time for 
> Spark w.r.t MR
> - by trying to minimize the average resource consumption, I ended up with 6 
> tasks per core, with a 30% resource usage reduction, for a similar wall clock 
> time w.r.t. MR
> What did I try to solve the issue with existing parameters (summing up a few 
> points mentioned in the comments) ?
> - change dynamicAllocation.maxExecutors: this would need to be adapted for 
> each job (tens to thousands splits can occur), and essentially remove the 
> interest of using the dynamic allocation.
> - use dynamicAllocation.backlogTimeout: 
> - setting this parameter right to avoid creating unused executors is very 
> dependant on wall clock time. One basically needs to solve the exponential 
> ramp up for the target time. So this is not an option for my use case where I 
> don't want a per-job tuning. 
> - I've still done a series of experiments, details in the comments. 
> Result is that after manual tuning, the best I could get was a similar 
> resource consumption at the expense of 20% more wall clock time, or a similar 
> wall clock time at the expense of 60% more resource consumption than what I 
> got using my proposal @ 6 tasks per slot (this value being optimized over a 
> much larger range of jobs as already stated)
> - as mentioned in another comment, tampering with the exponential ramp up 
> might yield task imbalance and such old executors could become contention 
> points for other exes trying to remotely access blocks in the old exes (not 
> witnessed in the jobs I'm talking about, but we did see this behavior in 
> other jobs)
> Proposal: 
> Simply add a tasksPerExecutorSlot parameter, which makes it possible to 
> specify how many tasks a single taskSlot should ideally execute to mitigate 
> the overhead of executor allocation.
> PR: https://github.com/apache/spark/pull/19881



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-04-24 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450131#comment-16450131
 ] 

Thomas Graves commented on SPARK-22683:
---

Note this added a new config spark.dynamicAllocation.executorAllocationRatio, 
default to 1.0 which is the same behavior as existing releases.

> DynamicAllocation wastes resources by allocating containers that will barely 
> be used
> 
>
> Key: SPARK-22683
> URL: https://issues.apache.org/jira/browse/SPARK-22683
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Julien Cuquemelle
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.4.0
>
>
> While migrating a series of jobs from MR to Spark using dynamicAllocation, 
> I've noticed almost a doubling (+114% exactly) of resource consumption of 
> Spark w.r.t MR, for a wall clock time gain of 43%
> About the context: 
> - resource usage stands for vcore-hours allocation for the whole job, as seen 
> by YARN
> - I'm talking about a series of jobs because we provide our users with a way 
> to define experiments (via UI / DSL) that automatically get translated to 
> Spark / MR jobs and submitted on the cluster
> - we submit around 500 of such jobs each day
> - these jobs are usually one shot, and the amount of processing can vary a 
> lot between jobs, and as such finding an efficient number of executors for 
> each job is difficult to get right, which is the reason I took the path of 
> dynamic allocation.  
> - Some of the tests have been scheduled on an idle queue, some on a full 
> queue.
> - experiments have been conducted with spark.executor-cores = 5 and 10, only 
> results for 5 cores have been reported because efficiency was overall better 
> than with 10 cores
> - the figures I give are averaged over a representative sample of those jobs 
> (about 600 jobs) ranging from tens to thousands splits in the data 
> partitioning and between 400 to 9000 seconds of wall clock time.
> - executor idle timeout is set to 30s;
>  
> Definition: 
> - let's say an executor has spark.executor.cores / spark.task.cpus taskSlots, 
> which represent the max number of tasks an executor will process in parallel.
> - the current behaviour of the dynamic allocation is to allocate enough 
> containers to have one taskSlot per task, which minimizes latency, but wastes 
> resources when tasks are small regarding executor allocation and idling 
> overhead. 
> The results using the proposal (described below) over the job sample (600 
> jobs):
> - by using 2 tasks per taskSlot, we get a 5% (against -114%) reduction in 
> resource usage, for a 37% (against 43%) reduction in wall clock time for 
> Spark w.r.t MR
> - by trying to minimize the average resource consumption, I ended up with 6 
> tasks per core, with a 30% resource usage reduction, for a similar wall clock 
> time w.r.t. MR
> What did I try to solve the issue with existing parameters (summing up a few 
> points mentioned in the comments) ?
> - change dynamicAllocation.maxExecutors: this would need to be adapted for 
> each job (tens to thousands splits can occur), and essentially remove the 
> interest of using the dynamic allocation.
> - use dynamicAllocation.backlogTimeout: 
> - setting this parameter right to avoid creating unused executors is very 
> dependant on wall clock time. One basically needs to solve the exponential 
> ramp up for the target time. So this is not an option for my use case where I 
> don't want a per-job tuning. 
> - I've still done a series of experiments, details in the comments. 
> Result is that after manual tuning, the best I could get was a similar 
> resource consumption at the expense of 20% more wall clock time, or a similar 
> wall clock time at the expense of 60% more resource consumption than what I 
> got using my proposal @ 6 tasks per slot (this value being optimized over a 
> much larger range of jobs as already stated)
> - as mentioned in another comment, tampering with the exponential ramp up 
> might yield task imbalance and such old executors could become contention 
> points for other exes trying to remotely access blocks in the old exes (not 
> witnessed in the jobs I'm talking about, but we did see this behavior in 
> other jobs)
> Proposal: 
> Simply add a tasksPerExecutorSlot parameter, which makes it possible to 
> specify how many tasks a single taskSlot should ideally execute to mitigate 
> the overhead of executor allocation.
> PR: https://github.com/apache/spark/pull/19881



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Resolved] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-04-24 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-22683.
---
   Resolution: Fixed
Fix Version/s: 2.4.0

> DynamicAllocation wastes resources by allocating containers that will barely 
> be used
> 
>
> Key: SPARK-22683
> URL: https://issues.apache.org/jira/browse/SPARK-22683
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Julien Cuquemelle
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.4.0
>
>
> While migrating a series of jobs from MR to Spark using dynamicAllocation, 
> I've noticed almost a doubling (+114% exactly) of resource consumption of 
> Spark w.r.t MR, for a wall clock time gain of 43%
> About the context: 
> - resource usage stands for vcore-hours allocation for the whole job, as seen 
> by YARN
> - I'm talking about a series of jobs because we provide our users with a way 
> to define experiments (via UI / DSL) that automatically get translated to 
> Spark / MR jobs and submitted on the cluster
> - we submit around 500 of such jobs each day
> - these jobs are usually one shot, and the amount of processing can vary a 
> lot between jobs, and as such finding an efficient number of executors for 
> each job is difficult to get right, which is the reason I took the path of 
> dynamic allocation.  
> - Some of the tests have been scheduled on an idle queue, some on a full 
> queue.
> - experiments have been conducted with spark.executor-cores = 5 and 10, only 
> results for 5 cores have been reported because efficiency was overall better 
> than with 10 cores
> - the figures I give are averaged over a representative sample of those jobs 
> (about 600 jobs) ranging from tens to thousands splits in the data 
> partitioning and between 400 to 9000 seconds of wall clock time.
> - executor idle timeout is set to 30s;
>  
> Definition: 
> - let's say an executor has spark.executor.cores / spark.task.cpus taskSlots, 
> which represent the max number of tasks an executor will process in parallel.
> - the current behaviour of the dynamic allocation is to allocate enough 
> containers to have one taskSlot per task, which minimizes latency, but wastes 
> resources when tasks are small regarding executor allocation and idling 
> overhead. 
> The results using the proposal (described below) over the job sample (600 
> jobs):
> - by using 2 tasks per taskSlot, we get a 5% (against -114%) reduction in 
> resource usage, for a 37% (against 43%) reduction in wall clock time for 
> Spark w.r.t MR
> - by trying to minimize the average resource consumption, I ended up with 6 
> tasks per core, with a 30% resource usage reduction, for a similar wall clock 
> time w.r.t. MR
> What did I try to solve the issue with existing parameters (summing up a few 
> points mentioned in the comments) ?
> - change dynamicAllocation.maxExecutors: this would need to be adapted for 
> each job (tens to thousands splits can occur), and essentially remove the 
> interest of using the dynamic allocation.
> - use dynamicAllocation.backlogTimeout: 
> - setting this parameter right to avoid creating unused executors is very 
> dependant on wall clock time. One basically needs to solve the exponential 
> ramp up for the target time. So this is not an option for my use case where I 
> don't want a per-job tuning. 
> - I've still done a series of experiments, details in the comments. 
> Result is that after manual tuning, the best I could get was a similar 
> resource consumption at the expense of 20% more wall clock time, or a similar 
> wall clock time at the expense of 60% more resource consumption than what I 
> got using my proposal @ 6 tasks per slot (this value being optimized over a 
> much larger range of jobs as already stated)
> - as mentioned in another comment, tampering with the exponential ramp up 
> might yield task imbalance and such old executors could become contention 
> points for other exes trying to remotely access blocks in the old exes (not 
> witnessed in the jobs I'm talking about, but we did see this behavior in 
> other jobs)
> Proposal: 
> Simply add a tasksPerExecutorSlot parameter, which makes it possible to 
> specify how many tasks a single taskSlot should ideally execute to mitigate 
> the overhead of executor allocation.
> PR: https://github.com/apache/spark/pull/19881



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23850) We should not redact username|user|url from UI by default

2018-04-23 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449054#comment-16449054
 ] 

Thomas Graves commented on SPARK-23850:
---

the url seems somewhat silly to me to, look at the environment page on yarn, at 
least in our environment it has redacted in a bunch of places that don't make 
sense.  If its an issue with the thriftserver and certain urls we should fix 
those separately.

> We should not redact username|user|url from UI by default
> -
>
> Key: SPARK-23850
> URL: https://issues.apache.org/jira/browse/SPARK-23850
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.2.1
>Reporter: Thomas Graves
>Priority: Major
>
> SPARK-22479 was filed to not print the log jdbc credentials, but in there 
> they also added  the username and url to be redacted.  I'm not sure why these 
> were added and to me by default these do not have security concerns.  It 
> makes it more usable by default to be able to see these things.  Users with 
> high security concerns can simply add them in their configs.
> Also on yarn just redacting url doesn't secure anything because if you go to 
> the environment ui page you see all sorts of paths and really its just 
> confusing that some of its redacted and other parts aren't.  If this was 
> specifically for jdbc I think it needs to be just applied there and not 
> broadly.
> If we remove these we need to test what the jdbc driver is going to log from 
> SPARK-22479.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23964) why does Spillable wait for 32 elements?

2018-04-19 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444043#comment-16444043
 ] 

Thomas Graves commented on SPARK-23964:
---

so far in my testing I haven't seen any performance regressions.  Doing the 
accounting to acquire more memory takes no time at all. Obviously if you have a 
small heap and it can't acquire more memory, it will spill but that is what you 
want so you don't oom.

> why does Spillable wait for 32 elements?
> 
>
> Key: SPARK-23964
> URL: https://issues.apache.org/jira/browse/SPARK-23964
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Thomas Graves
>Priority: Major
>
> The spillable class has a check in maybeSpill as to when it tries to acquire 
> more memory and determine if it should spill:
> if (elementsRead % 32 == 0 && currentMemory >= myMemoryThreshold) {
> Before it looks to see if it should spill.  
> [https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/collection/Spillable.scala#L83]
> I'm wondering why it has the elementsRead %32  in it?  If I have a small 
> number of elements that are huge this can easily cause OOM before we actually 
> spill.  
> I saw a few conversations on this and one Jira related: 
> https://issues.apache.org/jira/browse/SPARK-4456 . but I've never seen an 
> answer to this.
> anyone have history on this?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15703) Make ListenerBus event queue size configurable

2018-04-17 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440855#comment-16440855
 ] 

Thomas Graves commented on SPARK-15703:
---

this Jira is purely making the size of the event queue configurable which would 
allow you to increase it as long as you have sufficient driver memory.  There 
is no current fix for it dropping events. There is a fix that went into 2.3 
that makes it so the critical services aren't affected:

https://issues.apache.org/jira/browse/SPARK-18838

> Make ListenerBus event queue size configurable
> --
>
> Key: SPARK-15703
> URL: https://issues.apache.org/jira/browse/SPARK-15703
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler, Web UI
>Affects Versions: 2.0.0
>Reporter: Thomas Graves
>Assignee: Dhruve Ashar
>Priority: Minor
> Fix For: 2.0.1, 2.1.0
>
> Attachments: Screen Shot 2016-06-01 at 11.21.32 AM.png, Screen Shot 
> 2016-06-01 at 11.23.48 AM.png, SparkListenerBus .png, 
> spark-dynamic-executor-allocation.png
>
>
> The Spark UI doesn't seem to be showing all the tasks and metrics.
> I ran a job with 10 tasks but Detail stage page says it completed 93029:
> Summary Metrics for 93029 Completed Tasks
> The Stages for all jobs pages list that only 89519/10 tasks finished but 
> its completed.  The metrics for shuffled write and input are also incorrect.
> I will attach screen shots.
> I checked the logs and it does show that all the tasks actually finished.
> 16/06/01 16:15:42 INFO TaskSetManager: Finished task 59880.0 in stage 2.0 
> (TID 54038) in 265309 ms on 10.213.45.51 (10/10)
> 16/06/01 16:15:42 INFO YarnClusterScheduler: Removed TaskSet 2.0, whose tasks 
> have all completed, from pool



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (YARN-8149) Revisit behavior of Re-Reservation in Capacity Scheduler

2018-04-12 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436366#comment-16436366
 ] 

Thomas Graves commented on YARN-8149:
-

thinking about this a little more, even with the current preemption on, I don't 
think preemption is smart enough to keep starvation from happening.  If 
preemption was smart enough to kill enough containers on a reserved node to 
make it so the big container actually gets scheduled there that might be ok.  
But last time I checked it doesn't do that.

Without that or having another way to prevent starvation I wouldn't want to 
remove this.  I think adding a config would be alright but if anyone finds it 
useful you can't remove and would just be an extra config.  

If we have other ideas to simply or make this better, great we should look at. 
Or if there is a way for us to get stats on if this is useful we could add 
those and run and determine if we should remove.

> Revisit behavior of Re-Reservation in Capacity Scheduler
> 
>
> Key: YARN-8149
> URL: https://issues.apache.org/jira/browse/YARN-8149
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Priority: Critical
>
> Frankly speaking, I'm not sure why we need the re-reservation. The formula is 
> not that easy to understand:
> Inside: 
> {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator#shouldAllocOrReserveNewContainer}}
> {code:java}
> starvation = re-reservation / (#reserved-container * 
>  (1 - min(requested-resource / max-alloc, 
>   max-alloc - min-alloc / max-alloc))
> should_allocate = starvation + requiredContainers - reservedContainers > 
> 0{code}
> I think we should be able to remove the starvation computation, just to check 
> requiredContainers > reservedContainers should be enough.
> In a large cluster, we can easily overflow re-reservation to MAX_INT, see 
> YARN-7636. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8149) Revisit behavior of Re-Reservation in Capacity Scheduler

2018-04-12 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436295#comment-16436295
 ] 

Thomas Graves commented on YARN-8149:
-

are you going to do anything with starvation then or allocation a certain % 
more then what is required? I am hesitant to remove this without doing some 
major testing.  I haven't had a chance to look at the latest code to 
investigate.

It might be more fine now that we do continue looking at other nodes after 
reservation where as originally that didn't happen. Is in queue preemption on 
by default?

> Revisit behavior of Re-Reservation in Capacity Scheduler
> 
>
> Key: YARN-8149
> URL: https://issues.apache.org/jira/browse/YARN-8149
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Priority: Critical
>
> Frankly speaking, I'm not sure why we need the re-reservation. The formula is 
> not that easy to understand:
> Inside: 
> {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator#shouldAllocOrReserveNewContainer}}
> {code:java}
> starvation = re-reservation / (#reserved-container * 
>  (1 - min(requested-resource / max-alloc, 
>   max-alloc - min-alloc / max-alloc))
> should_allocate = starvation + requiredContainers - reservedContainers > 
> 0{code}
> I think we should be able to remove the starvation computation, just to check 
> requiredContainers > reservedContainers should be enough.
> In a large cluster, we can easily overflow re-reservation to MAX_INT, see 
> YARN-7636. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (SPARK-23964) why does Spillable wait for 32 elements?

2018-04-11 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434479#comment-16434479
 ] 

Thomas Graves commented on SPARK-23964:
---

I'm not sure, I'm trying to figure out if there is a performance implications 
here and perhaps there are but its at the cost of not being accurate  on memory 
usage.  In the deployments with fixed sized containers this is very important.  
if you wait 32 elements it may cause you to acquire a bigger chunk of memory at 
once vs getting smaller allocations (thus more).

I would think the only check you need is: currentMemory >= myMemoryThreshold, 
the initial threshold is 5MB right now but all its doing is asking for more 
memory, only when it can't get memory does it spill.  And the initial threshold 
is configurable so you can always make it bigger. 

I'm going to try to do some performance tests to see what happens but would 
like to know if anyone has other background.  

> why does Spillable wait for 32 elements?
> 
>
> Key: SPARK-23964
> URL: https://issues.apache.org/jira/browse/SPARK-23964
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Thomas Graves
>Priority: Major
>
> The spillable class has a check in maybeSpill as to when it tries to acquire 
> more memory and determine if it should spill:
> if (elementsRead % 32 == 0 && currentMemory >= myMemoryThreshold) {
> Before it looks to see if it should spill.  
> [https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/collection/Spillable.scala#L83]
> I'm wondering why it has the elementsRead %32  in it?  If I have a small 
> number of elements that are huge this can easily cause OOM before we actually 
> spill.  
> I saw a few conversations on this and one Jira related: 
> https://issues.apache.org/jira/browse/SPARK-4456 . but I've never seen an 
> answer to this.
> anyone have history on this?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23964) why does Spillable wait for 32 elements?

2018-04-11 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-23964:
--
Description: 
The spillable class has a check in maybeSpill as to when it tries to acquire 
more memory and determine if it should spill:

if (elementsRead % 32 == 0 && currentMemory >= myMemoryThreshold) {

Before it looks to see if it should spill.  

[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/collection/Spillable.scala#L83]

I'm wondering why it has the elementsRead %32  in it?  If I have a small number 
of elements that are huge this can easily cause OOM before we actually spill.  

I saw a few conversations on this and one Jira related: 
https://issues.apache.org/jira/browse/SPARK-4456 . but I've never seen an 
answer to this.

anyone have history on this?

  was:
The spillable class has a check:

if (elementsRead % 32 == 0 && currentMemory >= myMemoryThreshold) {

Before it looks to see if it should spill.  

[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/collection/Spillable.scala#L83]

I'm wondering why it has the elementsRead %32  in it?  If I have a small number 
of elements that are huge this can easily cause OOM before we actually spill.  

I saw a few conversations on this and one Jira related: 
https://issues.apache.org/jira/browse/SPARK-4456 . but I've never seen an 
answer to this.

anyone have history on this?


> why does Spillable wait for 32 elements?
> 
>
> Key: SPARK-23964
> URL: https://issues.apache.org/jira/browse/SPARK-23964
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Thomas Graves
>Priority: Major
>
> The spillable class has a check in maybeSpill as to when it tries to acquire 
> more memory and determine if it should spill:
> if (elementsRead % 32 == 0 && currentMemory >= myMemoryThreshold) {
> Before it looks to see if it should spill.  
> [https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/collection/Spillable.scala#L83]
> I'm wondering why it has the elementsRead %32  in it?  If I have a small 
> number of elements that are huge this can easily cause OOM before we actually 
> spill.  
> I saw a few conversations on this and one Jira related: 
> https://issues.apache.org/jira/browse/SPARK-4456 . but I've never seen an 
> answer to this.
> anyone have history on this?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23964) why does Spillable wait for 32 elements?

2018-04-11 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434454#comment-16434454
 ] 

Thomas Graves commented on SPARK-23964:
---

[~andrewor14]  [~matei] [~r...@databricks.com]

 

A few related threads:

[https://github.com/apache/spark/pull/3302]

[https://github.com/apache/spark/pull/3656]

https://github.com/apache/spark/commit/3be92cdac30cf488e09dbdaaa70e5c4cdaa9a099

> why does Spillable wait for 32 elements?
> 
>
> Key: SPARK-23964
> URL: https://issues.apache.org/jira/browse/SPARK-23964
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Thomas Graves
>Priority: Major
>
> The spillable class has a check:
> if (elementsRead % 32 == 0 && currentMemory >= myMemoryThreshold) {
> Before it looks to see if it should spill.  
> [https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/collection/Spillable.scala#L83]
> I'm wondering why it has the elementsRead %32  in it?  If I have a small 
> number of elements that are huge this can easily cause OOM before we actually 
> spill.  
> I saw a few conversations on this and one Jira related: 
> https://issues.apache.org/jira/browse/SPARK-4456 . but I've never seen an 
> answer to this.
> anyone have history on this?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23964) why does Spillable wait for 32 elements?

2018-04-11 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-23964:
--
Description: 
The spillable class has a check:

if (elementsRead % 32 == 0 && currentMemory >= myMemoryThreshold) {

Before it looks to see if it should spill.  

[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/collection/Spillable.scala#L83]

I'm wondering why it has the elementsRead %32  in it?  If I have a small number 
of elements that are huge this can easily cause OOM before we actually spill.  

I saw a few conversations on this and one Jira related: 
https://issues.apache.org/jira/browse/SPARK-4456 . but I've never seen an 
answer to this.

anyone have history on this?

> why does Spillable wait for 32 elements?
> 
>
> Key: SPARK-23964
> URL: https://issues.apache.org/jira/browse/SPARK-23964
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Thomas Graves
>Priority: Major
>
> The spillable class has a check:
> if (elementsRead % 32 == 0 && currentMemory >= myMemoryThreshold) {
> Before it looks to see if it should spill.  
> [https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/collection/Spillable.scala#L83]
> I'm wondering why it has the elementsRead %32  in it?  If I have a small 
> number of elements that are huge this can easily cause OOM before we actually 
> spill.  
> I saw a few conversations on this and one Jira related: 
> https://issues.apache.org/jira/browse/SPARK-4456 . but I've never seen an 
> answer to this.
> anyone have history on this?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23964) why does Spillable wait for 32 elements?

2018-04-11 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-23964:
--
Environment: (was: The spillable class has a check:

if (elementsRead % 32 == 0 && currentMemory >= myMemoryThreshold) {

Before it looks to see if it should spill.  

[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/collection/Spillable.scala#L83]

I'm wondering why it has the elementsRead %32  in it?  If I have a small number 
of elements that are huge this can easily cause OOM before we actually spill.  

I saw a few conversations on this and one Jira related: 
https://issues.apache.org/jira/browse/SPARK-4456 . but I've never seen an 
answer to this.

anyone have history on this?)

> why does Spillable wait for 32 elements?
> 
>
> Key: SPARK-23964
> URL: https://issues.apache.org/jira/browse/SPARK-23964
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Thomas Graves
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-23964) why does Spillable wait for 32 elements?

2018-04-11 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-23964:
-

 Summary: why does Spillable wait for 32 elements?
 Key: SPARK-23964
 URL: https://issues.apache.org/jira/browse/SPARK-23964
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.2.1
 Environment: The spillable class has a check:

if (elementsRead % 32 == 0 && currentMemory >= myMemoryThreshold) {

Before it looks to see if it should spill.  

[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/collection/Spillable.scala#L83]

I'm wondering why it has the elementsRead %32  in it?  If I have a small number 
of elements that are huge this can easily cause OOM before we actually spill.  

I saw a few conversations on this and one Jira related: 
https://issues.apache.org/jira/browse/SPARK-4456 . but I've never seen an 
answer to this.

anyone have history on this?
Reporter: Thomas Graves






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16630) Blacklist a node if executors won't launch on it.

2018-04-10 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432467#comment-16432467
 ] 

Thomas Graves commented on SPARK-16630:
---

sorry I don't follow, the list we get from the blacklist tracker is all nodes 
that are blacklisted currently that haven't met the expiry to unblacklist them. 
 You just union them with the yarn allocator list.   There is obviously some 
race condition there if one of the nodes it just about to be unblacklisted but 
I don't see that as a major issue, the next allocation will not have it.  Is 
there something I'm missing?

> Blacklist a node if executors won't launch on it.
> -
>
> Key: SPARK-16630
> URL: https://issues.apache.org/jira/browse/SPARK-16630
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 1.6.2
>Reporter: Thomas Graves
>Priority: Major
>
> On YARN, its possible that a node is messed or misconfigured such that a 
> container won't launch on it.  For instance if the Spark external shuffle 
> handler didn't get loaded on it , maybe its just some other hardware issue or 
> hadoop configuration issue. 
> It would be nice we could recognize this happening and stop trying to launch 
> executors on it since that could end up causing us to hit our max number of 
> executor failures and then kill the job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16630) Blacklist a node if executors won't launch on it.

2018-04-10 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432244#comment-16432244
 ] 

Thomas Graves commented on SPARK-16630:
---

yes I think it would make sense as the union of all blacklisted nodes.

I'm not sure what you mean by your last question.  The expiry currently is all 
handled in the BlacklistTracker, I wouldn't want to move that out into the yarn 
allocator.  Just use the information passed to it unless there is a case it 
doesn't cover?

> Blacklist a node if executors won't launch on it.
> -
>
> Key: SPARK-16630
> URL: https://issues.apache.org/jira/browse/SPARK-16630
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 1.6.2
>Reporter: Thomas Graves
>Priority: Major
>
> On YARN, its possible that a node is messed or misconfigured such that a 
> container won't launch on it.  For instance if the Spark external shuffle 
> handler didn't get loaded on it , maybe its just some other hardware issue or 
> hadoop configuration issue. 
> It would be nice we could recognize this happening and stop trying to launch 
> executors on it since that could end up causing us to hit our max number of 
> executor failures and then kill the job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16630) Blacklist a node if executors won't launch on it.

2018-04-05 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16427377#comment-16427377
 ] 

Thomas Graves commented on SPARK-16630:
---

the problem is that spark.executor.instances (or dynamic allocation) doesn't 
necessarily represent the # of nodes in the cluster, especially if you look at 
dynamic allocation.  Depending on the size of your nodes you can have a lot 
more executors then nodes, thus it could easily end up blacklisting the entire 
cluster.  I would rather look at the actual # of nodes in the cluster.  Is that 
turning out to be hard?

> Blacklist a node if executors won't launch on it.
> -
>
> Key: SPARK-16630
> URL: https://issues.apache.org/jira/browse/SPARK-16630
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 1.6.2
>Reporter: Thomas Graves
>Priority: Major
>
> On YARN, its possible that a node is messed or misconfigured such that a 
> container won't launch on it.  For instance if the Spark external shuffle 
> handler didn't get loaded on it , maybe its just some other hardware issue or 
> hadoop configuration issue. 
> It would be nice we could recognize this happening and stop trying to launch 
> executors on it since that could end up causing us to hit our max number of 
> executor failures and then kill the job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-23567) spark.redaction.regex should not include user by default, docs not updated

2018-04-04 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-23567.
---
Resolution: Duplicate

> spark.redaction.regex should not include user by default, docs not updated
> --
>
> Key: SPARK-23567
> URL: https://issues.apache.org/jira/browse/SPARK-23567
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Thomas Graves
>Priority: Major
>
> SPARK-22479 changed to redact the user name by default.  I would argue 
> username isn't something that should be redacted by default and its very 
> useful for debugging and other things. If people are running super secure and 
> want to turn it on they can but I don't see the user name as a default 
> security setting.  There are also other ways on the UI to see the user name, 
> for instance on yarn you can go to the Environment page and looking at the 
> resources and see the username in the paths.
> Also the Jira did not update the default setting in the docs, so the docs are 
> out of date:
> http://spark.apache.org/docs/2.2.1/configuration.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23039) Fix the bug in alter table set location.

2018-04-03 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423996#comment-16423996
 ] 

Thomas Graves commented on SPARK-23039:
---

seems to be a dup of 
 # SPARK-23057

>  Fix the bug in alter table set location.
> -
>
> Key: SPARK-23039
> URL: https://issues.apache.org/jira/browse/SPARK-23039
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.1
>Reporter: xubo245
>Priority: Critical
>
> TOBO work: Fix the bug in alter table set location.   
> org.apache.spark.sql.execution.command.DDLSuite#testSetLocation
> {code:java}
> // TODO(gatorsmile): fix the bug in alter table set location.
>//if (isUsingHiveMetastore) {
> //assert(storageFormat.properties.get("path") === expected)
> //   }
> {code}
> Analysis:
> because user add locationUri and erase path by 
> {code:java}
>  newPath = None
> {code}
> in org.apache.spark.sql.hive.HiveExternalCatalog#restoreDataSourceTable:
> {code:java}
> val storageWithLocation = {
>   val tableLocation = getLocationFromStorageProps(table)
>   // We pass None as `newPath` here, to remove the path option in storage 
> properties.
>   updateLocationInStorageProps(table, newPath = None).copy(
> locationUri = tableLocation.map(CatalogUtils.stringToURI(_)))
> }
> {code}
> =>
> newPath = None



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23850) We should not redact username|user|url from UI by default

2018-04-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422965#comment-16422965
 ] 

Thomas Graves commented on SPARK-23850:
---

ping [~ash...@gmail.com] [~onursatici] [~LI,Xiao] [~jiangxb1987] [~cloud_fan]  
who were in code review, is there more background on why these were added?

> We should not redact username|user|url from UI by default
> -
>
> Key: SPARK-23850
> URL: https://issues.apache.org/jira/browse/SPARK-23850
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.2.1
>Reporter: Thomas Graves
>Priority: Major
>
> SPARK-22479 was filed to not print the log jdbc credentials, but in there 
> they also added  the username and url to be redacted.  I'm not sure why these 
> were added and to me by default these do not have security concerns.  It 
> makes it more usable by default to be able to see these things.  Users with 
> high security concerns can simply add them in their configs.
> Also on yarn just redacting url doesn't secure anything because if you go to 
> the environment ui page you see all sorts of paths and really its just 
> confusing that some of its redacted and other parts aren't.  If this was 
> specifically for jdbc I think it needs to be just applied there and not 
> broadly.
> If we remove these we need to test what the jdbc driver is going to log from 
> SPARK-22479.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-23850) We should not redact username|user|url from UI by default

2018-04-02 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-23850:
-

 Summary: We should not redact username|user|url from UI by default
 Key: SPARK-23850
 URL: https://issues.apache.org/jira/browse/SPARK-23850
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 2.2.1
Reporter: Thomas Graves


SPARK-22479 was filed to not print the log jdbc credentials, but in there they 
also added  the username and url to be redacted.  I'm not sure why these were 
added and to me by default these do not have security concerns.  It makes it 
more usable by default to be able to see these things.  Users with high 
security concerns can simply add them in their configs.

Also on yarn just redacting url doesn't secure anything because if you go to 
the environment ui page you see all sorts of paths and really its just 
confusing that some of its redacted and other parts aren't.  If this was 
specifically for jdbc I think it needs to be just applied there and not broadly.

If we remove these we need to test what the jdbc driver is going to log from 
SPARK-22479.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23806) Broadcast. unpersist can cause fatal exception when used with dynamic allocation

2018-03-28 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-23806:
--
Description: 
Very similar to https://issues.apache.org/jira/browse/SPARK-22618 . But this 
could also apply to Broadcast.unpersist.

 

2018-03-24 05:29:17,836 [Spark Context Cleaner] ERROR 
org.apache.spark.ContextCleaner - Error cleaning broadcast 85710 
org.apache.spark.SparkException: Exception thrown in awaitResult: at 
org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205) at 
org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) at 
org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:152)
 at 
org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:306)
 at 
org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45)
 at 
org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:60)
 at 
org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:238) at 
org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$1.apply(ContextCleaner.scala:194)
 at 
org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$1.apply(ContextCleaner.scala:185)
 at scala.Option.foreach(Option.scala:257) at 
org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:185)
 at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1286) at 
org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:178)
 at org.apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:73) Caused 
by: java.io.IOException: Failed to send RPC 7228115282075984867 to 
/10.10.10.10:53804: java.nio.channels.ClosedChannelException at 
org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)
 at 
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
 at 
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
 at 
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
 at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122) 
at 
io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:852)
 at 
io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:738) 
at 
io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1251)
 at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:733)
 at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:725)
 at 
io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:35)
 at 
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1062)
 at 
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1116)
 at 
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1051)
 at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
 at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:446) at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
 at 
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
 at java.lang.Thread.run(Thread.java:745) Caused by: 
java.nio.channels.ClosedChannelException

  was:
Very similar to https://issues.apache.org/jira/browse/SPARK-2261 . But this 
could also apply to Broadcast.unpersist.

 

2018-03-24 05:29:17,836 [Spark Context Cleaner] ERROR 
org.apache.spark.ContextCleaner - Error cleaning broadcast 85710 
org.apache.spark.SparkException: Exception thrown in awaitResult: at 
org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205) at 
org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) at 
org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:152)
 at 
org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:306)
 at 
org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45)
 at 
org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:60)
 at 
org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:238) at 
org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$1.apply(ContextCleaner.scala:194)
 at 

[jira] [Created] (SPARK-23806) Broadcast. unpersist can cause fatal exception when used with dynamic allocation

2018-03-28 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-23806:
-

 Summary: Broadcast. unpersist can cause fatal exception when used 
with dynamic allocation
 Key: SPARK-23806
 URL: https://issues.apache.org/jira/browse/SPARK-23806
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.2.0
Reporter: Thomas Graves


Very similar to https://issues.apache.org/jira/browse/SPARK-2261 . But this 
could also apply to Broadcast.unpersist.

 

2018-03-24 05:29:17,836 [Spark Context Cleaner] ERROR 
org.apache.spark.ContextCleaner - Error cleaning broadcast 85710 
org.apache.spark.SparkException: Exception thrown in awaitResult: at 
org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205) at 
org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) at 
org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:152)
 at 
org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:306)
 at 
org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45)
 at 
org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:60)
 at 
org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:238) at 
org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$1.apply(ContextCleaner.scala:194)
 at 
org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$1.apply(ContextCleaner.scala:185)
 at scala.Option.foreach(Option.scala:257) at 
org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:185)
 at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1286) at 
org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:178)
 at org.apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:73) Caused 
by: java.io.IOException: Failed to send RPC 7228115282075984867 to 
/10.10.10.10:53804: java.nio.channels.ClosedChannelException at 
org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)
 at 
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
 at 
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
 at 
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
 at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122) 
at 
io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:852)
 at 
io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:738) 
at 
io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1251)
 at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:733)
 at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:725)
 at 
io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:35)
 at 
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1062)
 at 
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1116)
 at 
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1051)
 at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
 at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:446) at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
 at 
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
 at java.lang.Thread.run(Thread.java:745) Caused by: 
java.nio.channels.ClosedChannelException



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22618) RDD.unpersist can cause fatal exception when used with dynamic allocation

2018-03-28 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417466#comment-16417466
 ] 

Thomas Graves commented on SPARK-22618:
---

I'll file a separate Jira for it and put up a pr

> RDD.unpersist can cause fatal exception when used with dynamic allocation
> -
>
> Key: SPARK-22618
> URL: https://issues.apache.org/jira/browse/SPARK-22618
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: Brad
>Assignee: Brad
>Priority: Minor
> Fix For: 2.3.0
>
>
> If you use rdd.unpersist() with dynamic allocation, then an executor can be 
> deallocated while your rdd is being removed, which will throw an uncaught 
> exception killing your job. 
> I looked into different ways of preventing this error from occurring but 
> couldn't come up with anything that wouldn't require a big change. I propose 
> the best fix is just to catch and log IOExceptions in unpersist() so they 
> don't kill your job. This will match the effective behavior when executors 
> are lost from dynamic allocation in other parts of the code.
> In the worst case scenario I think this could lead to RDD partitions getting 
> left on executors after they were unpersisted, but this is probably better 
> than the whole job failing. I think in most cases the IOException would be 
> due to the executor dieing for some reason, which is effectively the same 
> result as unpersisting the rdd from that executor anyway.
> I noticed this exception in a job that loads a 100GB dataset on a cluster 
> where we use dynamic allocation heavily. Here is the relevant stack trace
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> at sun.nio.ch.IOUtil.read(IOUtil.java:192)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
> at 
> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221)
> at 
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899)
> at 
> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:276)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
> at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
> at java.lang.Thread.run(Thread.java:748)
> Exception in thread "main" org.apache.spark.SparkException: Exception thrown 
> in awaitResult:
> at 
> org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
> at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
> at 
> org.apache.spark.storage.BlockManagerMaster.removeRdd(BlockManagerMaster.scala:131)
> at org.apache.spark.SparkContext.unpersistRDD(SparkContext.scala:1806)
> at org.apache.spark.rdd.RDD.unpersist(RDD.scala:217)
> at 
> com.ibm.sparktc.sparkbench.workload.exercise.CacheTest.doWorkload(CacheTest.scala:62)
> at 
> com.ibm.sparktc.sparkbench.workload.Workload$class.run(Workload.scala:40)
> at 
> com.ibm.sparktc.sparkbench.workload.exercise.CacheTest.run(CacheTest.scala:33)
> at 
> com.ibm.sparktc.sparkbench.workload.SuiteKickoff$$anonfun$com$ibm$sparktc$sparkbench$workload$SuiteKickoff$$runSerially$1.apply(SuiteKickoff.scala:78)
> at 
> com.ibm.sparktc.sparkbench.workload.SuiteKickoff$$anonfun$com$ibm$sparktc$sparkbench$workload$SuiteKickoff$$runSerially$1.apply(SuiteKickoff.scala:78)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.immutable.List.map(List.scala:285)
> at 
> com.ibm.sparktc.sparkbench.workload.SuiteKickoff$.com$ibm$sparktc$sparkbench$workload$SuiteKickoff$$runSerially(SuiteKickoff.scala:78)
>  

[jira] [Commented] (SPARK-22618) RDD.unpersist can cause fatal exception when used with dynamic allocation

2018-03-27 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416309#comment-16416309
 ] 

Thomas Graves commented on SPARK-22618:
---

thanks for fixing this, hitting it now in spark 2.2, I think this same issue 
can happen with broadcast variables if its told to wait, did you happen to look 
at that at the same time?  

> RDD.unpersist can cause fatal exception when used with dynamic allocation
> -
>
> Key: SPARK-22618
> URL: https://issues.apache.org/jira/browse/SPARK-22618
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: Brad
>Assignee: Brad
>Priority: Minor
> Fix For: 2.3.0
>
>
> If you use rdd.unpersist() with dynamic allocation, then an executor can be 
> deallocated while your rdd is being removed, which will throw an uncaught 
> exception killing your job. 
> I looked into different ways of preventing this error from occurring but 
> couldn't come up with anything that wouldn't require a big change. I propose 
> the best fix is just to catch and log IOExceptions in unpersist() so they 
> don't kill your job. This will match the effective behavior when executors 
> are lost from dynamic allocation in other parts of the code.
> In the worst case scenario I think this could lead to RDD partitions getting 
> left on executors after they were unpersisted, but this is probably better 
> than the whole job failing. I think in most cases the IOException would be 
> due to the executor dieing for some reason, which is effectively the same 
> result as unpersisting the rdd from that executor anyway.
> I noticed this exception in a job that loads a 100GB dataset on a cluster 
> where we use dynamic allocation heavily. Here is the relevant stack trace
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> at sun.nio.ch.IOUtil.read(IOUtil.java:192)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
> at 
> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221)
> at 
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899)
> at 
> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:276)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
> at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
> at java.lang.Thread.run(Thread.java:748)
> Exception in thread "main" org.apache.spark.SparkException: Exception thrown 
> in awaitResult:
> at 
> org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
> at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
> at 
> org.apache.spark.storage.BlockManagerMaster.removeRdd(BlockManagerMaster.scala:131)
> at org.apache.spark.SparkContext.unpersistRDD(SparkContext.scala:1806)
> at org.apache.spark.rdd.RDD.unpersist(RDD.scala:217)
> at 
> com.ibm.sparktc.sparkbench.workload.exercise.CacheTest.doWorkload(CacheTest.scala:62)
> at 
> com.ibm.sparktc.sparkbench.workload.Workload$class.run(Workload.scala:40)
> at 
> com.ibm.sparktc.sparkbench.workload.exercise.CacheTest.run(CacheTest.scala:33)
> at 
> com.ibm.sparktc.sparkbench.workload.SuiteKickoff$$anonfun$com$ibm$sparktc$sparkbench$workload$SuiteKickoff$$runSerially$1.apply(SuiteKickoff.scala:78)
> at 
> com.ibm.sparktc.sparkbench.workload.SuiteKickoff$$anonfun$com$ibm$sparktc$sparkbench$workload$SuiteKickoff$$runSerially$1.apply(SuiteKickoff.scala:78)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.immutable.List.map(List.scala:285)
> at 
> 

[jira] [Commented] (SPARK-16630) Blacklist a node if executors won't launch on it.

2018-03-07 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390232#comment-16390232
 ] 

Thomas Graves commented on SPARK-16630:
---

yes yarn tells you the # of nodemanagers. allocateResponse -> getNumClusterNodes

 

 

> Blacklist a node if executors won't launch on it.
> -
>
> Key: SPARK-16630
> URL: https://issues.apache.org/jira/browse/SPARK-16630
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 1.6.2
>Reporter: Thomas Graves
>Priority: Major
>
> On YARN, its possible that a node is messed or misconfigured such that a 
> container won't launch on it.  For instance if the Spark external shuffle 
> handler didn't get loaded on it , maybe its just some other hardware issue or 
> hadoop configuration issue. 
> It would be nice we could recognize this happening and stop trying to launch 
> executors on it since that could end up causing us to hit our max number of 
> executor failures and then kill the job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-03-06 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387963#comment-16387963
 ] 

Thomas Graves commented on SPARK-22683:
---

I left comments on the open PR already, lets move the discussion there

> DynamicAllocation wastes resources by allocating containers that will barely 
> be used
> 
>
> Key: SPARK-22683
> URL: https://issues.apache.org/jira/browse/SPARK-22683
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Julien Cuquemelle
>Priority: Major
>  Labels: pull-request-available
>
> While migrating a series of jobs from MR to Spark using dynamicAllocation, 
> I've noticed almost a doubling (+114% exactly) of resource consumption of 
> Spark w.r.t MR, for a wall clock time gain of 43%
> About the context: 
> - resource usage stands for vcore-hours allocation for the whole job, as seen 
> by YARN
> - I'm talking about a series of jobs because we provide our users with a way 
> to define experiments (via UI / DSL) that automatically get translated to 
> Spark / MR jobs and submitted on the cluster
> - we submit around 500 of such jobs each day
> - these jobs are usually one shot, and the amount of processing can vary a 
> lot between jobs, and as such finding an efficient number of executors for 
> each job is difficult to get right, which is the reason I took the path of 
> dynamic allocation.  
> - Some of the tests have been scheduled on an idle queue, some on a full 
> queue.
> - experiments have been conducted with spark.executor-cores = 5 and 10, only 
> results for 5 cores have been reported because efficiency was overall better 
> than with 10 cores
> - the figures I give are averaged over a representative sample of those jobs 
> (about 600 jobs) ranging from tens to thousands splits in the data 
> partitioning and between 400 to 9000 seconds of wall clock time.
> - executor idle timeout is set to 30s;
>  
> Definition: 
> - let's say an executor has spark.executor.cores / spark.task.cpus taskSlots, 
> which represent the max number of tasks an executor will process in parallel.
> - the current behaviour of the dynamic allocation is to allocate enough 
> containers to have one taskSlot per task, which minimizes latency, but wastes 
> resources when tasks are small regarding executor allocation and idling 
> overhead. 
> The results using the proposal (described below) over the job sample (600 
> jobs):
> - by using 2 tasks per taskSlot, we get a 5% (against -114%) reduction in 
> resource usage, for a 37% (against 43%) reduction in wall clock time for 
> Spark w.r.t MR
> - by trying to minimize the average resource consumption, I ended up with 6 
> tasks per core, with a 30% resource usage reduction, for a similar wall clock 
> time w.r.t. MR
> What did I try to solve the issue with existing parameters (summing up a few 
> points mentioned in the comments) ?
> - change dynamicAllocation.maxExecutors: this would need to be adapted for 
> each job (tens to thousands splits can occur), and essentially remove the 
> interest of using the dynamic allocation.
> - use dynamicAllocation.backlogTimeout: 
> - setting this parameter right to avoid creating unused executors is very 
> dependant on wall clock time. One basically needs to solve the exponential 
> ramp up for the target time. So this is not an option for my use case where I 
> don't want a per-job tuning. 
> - I've still done a series of experiments, details in the comments. 
> Result is that after manual tuning, the best I could get was a similar 
> resource consumption at the expense of 20% more wall clock time, or a similar 
> wall clock time at the expense of 60% more resource consumption than what I 
> got using my proposal @ 6 tasks per slot (this value being optimized over a 
> much larger range of jobs as already stated)
> - as mentioned in another comment, tampering with the exponential ramp up 
> might yield task imbalance and such old executors could become contention 
> points for other exes trying to remotely access blocks in the old exes (not 
> witnessed in the jobs I'm talking about, but we did see this behavior in 
> other jobs)
> Proposal: 
> Simply add a tasksPerExecutorSlot parameter, which makes it possible to 
> specify how many tasks a single taskSlot should ideally execute to mitigate 
> the overhead of executor allocation.
> PR: https://github.com/apache/spark/pull/19881



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-03-06 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387938#comment-16387938
 ] 

Thomas Graves commented on SPARK-22683:
---

[~jcuquemelle] do you have time to update the PR, otherwise we should close 
that for now

> DynamicAllocation wastes resources by allocating containers that will barely 
> be used
> 
>
> Key: SPARK-22683
> URL: https://issues.apache.org/jira/browse/SPARK-22683
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Julien Cuquemelle
>Priority: Major
>  Labels: pull-request-available
>
> While migrating a series of jobs from MR to Spark using dynamicAllocation, 
> I've noticed almost a doubling (+114% exactly) of resource consumption of 
> Spark w.r.t MR, for a wall clock time gain of 43%
> About the context: 
> - resource usage stands for vcore-hours allocation for the whole job, as seen 
> by YARN
> - I'm talking about a series of jobs because we provide our users with a way 
> to define experiments (via UI / DSL) that automatically get translated to 
> Spark / MR jobs and submitted on the cluster
> - we submit around 500 of such jobs each day
> - these jobs are usually one shot, and the amount of processing can vary a 
> lot between jobs, and as such finding an efficient number of executors for 
> each job is difficult to get right, which is the reason I took the path of 
> dynamic allocation.  
> - Some of the tests have been scheduled on an idle queue, some on a full 
> queue.
> - experiments have been conducted with spark.executor-cores = 5 and 10, only 
> results for 5 cores have been reported because efficiency was overall better 
> than with 10 cores
> - the figures I give are averaged over a representative sample of those jobs 
> (about 600 jobs) ranging from tens to thousands splits in the data 
> partitioning and between 400 to 9000 seconds of wall clock time.
> - executor idle timeout is set to 30s;
>  
> Definition: 
> - let's say an executor has spark.executor.cores / spark.task.cpus taskSlots, 
> which represent the max number of tasks an executor will process in parallel.
> - the current behaviour of the dynamic allocation is to allocate enough 
> containers to have one taskSlot per task, which minimizes latency, but wastes 
> resources when tasks are small regarding executor allocation and idling 
> overhead. 
> The results using the proposal (described below) over the job sample (600 
> jobs):
> - by using 2 tasks per taskSlot, we get a 5% (against -114%) reduction in 
> resource usage, for a 37% (against 43%) reduction in wall clock time for 
> Spark w.r.t MR
> - by trying to minimize the average resource consumption, I ended up with 6 
> tasks per core, with a 30% resource usage reduction, for a similar wall clock 
> time w.r.t. MR
> What did I try to solve the issue with existing parameters (summing up a few 
> points mentioned in the comments) ?
> - change dynamicAllocation.maxExecutors: this would need to be adapted for 
> each job (tens to thousands splits can occur), and essentially remove the 
> interest of using the dynamic allocation.
> - use dynamicAllocation.backlogTimeout: 
> - setting this parameter right to avoid creating unused executors is very 
> dependant on wall clock time. One basically needs to solve the exponential 
> ramp up for the target time. So this is not an option for my use case where I 
> don't want a per-job tuning. 
> - I've still done a series of experiments, details in the comments. 
> Result is that after manual tuning, the best I could get was a similar 
> resource consumption at the expense of 20% more wall clock time, or a similar 
> wall clock time at the expense of 60% more resource consumption than what I 
> got using my proposal @ 6 tasks per slot (this value being optimized over a 
> much larger range of jobs as already stated)
> - as mentioned in another comment, tampering with the exponential ramp up 
> might yield task imbalance and such old executors could become contention 
> points for other exes trying to remotely access blocks in the old exes (not 
> witnessed in the jobs I'm talking about, but we did see this behavior in 
> other jobs)
> Proposal: 
> Simply add a tasksPerExecutorSlot parameter, which makes it possible to 
> specify how many tasks a single taskSlot should ideally execute to mitigate 
> the overhead of executor allocation.
> PR: https://github.com/apache/spark/pull/19881



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16630) Blacklist a node if executors won't launch on it.

2018-03-06 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387846#comment-16387846
 ] 

Thomas Graves commented on SPARK-16630:
---

yes something along these lines is what I was thinking. we would want a 
configurable number of failures (perhaps we can reuse one of the existing 
settings, but woudl need to think about more) at which point we would blacklist 
the node due to executor launch failures and we could have a timeout at which 
point we could retry.  We also want to take into account small clusters and 
perhaps stop blacklisting if a certain percent of the cluster is already 
blacklisted.

> Blacklist a node if executors won't launch on it.
> -
>
> Key: SPARK-16630
> URL: https://issues.apache.org/jira/browse/SPARK-16630
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 1.6.2
>Reporter: Thomas Graves
>Priority: Major
>
> On YARN, its possible that a node is messed or misconfigured such that a 
> container won't launch on it.  For instance if the Spark external shuffle 
> handler didn't get loaded on it , maybe its just some other hardware issue or 
> hadoop configuration issue. 
> It would be nice we could recognize this happening and stop trying to launch 
> executors on it since that could end up causing us to hit our max number of 
> executor failures and then kill the job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-23567) spark.redaction.regex should not include user by default, docs not updated

2018-03-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383723#comment-16383723
 ] 

Thomas Graves edited comment on SPARK-23567 at 3/2/18 3:57 PM:
---

I also question whether the url should be redacted by default, default example 
given, I'm not sure how the url here is security issue.  
{code:java}
SaveIntoDataSourceCommand jdbc, Map(dbtable -> test10, driver -> 
org.postgresql.Driver, url -> jdbc:postgresql:sparkdb, password -> pass), 
ErrorIfExists{code}


was (Author: tgraves):
I also question whether the url should be redacted by default, but I would want 
to look more at SPARK-22479 to understand what url was hidden since the Jira 
doesn't have an example.

> spark.redaction.regex should not include user by default, docs not updated
> --
>
> Key: SPARK-23567
> URL: https://issues.apache.org/jira/browse/SPARK-23567
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Thomas Graves
>Priority: Major
>
> SPARK-22479 changed to redact the user name by default.  I would argue 
> username isn't something that should be redacted by default and its very 
> useful for debugging and other things. If people are running super secure and 
> want to turn it on they can but I don't see the user name as a default 
> security setting.  There are also other ways on the UI to see the user name, 
> for instance on yarn you can go to the Environment page and looking at the 
> resources and see the username in the paths.
> Also the Jira did not update the default setting in the docs, so the docs are 
> out of date:
> http://spark.apache.org/docs/2.2.1/configuration.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22479) SaveIntoDataSourceCommand logs jdbc credentials

2018-03-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383733#comment-16383733
 ] 

Thomas Graves commented on SPARK-22479:
---

Also the example above shows the password, but the password should have been 
already redacted, this pr excluded url, user, and username. Was the password 
not being redacted for some reason?

> SaveIntoDataSourceCommand logs jdbc credentials
> ---
>
> Key: SPARK-22479
> URL: https://issues.apache.org/jira/browse/SPARK-22479
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Onur Satici
>Assignee: Onur Satici
>Priority: Major
> Fix For: 2.2.1, 2.3.0
>
>
> JDBC credentials are not redacted in plans including a 
> 'SaveIntoDataSourceCommand'.
> Steps to reproduce:
> {code}
> spark-shell --packages org.postgresql:postgresql:42.1.1
> {code}
> {code}
> import org.apache.spark.sql.execution.QueryExecution
> import org.apache.spark.sql.util.QueryExecutionListener
> val listener = new QueryExecutionListener {
>   override def onFailure(funcName: String, qe: QueryExecution, exception: 
> Exception): Unit = {}
>   override def onSuccess(funcName: String, qe: QueryExecution, duration: 
> Long): Unit = {
> System.out.println(qe.toString())
>   }
> }
> spark.listenerManager.register(listener)
> spark.range(100).write.format("jdbc").option("url", 
> "jdbc:postgresql:sparkdb").option("password", "pass").option("driver", 
> "org.postgresql.Driver").option("dbtable", "test").save()
> {code}
> The above will yield the following plan:
> {code}
> == Parsed Logical Plan ==
> SaveIntoDataSourceCommand jdbc, Map(dbtable -> test10, driver -> 
> org.postgresql.Driver, url -> jdbc:postgresql:sparkdb, password -> pass), 
> ErrorIfExists
>+- Range (0, 100, step=1, splits=Some(8))
> == Analyzed Logical Plan ==
> SaveIntoDataSourceCommand jdbc, Map(dbtable -> test10, driver -> 
> org.postgresql.Driver, url -> jdbc:postgresql:sparkdb, password -> pass), 
> ErrorIfExists
>+- Range (0, 100, step=1, splits=Some(8))
> == Optimized Logical Plan ==
> SaveIntoDataSourceCommand jdbc, Map(dbtable -> test10, driver -> 
> org.postgresql.Driver, url -> jdbc:postgresql:sparkdb, password -> pass), 
> ErrorIfExists
>+- Range (0, 100, step=1, splits=Some(8))
> == Physical Plan ==
> ExecutedCommand
>+- SaveIntoDataSourceCommand jdbc, Map(dbtable -> test10, driver -> 
> org.postgresql.Driver, url -> jdbc:postgresql:sparkdb, password -> pass), 
> ErrorIfExists
>  +- Range (0, 100, step=1, splits=Some(8))
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23567) spark.redaction.regex should not include user by default, docs not updated

2018-03-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383723#comment-16383723
 ] 

Thomas Graves commented on SPARK-23567:
---

I also question whether the url should be redacted by default, but I would want 
to look more at SPARK-22479 to understand what url was hidden since the Jira 
doesn't have an example.

> spark.redaction.regex should not include user by default, docs not updated
> --
>
> Key: SPARK-23567
> URL: https://issues.apache.org/jira/browse/SPARK-23567
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Thomas Graves
>Priority: Major
>
> SPARK-22479 changed to redact the user name by default.  I would argue 
> username isn't something that should be redacted by default and its very 
> useful for debugging and other things. If people are running super secure and 
> want to turn it on they can but I don't see the user name as a default 
> security setting.  There are also other ways on the UI to see the user name, 
> for instance on yarn you can go to the Environment page and looking at the 
> resources and see the username in the paths.
> Also the Jira did not update the default setting in the docs, so the docs are 
> out of date:
> http://spark.apache.org/docs/2.2.1/configuration.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-23567) spark.redaction.regex should not include user by default, docs not updated

2018-03-02 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-23567:
-

 Summary: spark.redaction.regex should not include user by default, 
docs not updated
 Key: SPARK-23567
 URL: https://issues.apache.org/jira/browse/SPARK-23567
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.2.1
Reporter: Thomas Graves


SPARK-22479 changed to redact the user name by default.  I would argue username 
isn't something that should be redacted by default and its very useful for 
debugging and other things. If people are running super secure and want to turn 
it on they can but I don't see the user name as a default security setting.  
There are also other ways on the UI to see the user name, for instance on yarn 
you can go to the Environment page and looking at the resources and see the 
username in the paths.

Also the Jira did not update the default setting in the docs, so the docs are 
out of date:

http://spark.apache.org/docs/2.2.1/configuration.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22479) SaveIntoDataSourceCommand logs jdbc credentials

2018-03-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383709#comment-16383709
 ] 

Thomas Graves commented on SPARK-22479:
---

[~aash] [~onursatici] this seems to have redacted user names as well as the 
passwords.   We specifically added the User: field to the UI and now its being 
blocked, which is makes debugging harder.  The user name does not seem like 
something that needs to be redacted by default.  what is the reasoning behind 
that? 

Note that at least on yarn there are other ways to easily see the username on 
the UI (like the Resource Paths) so its definitely not a complete solution 
anyway.

> SaveIntoDataSourceCommand logs jdbc credentials
> ---
>
> Key: SPARK-22479
> URL: https://issues.apache.org/jira/browse/SPARK-22479
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Onur Satici
>Assignee: Onur Satici
>Priority: Major
> Fix For: 2.2.1, 2.3.0
>
>
> JDBC credentials are not redacted in plans including a 
> 'SaveIntoDataSourceCommand'.
> Steps to reproduce:
> {code}
> spark-shell --packages org.postgresql:postgresql:42.1.1
> {code}
> {code}
> import org.apache.spark.sql.execution.QueryExecution
> import org.apache.spark.sql.util.QueryExecutionListener
> val listener = new QueryExecutionListener {
>   override def onFailure(funcName: String, qe: QueryExecution, exception: 
> Exception): Unit = {}
>   override def onSuccess(funcName: String, qe: QueryExecution, duration: 
> Long): Unit = {
> System.out.println(qe.toString())
>   }
> }
> spark.listenerManager.register(listener)
> spark.range(100).write.format("jdbc").option("url", 
> "jdbc:postgresql:sparkdb").option("password", "pass").option("driver", 
> "org.postgresql.Driver").option("dbtable", "test").save()
> {code}
> The above will yield the following plan:
> {code}
> == Parsed Logical Plan ==
> SaveIntoDataSourceCommand jdbc, Map(dbtable -> test10, driver -> 
> org.postgresql.Driver, url -> jdbc:postgresql:sparkdb, password -> pass), 
> ErrorIfExists
>+- Range (0, 100, step=1, splits=Some(8))
> == Analyzed Logical Plan ==
> SaveIntoDataSourceCommand jdbc, Map(dbtable -> test10, driver -> 
> org.postgresql.Driver, url -> jdbc:postgresql:sparkdb, password -> pass), 
> ErrorIfExists
>+- Range (0, 100, step=1, splits=Some(8))
> == Optimized Logical Plan ==
> SaveIntoDataSourceCommand jdbc, Map(dbtable -> test10, driver -> 
> org.postgresql.Driver, url -> jdbc:postgresql:sparkdb, password -> pass), 
> ErrorIfExists
>+- Range (0, 100, step=1, splits=Some(8))
> == Physical Plan ==
> ExecutedCommand
>+- SaveIntoDataSourceCommand jdbc, Map(dbtable -> test10, driver -> 
> org.postgresql.Driver, url -> jdbc:postgresql:sparkdb, password -> pass), 
> ErrorIfExists
>  +- Range (0, 100, step=1, splits=Some(8))
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (YARN-7935) Expose container's hostname to applications running within the docker container

2018-02-23 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374598#comment-16374598
 ] 

Thomas Graves commented on YARN-7935:
-

thanks for the explanation Mridul. I'm fine with waiting on the spark Jira til 
you know the scope better, I'm currently not doing anything with bridge mode so 
won't be able to help there at this point.

> Expose container's hostname to applications running within the docker 
> container
> ---
>
> Key: YARN-7935
> URL: https://issues.apache.org/jira/browse/YARN-7935
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-7935.1.patch, YARN-7935.2.patch
>
>
> Some applications have a need to bind to the container's hostname (like 
> Spark) which is different from the NodeManager's hostname(NM_HOST which is 
> available as an env during container launch) when launched through Docker 
> runtime. The container's hostname can be exposed to applications via an env 
> CONTAINER_HOSTNAME. Another potential candidate is the container's IP but 
> this can be addressed in a separate jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7935) Expose container's hostname to applications running within the docker container

2018-02-22 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373039#comment-16373039
 ] 

Thomas Graves commented on YARN-7935:
-

[~mridulm80] what is the spark Jira for this?  If this goes in it will still 
have to grab this from env to pass in to the executorRunnable.

> Expose container's hostname to applications running within the docker 
> container
> ---
>
> Key: YARN-7935
> URL: https://issues.apache.org/jira/browse/YARN-7935
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-7935.1.patch, YARN-7935.2.patch
>
>
> Some applications have a need to bind to the container's hostname (like 
> Spark) which is different from the NodeManager's hostname(NM_HOST which is 
> available as an env during container launch) when launched through Docker 
> runtime. The container's hostname can be exposed to applications via an env 
> CONTAINER_HOSTNAME. Another potential candidate is the container's IP but 
> this can be addressed in a separate jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-08 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357346#comment-16357346
 ] 

Thomas Graves commented on SPARK-23309:
---

sorry I haven't had time to make a query/dataset to reproduce that.  I'm ok 
with this not being a blocker for 2.3.

> Spark 2.3 cached query performance 20-30% worse then spark 2.2
> --
>
> Key: SPARK-23309
> URL: https://issues.apache.org/jira/browse/SPARK-23309
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was testing spark 2.3 rc2 and I am seeing a performance regression in sql 
> queries on cached data.
> The size of the data: 10.4GB input from hive orc files /18.8 GB cached/5592 
> partitions
> Here is the example query:
> val dailycached = spark.sql("select something from table where dt = 
> '20170301' AND something IS NOT NULL")
> dailycached.createOrReplaceTempView("dailycached") 
> spark.catalog.cacheTable("dailyCached")
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show()
>  
> On spark 2.2 I see queries times average 13 seconds
> On the same nodes I see spark 2.3 queries times average 17 seconds
> Note these are times of queries after the initial caching.  so just running 
> the last line again: 
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show() 
> multiple times.
>  
> I also ran a query over more data (335GB input/587.5 GB cached) and saw a 
> similar discrepancy in the performance of querying cached data between spark 
> 2.3 and spark 2.2, where 2.2 was better by like 20%.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-02-07 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356168#comment-16356168
 ] 

Thomas Graves commented on SPARK-22683:
---

I agree, I think default behavior stays 1. 

I ran a few tests with this patch. I definitely see an improvement in resource 
usage across all the jobs I ran. The jobs were similar job run time or actually 
faster on a few.  I used default 60 second timeout.

> DynamicAllocation wastes resources by allocating containers that will barely 
> be used
> 
>
> Key: SPARK-22683
> URL: https://issues.apache.org/jira/browse/SPARK-22683
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Julien Cuquemelle
>Priority: Major
>  Labels: pull-request-available
>
> While migrating a series of jobs from MR to Spark using dynamicAllocation, 
> I've noticed almost a doubling (+114% exactly) of resource consumption of 
> Spark w.r.t MR, for a wall clock time gain of 43%
> About the context: 
> - resource usage stands for vcore-hours allocation for the whole job, as seen 
> by YARN
> - I'm talking about a series of jobs because we provide our users with a way 
> to define experiments (via UI / DSL) that automatically get translated to 
> Spark / MR jobs and submitted on the cluster
> - we submit around 500 of such jobs each day
> - these jobs are usually one shot, and the amount of processing can vary a 
> lot between jobs, and as such finding an efficient number of executors for 
> each job is difficult to get right, which is the reason I took the path of 
> dynamic allocation.  
> - Some of the tests have been scheduled on an idle queue, some on a full 
> queue.
> - experiments have been conducted with spark.executor-cores = 5 and 10, only 
> results for 5 cores have been reported because efficiency was overall better 
> than with 10 cores
> - the figures I give are averaged over a representative sample of those jobs 
> (about 600 jobs) ranging from tens to thousands splits in the data 
> partitioning and between 400 to 9000 seconds of wall clock time.
> - executor idle timeout is set to 30s;
>  
> Definition: 
> - let's say an executor has spark.executor.cores / spark.task.cpus taskSlots, 
> which represent the max number of tasks an executor will process in parallel.
> - the current behaviour of the dynamic allocation is to allocate enough 
> containers to have one taskSlot per task, which minimizes latency, but wastes 
> resources when tasks are small regarding executor allocation and idling 
> overhead. 
> The results using the proposal (described below) over the job sample (600 
> jobs):
> - by using 2 tasks per taskSlot, we get a 5% (against -114%) reduction in 
> resource usage, for a 37% (against 43%) reduction in wall clock time for 
> Spark w.r.t MR
> - by trying to minimize the average resource consumption, I ended up with 6 
> tasks per core, with a 30% resource usage reduction, for a similar wall clock 
> time w.r.t. MR
> What did I try to solve the issue with existing parameters (summing up a few 
> points mentioned in the comments) ?
> - change dynamicAllocation.maxExecutors: this would need to be adapted for 
> each job (tens to thousands splits can occur), and essentially remove the 
> interest of using the dynamic allocation.
> - use dynamicAllocation.backlogTimeout: 
> - setting this parameter right to avoid creating unused executors is very 
> dependant on wall clock time. One basically needs to solve the exponential 
> ramp up for the target time. So this is not an option for my use case where I 
> don't want a per-job tuning. 
> - I've still done a series of experiments, details in the comments. 
> Result is that after manual tuning, the best I could get was a similar 
> resource consumption at the expense of 20% more wall clock time, or a similar 
> wall clock time at the expense of 60% more resource consumption than what I 
> got using my proposal @ 6 tasks per slot (this value being optimized over a 
> much larger range of jobs as already stated)
> - as mentioned in another comment, tampering with the exponential ramp up 
> might yield task imbalance and such old executors could become contention 
> points for other exes trying to remotely access blocks in the old exes (not 
> witnessed in the jobs I'm talking about, but we did see this behavior in 
> other jobs)
> Proposal: 
> Simply add a tasksPerExecutorSlot parameter, which makes it possible to 
> specify how many tasks a single taskSlot should ideally execute to mitigate 
> the overhead of executor allocation.
> PR: https://github.com/apache/spark/pull/19881



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-02-07 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356168#comment-16356168
 ] 

Thomas Graves edited comment on SPARK-22683 at 2/7/18 10:24 PM:


I agree, I think default behavior stays 1. 

I ran a few tests with this patch. I definitely see an improvement in resource 
usage across all the jobs I ran. The jobs were similar job run time or actually 
faster on a few.  I used default 60 second timeout.

 

Note none of those jobs were really long running.  small to medium size tasks.


was (Author: tgraves):
I agree, I think default behavior stays 1. 

I ran a few tests with this patch. I definitely see an improvement in resource 
usage across all the jobs I ran. The jobs were similar job run time or actually 
faster on a few.  I used default 60 second timeout.

> DynamicAllocation wastes resources by allocating containers that will barely 
> be used
> 
>
> Key: SPARK-22683
> URL: https://issues.apache.org/jira/browse/SPARK-22683
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Julien Cuquemelle
>Priority: Major
>  Labels: pull-request-available
>
> While migrating a series of jobs from MR to Spark using dynamicAllocation, 
> I've noticed almost a doubling (+114% exactly) of resource consumption of 
> Spark w.r.t MR, for a wall clock time gain of 43%
> About the context: 
> - resource usage stands for vcore-hours allocation for the whole job, as seen 
> by YARN
> - I'm talking about a series of jobs because we provide our users with a way 
> to define experiments (via UI / DSL) that automatically get translated to 
> Spark / MR jobs and submitted on the cluster
> - we submit around 500 of such jobs each day
> - these jobs are usually one shot, and the amount of processing can vary a 
> lot between jobs, and as such finding an efficient number of executors for 
> each job is difficult to get right, which is the reason I took the path of 
> dynamic allocation.  
> - Some of the tests have been scheduled on an idle queue, some on a full 
> queue.
> - experiments have been conducted with spark.executor-cores = 5 and 10, only 
> results for 5 cores have been reported because efficiency was overall better 
> than with 10 cores
> - the figures I give are averaged over a representative sample of those jobs 
> (about 600 jobs) ranging from tens to thousands splits in the data 
> partitioning and between 400 to 9000 seconds of wall clock time.
> - executor idle timeout is set to 30s;
>  
> Definition: 
> - let's say an executor has spark.executor.cores / spark.task.cpus taskSlots, 
> which represent the max number of tasks an executor will process in parallel.
> - the current behaviour of the dynamic allocation is to allocate enough 
> containers to have one taskSlot per task, which minimizes latency, but wastes 
> resources when tasks are small regarding executor allocation and idling 
> overhead. 
> The results using the proposal (described below) over the job sample (600 
> jobs):
> - by using 2 tasks per taskSlot, we get a 5% (against -114%) reduction in 
> resource usage, for a 37% (against 43%) reduction in wall clock time for 
> Spark w.r.t MR
> - by trying to minimize the average resource consumption, I ended up with 6 
> tasks per core, with a 30% resource usage reduction, for a similar wall clock 
> time w.r.t. MR
> What did I try to solve the issue with existing parameters (summing up a few 
> points mentioned in the comments) ?
> - change dynamicAllocation.maxExecutors: this would need to be adapted for 
> each job (tens to thousands splits can occur), and essentially remove the 
> interest of using the dynamic allocation.
> - use dynamicAllocation.backlogTimeout: 
> - setting this parameter right to avoid creating unused executors is very 
> dependant on wall clock time. One basically needs to solve the exponential 
> ramp up for the target time. So this is not an option for my use case where I 
> don't want a per-job tuning. 
> - I've still done a series of experiments, details in the comments. 
> Result is that after manual tuning, the best I could get was a similar 
> resource consumption at the expense of 20% more wall clock time, or a similar 
> wall clock time at the expense of 60% more resource consumption than what I 
> got using my proposal @ 6 tasks per slot (this value being optimized over a 
> much larger range of jobs as already stated)
> - as mentioned in another comment, tampering with the exponential ramp up 
> might yield task imbalance and such old executors could become contention 
> points for other exes trying to remotely access blocks in the old exes (not 
> witnessed in the jobs I'm 

[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-02-07 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16355920#comment-16355920
 ] 

Thomas Graves commented on SPARK-22683:
---

If the config is set to 1 which keeps the current behavior the job server 
pattern and really any other application by default won't be affected. I don't 
see this as any different then me tuning max executors for example.  Really 
this is just a more dynamic max executors.  

I agree with you that this isn't optimal in ways.  For instances it applies it 
across the entire application where you could run multiple jobs and stages. 
Each of those might not want this config, but that is a different problem where 
we would need to support per stage configuration for example. If its a single 
application then you should be able to set this between jobs programmatically 
if they are serial jobs (although I haven't tested this), but if that doesn't 
work all the dynamic allocation configs would have the same issue.

 

> DynamicAllocation wastes resources by allocating containers that will barely 
> be used
> 
>
> Key: SPARK-22683
> URL: https://issues.apache.org/jira/browse/SPARK-22683
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Julien Cuquemelle
>Priority: Major
>  Labels: pull-request-available
>
> While migrating a series of jobs from MR to Spark using dynamicAllocation, 
> I've noticed almost a doubling (+114% exactly) of resource consumption of 
> Spark w.r.t MR, for a wall clock time gain of 43%
> About the context: 
> - resource usage stands for vcore-hours allocation for the whole job, as seen 
> by YARN
> - I'm talking about a series of jobs because we provide our users with a way 
> to define experiments (via UI / DSL) that automatically get translated to 
> Spark / MR jobs and submitted on the cluster
> - we submit around 500 of such jobs each day
> - these jobs are usually one shot, and the amount of processing can vary a 
> lot between jobs, and as such finding an efficient number of executors for 
> each job is difficult to get right, which is the reason I took the path of 
> dynamic allocation.  
> - Some of the tests have been scheduled on an idle queue, some on a full 
> queue.
> - experiments have been conducted with spark.executor-cores = 5 and 10, only 
> results for 5 cores have been reported because efficiency was overall better 
> than with 10 cores
> - the figures I give are averaged over a representative sample of those jobs 
> (about 600 jobs) ranging from tens to thousands splits in the data 
> partitioning and between 400 to 9000 seconds of wall clock time.
> - executor idle timeout is set to 30s;
>  
> Definition: 
> - let's say an executor has spark.executor.cores / spark.task.cpus taskSlots, 
> which represent the max number of tasks an executor will process in parallel.
> - the current behaviour of the dynamic allocation is to allocate enough 
> containers to have one taskSlot per task, which minimizes latency, but wastes 
> resources when tasks are small regarding executor allocation and idling 
> overhead. 
> The results using the proposal (described below) over the job sample (600 
> jobs):
> - by using 2 tasks per taskSlot, we get a 5% (against -114%) reduction in 
> resource usage, for a 37% (against 43%) reduction in wall clock time for 
> Spark w.r.t MR
> - by trying to minimize the average resource consumption, I ended up with 6 
> tasks per core, with a 30% resource usage reduction, for a similar wall clock 
> time w.r.t. MR
> What did I try to solve the issue with existing parameters (summing up a few 
> points mentioned in the comments) ?
> - change dynamicAllocation.maxExecutors: this would need to be adapted for 
> each job (tens to thousands splits can occur), and essentially remove the 
> interest of using the dynamic allocation.
> - use dynamicAllocation.backlogTimeout: 
> - setting this parameter right to avoid creating unused executors is very 
> dependant on wall clock time. One basically needs to solve the exponential 
> ramp up for the target time. So this is not an option for my use case where I 
> don't want a per-job tuning. 
> - I've still done a series of experiments, details in the comments. 
> Result is that after manual tuning, the best I could get was a similar 
> resource consumption at the expense of 20% more wall clock time, or a similar 
> wall clock time at the expense of 60% more resource consumption than what I 
> got using my proposal @ 6 tasks per slot (this value being optimized over a 
> much larger range of jobs as already stated)
> - as mentioned in another comment, tampering with the exponential ramp up 
> might yield task imbalance and such 

[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-02-07 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1637#comment-1637
 ] 

Thomas Graves commented on SPARK-22683:
---

ok thanks,  I would like to try this out myself on a few jobs, but  my opinion 
is we should put this config in, if others have strong disagreement please 
speak up, otherwise I think we can move the discussion to the PR.  I do think 
we need to change the name of the config.

 

> DynamicAllocation wastes resources by allocating containers that will barely 
> be used
> 
>
> Key: SPARK-22683
> URL: https://issues.apache.org/jira/browse/SPARK-22683
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Julien Cuquemelle
>Priority: Major
>  Labels: pull-request-available
>
> While migrating a series of jobs from MR to Spark using dynamicAllocation, 
> I've noticed almost a doubling (+114% exactly) of resource consumption of 
> Spark w.r.t MR, for a wall clock time gain of 43%
> About the context: 
> - resource usage stands for vcore-hours allocation for the whole job, as seen 
> by YARN
> - I'm talking about a series of jobs because we provide our users with a way 
> to define experiments (via UI / DSL) that automatically get translated to 
> Spark / MR jobs and submitted on the cluster
> - we submit around 500 of such jobs each day
> - these jobs are usually one shot, and the amount of processing can vary a 
> lot between jobs, and as such finding an efficient number of executors for 
> each job is difficult to get right, which is the reason I took the path of 
> dynamic allocation.  
> - Some of the tests have been scheduled on an idle queue, some on a full 
> queue.
> - experiments have been conducted with spark.executor-cores = 5 and 10, only 
> results for 5 cores have been reported because efficiency was overall better 
> than with 10 cores
> - the figures I give are averaged over a representative sample of those jobs 
> (about 600 jobs) ranging from tens to thousands splits in the data 
> partitioning and between 400 to 9000 seconds of wall clock time.
> - executor idle timeout is set to 30s;
>  
> Definition: 
> - let's say an executor has spark.executor.cores / spark.task.cpus taskSlots, 
> which represent the max number of tasks an executor will process in parallel.
> - the current behaviour of the dynamic allocation is to allocate enough 
> containers to have one taskSlot per task, which minimizes latency, but wastes 
> resources when tasks are small regarding executor allocation and idling 
> overhead. 
> The results using the proposal (described below) over the job sample (600 
> jobs):
> - by using 2 tasks per taskSlot, we get a 5% (against -114%) reduction in 
> resource usage, for a 37% (against 43%) reduction in wall clock time for 
> Spark w.r.t MR
> - by trying to minimize the average resource consumption, I ended up with 6 
> tasks per core, with a 30% resource usage reduction, for a similar wall clock 
> time w.r.t. MR
> What did I try to solve the issue with existing parameters (summing up a few 
> points mentioned in the comments) ?
> - change dynamicAllocation.maxExecutors: this would need to be adapted for 
> each job (tens to thousands splits can occur), and essentially remove the 
> interest of using the dynamic allocation.
> - use dynamicAllocation.backlogTimeout: 
> - setting this parameter right to avoid creating unused executors is very 
> dependant on wall clock time. One basically needs to solve the exponential 
> ramp up for the target time. So this is not an option for my use case where I 
> don't want a per-job tuning. 
> - I've still done a series of experiments, details in the comments. 
> Result is that after manual tuning, the best I could get was a similar 
> resource consumption at the expense of 20% more wall clock time, or a similar 
> wall clock time at the expense of 60% more resource consumption than what I 
> got using my proposal @ 6 tasks per slot (this value being optimized over a 
> much larger range of jobs as already stated)
> - as mentioned in another comment, tampering with the exponential ramp up 
> might yield task imbalance and such old executors could become contention 
> points for other exes trying to remotely access blocks in the old exes (not 
> witnessed in the jobs I'm talking about, but we did see this behavior in 
> other jobs)
> Proposal: 
> Simply add a tasksPerExecutorSlot parameter, which makes it possible to 
> specify how many tasks a single taskSlot should ideally execute to mitigate 
> the overhead of executor allocation.
> PR: https://github.com/apache/spark/pull/19881



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-06 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354336#comment-16354336
 ] 

Thomas Graves commented on SPARK-23309:
---

I pulled in that patch ([https://github.com/apache/spark/pull/20513]) and 
numbers got better but am still seeing 10% slower on 2.3.  (this is down from 
15%)

This is using the configs: --conf spark.sql.orc.impl=hive --conf 
spark.sql.orc.filterPushdown=true --conf 
spark.sql.hive.convertMetastoreOrc=false --conf 
spark.sql.inMemoryColumnarStorage.enableVectorizedReader=false

has anyone else reproduced this or is it only me seeing it?

> Spark 2.3 cached query performance 20-30% worse then spark 2.2
> --
>
> Key: SPARK-23309
> URL: https://issues.apache.org/jira/browse/SPARK-23309
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was testing spark 2.3 rc2 and I am seeing a performance regression in sql 
> queries on cached data.
> The size of the data: 10.4GB input from hive orc files /18.8 GB cached/5592 
> partitions
> Here is the example query:
> val dailycached = spark.sql("select something from table where dt = 
> '20170301' AND something IS NOT NULL")
> dailycached.createOrReplaceTempView("dailycached") 
> spark.catalog.cacheTable("dailyCached")
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show()
>  
> On spark 2.2 I see queries times average 13 seconds
> On the same nodes I see spark 2.3 queries times average 17 seconds
> Note these are times of queries after the initial caching.  so just running 
> the last line again: 
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show() 
> multiple times.
>  
> I also ran a query over more data (335GB input/587.5 GB cached) and saw a 
> similar discrepancy in the performance of querying cached data between spark 
> 2.3 and spark 2.2, where 2.2 was better by like 20%.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350901#comment-16350901
 ] 

Thomas Graves edited comment on SPARK-23309 at 2/2/18 8:29 PM:
---

I should ask is there a log statement or query plan I can dump out just to make 
sure spark.sql.inMemoryColumnarStorage.enableVectorizedReader=false was applied 
properly?

 

Note I did verify the symbol CACHE_VECTORIZED_READER_ENABLED was present in the 
jar I ran with so the config should have been set properly.


was (Author: tgraves):
I should ask is there a log statement or query plan I can dump out just to make 
sure spark.sql.inMemoryColumnarStorage.enableVectorizedReader=false was applied 
properly?

 

Note I did verify the symbol CACHE_VECTORIZED_READER_ENABLED was present in the 
jar I ran with so the config should have worked.

> Spark 2.3 cached query performance 20-30% worse then spark 2.2
> --
>
> Key: SPARK-23309
> URL: https://issues.apache.org/jira/browse/SPARK-23309
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was testing spark 2.3 rc2 and I am seeing a performance regression in sql 
> queries on cached data.
> The size of the data: 10.4GB input from hive orc files /18.8 GB cached/5592 
> partitions
> Here is the example query:
> val dailycached = spark.sql("select something from table where dt = 
> '20170301' AND something IS NOT NULL")
> dailycached.createOrReplaceTempView("dailycached") 
> spark.catalog.cacheTable("dailyCached")
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show()
>  
> On spark 2.2 I see queries times average 13 seconds
> On the same nodes I see spark 2.3 queries times average 17 seconds
> Note these are times of queries after the initial caching.  so just running 
> the last line again: 
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show() 
> multiple times.
>  
> I also ran a query over more data (335GB input/587.5 GB cached) and saw a 
> similar discrepancy in the performance of querying cached data between spark 
> 2.3 and spark 2.2, where 2.2 was better by like 20%.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350901#comment-16350901
 ] 

Thomas Graves edited comment on SPARK-23309 at 2/2/18 8:29 PM:
---

I should ask is there a log statement or query plan I can dump out just to make 
sure spark.sql.inMemoryColumnarStorage.enableVectorizedReader=false was applied 
properly?

 

Note I did verify the symbol CACHE_VECTORIZED_READER_ENABLED was present in the 
jar I ran with so the config should have worked.


was (Author: tgraves):
I should ask is there a log statement or query plan I can dump out just to make 
sure spark.sql.inMemoryColumnarStorage.enableVectorizedReader=false was applied 
properly?

> Spark 2.3 cached query performance 20-30% worse then spark 2.2
> --
>
> Key: SPARK-23309
> URL: https://issues.apache.org/jira/browse/SPARK-23309
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was testing spark 2.3 rc2 and I am seeing a performance regression in sql 
> queries on cached data.
> The size of the data: 10.4GB input from hive orc files /18.8 GB cached/5592 
> partitions
> Here is the example query:
> val dailycached = spark.sql("select something from table where dt = 
> '20170301' AND something IS NOT NULL")
> dailycached.createOrReplaceTempView("dailycached") 
> spark.catalog.cacheTable("dailyCached")
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show()
>  
> On spark 2.2 I see queries times average 13 seconds
> On the same nodes I see spark 2.3 queries times average 17 seconds
> Note these are times of queries after the initial caching.  so just running 
> the last line again: 
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show() 
> multiple times.
>  
> I also ran a query over more data (335GB input/587.5 GB cached) and saw a 
> similar discrepancy in the performance of querying cached data between spark 
> 2.3 and spark 2.2, where 2.2 was better by like 20%.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350813#comment-16350813
 ] 

Thomas Graves edited comment on SPARK-23309 at 2/2/18 8:15 PM:
---

I'm still seeing spark 2.3 slower by about 15% for the larger dataset. (times 
here are 301 seconds on 2.2 vs 346 seconds on 2.3)  I tried 

=> --conf spark.sql.orc.impl=hive --conf spark.sql.orc.filterPushdown=true 
--conf spark.sql.hive.convertMetastoreOrc=false

and then also tried setting the vectoried reader to false

=> --conf spark.sql.orc.impl=hive --conf spark.sql.orc.filterPushdown=true 
--conf spark.sql.hive.convertMetastoreOrc=false -conf 
spark.sql.inMemoryColumnarStorage.enableVectorizedReader=false

Note the # of partitions its processing is now the same since turning off the 
native orc impl.

 


was (Author: tgraves):
I'm still seeing spark 2.3 slower by about 15% for the larger dataset. (times 
here are 301 seconds on 2.2 vs 346 seconds on 2.3)  I tried 

=> --conf spark.sql.orc.impl=hive --conf spark.sql.orc.filterPushdown=true 
--conf spark.sql.hive.convertMetastoreOrc=false

and then also tried setting the vectoried reader to false

=> --conf spark.sql.orc.impl=hive --conf spark.sql.orc.filterPushdown=true 
--conf spark.sql.hive.convertMetastoreOrc=false- -conf 
spark.sql.inMemoryColumnarStorage.enableVectorizedReader=false

Note the # of partitions its processing is now the same since turning off the 
native orc impl.

 

> Spark 2.3 cached query performance 20-30% worse then spark 2.2
> --
>
> Key: SPARK-23309
> URL: https://issues.apache.org/jira/browse/SPARK-23309
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was testing spark 2.3 rc2 and I am seeing a performance regression in sql 
> queries on cached data.
> The size of the data: 10.4GB input from hive orc files /18.8 GB cached/5592 
> partitions
> Here is the example query:
> val dailycached = spark.sql("select something from table where dt = 
> '20170301' AND something IS NOT NULL")
> dailycached.createOrReplaceTempView("dailycached") 
> spark.catalog.cacheTable("dailyCached")
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show()
>  
> On spark 2.2 I see queries times average 13 seconds
> On the same nodes I see spark 2.3 queries times average 17 seconds
> Note these are times of queries after the initial caching.  so just running 
> the last line again: 
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show() 
> multiple times.
>  
> I also ran a query over more data (335GB input/587.5 GB cached) and saw a 
> similar discrepancy in the performance of querying cached data between spark 
> 2.3 and spark 2.2, where 2.2 was better by like 20%.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350901#comment-16350901
 ] 

Thomas Graves commented on SPARK-23309:
---

I should ask is there a log statement or query plan I can dump out just to make 
sure spark.sql.inMemoryColumnarStorage.enableVectorizedReader=false was applied 
properly?

> Spark 2.3 cached query performance 20-30% worse then spark 2.2
> --
>
> Key: SPARK-23309
> URL: https://issues.apache.org/jira/browse/SPARK-23309
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was testing spark 2.3 rc2 and I am seeing a performance regression in sql 
> queries on cached data.
> The size of the data: 10.4GB input from hive orc files /18.8 GB cached/5592 
> partitions
> Here is the example query:
> val dailycached = spark.sql("select something from table where dt = 
> '20170301' AND something IS NOT NULL")
> dailycached.createOrReplaceTempView("dailycached") 
> spark.catalog.cacheTable("dailyCached")
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show()
>  
> On spark 2.2 I see queries times average 13 seconds
> On the same nodes I see spark 2.3 queries times average 17 seconds
> Note these are times of queries after the initial caching.  so just running 
> the last line again: 
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show() 
> multiple times.
>  
> I also ran a query over more data (335GB input/587.5 GB cached) and saw a 
> similar discrepancy in the performance of querying cached data between spark 
> 2.3 and spark 2.2, where 2.2 was better by like 20%.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350900#comment-16350900
 ] 

Thomas Graves commented on SPARK-23309:
---

So the last test I did was spark 2.3 with the old hive path and spark 2.2.  
Spark 2.3 is slower then spark 2.2 reading the cached data.

[~smilegator] I already tried the patch, see the last config I tested with 
where -conf spark.sql.inMemoryColumnarStorage.enableVectorizedReader=false

> Spark 2.3 cached query performance 20-30% worse then spark 2.2
> --
>
> Key: SPARK-23309
> URL: https://issues.apache.org/jira/browse/SPARK-23309
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was testing spark 2.3 rc2 and I am seeing a performance regression in sql 
> queries on cached data.
> The size of the data: 10.4GB input from hive orc files /18.8 GB cached/5592 
> partitions
> Here is the example query:
> val dailycached = spark.sql("select something from table where dt = 
> '20170301' AND something IS NOT NULL")
> dailycached.createOrReplaceTempView("dailycached") 
> spark.catalog.cacheTable("dailyCached")
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show()
>  
> On spark 2.2 I see queries times average 13 seconds
> On the same nodes I see spark 2.3 queries times average 17 seconds
> Note these are times of queries after the initial caching.  so just running 
> the last line again: 
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show() 
> multiple times.
>  
> I also ran a query over more data (335GB input/587.5 GB cached) and saw a 
> similar discrepancy in the performance of querying cached data between spark 
> 2.3 and spark 2.2, where 2.2 was better by like 20%.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350813#comment-16350813
 ] 

Thomas Graves edited comment on SPARK-23309 at 2/2/18 7:04 PM:
---

I'm still seeing spark 2.3 slower by about 15% for the larger dataset. (times 
here are 301 seconds on 2.2 vs 346 seconds on 2.3)  I tried 

=> --conf spark.sql.orc.impl=hive --conf spark.sql.orc.filterPushdown=true 
--conf spark.sql.hive.convertMetastoreOrc=false

and then also tried setting the vectoried reader to false

=> --conf spark.sql.orc.impl=hive --conf spark.sql.orc.filterPushdown=true 
--conf spark.sql.hive.convertMetastoreOrc=false- -conf 
spark.sql.inMemoryColumnarStorage.enableVectorizedReader=false

Note the # of partitions its processing is now the same since turning off the 
native orc impl.

 


was (Author: tgraves):
I'm still seeing spark 2.3 slower by about 15% for the larger dataset.  I tried 

=> --conf spark.sql.orc.impl=hive --conf spark.sql.orc.filterPushdown=true 
--conf spark.sql.hive.convertMetastoreOrc=false

and then also tried setting the vectoried reader to false

=> --conf spark.sql.orc.impl=hive --conf spark.sql.orc.filterPushdown=true 
--conf spark.sql.hive.convertMetastoreOrc=false --conf 
spark.sql.inMemoryColumnarStorage.enableVectorizedReader=false

Note the # of partitions its processing is now the same since turning off the 
native orc impl.

 

> Spark 2.3 cached query performance 20-30% worse then spark 2.2
> --
>
> Key: SPARK-23309
> URL: https://issues.apache.org/jira/browse/SPARK-23309
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was testing spark 2.3 rc2 and I am seeing a performance regression in sql 
> queries on cached data.
> The size of the data: 10.4GB input from hive orc files /18.8 GB cached/5592 
> partitions
> Here is the example query:
> val dailycached = spark.sql("select something from table where dt = 
> '20170301' AND something IS NOT NULL")
> dailycached.createOrReplaceTempView("dailycached") 
> spark.catalog.cacheTable("dailyCached")
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show()
>  
> On spark 2.2 I see queries times average 13 seconds
> On the same nodes I see spark 2.3 queries times average 17 seconds
> Note these are times of queries after the initial caching.  so just running 
> the last line again: 
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show() 
> multiple times.
>  
> I also ran a query over more data (335GB input/587.5 GB cached) and saw a 
> similar discrepancy in the performance of querying cached data between spark 
> 2.3 and spark 2.2, where 2.2 was better by like 20%.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350813#comment-16350813
 ] 

Thomas Graves commented on SPARK-23309:
---

I'm still seeing spark 2.3 slower by about 15% for the larger dataset.  I tried 

=> --conf spark.sql.orc.impl=hive --conf spark.sql.orc.filterPushdown=true 
--conf spark.sql.hive.convertMetastoreOrc=false

and then also tried setting the vectoried reader to false

=> --conf spark.sql.orc.impl=hive --conf spark.sql.orc.filterPushdown=true 
--conf spark.sql.hive.convertMetastoreOrc=false --conf 
spark.sql.inMemoryColumnarStorage.enableVectorizedReader=false

Note the # of partitions its processing is now the same since turning off the 
native orc impl.

 

> Spark 2.3 cached query performance 20-30% worse then spark 2.2
> --
>
> Key: SPARK-23309
> URL: https://issues.apache.org/jira/browse/SPARK-23309
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was testing spark 2.3 rc2 and I am seeing a performance regression in sql 
> queries on cached data.
> The size of the data: 10.4GB input from hive orc files /18.8 GB cached/5592 
> partitions
> Here is the example query:
> val dailycached = spark.sql("select something from table where dt = 
> '20170301' AND something IS NOT NULL")
> dailycached.createOrReplaceTempView("dailycached") 
> spark.catalog.cacheTable("dailyCached")
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show()
>  
> On spark 2.2 I see queries times average 13 seconds
> On the same nodes I see spark 2.3 queries times average 17 seconds
> Note these are times of queries after the initial caching.  so just running 
> the last line again: 
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show() 
> multiple times.
>  
> I also ran a query over more data (335GB input/587.5 GB cached) and saw a 
> similar discrepancy in the performance of querying cached data between spark 
> 2.3 and spark 2.2, where 2.2 was better by like 20%.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350533#comment-16350533
 ] 

Thomas Graves commented on SPARK-23309:
---

Note the schema of "something" here is a "string".

I'll try with the changes in SPARK-23312 and turn off the vectorized cache 
reader.

I'm also running  2.3 with the configs --conf spark.sql.orc.impl=hive --conf 
spark.sql.orc.filterPushdown=true --conf 
spark.sql.hive.convertMetastoreOrc=false which should be the same as 2.2 and it 
gives me the same # of partitions

> Spark 2.3 cached query performance 20-30% worse then spark 2.2
> --
>
> Key: SPARK-23309
> URL: https://issues.apache.org/jira/browse/SPARK-23309
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was testing spark 2.3 rc2 and I am seeing a performance regression in sql 
> queries on cached data.
> The size of the data: 10.4GB input from hive orc files /18.8 GB cached/5592 
> partitions
> Here is the example query:
> val dailycached = spark.sql("select something from table where dt = 
> '20170301' AND something IS NOT NULL")
> dailycached.createOrReplaceTempView("dailycached") 
> spark.catalog.cacheTable("dailyCached")
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show()
>  
> On spark 2.2 I see queries times average 13 seconds
> On the same nodes I see spark 2.3 queries times average 17 seconds
> Note these are times of queries after the initial caching.  so just running 
> the last line again: 
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show() 
> multiple times.
>  
> I also ran a query over more data (335GB input/587.5 GB cached) and saw a 
> similar discrepancy in the performance of querying cached data between spark 
> 2.3 and spark 2.2, where 2.2 was better by like 20%.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-02 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-23304.
---
Resolution: Invalid

> Spark SQL coalesce() against hive not working
> -
>
> Key: SPARK-23304
> URL: https://issues.apache.org/jira/browse/SPARK-23304
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.1, 2.3.0
>Reporter: Thomas Graves
>Assignee: Xiao Li
>Priority: Major
> Attachments: spark22_oldorc_explain.txt, spark23_oldorc_explain.txt, 
> spark23_oldorc_explain_convermetastoreorcfalse.txt
>
>
> The query below seems to ignore the coalesce. This is running spark 2.2 or 
> spark 2.3 against hive, which is reading orc:
>  
>  Query:
>  spark.sql("SELECT COUNT(DISTINCT(something)) FROM sometable WHERE dt >= 
> '20170301' AND dt <= '20170331' AND something IS NOT 
> NULL").coalesce(16).show()
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350440#comment-16350440
 ] 

Thomas Graves commented on SPARK-23304:
---

ok so I guess by that logic then the coalesce won't every work with the 
COUNT(DISTINCT()) since its the intermediate query I want it to apply to, it 
will work on the select bcookie. 

I tested that and verified. 

spark.sql("SELECT something FROM sometable WHERE dt >= '20170301' AND dt <= 
'20170331' AND something IS NOT NULL").coalesce(8).show()

Actually works then.

So I guess we can close this it was my misunderstanding.

> Spark SQL coalesce() against hive not working
> -
>
> Key: SPARK-23304
> URL: https://issues.apache.org/jira/browse/SPARK-23304
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.1, 2.3.0
>Reporter: Thomas Graves
>Assignee: Xiao Li
>Priority: Major
> Attachments: spark22_oldorc_explain.txt, spark23_oldorc_explain.txt, 
> spark23_oldorc_explain_convermetastoreorcfalse.txt
>
>
> The query below seems to ignore the coalesce. This is running spark 2.2 or 
> spark 2.3 against hive, which is reading orc:
>  
>  Query:
>  spark.sql("SELECT COUNT(DISTINCT(something)) FROM sometable WHERE dt >= 
> '20170301' AND dt <= '20170331' AND something IS NOT 
> NULL").coalesce(16).show()
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350423#comment-16350423
 ] 

Thomas Graves commented on SPARK-23304:
---

it doesn't look like sql("xyz").rdd.partitions.length comes back correct in 
either spark 2.2 or 2.3.  

But if I change the query from SELECT COUNT(DISTINCT(bcookie)) . to just SELECT 
bookie, then the partitions.length works.  So perhaps is something with the 
count

 

spark 2.3 SELECT COUNT(DISTINCT(bcookie))

scala> query.rdd.partitions.length
res4: Int = 1

scala> query.count()
[Stage 5:===> (15420 + 619) / 16039]

 

spark 2.2 SELECT COUNT(DISTINCT(bcookie)):

scala> query.rdd.partitions.length
res0: Int = 1

scala> query.count()
[Stage 0:==> (1136 + 1600) / 5346]

 

spark 2.2 Query with just select bcookie:

scala> query.rdd.partitions.length
res1: Int = 5346

spark 2.3 Query with just select bcookie:

scala> query.rdd.partitions.length
res9: Int = 16039

 

Note if I change to just be SELECT DISTINCT(bcookie) then I get 200:

scala> query.rdd.partitions.length
res10: Int = 200

 

> Spark SQL coalesce() against hive not working
> -
>
> Key: SPARK-23304
> URL: https://issues.apache.org/jira/browse/SPARK-23304
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.1, 2.3.0
>Reporter: Thomas Graves
>Assignee: Xiao Li
>Priority: Major
> Attachments: spark22_oldorc_explain.txt, spark23_oldorc_explain.txt, 
> spark23_oldorc_explain_convermetastoreorcfalse.txt
>
>
> The query below seems to ignore the coalesce. This is running spark 2.2 or 
> spark 2.3 against hive, which is reading orc:
>  
>  Query:
>  spark.sql("SELECT COUNT(DISTINCT(something)) FROM sometable WHERE dt >= 
> '20170301' AND dt <= '20170331' AND something IS NOT 
> NULL").coalesce(16).show()
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-02 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350428#comment-16350428
 ] 

Thomas Graves commented on SPARK-23304:
---

well I guess that give you end # of partitions and not the # it will be 
initially reading

> Spark SQL coalesce() against hive not working
> -
>
> Key: SPARK-23304
> URL: https://issues.apache.org/jira/browse/SPARK-23304
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.1, 2.3.0
>Reporter: Thomas Graves
>Assignee: Xiao Li
>Priority: Major
> Attachments: spark22_oldorc_explain.txt, spark23_oldorc_explain.txt, 
> spark23_oldorc_explain_convermetastoreorcfalse.txt
>
>
> The query below seems to ignore the coalesce. This is running spark 2.2 or 
> spark 2.3 against hive, which is reading orc:
>  
>  Query:
>  spark.sql("SELECT COUNT(DISTINCT(something)) FROM sometable WHERE dt >= 
> '20170301' AND dt <= '20170331' AND something IS NOT 
> NULL").coalesce(16).show()
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349555#comment-16349555
 ] 

Thomas Graves commented on SPARK-23304:
---

I don't have any hive tables backed by parquet to compare to.

> Spark SQL coalesce() against hive not working
> -
>
> Key: SPARK-23304
> URL: https://issues.apache.org/jira/browse/SPARK-23304
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.1, 2.3.0
>Reporter: Thomas Graves
>Assignee: Xiao Li
>Priority: Major
> Attachments: spark22_oldorc_explain.txt, spark23_oldorc_explain.txt, 
> spark23_oldorc_explain_convermetastoreorcfalse.txt
>
>
> The query below seems to ignore the coalesce. This is running spark 2.2 or 
> spark 2.3 against hive, which is reading orc:
>  
>  Query:
>  spark.sql("SELECT COUNT(DISTINCT(something)) FROM sometable WHERE dt >= 
> '20170301' AND dt <= '20170331' AND something IS NOT 
> NULL").coalesce(16).show()
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-01 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349553#comment-16349553
 ] 

Thomas Graves commented on SPARK-23309:
---

[~dongjoon] is there any native way with the native hive to control the # of 
partitions?  (like hive.exec.orc.split.strategy).  Or do you have to do the 
coalesce?

> Spark 2.3 cached query performance 20-30% worse then spark 2.2
> --
>
> Key: SPARK-23309
> URL: https://issues.apache.org/jira/browse/SPARK-23309
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was testing spark 2.3 rc2 and I am seeing a performance regression in sql 
> queries on cached data.
> The size of the data: 10.4GB input from hive orc files /18.8 GB cached/5592 
> partitions
> Here is the example query:
> val dailycached = spark.sql("select something from table where dt = 
> '20170301' AND something IS NOT NULL")
> dailycached.createOrReplaceTempView("dailycached") 
> spark.catalog.cacheTable("dailyCached")
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show()
>  
> On spark 2.2 I see queries times average 13 seconds
> On the same nodes I see spark 2.3 queries times average 17 seconds
> Note these are times of queries after the initial caching.  so just running 
> the last line again: 
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show() 
> multiple times.
>  
> I also ran a query over more data (335GB input/587.5 GB cached) and saw a 
> similar discrepancy in the performance of querying cached data between spark 
> 2.3 and spark 2.2, where 2.2 was better by like 20%.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-01 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349551#comment-16349551
 ] 

Thomas Graves commented on SPARK-23309:
---

seeing the same time difference after adding in the   
spark.table("dailyCached").count()

[~dongjoon] Correct this is only when read from cached data. Without caching 
spark 2.3 is quite a bit faster (1.5-2x+) then spark 2.2 when reading from hive 
using orc.  (which is awesome, thanks for all the work!)

 

I'm running now with --conf spark.sql.orc.impl=hive  --conf 
spark.sql.hive.convertMetastoreOrc=false.  For the smaller data set it did get 
closer, only 1 second diff on average between spark 2.2 and spark 2.3.  Trying 
to run on the larger dataset now.  I'm wondering if much of the difference is 
the larger # of partitions you get with hive native in spark 2.3

> Spark 2.3 cached query performance 20-30% worse then spark 2.2
> --
>
> Key: SPARK-23309
> URL: https://issues.apache.org/jira/browse/SPARK-23309
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was testing spark 2.3 rc2 and I am seeing a performance regression in sql 
> queries on cached data.
> The size of the data: 10.4GB input from hive orc files /18.8 GB cached/5592 
> partitions
> Here is the example query:
> val dailycached = spark.sql("select something from table where dt = 
> '20170301' AND something IS NOT NULL")
> dailycached.createOrReplaceTempView("dailycached") 
> spark.catalog.cacheTable("dailyCached")
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show()
>  
> On spark 2.2 I see queries times average 13 seconds
> On the same nodes I see spark 2.3 queries times average 17 seconds
> Note these are times of queries after the initial caching.  so just running 
> the last line again: 
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show() 
> multiple times.
>  
> I also ran a query over more data (335GB input/587.5 GB cached) and saw a 
> similar discrepancy in the performance of querying cached data between spark 
> 2.3 and spark 2.2, where 2.2 was better by like 20%.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-01 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349483#comment-16349483
 ] 

Thomas Graves commented on SPARK-23309:
---

sure, I can also run with the  --conf spark.sql.orc.impl=hive --conf 
spark.sql.orc.filterPushdown=false --conf 
spark.sql.hive.convertMetastoreOrc=false configs to make sure it doesn't just 
have to do with # of partitions

 

> Spark 2.3 cached query performance 20-30% worse then spark 2.2
> --
>
> Key: SPARK-23309
> URL: https://issues.apache.org/jira/browse/SPARK-23309
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was testing spark 2.3 rc2 and I am seeing a performance regression in sql 
> queries on cached data.
> The size of the data: 10.4GB input from hive orc files /18.8 GB cached/5592 
> partitions
> Here is the example query:
> val dailycached = spark.sql("select something from table where dt = 
> '20170301' AND something IS NOT NULL")
> dailycached.createOrReplaceTempView("dailycached") 
> spark.catalog.cacheTable("dailyCached")
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show()
>  
> On spark 2.2 I see queries times average 13 seconds
> On the same nodes I see spark 2.3 queries times average 17 seconds
> Note these are times of queries after the initial caching.  so just running 
> the last line again: 
> spark.sql("SELECT COUNT(DISTINCT(something)) from dailycached").show() 
> multiple times.
>  
> I also ran a query over more data (335GB input/587.5 GB cached) and saw a 
> similar discrepancy in the performance of querying cached data between spark 
> 2.3 and spark 2.2, where 2.2 was better by like 20%.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349480#comment-16349480
 ] 

Thomas Graves commented on SPARK-23304:
---

I just ran the query (show()) and saw the # of partitions. 

spark23_oldorc_explain_convermetastoreorcfalse.txt is the explain with  --conf 
spark.sql.orc.impl=hive --conf spark.sql.orc.filterPushdown=false --conf 
spark.sql.hive.convertMetastoreOrc=false

 

 

> Spark SQL coalesce() against hive not working
> -
>
> Key: SPARK-23304
> URL: https://issues.apache.org/jira/browse/SPARK-23304
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Assignee: Xiao Li
>Priority: Major
> Attachments: spark22_oldorc_explain.txt, spark23_oldorc_explain.txt, 
> spark23_oldorc_explain_convermetastoreorcfalse.txt
>
>
> The query below seems to ignore the coalesce. This is running spark 2.2 or 
> spark 2.3 against hive, which is reading orc:
>  
>  Query:
>  spark.sql("SELECT COUNT(DISTINCT(something)) FROM sometable WHERE dt >= 
> '20170301' AND dt <= '20170331' AND something IS NOT 
> NULL").coalesce(16).show()
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



<    3   4   5   6   7   8   9   10   11   12   >