[jira] [Commented] (SPARK-4921) Performance issue caused by TaskSetManager returning PROCESS_LOCAL for NO_PREF tasks

2014-12-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261384#comment-14261384
 ] 

Sandy Ryza commented on SPARK-4921:
---

Ah, makes sense.  In the query, are some splits NODE_LOCAL and others NO_PREF?  
Or all NO_PREF?

Looking deeper into the issue, as far as I can tell, changing the return value 
to NO_PREF as described above should have no effect at all in any scenario.

 Performance issue caused by TaskSetManager returning  PROCESS_LOCAL for 
 NO_PREF tasks
 -

 Key: SPARK-4921
 URL: https://issues.apache.org/jira/browse/SPARK-4921
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.2.0
Reporter: Xuefu Zhang
 Attachments: NO_PREF.patch


 During research for HIVE-9153, we found that TaskSetManager returns 
 PROCESS_LOCAL for NO_PREF tasks, which may caused performance degradation. 
 Changing the return value to NO_PREF, as demonstrated in the attached patch, 
 seemingly improves the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4921) Performance issue caused by TaskSetManager returning PROCESS_LOCAL for NO_PREF tasks

2014-12-30 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261436#comment-14261436
 ] 

Xuefu Zhang commented on SPARK-4921:


Some will be NODE_LOCAL, but others will be NO_PERF. Returning PROCESS_LOCAL 
seems at least confusing. As to performance implication, maybe [~lirui] can 
further confirm.



 Performance issue caused by TaskSetManager returning  PROCESS_LOCAL for 
 NO_PREF tasks
 -

 Key: SPARK-4921
 URL: https://issues.apache.org/jira/browse/SPARK-4921
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.2.0
Reporter: Xuefu Zhang
 Attachments: NO_PREF.patch


 During research for HIVE-9153, we found that TaskSetManager returns 
 PROCESS_LOCAL for NO_PREF tasks, which may caused performance degradation. 
 Changing the return value to NO_PREF, as demonstrated in the attached patch, 
 seemingly improves the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4921) Performance issue caused by TaskSetManager returning PROCESS_LOCAL for NO_PREF tasks

2014-12-29 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260322#comment-14260322
 ] 

Sandy Ryza commented on SPARK-4921:
---

Offline [~xuefuz] and [~lirui] mentioned to me that the query they noticed this 
with was
{code}
select count(*) from store_sales where ss_sold_date_sk is not null;
{code}

 Performance issue caused by TaskSetManager returning  PROCESS_LOCAL for 
 NO_PREF tasks
 -

 Key: SPARK-4921
 URL: https://issues.apache.org/jira/browse/SPARK-4921
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.2.0
Reporter: Xuefu Zhang
 Attachments: NO_PREF.patch


 During research for HIVE-9153, we found that TaskSetManager returns 
 PROCESS_LOCAL for NO_PREF tasks, which may caused performance degradation. 
 Changing the return value to NO_PREF, as demonstrated in the attached patch, 
 seemingly improves the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4921) Performance issue caused by TaskSetManager returning PROCESS_LOCAL for NO_PREF tasks

2014-12-29 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260325#comment-14260325
 ] 

Sandy Ryza commented on SPARK-4921:
---

[~xuefuz] [~lirui] was that query against data not in HDFS?  If the data is in 
HDFS, we'd expect NODE_LOCAL, not NO_PREF, right?

 Performance issue caused by TaskSetManager returning  PROCESS_LOCAL for 
 NO_PREF tasks
 -

 Key: SPARK-4921
 URL: https://issues.apache.org/jira/browse/SPARK-4921
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.2.0
Reporter: Xuefu Zhang
 Attachments: NO_PREF.patch


 During research for HIVE-9153, we found that TaskSetManager returns 
 PROCESS_LOCAL for NO_PREF tasks, which may caused performance degradation. 
 Changing the return value to NO_PREF, as demonstrated in the attached patch, 
 seemingly improves the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4921) Performance issue caused by TaskSetManager returning PROCESS_LOCAL for NO_PREF tasks

2014-12-27 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259562#comment-14259562
 ] 

Apache Spark commented on SPARK-4921:
-

User 'sryza' has created a pull request for this issue:
https://github.com/apache/spark/pull/3816

 Performance issue caused by TaskSetManager returning  PROCESS_LOCAL for 
 NO_PREF tasks
 -

 Key: SPARK-4921
 URL: https://issues.apache.org/jira/browse/SPARK-4921
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.2.0
Reporter: Xuefu Zhang
 Attachments: NO_PREF.patch


 During research for HIVE-9153, we found that TaskSetManager returns 
 PROCESS_LOCAL for NO_PREF tasks, which may caused performance degradation. 
 Changing the return value to NO_PREF, as demonstrated in the attached patch, 
 seemingly improves the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4921) Performance issue caused by TaskSetManager returning PROCESS_LOCAL for NO_PREF tasks

2014-12-23 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14257940#comment-14257940
 ] 

Sandy Ryza commented on SPARK-4921:
---

Is there a barebones Spark program that I could use to reproduce this?

 Performance issue caused by TaskSetManager returning  PROCESS_LOCAL for 
 NO_PREF tasks
 -

 Key: SPARK-4921
 URL: https://issues.apache.org/jira/browse/SPARK-4921
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.2.0
Reporter: Xuefu Zhang
 Attachments: NO_PREF.patch


 During research for HIVE-9153, we found that TaskSetManager returns 
 PROCESS_LOCAL for NO_PREF tasks, which may caused performance degradation. 
 Changing the return value to NO_PREF, as demonstrated in the attached patch, 
 seemingly improves the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4921) Performance issue caused by TaskSetManager returning PROCESS_LOCAL for NO_PREF tasks

2014-12-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256015#comment-14256015
 ] 

Xuefu Zhang commented on SPARK-4921:


cc: [~lirui], [~sandyr]

 Performance issue caused by TaskSetManager returning  PROCESS_LOCAL for 
 NO_PREF tasks
 -

 Key: SPARK-4921
 URL: https://issues.apache.org/jira/browse/SPARK-4921
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.2.0
Reporter: Xuefu Zhang

 During research for HIVE-9153, we found that TaskSetManager returns 
 PROCESS_LOCAL for NO_PREF tasks, which may caused performance degradation. 
 Changing the return value to NO_PREF, as demonstrated in the attached patch, 
 seemingly improves the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4921) Performance issue caused by TaskSetManager returning PROCESS_LOCAL for NO_PREF tasks

2014-12-22 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256425#comment-14256425
 ] 

Rui Li commented on SPARK-4921:
---

I'm not sure if this is intended, but returning process_local for no_pref tasks 
may reset {{currentLocalityIndex}} to 0 which may cause more delay later. Seems 
there's a check to avoid this but I doubt it's sufficient:
{code}
  // Update our locality level for delay scheduling
  // NO_PREF will not affect the variables related to delay scheduling
  if (maxLocality != TaskLocality.NO_PREF) {
currentLocalityIndex = getLocalityIndex(taskLocality)
lastLaunchTime = curTime
  }
{code}

 Performance issue caused by TaskSetManager returning  PROCESS_LOCAL for 
 NO_PREF tasks
 -

 Key: SPARK-4921
 URL: https://issues.apache.org/jira/browse/SPARK-4921
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.2.0
Reporter: Xuefu Zhang
 Attachments: NO_PREF.patch


 During research for HIVE-9153, we found that TaskSetManager returns 
 PROCESS_LOCAL for NO_PREF tasks, which may caused performance degradation. 
 Changing the return value to NO_PREF, as demonstrated in the attached patch, 
 seemingly improves the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org