[jira] [Created] (SPARK-1937) Tasks can be submitted before executors are registered

2014-05-27 Thread Rui Li (JIRA)
Rui Li created SPARK-1937: - Summary: Tasks can be submitted before executors are registered Key: SPARK-1937 URL: https://issues.apache.org/jira/browse/SPARK-1937 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-1937) Tasks can be submitted before executors are registered

2014-05-27 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated SPARK-1937: -- Attachment: Before-patch.png Tasks can be submitted before executors are registered

[jira] [Updated] (SPARK-1937) Tasks can be submitted before executors are registered

2014-05-27 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated SPARK-1937: -- Attachment: RSBTest.scala The program that triggers the problem. With the patch, the whole execution time of

[jira] [Created] (SPARK-2277) Make TaskScheduler track whether there's host on a rack

2014-06-25 Thread Rui Li (JIRA)
Rui Li created SPARK-2277: - Summary: Make TaskScheduler track whether there's host on a rack Key: SPARK-2277 URL: https://issues.apache.org/jira/browse/SPARK-2277 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2277) Make TaskScheduler track whether there's host on a rack

2014-07-02 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050951#comment-14050951 ] Rui Li commented on SPARK-2277: --- Suppose task1 prefers node1 but node1 is not available at

[jira] [Commented] (SPARK-2277) Make TaskScheduler track whether there's host on a rack

2014-07-02 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050952#comment-14050952 ] Rui Li commented on SPARK-2277: --- PR created at: https://github.com/apache/spark/pull/1212

[jira] [Commented] (SPARK-2277) Make TaskScheduler track whether there's host on a rack

2014-07-03 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052103#comment-14052103 ] Rui Li commented on SPARK-2277: --- With [PR #892|https://github.com/apache/spark/pull/892],

[jira] [Created] (SPARK-2387) Remove the stage barrier for better resource utilization

2014-07-07 Thread Rui Li (JIRA)
Rui Li created SPARK-2387: - Summary: Remove the stage barrier for better resource utilization Key: SPARK-2387 URL: https://issues.apache.org/jira/browse/SPARK-2387 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-2387) Remove the stage barrier for better resource utilization

2014-07-08 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054862#comment-14054862 ] Rui Li commented on SPARK-2387: --- PR created at https://github.com/apache/spark/pull/1328

[jira] [Created] (SPARK-2740) In JavaPairRdd, allow user to specify ascending and numPartitions for sortByKey

2014-07-29 Thread Rui Li (JIRA)
Rui Li created SPARK-2740: - Summary: In JavaPairRdd, allow user to specify ascending and numPartitions for sortByKey Key: SPARK-2740 URL: https://issues.apache.org/jira/browse/SPARK-2740 Project: Spark

[jira] [Commented] (SPARK-2387) Remove the stage barrier for better resource utilization

2014-07-29 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078798#comment-14078798 ] Rui Li commented on SPARK-2387: --- Right, thanks [~joshrosen] for pointing out. This is just

[jira] [Commented] (SPARK-2636) no where to get job identifier while submit spark job through spark API

2014-08-25 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110172#comment-14110172 ] Rui Li commented on SPARK-2636: --- Just want to make sure I understand everything correctly:

[jira] [Commented] (SPARK-2321) Design a proper progress reporting event listener API

2014-11-14 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212000#comment-14212000 ] Rui Li commented on SPARK-2321: --- Hi [~joshrosen], The new API is quite useful. But the

[jira] [Created] (SPARK-4440) Enhance the job progress API to expose more information

2014-11-16 Thread Rui Li (JIRA)
Rui Li created SPARK-4440: - Summary: Enhance the job progress API to expose more information Key: SPARK-4440 URL: https://issues.apache.org/jira/browse/SPARK-4440 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2321) Design a proper progress reporting event listener API

2014-11-16 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14214223#comment-14214223 ] Rui Li commented on SPARK-2321: --- Hey [~joshrosen], Thanks a lot for the update! I created

[jira] [Commented] (SPARK-2321) Design a proper progress reporting event listener API

2014-11-18 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217485#comment-14217485 ] Rui Li commented on SPARK-2321: --- Hi [~joshrosen], Shall we make {{SparkJobInfo}} and

[jira] [Commented] (SPARK-4921) Performance issue caused by TaskSetManager returning PROCESS_LOCAL for NO_PREF tasks

2014-12-22 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256425#comment-14256425 ] Rui Li commented on SPARK-4921: --- I'm not sure if this is intended, but returning

[jira] [Commented] (SPARK-2621) Update task InputMetrics incrementally

2015-01-08 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14270714#comment-14270714 ] Rui Li commented on SPARK-2621: --- Hey [~sandyr], it seems after this change we require the

[jira] [Created] (SPARK-5080) Expose more cluster resource information to user

2015-01-04 Thread Rui Li (JIRA)
Rui Li created SPARK-5080: - Summary: Expose more cluster resource information to user Key: SPARK-5080 URL: https://issues.apache.org/jira/browse/SPARK-5080 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-7081) Faster sort-based shuffle path using binary processing cache-aware sort

2015-05-28 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562386#comment-14562386 ] Rui Li commented on SPARK-7081: --- Hi [~joshrosen], requiring the dependency having no

[jira] [Commented] (SPARK-4440) Enhance the job progress API to expose more information

2015-09-17 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791732#comment-14791732 ] Rui Li commented on SPARK-4440: --- For Hive on Spark, we want completion time for each stage so we can compute

[jira] [Created] (SPARK-14958) Failed task hangs if error is encountered when getting task result

2016-04-27 Thread Rui Li (JIRA)
Rui Li created SPARK-14958: -- Summary: Failed task hangs if error is encountered when getting task result Key: SPARK-14958 URL: https://issues.apache.org/jira/browse/SPARK-14958 Project: Spark

[jira] [Commented] (SPARK-14958) Failed task hangs if error is encountered when getting task result

2016-10-09 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561094#comment-15561094 ] Rui Li commented on SPARK-14958: I hit the issue with Spark 1.6.1 > Failed task hangs if error is

[jira] [Created] (SPARK-24116) SparkSQL inserting overwrite table has inconsistent behavior regarding HDFS trash

2018-04-27 Thread Rui Li (JIRA)
Rui Li created SPARK-24116: -- Summary: SparkSQL inserting overwrite table has inconsistent behavior regarding HDFS trash Key: SPARK-24116 URL: https://issues.apache.org/jira/browse/SPARK-24116 Project: Spark

[jira] [Created] (SPARK-24387) Heartbeat-timeout executor is added back and used again

2018-05-25 Thread Rui Li (JIRA)
Rui Li created SPARK-24387: -- Summary: Heartbeat-timeout executor is added back and used again Key: SPARK-24387 URL: https://issues.apache.org/jira/browse/SPARK-24387 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-24387) Heartbeat-timeout executor is added back and used again

2018-05-25 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490293#comment-16490293 ] Rui Li commented on SPARK-24387: A snippet of the log w/ some fields masked: {noformat} [Stage

[jira] [Commented] (SPARK-24387) Heartbeat-timeout executor is added back and used again

2018-05-25 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490309#comment-16490309 ] Rui Li commented on SPARK-24387: When HeartbeatReceiver finds the executor's heartbeat is timeout, it

[jira] [Commented] (SPARK-24387) Heartbeat-timeout executor is added back and used again

2018-06-11 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509214#comment-16509214 ] Rui Li commented on SPARK-24387: Yes, blacklisting can be used to avoid the issue. But blacklist can be

[jira] [Commented] (SPARK-24387) Heartbeat-timeout executor is added back and used again

2018-05-28 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16492563#comment-16492563 ] Rui Li commented on SPARK-24387: Instead of let HeartbeatReceiver tell TaskScheduler the executor is

[jira] [Commented] (SPARK-24116) SparkSQL inserting overwrite table has inconsistent behavior regarding HDFS trash

2018-05-01 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460472#comment-16460472 ] Rui Li commented on SPARK-24116: [~hyukjin.kwon], sorry for the late response. For example, assume we

[jira] [Commented] (SPARK-24116) SparkSQL inserting overwrite table has inconsistent behavior regarding HDFS trash

2018-05-03 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462040#comment-16462040 ] Rui Li commented on SPARK-24116: To reproduce: {code} create table test_text(x int); insert overwrite

[jira] [Created] (SPARK-25085) Insert overwrite a non-partitioned table can delete table folder

2018-08-10 Thread Rui Li (JIRA)
Rui Li created SPARK-25085: -- Summary: Insert overwrite a non-partitioned table can delete table folder Key: SPARK-25085 URL: https://issues.apache.org/jira/browse/SPARK-25085 Project: Spark Issue

[jira] [Created] (SPARK-24010) Select from table needs read access on DB folder when storage based auth is enabled

2018-04-18 Thread Rui Li (JIRA)
Rui Li created SPARK-24010: -- Summary: Select from table needs read access on DB folder when storage based auth is enabled Key: SPARK-24010 URL: https://issues.apache.org/jira/browse/SPARK-24010 Project:

[jira] [Commented] (SPARK-24010) Select from table needs read access on DB folder when storage based auth is enabled

2018-04-18 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442146#comment-16442146 ] Rui Li commented on SPARK-24010: Hi [~rxin], I think checking databaseExists is added in SPARK-14869. Do

[jira] [Commented] (SPARK-24010) Select from table needs read access on DB folder when storage based auth is enabled

2018-04-21 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446761#comment-16446761 ] Rui Li commented on SPARK-24010: [~rxin], thanks for your reply. I noted that InMemoryCatalog even