[GitHub] spark pull request #16440: [SPARK-18857][SQL] Don't use `Iterator.duplicate`...

2017-01-05 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16440#discussion_r94826910 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- @@ -50,8 +50,8 @@ private[hive

[GitHub] spark pull request #16481: [SPARK-19092] [SQL] Save() API of DataFrameWriter...

2017-01-05 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16481#discussion_r94886105 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/PartitionedTablePerfStatsSuite.scala --- @@ -88,17 +83,12 @@ class PartitionedTablePerfStatsSuite

[GitHub] spark pull request #16481: [SPARK-19092] [SQL] Save() API of DataFrameWriter...

2017-01-05 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16481#discussion_r94886317 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -494,8 +500,13 @@ case class DataSource

[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...

2017-01-03 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16460#discussion_r94456738 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -473,22 +473,26 @@ case class DataSource

[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...

2017-01-03 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16460#discussion_r94457115 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -393,7 +393,9 @@ final class DataFrameWriter[T] private[sql](ds: Dataset

[GitHub] spark issue #16460: [SPARK-19058][SQL] fix partition related behaviors with ...

2017-01-04 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/16460 looks good --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...

2017-01-04 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16460#discussion_r94542321 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala --- @@ -74,12 +69,29 @@ case class

[GitHub] spark pull request #15539: [SPARK-17994] [SQL] Add back a file status cache ...

2017-01-08 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/15539#discussion_r95074015 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileStatusCache.scala --- @@ -0,0 +1,149 @@ +/* + * Licensed

[GitHub] spark pull request #15539: [SPARK-17994] [SQL] Add back a file status cache ...

2017-01-07 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/15539#discussion_r95073488 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileStatusCache.scala --- @@ -0,0 +1,149 @@ +/* + * Licensed

[GitHub] spark pull request #16424: [SPARK-19016][SQL][DOC] Document scalable partiti...

2016-12-29 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16424#discussion_r94170820 --- Diff: docs/sql-programming-guide.md --- @@ -526,11 +526,18 @@ By default `saveAsTable` will create a "managed table", meaning that t

[GitHub] spark pull request #16424: [SPARK-19016][SQL][DOC] Document scalable partiti...

2016-12-29 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16424#discussion_r94170722 --- Diff: docs/sql-programming-guide.md --- @@ -515,7 +515,7 @@ new data. ### Saving to Persistent Tables `DataFrames` can also be saved

[GitHub] spark pull request #16424: [SPARK-19016][SQL][DOC] Document scalable partiti...

2016-12-29 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16424#discussion_r94170967 --- Diff: docs/sql-programming-guide.md --- @@ -526,11 +526,18 @@ By default `saveAsTable` will create a "managed table", meaning that t

[GitHub] spark pull request #16424: [SPARK-19016][SQL][DOC] Document scalable partiti...

2016-12-29 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16424#discussion_r94187848 --- Diff: docs/sql-programming-guide.md --- @@ -526,11 +526,18 @@ By default `saveAsTable` will create a "managed table", meaning that t

[GitHub] spark issue #16424: [SPARK-19016][SQL][DOC] Document scalable partition hand...

2016-12-29 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/16424 LGTM, just one comment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #16460: [SPARK-19058][SQL] fix partition related behavior...

2017-01-03 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16460#discussion_r94536423 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala --- @@ -74,12 +69,29 @@ case class

[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2017-01-08 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/15539 That one is safe to make global but mutable right? It will take effect after a table is refreshed. Most of these anomalies seem OK to me provided we document them -- it seems to solve

[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2017-01-08 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/15539 Hm, what use cases are we trying to address? As I understand, the worst that can happen if the cache size flag is toggled at runtime is that the old settings might still apply. And when

[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2017-01-09 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/15539 Hmm, I don't think fileStatusCache can ever return incorrect results, only stale ones. Furthermore, its scoped by client-id to particular instances of tables, so refresh table is guaranteed to wipe

[GitHub] spark issue #16514: [SPARK-19128] [SQL] Refresh Cache after Set Location

2017-01-09 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/16514 Do you know why this check in the relation cache that the root paths have not changed is not sufficient? https://github.com/apache/spark/blob/24482858e05bea84cacb41c62be0a9aaa33897ee/sql/hive/src

[GitHub] spark issue #16514: [SPARK-19128] [SQL] Refresh Cache after Set Location

2017-01-09 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/16514 I see. What do you think about adding that check in the caching code rather than require invalidation calls? After all, the SET LOCATION may be issued by a separate Spark cluster connecting

[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/16350 yeah, i don't think we need the unit test for 2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16326: [SPARK-18915] [SQL] Automatic Table Repair when Creating...

2016-12-19 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/16326 Oh I see, you're saying if there are old files for the partition, the INSERT INTO will cause those to become visible. That is a little confusing. --- If your project is set up for it, you can reply

[GitHub] spark pull request #16341: [SQL] [WIP] Switch internal catalog types to use ...

2016-12-19 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/16341 [SQL] [WIP] Switch internal catalog types to use URI instead of string for locationUri ## What changes were proposed in this pull request? This should help prevent accidental incorrect

[GitHub] spark issue #16326: [SPARK-18915] [SQL] Automatic Table Repair when Creating...

2016-12-19 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/16326 > hive> select * from test; >OK >ddda >c a Isn't this showing that hive is appending to the table (ddd, a) as expected with INSERT INTO? For the (

[GitHub] spark issue #16122: [SPARK-18681][SQL] Fix filtering to compatible with part...

2016-12-07 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/16122 I see. In that case I think manual testing may be sufficient. On Wed, Dec 7, 2016, 5:00 PM Michael Allman <notificati...@github.com> wrote: > I think that's exactly wha

[GitHub] spark issue #16135: [SPARK-18700][SQL] Add ReadWriteLock for each table's re...

2016-12-07 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/16135 Isn't it sufficient to lock around the `catalog.filterPartitions(Nil)`? Why do we need reader locks? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-23 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r10651 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -467,7 +474,7 @@ private[spark] class TaskSchedulerImpl private

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-23 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107780463 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -467,7 +474,7 @@ private[spark] class TaskSchedulerImpl private

[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...

2017-03-21 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/17166 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-22 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107534830 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -467,7 +474,7 @@ private[spark] class TaskSchedulerImpl private

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-22 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107559290 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -296,12 +298,13 @@ private[spark] class Executor

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-22 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107559342 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -239,14 +239,26 @@ private[spark] class TaskSchedulerImpl private

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-22 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107542763 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -160,15 +160,20 @@ private[spark] abstract class Task[T]( // A flag

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-22 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107542944 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -239,14 +239,21 @@ private[spark] class TaskSchedulerImpl private

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-22 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107543449 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -239,14 +239,21 @@ private[spark] class TaskSchedulerImpl private

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-22 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107541631 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -59,8 +59,8 @@ private[spark] class TaskContextImpl( /** List of callback

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-22 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107542020 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala --- @@ -215,7 +215,7 @@ private[spark] class PythonRunner

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-22 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107542655 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -160,15 +160,20 @@ private[spark] abstract class Task[T]( // A flag

[GitHub] spark pull request #17475: [SPARK-20148] [SQL] Extend the file commit API to...

2017-03-29 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/17475 [SPARK-20148] [SQL] Extend the file commit API to allow subscribing to task commit messages ## What changes were proposed in this pull request? The internal FileCommitProtocol interface

[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...

2017-03-16 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/17166 Rebased --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...

2017-03-17 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/17166 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107070379 --- Diff: core/src/main/scala/org/apache/spark/ui/UIUtils.scala --- @@ -354,7 +354,7 @@ private[spark] object UIUtils extends Logging

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107070457 --- Diff: core/src/test/scala/org/apache/spark/SparkContextSuite.scala --- @@ -540,6 +540,39 @@ class SparkContextSuite extends SparkFunSuite

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107070597 --- Diff: core/src/test/scala/org/apache/spark/SparkContextSuite.scala --- @@ -540,6 +540,39 @@ class SparkContextSuite extends SparkFunSuite

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107070165 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -467,7 +474,7 @@ private[spark] class TaskSchedulerImpl private

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107073269 --- Diff: core/src/main/scala/org/apache/spark/TaskEndReason.scala --- @@ -212,8 +212,8 @@ case object TaskResultLost extends TaskFailedReason { * Task

[GitHub] spark pull request #16341: [SQL] [WIP] Switch internal catalog types to use ...

2017-03-15 Thread ericl
Github user ericl closed the pull request at: https://github.com/apache/spark/pull/16341 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...

2017-03-16 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/17166 Test failure seems unrelated. jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-21 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107274054 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -467,7 +474,7 @@ private[spark] class TaskSchedulerImpl private

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-21 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107271262 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -140,16 +140,22 @@ private[spark] class TaskContextImpl

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-21 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107273498 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -160,15 +160,20 @@ private[spark] abstract class Task[T]( // A flag

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-21 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107273296 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -302,12 +298,12 @@ private[spark] class Executor

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-21 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107272852 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala --- @@ -215,7 +215,8 @@ private[spark] class PythonRunner

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-21 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r107271185 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -59,8 +59,8 @@ private[spark] class TaskContextImpl( /** List of callback

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106066177 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -710,7 +710,11 @@ private[spark] class TaskSetManager

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106051633 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -2250,6 +2250,25 @@ class SparkContext(config: SparkConf) extends Logging

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106051490 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -2250,6 +2250,25 @@ class SparkContext(config: SparkConf) extends Logging

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106051697 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -2250,6 +2250,25 @@ class SparkContext(config: SparkConf) extends Logging

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106060305 --- Diff: core/src/test/scala/org/apache/spark/SparkContextSuite.scala --- @@ -538,10 +538,37 @@ class SparkContextSuite extends SparkFunSuite

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106054370 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/UIData.scala --- @@ -64,7 +64,7 @@ private[spark] object UIData { var numCompletedTasks: Int

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106060636 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/executor/MesosExecutorBackend.scala --- @@ -104,7 +104,8 @@ private[spark] class

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106053942 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskScheduler.scala --- @@ -54,6 +54,9 @@ private[spark] trait TaskScheduler { // Cancel

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106053125 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -239,8 +244,9 @@ private[spark] class Executor( */ @volatile

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106054145 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -710,7 +710,11 @@ private[spark] class TaskSetManager

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106052824 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -168,7 +168,8 @@ private[spark] class Executor( case Some

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106060313 --- Diff: core/src/test/scala/org/apache/spark/SparkContextSuite.scala --- @@ -538,10 +538,37 @@ class SparkContextSuite extends SparkFunSuite

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106053726 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -732,6 +732,13 @@ class DAGScheduler

[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...

2017-03-14 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/17166 Drilling down into the detail view is kind of cumbersome -- I think it's most useful to have a good summary at the progress bar, and then the user can refer to logs for detailed per-task debugging

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-16 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106555729 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -2250,6 +2250,22 @@ class SparkContext(config: SparkConf) extends Logging

[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...

2017-03-16 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/17166 Made the change to improve the default reason, which now says "killed via SparkContext.killTaskAttempt". --- If your project is set up for it, you can reply to this email and have your re

[GitHub] spark pull request #17531: [SPARK-20217][core] Executor should not fail stag...

2017-04-04 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/17531 [SPARK-20217][core] Executor should not fail stage if killed task throws non-interrupted exception ## What changes were proposed in this pull request? If tasks throw non-interrupted

[GitHub] spark pull request #17531: [SPARK-20217][core] Executor should not fail stag...

2017-04-05 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17531#discussion_r109998390 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -432,7 +432,7 @@ private[spark] class Executor

[GitHub] spark issue #17659: [SPARK-20358] [core] Executors failing stage on interrup...

2017-04-19 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/17659 Ping. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17659: [SPARK-20358] [core] Executors failing stage on i...

2017-04-17 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/17659 [SPARK-20358] [core] Executors failing stage on interrupted exception thrown by cancelled tasks ## What changes were proposed in this pull request? This was a regression introduced by my

[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...

2017-04-17 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/15398 This seems to have broken the build in branch-2.1, e.g. https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/spark-branch-2.1-compile-maven-hadoop-2.6/591/consoleFull

[GitHub] spark issue #17623: [SPARK-20292][SQL] Clean up string representation of Tre...

2017-04-19 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/17623 Thanks for doing this, we recently hit an issue where O(n^2) sized expression tree-strings crashed the cluster and created many hundreds of gigabytes of log files. Could we also add a unit

[GitHub] spark pull request #17692: [SPARK-20398] [SQL] range() operator should inclu...

2017-04-19 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/17692 [SPARK-20398] [SQL] range() operator should include cancellation reason when killed ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-19820

[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...

2017-03-06 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/17166 Added `killTask(id: TaskId, reason: String)` to SparkContext and a corresponding test. cc @joshrosen for the API changes. As discussed offline, it's very hard to preserve binary

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-06 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r104566606 --- Diff: core/src/main/scala/org/apache/spark/scheduler/SchedulerBackend.scala --- @@ -30,8 +30,20 @@ private[spark] trait SchedulerBackend { def

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-06 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r104572407 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -467,7 +474,7 @@ private[spark] class TaskSchedulerImpl private

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-06 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r104595023 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -158,7 +158,8 @@ private[spark] class Executor( threadPool.execute(tr

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-06 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r104594970 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -2250,6 +2250,25 @@ class SparkContext(config: SparkConf) extends Logging

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-06 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r104595065 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedClusterMessage.scala --- @@ -40,7 +40,8 @@ private[spark] object

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-04 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/17166 [SPARK-19820] [core] Allow reason to be specified for task kill ## What changes were proposed in this pull request? This refactors the task kill path to allow specifying a reason

[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...

2017-03-05 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/17166 Yes -- this is useful if you want to implement extensions to Spark that can kill tasks for other reasons, e.g. if a debugger detects that a task has entered a bad state. Without this change

[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...

2017-03-05 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/17166 That's right, its not here. This PR only adds the distinction between tasks killed due to stage cancellation and speculation attempts. On Sun, Mar 5, 2017, 3:04 AM Mridul Muralidharan

[GitHub] spark pull request #17749: [SPARK-20450] [SQL] Unexpected first-query schema...

2017-04-24 Thread ericl
Github user ericl closed the pull request at: https://github.com/apache/spark/pull/17749 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #18714: [SPARK-20236][SQL] hive style partition overwrite

2017-07-22 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/18714#discussion_r128909042 --- Diff: core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala --- @@ -52,12 +55,22 @@ class HadoopMapReduceCommitProtocol

[GitHub] spark pull request #18714: [SPARK-20236][SQL] hive style partition overwrite

2017-07-23 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/18714#discussion_r128913707 --- Diff: core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala --- @@ -52,12 +55,22 @@ class HadoopMapReduceCommitProtocol

[GitHub] spark issue #18714: [SPARK-20236][SQL] hive style partition overwrite

2017-07-23 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/18714 Got it. On Sun, Jul 23, 2017, 10:40 PM Wenchen Fan <notificati...@github.com> wrote: > *@cloud-fan* commented on this pul

[GitHub] spark pull request #17633: [SPARK-20331][SQL] Enhanced Hive partition prunin...

2017-04-26 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17633#discussion_r113566732 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -589,18 +590,34 @@ private[client] class Shim_v0_13 extends

[GitHub] spark pull request #17749: [SPARK-20450] [SQL] Unexpected first-query schema...

2017-04-24 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/17749 [SPARK-20450] [SQL] Unexpected first-query schema inference cost with 2.1.1 ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-19611 fixes

[GitHub] spark issue #15306: [SPARK-17740] Spark tests should mock / interpose HDFS t...

2017-06-12 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/15306 Hm, we could it move the actual throw to the afterAll(), that would cause a suite abort instead but presumably leave the test errors intact. --- If your project is set up for it, you can reply

[GitHub] spark pull request #18070: [SPARK-20713][Spark Core] Convert CommitDenied to...

2017-05-25 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/18070#discussion_r118628471 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -338,6 +340,9 @@ private[spark] class Executor( metricsSystem

[GitHub] spark issue #21185: [SPARK-23894][CORE][SQL] Defensively clear ActiveSession...

2018-04-30 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/21185 This makes sense to me. It would be slightly to clear it where where the session is getting leaked through threads, but if that's hard then this looks good

[GitHub] spark issue #21934: [SPARK-24951][SQL] Table valued functions should throw A...

2018-07-31 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/21934 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...

2018-04-05 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/20971 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21058: [SPARK-23971] Should not leak Spark sessions acro...

2018-04-12 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/21058 [SPARK-23971] Should not leak Spark sessions across test suites ## What changes were proposed in this pull request? Many suites currently leak Spark sessions (sometimes with stopped

[GitHub] spark issue #21058: [SPARK-23971] Should not leak Spark sessions across test...

2018-04-12 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/21058 This is a followup to https://github.com/apache/spark/pull/20971 @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #20971: [SPARK-23809][SQL][backport] Active SparkSession ...

2018-04-09 Thread ericl
Github user ericl closed the pull request at: https://github.com/apache/spark/pull/20971 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

<    4   5   6   7   8   9   10   >