[GitHub] spark issue #20823: [SPARK-23674] Add Spark ML Listener for Tracking ML Pipe...

2018-10-16 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/20823 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22550: [SPARK-25501] Kafka delegation token support

2018-10-01 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/22550 close this one since other PR is working on this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22550: [SPARK-25501] Kafka delegation token support

2018-10-01 Thread merlintang
Github user merlintang closed the pull request at: https://github.com/apache/spark/pull/22550 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22598: [SPARK-25501][SS] Add kafka delegation token support.

2018-10-01 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/22598 @gaborgsomogyi thanks for your PR, I am going through the details and test on my local machine. --- - To unsubscribe, e

[GitHub] spark pull request #22550: [SPARK-25501] Kafka delegation token support

2018-09-25 Thread merlintang
GitHub user merlintang opened a pull request: https://github.com/apache/spark/pull/22550 [SPARK-25501] Kafka delegation token support ## What changes were proposed in this pull request? Kafaka is going to support delegation token, Spark need to read the delegation token

[GitHub] spark issue #21455: [SPARK-24093][DStream][Minor]Make some fields of KafkaSt...

2018-06-26 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/21455 @gabor. These fields are important for us the understand the spark kafka streaming data like the topic name. we can use these information to track the system status. On Tue, Jun

[GitHub] spark issue #20823: [SPARK-23674] Add Spark ML Listener for Tracking ML Pipe...

2018-06-11 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/20823 @holdenk can you look at this PR? thanks in advance. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21455: [SPARK-24093][DStream][Minor]Make some fields of KafkaSt...

2018-06-11 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/21455 @jerryshao Actually, we can not use reflection to get this field information. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #21504: [SPARK-24479][SS] Added config for registering st...

2018-06-07 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/21504#discussion_r193911087 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala --- @@ -55,6 +56,11 @@ class StreamingQueryManager private

[GitHub] spark issue #20823: [SPARK-23674] Add Spark ML Listener for Tracking ML Pipe...

2018-06-07 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/20823 @jmwdpk can you update this pr, since there is conflict. I have update this pr. https://github.com/merlintang/spark/commits/SPARK-23674

[GitHub] spark issue #21455: [SPARK-24093][DStream][Minor]Make some fields of KafkaSt...

2018-05-29 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/21455 @jerryshao can you review this minor update ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #21455: [SPARK-24093][DStream][Minor]Make some fields of ...

2018-05-29 Thread merlintang
GitHub user merlintang opened a pull request: https://github.com/apache/spark/pull/21455 [SPARK-24093][DStream][Minor]Make some fields of KafkaStreamWriter/In… …ternalRowMicroBatchWriter visible to outside of the classes ## What changes were proposed in this pull

[GitHub] spark issue #19885: [SPARK-22587] Spark job fails if fs.defaultFS and applic...

2018-01-17 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/19885 @jerryshao can you backport this to branch 2.2 as well. thanks --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19885: [SPARK-22587] Spark job fails if fs.defaultFS and applic...

2018-01-10 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/19885 @jerryshao and @steveloughran thanks for your comments and review. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19885: [SPARK-22587] Spark job fails if fs.defaultFS and applic...

2018-01-09 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/19885 @steveloughran can you review the added system test cases? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19885: [SPARK-22587] Spark job fails if fs.defaultFS and applic...

2018-01-02 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/19885 My local test is ok. I would set up a system test and update this soon. sorry about this delay. On Tue, Jan 2, 2018 at 3:42 PM, Marcelo Vanzin <notificati...@github.com>

[GitHub] spark issue #19885: [SPARK-22587] Spark job fails if fs.defaultFS and applic...

2017-12-14 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/19885 I am so sorry for the late of testing function, I would update it soon. On Thu, Dec 14, 2017 at 12:55 PM, UCB AMPLab <notificati...@github.com> wrote: > Can one of t

[GitHub] spark issue #19885: [SPARK-22587] Spark job fails if fs.defaultFS and applic...

2017-12-06 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/19885 I have added this test case for the URI comparing based on Steve's comments. I have tested this in my local vm, it pass the test. meanwhile, for the hdfs://namenode1/path1 hdfs

[GitHub] spark issue #19885: [SPARK-22587] Spark job fails if fs.defaultFS and applic...

2017-12-04 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/19885 @jerryshao yes, hdfs://us...@nn1.com:8020 and hdfs://us...@nn1.com:8020 would consider as two filesystem, since the authority information should be taken into consideration. that is why need

[GitHub] spark pull request #19885: [SPARK-22587] Spark job fails if fs.defaultFS and...

2017-12-04 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/19885#discussion_r154827513 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -1428,6 +1428,12 @@ private object Client extends

[GitHub] spark issue #19885: [SPARK-22587] Spark job fails if fs.defaultFS and applic...

2017-12-04 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/19885 @jerryshao can you review this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19885: [SPARK-22587] Spark job fails if fs.defaultFS and...

2017-12-04 Thread merlintang
GitHub user merlintang opened a pull request: https://github.com/apache/spark/pull/19885 [SPARK-22587] Spark job fails if fs.defaultFS and application jar are d… …ifferent url ## What changes were proposed in this pull request? Two filesystems comparing does

[GitHub] spark issue #16165: [SPARK-8617] [WEBUI] HistoryServer: Include in-progress ...

2017-04-05 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/16165 @markhamstra Thanks all. btw: what if there are many redundant inprogress files in the disk and impact the system performance? --- If your project is set up for it, you can reply

[GitHub] spark issue #16165: [SPARK-8617] [WEBUI] HistoryServer: Include in-progress ...

2017-04-04 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/16165 @vanzin sorry, I mean the 2.1.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16165: [SPARK-8617] [WEBUI] HistoryServer: Include in-progress ...

2017-04-04 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/16165 should we backport this into 2.1? @vanzin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17092: [SPARK-18450][ML] Scala API Change for LSH AND-amplifica...

2017-02-28 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/17092 @Yunni I test this patch locally, it can work, but I have one idea to improve it. We can discuss it in other ticket. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #17092: [SPARK-18450][ML] Scala API Change for LSH AND-amplifica...

2017-02-28 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/17092 @Yunni ok, let us discuss the further optimization step in other ticket. the current patch is LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #16965: [Spark-18450][ML] Scala API Change for LSH AND-amplifica...

2017-02-24 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/16965 @Yunni thanks, where I mention the L is the number of hash tables. By this way, the memory usage would be O(L*N). the approximate NN searching cost in one partition is O(L*N'). Where N

[GitHub] spark issue #16965: [Spark-18450][ML] Scala API Change for LSH AND-amplifica...

2017-02-24 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/16965 @Yunni Ok, if we want to move this quicker, we can keep the current AND-OR implementation. (2)(3) you mention that you explode the inner table (dataset). Does it mean for each tuple

[GitHub] spark issue #16965: [Spark-18450][ML] Scala API Change for LSH AND-amplifica...

2017-02-24 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/16965 @Yunni Yes, we can use the AND-OR to increase the possibility by having more the numHashTables and numHashFunctions. For the further user extension, if users have a hash function with lower

[GitHub] spark issue #16965: [Spark-18450][ML] Scala API Change for LSH AND-amplifica...

2017-02-23 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/16965 @Yunni I agree with you that the current NN search and Join are using the AND-OR. We can discuss how to use the OR-AND for that two searching as well. For the OR-AND option

[GitHub] spark issue #16965: [Spark-18450][ML] Scala API Change for LSH AND-amplifica...

2017-02-22 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/16965 It seems this patch provide the AND-OR amplification. Can we provide the option for users to choose the OR-AND amplification as well? --- If your project is set up for it, you can reply

[GitHub] spark pull request #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory ...

2017-01-06 Thread merlintang
Github user merlintang closed the pull request at: https://github.com/apache/spark/pull/15819 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2017-01-06 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 Many thanks, Xiao. I learnt lots. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory ...

2017-01-05 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/15819#discussion_r94906952 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/VersionsSuite.scala --- @@ -216,5 +219,37 @@ class VersionsSuite extends SparkFunSuite

[GitHub] spark pull request #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory ...

2017-01-05 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/15819#discussion_r94727237 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/VersionsSuite.scala --- @@ -216,5 +219,37 @@ class VersionsSuite extends SparkFunSuite

[GitHub] spark pull request #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory ...

2017-01-05 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/15819#discussion_r94727256 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/VersionsSuite.scala --- @@ -216,5 +219,37 @@ class VersionsSuite extends SparkFunSuite

[GitHub] spark pull request #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory ...

2017-01-05 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/15819#discussion_r94727246 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -54,6 +63,63 @@ case class InsertIntoHiveTable

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2017-01-03 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 @gatorsmile can you retest the patch, then we can merge. Sorry to ping you multiple times since several users are asking this. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory ...

2017-01-02 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/15819#discussion_r94361979 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/VersionsSuite.scala --- @@ -216,5 +218,37 @@ class VersionsSuite extends SparkFunSuite

[GitHub] spark pull request #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory ...

2017-01-02 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/15819#discussion_r94359244 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/VersionsSuite.scala --- @@ -216,5 +218,37 @@ class VersionsSuite extends SparkFunSuite

[GitHub] spark pull request #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory ...

2017-01-02 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/15819#discussion_r94351849 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -54,6 +63,63 @@ case class InsertIntoHiveTable

[GitHub] spark pull request #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory ...

2017-01-02 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/15819#discussion_r94351862 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/VersionsSuite.scala --- @@ -216,5 +218,37 @@ class VersionsSuite extends SparkFunSuite

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2017-01-01 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 @gatorsmile I have backport the test case in #16339 with small modification. because the "INSERT OVERWRITE TABLE tab SELECT '$i'" will bring the issue from hive side e.

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-29 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 yes, let me backport the test cases for checking the staging file. On Thu, Dec 29, 2016 at 10:11 PM, Xiao Li <notificati...@github.com> wrote: > Is that possible to

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-29 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 Thanks, Wenchen, I have backport the code of #16339 to here, I have tested it locally. Can you review and verify? On Sun, Dec 25, 2016 at 11:04 PM, Wenchen Fan <notific

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-20 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 @gatorsmile Great! thanks so much, because I was pinged multiple times for this bug. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-20 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 @cloud-fan @gatorsmile I have backport the code from #16134, can you verify and backport this to spark 1.6.x? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-19 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 @gatorsmile one more customer is running into this issue in the spark 1.6.x. I backport the code #16134 to here and test it manually. Please verify. --- If your project is set up for it, you

[GitHub] spark issue #16134: [SPARK-18703] [SQL] Drop Staging Directories and Data Fi...

2016-12-15 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/16134 This patch is related to the path #15819 for spark 1.6. In the #15819, I can add the code from this patch(#16134) now, then we can fix the staging files issues in the spark 1.6.x

[GitHub] spark issue #16134: [SPARK-18703] [SQL] Drop Staging Directories and Data Fi...

2016-12-15 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/16134 +1 backport to spark 1.6.x On Thu, Dec 15, 2016 at 8:14 AM, Xiao Li <notificati...@github.com> wrote: > The staging directory and files will not be removed when user

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-14 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 @cloud-fan @gatorsmile this patch is related to #16134, It seems #16134 would be merged soon. Meanwhile, should we backport #16104 into 1.6.x? please advise. or else, I just backport #16134

[GitHub] spark pull request #16134: [SPARK-18703] [SQL] Drop Staging Directories and ...

2016-12-13 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/16134#discussion_r92244682 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -328,6 +332,15 @@ case class InsertIntoHiveTable

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-13 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 Great, once the #16134 <https://github.com/apache/spark/pull/16134> is done, we can backport them together. On Tue, Dec 13, 2016 at 12:18 AM, Wenchen Fan <notificati...@g

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-12 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 @gatorsmile what is going on this patch? this is a backport code, thus, can you merge this patch into 1.6.x ? more than one users are running into this issue in the spark 1.6.x. --- If your

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-06 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 do you exit the spark shell ? I have tested on this, and this staging file would be removed after we exit the spark shell under spark 2.0.x. meanwhile, the staging file are used

[GitHub] spark issue #13670: [SPARK-15951] Change Executors Page to use datatables to...

2016-12-06 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/13670 @kishorvpatil you provided the function allexecutors, which is used to return the dead and active executor information. For the document http://spark.apache.org/docs/latest

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-04 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 @cloud-fan this is related to this PR in the 2.0.x https://github.com/apache/spark/pull/12770 --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-04 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 Ok. On Sun, Dec 4, 2016 at 6:25 PM, Reynold Xin <notificati...@github.com> wrote: > We have stopped making new releases for 1.5 so it makes no sense to &

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-04 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 this bug is related to 1.5.x as well as 1.6.x. please backport to 1.5.x as well. On Sun, Dec 4, 2016 at 6:20 PM, Reynold Xin <notificati...@github.com>

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-04 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 it is updated. On Sun, Dec 4, 2016 at 11:23 AM, Xiao Li <notificati...@github.com> wrote: > @merlintang <https://github.com/merlintang> Could you please add

[GitHub] spark issue #15819: [SPARK-18372][SQL].Staging directory fail to be removed

2016-12-04 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 yes, exactly. This path is only for spark 1.x. what i proposed here is that we need to use the code of spark 2.0.x o fix the bug of spark 1.x. you can see this message from the my previous

[GitHub] spark pull request #15819: [SPARK-18372][SQL].Staging directory fail to be r...

2016-11-19 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/15819#discussion_r88778830 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -54,6 +61,61 @@ case class InsertIntoHiveTable

[GitHub] spark pull request #15819: [SPARK-18372][SQL].Staging directory fail to be r...

2016-11-19 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/15819#discussion_r88778781 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -54,6 +61,61 @@ case class InsertIntoHiveTable

[GitHub] spark issue #15819: [SPARK-18372][SQL].Staging directory fail to be removed

2016-11-16 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 @cloud-fan @rxin can you review this code? since several customers are complaining about the hive generated empty staging files in the HDFS. --- If your project is set up for it, you can reply

[GitHub] spark pull request #15819: [SPARK-18372][SQL].Staging directory fail to be r...

2016-11-16 Thread merlintang
Github user merlintang commented on a diff in the pull request: https://github.com/apache/spark/pull/15819#discussion_r88345264 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -54,6 +61,61 @@ case class InsertIntoHiveTable

[GitHub] spark issue #15819: [SPARK-18372][SQL].Staging directory fail to be removed

2016-11-09 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 Actually, I do not have the unit test, but the code list below (same as we posted in the JIRA) can reproduce this bug. The related code would be this way: val sqlContext = new

[GitHub] spark pull request #15819: [SPARK-18372][SQL].Staging directory fail to be r...

2016-11-08 Thread merlintang
GitHub user merlintang opened a pull request: https://github.com/apache/spark/pull/15819 [SPARK-18372][SQL].Staging directory fail to be removed ## What changes were proposed in this pull request? This fix is related to be bug: https://issues.apache.org/jira/browse/SPARK