[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 yes, correct --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #19086: [SPARK-21874][SQL] Support changing database when...

2017-09-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19086#discussion_r136747432 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -569,46 +569,51 @@ class SessionCatalog

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 @gatorsmile I updated, let me known if there's still comments not resolved. Thanks again for review. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #19086: [SPARK-21874][SQL] Support changing database when...

2017-09-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19086#discussion_r136719337 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -502,17 +502,16 @@ private[spark] class HiveExternalCatalog

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-09-02 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18270 Thanks for notification. Actually we implement the same logic with hive, though there's a bug ... --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-08-31 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 Thanks, I will refine soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-08-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 @gatorsmile Thanks for taking time look at this. I updated description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18270 Thank you so much ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #19086: [SPARK-21874][SQL] Support changing database when...

2017-08-30 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19086 [SPARK-21874][SQL] Support changing database when rename table. ## What changes were proposed in this pull request? Support changing database of table by `alter table dbA.XXX rename

[GitHub] spark issue #18713: [SPARK-21509][SQL] Add a config to enable adaptive query...

2017-08-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18713 cc @cenyuhai As we talked offline, maybe your have interest on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18270 @gatorsmile Could you please give some ideas why the value of `grouping_id()` generated in Spark is different from `grouping__id` Hive? Is it designed on purpose? A lot of our users

[GitHub] spark pull request #18866: [SPARK-21649][SQL] Support writing data into hive...

2017-08-22 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/18866 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #18866: [SPARK-21649][SQL] Support writing data into hive bucket...

2017-08-22 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18866 @cloud-fan Thanks for reply. Looks like #19001 continues working on this and it's more comprehensive. I will close this pr for now. --- If your project is set up for it, you can reply

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18270 @cenyuhai Are you still working on this? Could please fix the test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18866: [SPARK-21649][SQL] Support writing data into hive bucket...

2017-08-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18866 @cloud-fan Would you give some advice on this ? Thus I can know if I'm on the right direction. I can keep working on it :) --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18270 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18866: [WIP][SPARK-21649][SQL] Support writing data into hive b...

2017-08-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18866 In current change: 1. `ClusteredDistribution` becomes ClusteredDistribution(clustering: Seq[Expression], clustersOpt: Option[Int] = None, useHiveHash: Boolean = false)` -- a) number

[GitHub] spark issue #18866: [SPARK-21649][SQL] Support writing data into hive bucket...

2017-08-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18866 cc @cloud-fan Would you mind give some comments? I can keep working on this :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #18866: [SPARK-21649][SQL] Support writing data into hive bucket...

2017-08-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18866 @viirya Please take another look when you have time. I've already updated :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #18866: [SPARK-21649][SQL] Support writing data into hive...

2017-08-07 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18866#discussion_r131682897 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala --- @@ -534,4 +534,29 @@ class InsertIntoHiveTableSuite

[GitHub] spark issue #18866: [SPARK-21649][SQL] Support writing data into hive bucket...

2017-08-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18866 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18866: [SPARK-21649][SQL] Support writing data into hive...

2017-08-07 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18866#discussion_r131607680 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -262,7 +262,12 @@ case class

[GitHub] spark issue #18866: [SPARK-21649][SQL] Support writing data into hive bucket...

2017-08-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18866 I added the unit test referring (https://github.com/apache/hive/blob/branch-1/ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractBucketJoinProc.java#L393). Hive will sort bucket files

[GitHub] spark pull request #18866: [SPARK-21649][SQL] Support writing data into hive...

2017-08-07 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18866 [SPARK-21649][SQL] Support writing data into hive bucket table. ## What changes were proposed in this pull request? Support writing hive bucket table. Spark internally uses Murmur3Hash

[GitHub] spark pull request #18713: [SPARK-21509][SQL] Add a config to enable adaptiv...

2017-07-27 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/18713 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #18713: [SPARK-21509][SQL] Add a config to enable adaptive query...

2017-07-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18713 Ok, I will close this for now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18735: [SPARK-21530] Update description of spark.shuffle.maxChu...

2017-07-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18735 @tgravescs Thanks, I should be more careful :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #18735: [SPARK-21530] Update description of spark.shuffle...

2017-07-26 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18735#discussion_r129608114 --- Diff: docs/configuration.md --- @@ -636,6 +636,8 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark issue #18735: [SPARK-21530] Update description of spark.shuffle.maxChu...

2017-07-25 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18735 cc @tgravescs @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18735: [SPARK-21530] Update description of spark.shuffle...

2017-07-25 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18735 [SPARK-21530] Update description of spark.shuffle.maxChunksBeingTransferred. ## What changes were proposed in this pull request? Update the description

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-25 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @tgravescs Thanks for help. > I think we should expand the description of the config to say what happens when the limit is hit. Since its not using real flow control a user might

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-25 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 Thanks for merging ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18713: [SPARK-21509][SQL] Add a config to enable adaptive query...

2017-07-23 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18713 cc @cloud-fan @jiangxb1987 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-23 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18713: [SPARK-21509][SQL] Add a config to enable adaptiv...

2017-07-22 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18713 [SPARK-21509][SQL] Add a config to enable adaptive query execution only for the last que… ## What changes were proposed in this pull request? Feature of adaptive query execution is a good

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-21 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r128794455 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/StreamManager.java --- @@ -83,4 +83,16 @@ public void connectionTerminated

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @cloud-fan I understand your concern. A `TransportRequestHandler` is for a channel/connection. We want to track the sending chunks of all connections. So I guess we must have a manager

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-20 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r128519475 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java --- @@ -145,7 +172,12 @@ private void

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-20 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r128498015 --- Diff: common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java --- @@ -257,4 +257,7 @@ public Properties cryptoConf

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-20 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r128497296 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java --- @@ -145,7 +172,12 @@ private void

[GitHub] spark pull request #18634: [SPARK-21414] Refine SlidingWindowFunctionFrame t...

2017-07-19 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18634#discussion_r128227194 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLWindowFunctionSuite.scala --- @@ -356,6 +356,46 @@ class SQLWindowFunctionSuite

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-19 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r128217700 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java --- @@ -25,6 +25,9 @@ import

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-19 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r128217450 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java --- @@ -139,6 +153,32 @@ public void

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-19 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r128217046 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java --- @@ -53,9 +56,13

[GitHub] spark pull request #18634: [SPARK-21414] Refine SlidingWindowFunctionFrame t...

2017-07-19 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18634#discussion_r128215724 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLWindowFunctionSuite.scala --- @@ -359,37 +359,41 @@ class SQLWindowFunctionSuite

[GitHub] spark pull request #18634: [SPARK-21414] Refine SlidingWindowFunctionFrame t...

2017-07-18 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18634#discussion_r128142152 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLWindowFunctionSuite.scala --- @@ -356,6 +356,42 @@ class SQLWindowFunctionSuite

[GitHub] spark issue #18634: [SPARK-21414] Refine SlidingWindowFunctionFrame to avoid...

2017-07-17 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18634 @cloud-fan @jiangxb1987 Thanks for help! I will refine and post the result of manual test late today :) --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #18634: [SPARK-21414] Refine SlidingWindowFunctionFrame to avoid...

2017-07-17 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18634 @jiangxb1987 Thanks a lot for quick reply ! > One concern is the test case don't reflect the improvement of this change. Yes, there is no unit test for `WindowFunctionFrame`

[GitHub] spark issue #18634: [SPARK-21414] Refine SlidingWindowFunctionFrame to avoid...

2017-07-17 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18634 cc @cloud-fan @jiangxb1987 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @tgravescs Thanks a lot for helping this pr ! I changed this pr, in current change: Shuffle server will track the number of chunks being transferred. Connection will be closed

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-14 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18487 `maxReqsInFlight` and `maxBytesInFlight` is hard to control the # of blocks in a single request. When # of map is very high, this change can alleviate the pressure of shuffle server. @dhruve

[GitHub] spark issue #18634: [SPARK-21414] Refine SlidingWindowFunctionFrame to avoid...

2017-07-14 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18634 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18634: [SPARK-21414] Refine SlidingWindowFunctionFrame to avoid...

2017-07-14 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18634 retest please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #18634: [SPARK-21414] Refine SlidingWindowFunctionFrame t...

2017-07-14 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18634 [SPARK-21414] Refine SlidingWindowFunctionFrame to avoid OOM. ## What changes were proposed in this pull request? In `SlidingWindowFunctionFrame`, it is now adding all rows to the buffer

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @tgravescs Thanks a lot for advice. > the flow control part should allow everyone to start fetching without rejecting a bunch, especially if the network can't push it out that fast any

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-12 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 I think it could be more efficient to do the control on shuffle service side. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-12 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 Previously I was saying that I have 200k+ connections to one shuffle service. I'm sorry about this, the information is wrong. It turns out that our each `NodeManager` has two auxiliary shuffle

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-11 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 >Is this all a single application? No, it's data warehouse, there are thousands ETLs >You say 6000 nodes with 64 executors on each host, how many cores per executor?

[GitHub] spark pull request #18405: [SPARK-21194][SQL] Fail the putNullmethod when co...

2017-07-11 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/18405 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-07-11 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 Sure, I will close this pr for now :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18405: [SPARK-21194][SQL] Fail the putNullmethod when co...

2017-07-10 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18405#discussion_r126584992 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala --- @@ -758,6 +758,35 @@ class ColumnarBatchSuite

[GitHub] spark issue #18593: [SPARK-21369][Core]Don't use Scala Tuple2 in common/netw...

2017-07-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18593 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-07-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 @cloud-fan More comments on this : ) ? If this is too trivial, should I close ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @tgravescs Thanks a lot for your advice :) very helpful. I will try more on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #18541: [SPARK-21315][SQL]Skip some spill files when generateIte...

2017-07-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18541 @jiangxb1987 Thanks for approving ! Already updated to PR description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #18565: [SPARK-21342] Fix DownloadCallback to work well with Ret...

2017-07-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18565 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18541: [SPARK-21315][SQL]Skip some spill files when gene...

2017-07-10 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18541#discussion_r126342950 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java --- @@ -589,29 +589,51 @@ public long getKeyPrefix

[GitHub] spark pull request #18565: [SPARK-21342] Fix DownloadCallback to work well w...

2017-07-09 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18565#discussion_r126306193 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java --- @@ -119,9 +132,9 @@ public void onSuccess

[GitHub] spark pull request #18565: [SPARK-21342] Fix DownloadCallback to work well w...

2017-07-09 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18565#discussion_r126306207 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java --- @@ -58,29 +59,41 @@ private final

[GitHub] spark issue #18541: [SPARK-21315][SQL]Skip some spill files when generateIte...

2017-07-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18541 @cloud-fan Would you mind take a look when you have time :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #18541: [SPARK-21315][SQL]Skip some spill files when gene...

2017-07-09 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18541#discussion_r126299265 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java --- @@ -589,29 +589,50 @@ public long getKeyPrefix

[GitHub] spark issue #18565: [SPARK-21342] Fix DownloadCallback to work well with Ret...

2017-07-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18565 In current change: 1. I'm using `java.util.function.Supplier` instead of `TmpFileCreater` 2. Pass `shuffleBlockFetcherIteratorIsZombie` from `ShuffleBlockFetcherIterator

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 Thanks for reply. I will figure out what I can do for this :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18565: [SPARK-21342] Fix DownloadCallback to work well with Ret...

2017-07-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18565 There are some corner cases because of creating and deleting in both `ShuffleBlockFetcherIterator` and `OneForOneBlockFetcher`. It seems no need to pass the `shuffleFiles` from

[GitHub] spark pull request #18565: [SPARK-21342] Fix DownloadCallback to work well w...

2017-07-08 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18565#discussion_r126277863 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java --- @@ -151,15 +152,27 @@ private void

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @tgravescs Thanks a lot for reviewing this pr thus much. I think I'm making a stupid mistake. Can I ask a question, how to decide the number of connections? I'm just counting

[GitHub] spark issue #18566: Refine the document for spark.reducer.maxReqSizeShuffleT...

2017-07-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18566 I didn't include this config in configuration.md. Do I need to? cc @zsxwing @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #18566: Refine the document for spark.reducer.maxReqSizeS...

2017-07-07 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18566 Refine the document for spark.reducer.maxReqSizeShuffleToMem. ## What changes were proposed in this pull request? In current code, reducer can break the old shuffle service when

[GitHub] spark pull request #18565: [SPARK-21342] Fix DownloadCallback to work well w...

2017-07-07 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18565#discussion_r126178968 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java --- @@ -151,15 +152,27 @@ private void

[GitHub] spark issue #18565: [SPARK-21342] Fix DownloadCallback to work well with Ret...

2017-07-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18565 cc @zsxwing @cloud-fan @jiangxb1987 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18565: [SPARK-21342] Fix DownloadCallback to work well w...

2017-07-07 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18565 [SPARK-21342] Fix DownloadCallback to work well with RetryingBlockFetcher. ## What changes were proposed in this pull request? When `RetryingBlockFetcher` retries fetching blocks

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 We didn't change `spark.shuffle.io.numConnectionsPerPeer`. Our biggest cluster has 6000 `NodeManager`s. There are 50 executors running on a same host at the same time. --- If your project

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @cloud-fan To be honest, it's a little bit tricky to reject "open blocks" by closing the connection. The following reconnection will surely have extra cost. In current change we a

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 Analyzing the heap dump, there are 200K+ connections and 3.5M blocks(`FileSegmentManagedBuffer`) being fetched. Yes, flow control is a good idea. But I still think it make much sense to control

[GitHub] spark pull request #18482: [SPARK-21262] Stop sending 'stream request' when ...

2017-07-06 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/18482 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #18482: [SPARK-21262] Stop sending 'stream request' when shuffle...

2017-07-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18482 Sure, I will update the document soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18487: [SPARK-21243][Core] Limit no. of map outputs in a...

2017-07-06 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18487#discussion_r125917832 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -321,6 +321,16 @@ package object config { .intConf

[GitHub] spark issue #18541: [SPARK-21315][SQL]Skip some spill files when generateIte...

2017-07-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18541 @hvanhovell I refined according to your comments. Please take another look when you have time :) --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #18541: [SPARK-21315][SQL]Skip some spill files when generateIte...

2017-07-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18541 Did a small test: 2000200 rows in the `UnsafeExternalSorter`: 2 spill files(each contains 100 rows) and `inMemSorter` contains 200 rows. I want to target the iterator to index=201

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-07-06 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r125877439 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java --- @@ -126,4 +150,38 @@ private void

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 I removed the `OpenBlocksFailed` for compatibility. In current change, the server reject the "open blocks" request by closing the connection. Then `RetryingBlockFetcher` will retry. -

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @tgravescs As in the screenshot, we have tons of `ChunkOutboundBuffer$Entry`. Yes we are using `transferTo`. Netty will put the `Entry`(containing reference to the `MessageWithHeader

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @tgravescs I think it's not that hurt. In current change, new client is compatible with the old and new shuffle service. In our clusters, we always upgrade the client first and then server

[GitHub] spark pull request #18541: [SPARK-21315][SQL]Skip some spill files when gene...

2017-07-05 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18541 [SPARK-21315][SQL]Skip some spill files when generateIterator(startIndex) in ExternalAppendOnlyUnsafeRowArray. ## What changes were proposed in this pull request? In current

[GitHub] spark issue #18482: [SPARK-21262] Stop sending 'stream request' when shuffle...

2017-07-04 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18482 Very gentle ping @zsxwing , How do you think about this idea? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-04 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 Very gentle ping @zsxwing , would you mind help comment on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-06-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 Yes, there is a change. Server side may return `OpenBlocksFailed` for the "open blocks" request, which means that old client is not compatible with new server. Is it acceptable ? -

[GitHub] spark issue #18482: [SPARK-21262] Stop sending 'stream request' when shuffle...

2017-06-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18482 In current change, it i fetching big chunk in memory and then writing to disk and then release the memory. I made this change for below reasons: 1. The client shouldn't break old shuffle

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-06-30 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r125068210 --- Diff: common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java --- @@ -257,4 +257,31 @@ public Properties cryptoConf

[GitHub] spark issue #18482: [SPARK-21262][WIP] Stop sending 'stream request' when sh...

2017-06-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18482 cc @zsxwing @cloud-fan @jiangxb1987 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

<    1   2   3   4   5   6   7   8   >