[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-27 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/23124#discussion_r236952729 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala --- @@ -0,0 +1,118 @@ +/* + * Licensed

[GitHub] spark pull request #23000: [SPARK-26002][SQL] Fix day of year calculation fo...

2018-11-19 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/23000#discussion_r234819827 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala --- @@ -410,6 +410,30 @@ class DateTimeUtilsSuite

[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...

2018-11-02 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22504 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22865: [DOC] Fix doc for spark.sql.parquet.recordLevelFi...

2018-10-28 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/22865#discussion_r228771361 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -462,7 +462,7 @@ object SQLConf { val

[GitHub] spark pull request #22865: [DOC] Fix doc for spark.sql.parquet.recordLevelFi...

2018-10-27 Thread bersprockets
GitHub user bersprockets opened a pull request: https://github.com/apache/spark/pull/22865 [DOC] Fix doc for spark.sql.parquet.recordLevelFilter.enabled ## What changes were proposed in this pull request? Updated the doc string value

[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Yarn Client...

2018-09-28 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22504 The Py4JJavaError StackOverflow happens pretty reliably. I am guessing its related to the change. --- - To unsubscribe, e

[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Yarn Client...

2018-09-27 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22504 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Yarn Client...

2018-09-27 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22504 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21950: [SPARK-24914][SQL][WIP] Add configuration to avoi...

2018-09-18 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21950#discussion_r218608537 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -1051,11 +1052,27 @@ private[hive] object

[GitHub] spark issue #21950: [SPARK-24914][SQL][WIP] Add configuration to avoid OOM d...

2018-09-17 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21950 I broke this :). Don't ask for a redo. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #21950: [SPARK-24914][SQL][WIP] Add configuration to avoi...

2018-09-12 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21950#discussion_r217216975 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/PruneFileSourcePartitionsSuite.scala --- @@ -91,4 +91,28 @@ class

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-11 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22192 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22382: [SPARK-23243] [SPARK-20715][CORE][2.2] Fix RDD.repartiti...

2018-09-11 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22382 Thanks! Closing. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22382: [SPARK-23243] [SPARK-20715][CORE][2.2] Fix RDD.re...

2018-09-11 Thread bersprockets
Github user bersprockets closed the pull request at: https://github.com/apache/spark/pull/22382 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-10 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22192 retest this please. It's that old "java.lang.reflect.InvocationTargetException: null" error we've seen

[GitHub] spark issue #21899: [SPARK-24912][SQL] Don't obscure source of OOM during br...

2018-09-10 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21899 cc @jinxing64 @hvanhovell @MaxGekk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22382: [SPARK-23243] [SPARK-20715][CORE][2.2] Fix RDD.repartiti...

2018-09-10 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22382 cc @cloud-fan @JoshRosen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22382: [SPARK-23243] [SPARK-20715][CORE][2.2] Fix RDD.re...

2018-09-10 Thread bersprockets
GitHub user bersprockets opened a pull request: https://github.com/apache/spark/pull/22382 [SPARK-23243] [SPARK-20715][CORE][2.2] Fix RDD.repartition() data correctness issue ## What changes were proposed in this pull request? Back port of #22354 and #17955 to 2.2 (#22354

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-04 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22192 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22209: [SPARK-24415][Core] Fixed the aggregated stage metrics b...

2018-08-29 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22209 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22209: [SPARK-24415][Core] Fixed the aggregated stage metrics b...

2018-08-28 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22209 Looks like test failed due to https://issues.apache.org/jira/browse/SPARK-23622 --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22209: [SPARK-24415][Core] Fixed the aggregated stage metrics b...

2018-08-28 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22209 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22188: [SPARK-25164][SQL] Avoid rebuilding column and path list...

2018-08-27 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22188 @gatorsmile Thanks much! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22188: [SPARK-25164][SQL] Avoid rebuilding column and path list...

2018-08-27 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22188 @gatorsmile >Why 2.2 only? Only that I forgot that master is already on 2.4. We should do 2.3 as well, but I haven't tested it yet. Do I need to do anything on my

[GitHub] spark issue #22188: [SPARK-25164][SQL] Avoid rebuilding column and path list...

2018-08-27 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22188 @cloud-fan @gatorsmile Should we merge this also onto 2.2? It was a clean cherry-pick for me (from master to branch-2.2), and I ran the top and bottom tests (6000 columns, 1 million rows, 67

[GitHub] spark pull request #21899: [SPARK-24912][SQL] Don't obscure source of OOM du...

2018-08-24 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21899#discussion_r212756302 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -118,12 +119,20 @@ case class

[GitHub] spark pull request #21950: [SPARK-24914][SQL][WIP] Add configuration to avoi...

2018-08-24 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21950#discussion_r212719073 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala --- @@ -76,4 +78,16 @@ private[sql

[GitHub] spark pull request #22079: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-251...

2018-08-23 Thread bersprockets
Github user bersprockets closed the pull request at: https://github.com/apache/spark/pull/22079 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...

2018-08-23 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 @gatorsmile Weird, I don't see it on branch-2.2. Is that a sync issue? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22188: [SPARK-25164][SQL] Avoid rebuilding column and path list...

2018-08-22 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22188 OK, I reran the tests for the lower column count cases, and the runs with the patch consistently show a tiny (1-3%) improvement compared to the master branch. So even the lower column count

[GitHub] spark issue #22188: [SPARK-25164][SQL] Avoid rebuilding column and path list...

2018-08-22 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22188 Thanks @vanzin. In my benchmark tests, the tiny degradation (0.5%) in the lower column count cases is pretty consistent, which concerns me a little. I am going to re-run those tests

[GitHub] spark pull request #22188: [SPARK-25164][SQL] Avoid rebuilding column and pa...

2018-08-22 Thread bersprockets
GitHub user bersprockets opened a pull request: https://github.com/apache/spark/pull/22188 [SPARK-25164][SQL] Avoid rebuilding column and path list for each column in parquet reader ## What changes were proposed in this pull request? VectorizedParquetRecordReader

[GitHub] spark pull request #21899: [SPARK-24912][SQL] Don't obscure source of OOM du...

2018-08-21 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21899#discussion_r211833522 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -118,12 +119,20 @@ case class

[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...

2018-08-21 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22154 Re: Your build failure ('statefulOperators.scala:95: value asJava is not a member of scala.collection.immutable.Map[String,Long]). I am also seeing this in my fork on my laptop (I just

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SQL][BACKPORT-2.2] Shuffle+Re...

2018-08-21 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 @gatorsmile So I should include all the related PRs merged to master as a single PR here? Just verifying

[GitHub] spark issue #21899: [SPARK-24912][SQL] Don't obscure source of OOM during br...

2018-08-18 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21899 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21899: [SPARK-24912][SQL] Don't obscure source of OOM du...

2018-08-17 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21899#discussion_r211047556 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -118,12 +119,20 @@ case class

[GitHub] spark issue #21899: [SPARK-24912][SQL] Don't obscure source of OOM during br...

2018-08-17 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21899 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21899: [SPARK-24912][SQL] Don't obscure source of OOM during br...

2018-08-17 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21899 @MaxGekk In the updated message, I left out "hash" from the term "hash relation" only because it seems the relation could

[GitHub] spark issue #21950: [SPARK-24914][SQL][WIP] Add configuration to avoid OOM d...

2018-08-17 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21950 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SQL][BACKPORT-2.2] Shuffle+Re...

2018-08-16 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 @jiangxb1987 gentle ping. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SQL][BACKPORT-2.2] Shuffle+Re...

2018-08-15 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 Once this is merged, I will also back-port: - [[SPARK-24564][TEST] Add test suite for RecordBinaryComparator](https://github.com/apache/spark/commit

[GitHub] spark issue #22101: [SPARK-25114][Core] Fix RecordBinaryComparator when subt...

2018-08-14 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22101 Should there be a test, or do other sorting-related tests cover this indirectly? --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SQL][BACKPORT-2.2] Shuffle+Re...

2018-08-13 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 @jiangxb1987 Here are some of the differences from the original PR - I also ported the follow up PR #20426 - I ported #20088 (for SPARK-22905) to get the tests to pass. I also

[GitHub] spark pull request #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartit...

2018-08-13 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/22079#discussion_r209736691 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala --- @@ -144,7 +144,7 @@ object ChiSqSelectorModel extends Loader

[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...

2018-08-12 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 Hmmm... I somehow managed to break SparkR tests but fixing a comment. It seems to have auto-retried and broke the second time too

[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...

2018-08-12 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...

2018-08-12 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 @jiangxb1987 > We shall also include #20088 in this backport PR. I did that shortly after commenting, which allowed the tests to pass. I squashed it into the first commit,

[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...

2018-08-11 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 The test "model load / save" in ChiSqSelectorSuite fails because of this line in [ChiSqSelector.scala](https://github.com/apache/spark/blob/branch-2.2/mllib/src/main/scala/

[GitHub] spark pull request #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartit...

2018-08-11 Thread bersprockets
GitHub user bersprockets opened a pull request: https://github.com/apache/spark/pull/22079 [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on a DataFrame could lead to incorrect answers ## What changes were proposed in this pull request? Currently shuffle repartition

[GitHub] spark issue #21950: [SPARK-24914][SQL][WIP] Add configuration to avoid OOM d...

2018-08-02 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21950 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21950: [SPARK-24914][SQL][WIP] Add configuration to avoid OOM d...

2018-08-02 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21950 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21950: [SPARK-24912][SQL][WIP] Add configuration to avoi...

2018-08-01 Thread bersprockets
GitHub user bersprockets opened a pull request: https://github.com/apache/spark/pull/21950 [SPARK-24912][SQL][WIP] Add configuration to avoid OOM during broadcast join (and other negative side effects of incorrect table sizing) ## What changes were proposed in this pull request

[GitHub] spark issue #21899: [SPARK-24912][SQL] Don't obscure source of OOM during br...

2018-07-27 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21899 >Is it possible to include the actual size of the in-memory table so far in the msg as well? Only if the relation can be built. If we run out of memory attempting to bu

[GitHub] spark issue #21899: [SPARK-24912][SQL] Don't obscure source of OOM during br...

2018-07-27 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21899 > Is it possible to include the actual size of the in-memory table so far in the msg as well? Possibly. The state of the relation might be messy when I go to query its s

[GitHub] spark pull request #21899: [SPARK-24912][SQL] Don't obscure source of OOM du...

2018-07-27 Thread bersprockets
GitHub user bersprockets opened a pull request: https://github.com/apache/spark/pull/21899 [SPARK-24912][SQL] Don't obscure source of OOM during broadcast join ## What changes were proposed in this pull request? This PR shows the stack trace of the original OutOfMemoryError

[GitHub] spark issue #21073: [SPARK-23936][SQL] Implement map_concat

2018-07-10 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21073 @ueshin Thanks for all your help! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21073: [SPARK-23936][SQL] Implement map_concat

2018-07-06 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21073 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21073: [SPARK-23936][SQL] Implement map_concat

2018-07-02 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21073#discussion_r199678852 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -551,6 +551,36 @@ object TypeCoercion

[GitHub] spark issue #21073: [SPARK-23936][SQL] Implement map_concat

2018-06-26 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21073 Still working on type coercion. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #21628: [SPARK-23776][DOC] Update instructions for running PySpa...

2018-06-25 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21628 @HyukjinKwon Thanks for your help! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #21073: [SPARK-23936][SQL] Implement map_concat

2018-06-24 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21073#discussion_r197671215 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -475,6 +474,231 @@ case class

[GitHub] spark issue #21073: [SPARK-23936][SQL] Implement map_concat

2018-06-24 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21073 @ueshin >so I was wondering whether we need the same thing for MapConcat or not. Got it. I will research that, plus I will look at the entire pull request for Concat to

[GitHub] spark pull request #21073: [SPARK-23936][SQL] Implement map_concat

2018-06-24 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21073#discussion_r197669221 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -475,6 +474,231 @@ case class

[GitHub] spark pull request #21628: [SPARK-23776][DOC] Update instructions for runnin...

2018-06-24 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21628#discussion_r197667457 --- Diff: docs/building-spark.md --- @@ -215,19 +215,23 @@ If you are building Spark for use in a Python environment and you wish to pip

[GitHub] spark pull request #21628: [SPARK-23776][DOC] Update instructions for runnin...

2018-06-24 Thread bersprockets
GitHub user bersprockets opened a pull request: https://github.com/apache/spark/pull/21628 [SPARK-23776][DOC] Update instructions for running PySpark after building with SBT ## What changes were proposed in this pull request? This update tells the reader how to build Spark

[GitHub] spark pull request #21621: [SPARK-24633][SQL] Fix codegen when split is requ...

2018-06-24 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21621#discussion_r197646450 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala --- @@ -556,6 +556,17 @@ class DataFrameFunctionsSuite extends

[GitHub] spark pull request #21621: [SPARK-24633][SQL] Fix codegen when split is requ...

2018-06-23 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21621#discussion_r197623576 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala --- @@ -556,6 +556,17 @@ class DataFrameFunctionsSuite extends

[GitHub] spark issue #21073: [SPARK-23936][SQL] Implement map_concat

2018-06-22 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21073 Hi @ueshin. Just a question while I work on the changes for your review comments. >I'm wondering whether we need type coercion like concat for array type is doing. Which t

[GitHub] spark pull request #20909: [SPARK-23776][python][test] Check for needed comp...

2018-06-09 Thread bersprockets
Github user bersprockets closed the pull request at: https://github.com/apache/spark/pull/20909 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...

2018-06-09 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/20909 @HyukjinKwon This PR is mostly obsolete. I will close it and re-open something smaller... maybe a one-line documentation change to handle the missing UDF case for those who build with sbt

[GitHub] spark issue #21231: [SPARK-24119][SQL]Add interpreted execution to SortPrefi...

2018-06-08 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21231 @hvanhovell @maropu @viirya @kiszk Thanks for all the help! --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #21073: [SPARK-23936][SQL] Implement map_concat

2018-06-05 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21073#discussion_r193280073 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -308,6 +308,170 @@ case class

[GitHub] spark issue #21231: [SPARK-24119][SQL]Add interpreted execution to SortPrefi...

2018-05-30 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21231 ping @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21308: SPARK-24253: Add DeleteSupport mix-in for DataSou...

2018-05-25 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21308#discussion_r190963247 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/DeleteSupport.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #21073: [SPARK-23936][SQL] Implement map_concat

2018-05-18 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21073 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21073: [SPARK-23936][SQL] Implement map_concat

2018-05-17 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21073#discussion_r189161277 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala --- @@ -56,6 +58,28 @@ class

[GitHub] spark issue #21231: [SPARK-24119][SQL]Add interpreted execution to SortPrefi...

2018-05-17 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21231 @maropu @hvanhovell @viirya Are all pending issues resolved? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #21305: [SPARK-24251][SQL] Add AppendData logical plan.

2018-05-17 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21305#discussion_r188960491 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -344,6 +344,36 @@ case class

[GitHub] spark pull request #21308: SPARK-24253: Add DeleteSupport mix-in for DataSou...

2018-05-15 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21308#discussion_r188392219 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/DeleteSupport.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #21231: [SPARK-24119][SQL]Add interpreted execution to So...

2018-05-09 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21231#discussion_r187192485 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala --- @@ -147,7 +148,40 @@ case class SortPrefix(child

[GitHub] spark issue #21144: [SPARK-24043][SQL] Interpreted Predicate should initiali...

2018-05-09 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21144 @cloud-fan I don't think this is an issue in 2.3. It would be an issue only once [SPARK-23580](https://issues.apache.org/jira/browse/SPARK-23580) ("Interpreted mode fallback s

[GitHub] spark issue #21144: [SPARK-24043][SQL] Interpreted Predicate should initiali...

2018-05-07 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21144 Thanks much! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21073: [SPARK-23936][SQL] Implement map_concat

2018-05-07 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21073 @ueshin Hopefully I have addressed all of your review comments. Also, I have a question about what it means to dedup across maps when Spark allows duplicates in maps [here.](https

[GitHub] spark pull request #21073: [SPARK-23936][SQL] Implement map_concat

2018-05-07 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21073#discussion_r186570491 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -116,6 +117,169 @@ case class

[GitHub] spark issue #21144: [SPARK-24043][SQL] Interpreted Predicate should initiali...

2018-05-07 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21144 @hvanhovell @maropu Is there anything on this PR that I should do? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #21231: [SPARK-24119][SQL]Add interpreted execution to So...

2018-05-06 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21231#discussion_r186307490 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/SortOrderExpressionsSuite.scala --- @@ -0,0 +1,90

[GitHub] spark issue #21231: [SPARK-24119][SQL]Add interpreted execution to SortPrefi...

2018-05-06 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21231 @maropu @kiszk Hopefully I've addressed all comments. Please take a look. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21231: [SPARK-24119][SQL]Add interpreted execution to SortPrefi...

2018-05-05 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21231 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21231: [SPARK-24119][SQL]Add interpreted execution to SortPrefi...

2018-05-03 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21231 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21231: [SPARK-24119][SQL]Add interpreted execution to So...

2018-05-03 Thread bersprockets
GitHub user bersprockets opened a pull request: https://github.com/apache/spark/pull/21231 [SPARK-24119][SQL]Add interpreted execution to SortPrefix expression ## What changes were proposed in this pull request? Implemented eval in SortPrefix expression. ## How

[GitHub] spark issue #21169: [SPARK-23715][SQL] the input of to/from_utc_timestamp ca...

2018-05-02 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21169 Addresses all of my comments, thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #21073: [SPARK-23936][SQL] Implement map_concat

2018-05-01 Thread bersprockets
Github user bersprockets commented on a diff in the pull request: https://github.com/apache/spark/pull/21073#discussion_r185392954 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -116,6 +117,161 @@ case class

[GitHub] spark issue #21073: [SPARK-23936][SQL] Implement map_concat

2018-05-01 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21073 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21073: [SPARK-23936][SQL] Implement map_concat

2018-05-01 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21073 A test failed with "./bin/spark-submit ... No such file or directory" Seems like there's lots of spurious test failures right now. I will hold off on re-running for a li

[GitHub] spark issue #21144: [SPARK-24043][SQL] Interpreted Predicate should initiali...

2018-05-01 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21144 @hvanhovell @maropu As it turns out, there are at least two places where an InterpretedPredicate is created but never initialized: SimpleTextSource.buildReader

[GitHub] spark issue #21073: [SPARK-23936][SQL] Implement map_concat

2018-05-01 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21073 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21141: [SPARK-23853][PYSPARK][TEST] Run Hive-related PySpark te...

2018-04-30 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21141 My experience here is limited. Still, it also looks good to me. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21073: [SPARK-23936][SQL] Implement map_concat

2018-04-30 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21073 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21073: [SPARK-23936][SQL] Implement map_concat

2018-04-30 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/21073 @mn-mikke @kiszk Thanks for the review. I addressed the comments. Please take a look when you have a chance

  1   2   >