Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/23124#discussion_r236952729
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala
---
@@ -0,0 +1,118 @@
+/*
+ * Licensed
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/23000#discussion_r234819827
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
---
@@ -410,6 +410,30 @@ class DateTimeUtilsSuite
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22504
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/22865#discussion_r228771361
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -462,7 +462,7 @@ object SQLConf {
val
GitHub user bersprockets opened a pull request:
https://github.com/apache/spark/pull/22865
[DOC] Fix doc for spark.sql.parquet.recordLevelFilter.enabled
## What changes were proposed in this pull request?
Updated the doc string value
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22504
The Py4JJavaError StackOverflow happens pretty reliably. I am guessing its
related to the change.
---
-
To unsubscribe, e
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22504
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22504
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21950#discussion_r218608537
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
---
@@ -1051,11 +1052,27 @@ private[hive] object
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21950
I broke this :). Don't ask for a redo.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21950#discussion_r217216975
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/PruneFileSourcePartitionsSuite.scala
---
@@ -91,4 +91,28 @@ class
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22192
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22382
Thanks! Closing.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets closed the pull request at:
https://github.com/apache/spark/pull/22382
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22192
retest this please.
It's that old "java.lang.reflect.InvocationTargetException: null" error
we've seen
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21899
cc @jinxing64 @hvanhovell @MaxGekk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22382
cc @cloud-fan @JoshRosen
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
GitHub user bersprockets opened a pull request:
https://github.com/apache/spark/pull/22382
[SPARK-23243] [SPARK-20715][CORE][2.2] Fix RDD.repartition() data
correctness issue
## What changes were proposed in this pull request?
Back port of #22354 and #17955 to 2.2 (#22354
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22192
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22209
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22209
Looks like test failed due to
https://issues.apache.org/jira/browse/SPARK-23622
---
-
To unsubscribe, e-mail: reviews
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22209
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22188
@gatorsmile Thanks much!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22188
@gatorsmile
>Why 2.2 only?
Only that I forgot that master is already on 2.4. We should do 2.3 as well,
but I haven't tested it yet.
Do I need to do anything on my
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22188
@cloud-fan @gatorsmile Should we merge this also onto 2.2? It was a clean
cherry-pick for me (from master to branch-2.2), and I ran the top and bottom
tests (6000 columns, 1 million rows, 67
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21899#discussion_r212756302
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
---
@@ -118,12 +119,20 @@ case class
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21950#discussion_r212719073
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala
---
@@ -76,4 +78,16 @@ private[sql
Github user bersprockets closed the pull request at:
https://github.com/apache/spark/pull/22079
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22079
@gatorsmile Weird, I don't see it on branch-2.2. Is that a sync issue?
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22188
OK, I reran the tests for the lower column count cases, and the runs with
the patch consistently show a tiny (1-3%) improvement compared to the master
branch. So even the lower column count
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22188
Thanks @vanzin. In my benchmark tests, the tiny degradation (0.5%) in the
lower column count cases is pretty consistent, which concerns me a little. I am
going to re-run those tests
GitHub user bersprockets opened a pull request:
https://github.com/apache/spark/pull/22188
[SPARK-25164][SQL] Avoid rebuilding column and path list for each column in
parquet reader
## What changes were proposed in this pull request?
VectorizedParquetRecordReader
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21899#discussion_r211833522
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
---
@@ -118,12 +119,20 @@ case class
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22154
Re: Your build failure ('statefulOperators.scala:95: value asJava is not a
member of scala.collection.immutable.Map[String,Long]).
I am also seeing this in my fork on my laptop (I just
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22079
@gatorsmile So I should include all the related PRs merged to master as a
single PR here? Just verifying
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21899
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21899#discussion_r211047556
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
---
@@ -118,12 +119,20 @@ case class
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21899
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21899
@MaxGekk In the updated message, I left out "hash" from the term "hash
relation" only because it seems the relation could
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21950
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22079
@jiangxb1987 gentle ping.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22079
Once this is merged, I will also back-port:
- [[SPARK-24564][TEST] Add test suite for
RecordBinaryComparator](https://github.com/apache/spark/commit
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22101
Should there be a test, or do other sorting-related tests cover this
indirectly?
---
-
To unsubscribe, e-mail: reviews
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22079
@jiangxb1987
Here are some of the differences from the original PR
- I also ported the follow up PR #20426
- I ported #20088 (for SPARK-22905) to get the tests to pass. I also
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/22079#discussion_r209736691
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala ---
@@ -144,7 +144,7 @@ object ChiSqSelectorModel extends
Loader
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22079
Hmmm... I somehow managed to break SparkR tests but fixing a comment. It
seems to have auto-retried and broke the second time too
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22079
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22079
@jiangxb1987
> We shall also include #20088 in this backport PR.
I did that shortly after commenting, which allowed the tests to pass. I
squashed it into the first commit,
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/22079
The test "model load / save" in ChiSqSelectorSuite fails because of this
line in
[ChiSqSelector.scala](https://github.com/apache/spark/blob/branch-2.2/mllib/src/main/scala/
GitHub user bersprockets opened a pull request:
https://github.com/apache/spark/pull/22079
[SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on a DataFrame could
lead to incorrect answers
## What changes were proposed in this pull request?
Currently shuffle repartition
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21950
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21950
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user bersprockets opened a pull request:
https://github.com/apache/spark/pull/21950
[SPARK-24912][SQL][WIP] Add configuration to avoid OOM during broadcast
join (and other negative side effects of incorrect table sizing)
## What changes were proposed in this pull request
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21899
>Is it possible to include the actual size of the in-memory table so far in
the msg as well?
Only if the relation can be built. If we run out of memory attempting to
bu
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21899
> Is it possible to include the actual size of the in-memory table so far
in the msg as well?
Possibly. The state of the relation might be messy when I go to query its
s
GitHub user bersprockets opened a pull request:
https://github.com/apache/spark/pull/21899
[SPARK-24912][SQL] Don't obscure source of OOM during broadcast join
## What changes were proposed in this pull request?
This PR shows the stack trace of the original OutOfMemoryError
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21073
@ueshin Thanks for all your help!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21073
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21073#discussion_r199678852
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
---
@@ -551,6 +551,36 @@ object TypeCoercion
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21073
Still working on type coercion.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21628
@HyukjinKwon Thanks for your help!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21073#discussion_r197671215
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
---
@@ -475,6 +474,231 @@ case class
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21073
@ueshin
>so I was wondering whether we need the same thing for MapConcat or not.
Got it. I will research that, plus I will look at the entire pull request
for Concat to
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21073#discussion_r197669221
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
---
@@ -475,6 +474,231 @@ case class
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21628#discussion_r197667457
--- Diff: docs/building-spark.md ---
@@ -215,19 +215,23 @@ If you are building Spark for use in a Python
environment and you wish to pip
GitHub user bersprockets opened a pull request:
https://github.com/apache/spark/pull/21628
[SPARK-23776][DOC] Update instructions for running PySpark after building
with SBT
## What changes were proposed in this pull request?
This update tells the reader how to build Spark
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21621#discussion_r197646450
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala ---
@@ -556,6 +556,17 @@ class DataFrameFunctionsSuite extends
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21621#discussion_r197623576
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala ---
@@ -556,6 +556,17 @@ class DataFrameFunctionsSuite extends
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21073
Hi @ueshin. Just a question while I work on the changes for your review
comments.
>I'm wondering whether we need type coercion like concat for array type is
doing.
Which t
Github user bersprockets closed the pull request at:
https://github.com/apache/spark/pull/20909
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/20909
@HyukjinKwon This PR is mostly obsolete. I will close it and re-open
something smaller... maybe a one-line documentation change to handle the
missing UDF case for those who build with sbt
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21231
@hvanhovell @maropu @viirya @kiszk Thanks for all the help!
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21073#discussion_r193280073
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
---
@@ -308,6 +308,170 @@ case class
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21231
ping @hvanhovell
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21308#discussion_r190963247
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/DeleteSupport.java ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21073
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21073#discussion_r189161277
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala
---
@@ -56,6 +58,28 @@ class
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21231
@maropu @hvanhovell @viirya Are all pending issues resolved?
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21305#discussion_r188960491
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
---
@@ -344,6 +344,36 @@ case class
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21308#discussion_r188392219
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/DeleteSupport.java ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21231#discussion_r187192485
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala
---
@@ -147,7 +148,40 @@ case class SortPrefix(child
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21144
@cloud-fan I don't think this is an issue in 2.3. It would be an issue only
once [SPARK-23580](https://issues.apache.org/jira/browse/SPARK-23580)
("Interpreted mode fallback s
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21144
Thanks much!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21073
@ueshin Hopefully I have addressed all of your review comments.
Also, I have a question about what it means to dedup across maps when Spark
allows duplicates in maps
[here.](https
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21073#discussion_r186570491
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
---
@@ -116,6 +117,169 @@ case class
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21144
@hvanhovell @maropu Is there anything on this PR that I should do?
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21231#discussion_r186307490
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/SortOrderExpressionsSuite.scala
---
@@ -0,0 +1,90
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21231
@maropu @kiszk Hopefully I've addressed all comments. Please take a look.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21231
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21231
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user bersprockets opened a pull request:
https://github.com/apache/spark/pull/21231
[SPARK-24119][SQL]Add interpreted execution to SortPrefix expression
## What changes were proposed in this pull request?
Implemented eval in SortPrefix expression.
## How
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21169
Addresses all of my comments, thanks.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user bersprockets commented on a diff in the pull request:
https://github.com/apache/spark/pull/21073#discussion_r185392954
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
---
@@ -116,6 +117,161 @@ case class
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21073
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21073
A test failed with "./bin/spark-submit ... No such file or directory"
Seems like there's lots of spurious test failures right now. I will hold
off on re-running for a li
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21144
@hvanhovell @maropu As it turns out, there are at least two places where an
InterpretedPredicate is created but never initialized:
SimpleTextSource.buildReader
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21073
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21141
My experience here is limited. Still, it also looks good to me.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21073
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user bersprockets commented on the issue:
https://github.com/apache/spark/pull/21073
@mn-mikke @kiszk Thanks for the review. I addressed the comments. Please
take a look when you have a chance
1 - 100 of 177 matches
Mail list logo