Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r155667834
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala ---
@@ -263,6 +262,25 @@ class DatasetAggregatorSuite extends QueryTest
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r155374041
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala ---
@@ -263,6 +262,25 @@ class DatasetAggregatorSuite extends QueryTest
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/18113
There is one issue however I am stuck on: the tests for empty sets ("typed
aggregate: empty") seem to be casting to nulls from options, resulting into the
following:
Decoded
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r155070641
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +96,165 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r155068746
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +96,165 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r155068709
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -17,8 +17,10 @@
package
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/18113
@cloud-fan done, could you please have a look?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r154172794
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +94,91 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r153025270
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +94,91 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r153024799
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +94,91 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r153022290
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +94,91 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r153021545
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +94,91 @@ class TypedAverage[IN](val f
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/18113
@cloud-fan done, some small white spaces remain as it formats the functions
within the file consistently
---
-
To unsubscribe, e
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r153020474
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -38,13 +38,11 @@ class TypedSumDouble[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r153020437
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -81,14 +77,13 @@ class TypedCount[IN](val f
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/18113
@cloud-fan could you have a look please?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r150392495
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -76,26 +76,126 @@ class TypedCount[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r150392424
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/expressions/javalang/typed.java ---
@@ -74,4 +71,40 @@
public static TypedColumn<T, L
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/18113
Exactly my point. I'll return -/+ inf then for doubles only, and min/max
values for longs
---
-
To unsubscribe, e-mail
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/18113
Ok sounds good. What about doubles? We could return the proper mathematical
defintion, but that is not consistent with Longs
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/18113
An empty sets min and max are defined is -infinity and +infinity:
https://en.wikipedia.org/wiki/Empty_set
This is supported for Java doubles, but not for Longs. We could instead
Long.MIN
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r150381736
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -26,43 +26,64 @@ import
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/18113
@cloud-fan
Sorry i misread the conclusion of the discussion, reverted the initial api
to exactly how it was before, while the new functions follow the SQL standard
as you agreed on 2 weeks ago
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/18113
@gatorsmile @cloud-fan
Could you have a look please?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/18113
Hi, it has been a while but I can pick it back up when I have time next
weekend or so if that's OK.
---
-
To unsubscribe, e
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r121761524
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -26,43 +26,64 @@ import
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r119996472
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -95,7 +93,123 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r119175369
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -95,7 +93,123 @@ class TypedAverage[IN](val f
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/18080
This variant is available in other DB's, albeit with slightly different
function and parameter naming. For example, MySQL allows it via the `week()`
function:
http://www.w3resource.com/mysql/date
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r118939565
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +97,67 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r118914424
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +97,67 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r118843548
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +97,67 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r118841626
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +97,67 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r118840234
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +97,67 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r118822478
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +97,67 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r118821329
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +97,67 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18113#discussion_r118821077
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
---
@@ -99,3 +97,67 @@ class TypedAverage[IN](val f
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18080#discussion_r118820467
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
---
@@ -402,23 +402,40 @@ case class DayOfMonth
Github user setjet commented on a diff in the pull request:
https://github.com/apache/spark/pull/18080#discussion_r118820456
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
---
@@ -402,23 +402,40 @@ case class DayOfMonth
GitHub user setjet opened a pull request:
https://github.com/apache/spark/pull/18125
[SPARK-20891][SQL] Reduce duplicate code typedaggregators.scala
## What changes were proposed in this pull request?
The aggregators in typedaggregators.scala were polluted with duplicate
code
GitHub user setjet opened a pull request:
https://github.com/apache/spark/pull/18113
[SPARK-20890][SQL] Added min and max typed aggregation functions
## What changes were proposed in this pull request?
Typed Min and Max functions are missing for aggregations done on dataset
GitHub user setjet opened a pull request:
https://github.com/apache/spark/pull/18097
[Spark-20873][SQL] Improve the error message for unsupported Column Type
## What changes were proposed in this pull request?
Upon encountering an invalid columntype, the column type object
GitHub user setjet opened a pull request:
https://github.com/apache/spark/pull/18094
[Spark-20775][SQL] Added scala support from_json
## What changes were proposed in this pull request?
from_json function required to take in a java.util.Hashmap. For other
functions, a java
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/18080
I agree that we shouldn't change the behavior, hence I suggested we could
do it the other way around: make a new function for gregorian instead and
leave weekofyear as is.
I suppose we
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/18080
Coming to think of it, it might actually be better to switch it around:
have ISO8601 as function weekofyear, and make a separate function for gregorian
because ISO is more of a commonly used term
GitHub user setjet opened a pull request:
https://github.com/apache/spark/pull/18080
[Spark-20771][SQL] Make weekofyear more intuitive
## What changes were proposed in this pull request?
The current implementation of weekofyear implements ISO8601, which results
in the following
Github user setjet closed the pull request at:
https://github.com/apache/spark/pull/14233
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user setjet commented on the issue:
https://github.com/apache/spark/pull/17523
Done :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user setjet opened a pull request:
https://github.com/apache/spark/pull/17523
[SPARK-20064][PySpark]
## What changes were proposed in this pull request?
PySpark version in version.py was lagging behind
Versioning is in line with PEP 440:
https://www.python.org/dev
49 matches
Mail list logo