Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9556#discussion_r44267784
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -146,148 +146,105 @@ private[sql] abstract class
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9556#discussion_r44268074
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -146,148 +146,105 @@ private[sql] abstract class
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9556#discussion_r44268156
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala
---
@@ -21,22 +21,22 @@ import
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9556#discussion_r44268135
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/SortBasedAggregate.scala
---
@@ -27,15 +27,15 @@ import
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/9414
[SPARK-11450][SQL] Add Unsafe Row processing to Expand
This PR enables the Expand operator to process and produce Unsafe Rows.
You can merge this pull request into a Git repository by running
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/9417
[SPARK-11449][Core] PortableDataStream should be a factory
```PortableDataStream``` maintains some internal state. This makes it
tricky to reuse a stream (one needs to call ```close``` on both
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9429#discussion_r43724216
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -205,45 +205,30 @@ class Analyzer
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9419#issuecomment-153147431
It would also help alot to have unit tests covering this problem.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9419#issuecomment-153146768
If I understand the problem correctly, the logical Expand operator makes
items which are not in the grouping set ```null```. This means that if a column
is both used
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9406#discussion_r44203475
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Utils.scala
---
@@ -213,3 +216,178 @@ object Utils
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9409#discussion_r44203342
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala
---
@@ -419,3 +419,30 @@ case class Greatest
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9409#discussion_r44203267
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala
---
@@ -419,3 +419,30 @@ case class Greatest
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9406#discussion_r44203629
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Utils.scala
---
@@ -213,3 +216,178 @@ object Utils
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9406#discussion_r44203639
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Utils.scala
---
@@ -213,3 +216,178 @@ object Utils
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/9541
[SPARK-9241][SQL] Supporting multiple DISTINCT columns - follow-up
This PR is a follow up for PR https://github.com/apache/spark/pull/9406. It
adds more documentation to the rewriting rule
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9409#issuecomment-154695463
test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9409#discussion_r44210935
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala
---
@@ -419,3 +419,30 @@ case class Greatest
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9409#discussion_r44216923
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Utils.scala
---
@@ -54,10 +54,14 @@ object Utils
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9409#issuecomment-154752963
This one is currently running:
https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2005/consoleFull
---
If your project is set up for it, you can
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9409#issuecomment-154750495
test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9409#discussion_r44214255
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Utils.scala
---
@@ -54,10 +54,14 @@ object Utils
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9541#issuecomment-154723430
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9541#issuecomment-154723404
Funny build failure:
Build was aborted
Aborted by anonymous
ERROR: Step ?Archive the artifacts? failed: no workspace
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9409#discussion_r44213709
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Utils.scala
---
@@ -54,10 +54,14 @@ object Utils
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9409#issuecomment-154758153
Seems like I have broken something. I'll need to rebase anyway.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9417#discussion_r43870851
--- Diff:
core/src/main/scala/org/apache/spark/input/PortableDataStream.scala ---
@@ -177,39 +170,24 @@ class PortableDataStream
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9455#discussion_r43854421
--- Diff:
core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java ---
@@ -325,6 +327,11 @@ public Location next() {
try
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9461#discussion_r43852227
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
---
@@ -487,4 +488,9 @@ private[sql] class JDBCRDD
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9456#issuecomment-153630052
Could you add the number of the JIRA ticket this relates to? See other PRs
for an example.
---
If your project is set up for it, you can reply to this email
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/9406
[SPARK-9241][SQL] Supporting multiple DISTINCT columns (2) - Rewriting Rule
The second PR for SPARK-9241, this adds support for multiple distinct
columns to the new aggregation code path
Github user hvanhovell closed the pull request at:
https://github.com/apache/spark/pull/8298
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/9409
[SPARK-11451][SQL] Support single distinct count on multiple columns.
This PR adds support for multiple column in a single count distinct
aggregate to the new aggregation path.
cc
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9414#discussion_r44112686
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Expand.scala ---
@@ -41,14 +41,34 @@ case class Expand(
// as UNKNOWN partitioning
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9414#discussion_r44112770
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/ExpandSuite.scala ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software
Github user hvanhovell closed the pull request at:
https://github.com/apache/spark/pull/9280
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9406#issuecomment-154368700
H... this is a bit of a strange error.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9406#issuecomment-154368715
Jenkins retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9406#issuecomment-154369474
Jenkins is not retesting... @marmbrus could you add me to the whitelist?
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8362#issuecomment-148305817
Another thought on hashing. The ClearSpring hash is a generic hash
function. We could used very specialized (hopefully fast) hashing functions,
because we know
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8362#issuecomment-148209375
@yhuai It doesn't. A 64-bit hashcode is recommended though, especially when
would want to approximate a billion or more unique values. I have used the
ClearSpring
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8362#issuecomment-148209543
A good article on HLL++ and the hashcode:
http://research.neustar.biz/2013/01/24/hyperloglog-googles-take-on-engineering-hll
---
If your project is set up
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9167#issuecomment-149464143
We could do a similar thing for window functions.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9050#issuecomment-147077121
Good catch! Shouldn't we also backport this one into the 1.5 branch?
@davies @yhuai could one of you guys explain to me why/where this is
causing problems? I
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34182012
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -37,443 +67,615 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34182441
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -37,443 +67,615 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34169654
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -37,443 +59,622 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34175570
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -37,443 +67,615 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34180206
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -37,443 +67,615 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34180117
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -37,443 +67,615 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34205943
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveDataFrameWindowSuite.scala
---
@@ -189,7 +189,7 @@ class HiveDataFrameWindowSuite extends
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7057#issuecomment-119752352
@yhuai I have updated the PR.
As for the documentation. I will add another section to the general class
documentation, which explains the inner workings
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9642#issuecomment-156721678
So I have been making a lot of fuss about internal classes, which you are
not touching. Sorry about that.
This change is much more benign, but I still wonder
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8362#issuecomment-135619245
Implemented initial non-sparse HLL++. I am going to take a look at the
sparse version next week. The results are still equal to the Clearspring HLL+
implementation
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8362#issuecomment-138493630
@rxin the dense version of HLL++ is ready. We could also add this, and add
the sparse logic in a follow-up PR. Let me know what you think. I'll close if
you'd rather
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8362#discussion_r40782166
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HyperLogLogPlusPlusSuite.scala
---
@@ -0,0 +1,125
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/6416#issuecomment-142309139
@MLnick I guess it depends. The other ```dataframe.stat``` functions have
not been implemented as UDAFs, so this is not nessecary. However I do think
that CMS
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8362#issuecomment-142323666
@MLnick I am in the process of moving house, so I am a bit slow/late with
my response :(...
I think it is very usefull to be able to return the HLL
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9819#issuecomment-160478413
Yes. You can use any Spark aggregate function as a window function. Most
Hive UDAFs should also work except for the pivoted ones...
---
If your project is set up
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/10067#discussion_r46353257
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -223,10 +223,13 @@ class Analyzer
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9819#issuecomment-161653994
@zzcclp Just to absolutely (painfully) clear: You can use the Hive based
window functions without a Hive installation, you just need to have a version
of Spark
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47071980
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
---
@@ -328,3 +281,222 @@ object
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47073846
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
---
@@ -328,3 +281,222 @@ object
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47073901
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
---
@@ -328,3 +281,222 @@ object
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47072914
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -592,11 +594,17 @@ class Analyzer
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47035205
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
---
@@ -70,15 +70,32 @@ trait CheckAnalysis
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47034537
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -870,26 +878,37 @@ class Analyzer
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47034735
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -156,36 +165,90 @@ case class Window(
* @param frame
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47034276
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -567,6 +567,8 @@ class Analyzer(
case u
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47105986
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -870,26 +878,37 @@ class Analyzer
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9819#issuecomment-161247041
@zzcclp this PR is slated for review in the next week or so. It should be
in good shape, but I'll leave the verdict to the reviewers.
SPARK 1.6 is currently
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/10228#issuecomment-163764840
@davies don't get me wrong. I think this PR is an improvement of the
current situation (it never crossed my mind to change partitioning when I was
working
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47408969
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
---
@@ -246,85 +260,244 @@ object
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47407972
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
---
@@ -246,85 +260,244 @@ object
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47410151
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -736,15 +691,156 @@ private[execution] final class
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47407165
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
---
@@ -70,15 +70,32 @@ trait CheckAnalysis
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47408697
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -870,26 +878,37 @@ class Analyzer
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47411273
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -736,15 +691,156 @@ private[execution] final class
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47408553
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/interfaces.scala
---
@@ -187,7 +184,7 @@ sealed abstract class
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47407885
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
---
@@ -120,6 +121,19 @@ sealed trait
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47410438
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -149,43 +152,102 @@ case class Window
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/10335#discussion_r47834244
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
---
@@ -210,6 +210,37 @@ case class Sort
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/10335#discussion_r47828090
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala ---
@@ -126,6 +127,69 @@ case class Sample
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/10335#discussion_r47828506
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
---
@@ -210,6 +210,37 @@ case class Sort
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/10335#discussion_r47827840
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala ---
@@ -126,6 +127,69 @@ case class Sample
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/10335#discussion_r47834560
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala ---
@@ -126,6 +127,69 @@ case class Sample
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/10228#issuecomment-164251993
@yhuai I think having the two clearly separated paths (this PR) is an
improvement of the current situation. I also admit that I am responsible for
introducing
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47620489
--- Diff:
sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveWindowFunctionQuerySuite.scala
---
@@ -472,7 +475,7 @@ class
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/10228#discussion_r47444204
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggregationIterator.scala
---
@@ -165,237 +137,100 @@ abstract class
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/10228#discussion_r47444202
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggregationIterator.scala
---
@@ -165,237 +137,100 @@ abstract class
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47450253
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
---
@@ -246,85 +260,238 @@ object
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47450230
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
---
@@ -246,85 +260,238 @@ object
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47451483
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
---
@@ -246,85 +260,238 @@ object
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47432910
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -870,26 +878,37 @@ class Analyzer
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/10374#discussion_r48015896
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/rows.scala
---
@@ -201,7 +201,7 @@ class GenericRow(protected[sql] val
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47973756
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
---
@@ -246,85 +260,281 @@ object
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/10374#discussion_r48020875
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/rows.scala
---
@@ -201,7 +201,7 @@ class GenericRow(protected[sql] val
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/10249#issuecomment-163685844
The timestamp is bound to this specific number because CodeGen uses -1L as
its default (null) value for Timestamp (assuming that your timezone is GMT-8
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/10228#issuecomment-163699118
We could move the planning of a distinct queries entirely to the
DistinctAggregateRewriter. This would require us to merge the non-distinct
aggregate paths
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9819#discussion_r47565984
--- Diff:
sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveWindowFunctionQuerySuite.scala
---
@@ -472,7 +475,7 @@ class
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9819#issuecomment-164583021
Build failed due to R versioning problem. I'll try again when this is
sorted out.
---
If your project is set up for it, you can reply to this email and have your
101 - 200 of 4165 matches
Mail list logo