Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15389
retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15389
Seems Jenkins are not in working status?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15389
retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15388
ping @rxin @hvanhovell @cloud-fan @gatorsmile any else need to address?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/14452
@rxin yeah, as I tried adding explicit cache call doesn't improve it. So I
remove it then.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitH
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15423#discussion_r82716067
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -207,6 +208,7 @@ class SQLQueryTestSuite extends QueryTest with
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15148#discussion_r82716769
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala ---
@@ -0,0 +1,343 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15423#discussion_r82718263
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -207,6 +208,7 @@ class SQLQueryTestSuite extends QueryTest with
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15388#discussion_r82718343
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionSetSuite.scala
---
@@ -80,6 +80,65 @@ class ExpressionSetSuite
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15423#discussion_r82720911
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -207,6 +208,7 @@ class SQLQueryTestSuite extends QueryTest with
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15398#discussion_r82722525
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
---
@@ -25,26 +25,25 @@ object StringUtils
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15148#discussion_r82725489
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala ---
@@ -0,0 +1,343 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
GitHub user viirya opened a pull request:
https://github.com/apache/spark/pull/15427
[SPARK-17866][SPARK-17867][SQL] Fix Dataset.dropduplicates
## What changes were proposed in this pull request?
Two issues regarding Dataset.dropduplicates:
1
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/14847
Re-open it and see if we can have some consensus about this direction.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
GitHub user viirya reopened a pull request:
https://github.com/apache/spark/pull/14847
[SPARK-17254][SQL] Filter can stop when the condition is false if the child
output is sorted
## What changes were proposed in this pull request?
From
https://issues.apache.org/jira
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15423#discussion_r82745403
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -168,17 +168,7 @@ class SparkSqlAstBuilder(conf: SQLConf
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15427
cc @cloud-fan @hvanhovell
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15388
Thanks! @rxin @cloud-fan @hvanhovell @gatorsmile
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15423#discussion_r82750409
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -168,17 +168,7 @@ class SparkSqlAstBuilder(conf: SQLConf
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15423#discussion_r82750631
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -168,17 +168,7 @@ class SparkSqlAstBuilder(conf: SQLConf
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/14847
@rxin Thanks for recommendation! Let me close it now and work on it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user viirya closed the pull request at:
https://github.com/apache/spark/pull/14847
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15389#discussion_r82929167
--- Diff: python/pyspark/rdd.py ---
@@ -2029,7 +2028,15 @@ def coalesce(self, numPartitions, shuffle=False):
>>> sc.parallelize([1, 2
GitHub user viirya opened a pull request:
https://github.com/apache/spark/pull/15445
[SPARK-17817][PySpark][FOLLOWUP] PySpark RDD Repartitioning Results in
Highly Skewed Partition Sizes
## What changes were proposed in this pull request?
This change is a followup for
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15389#discussion_r82930615
--- Diff: python/pyspark/rdd.py ---
@@ -2029,7 +2028,15 @@ def coalesce(self, numPartitions, shuffle=False):
>>> sc.parallelize([1, 2
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15398#discussion_r82931395
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
---
@@ -25,26 +25,25 @@ object StringUtils
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15423#discussion_r82937473
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -207,6 +208,7 @@ class SQLQueryTestSuite extends QueryTest with
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15445
@felixcheung I post the benchmark in #15389. Now post here too.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15427#discussion_r83140093
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -1878,17 +1878,25 @@ class Dataset[T] private[sql](
def dropDuplicates
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15427
Thanks for review! @rxin @cloud-fan
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15148#discussion_r83146607
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala ---
@@ -0,0 +1,343 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15457#discussion_r83162239
--- Diff: sql/core/src/main/java/org/apache/spark/sql/api/java/UDF1.java ---
@@ -19,14 +19,12 @@
import java.io.Serializable
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15445
@davies @felixcheung I ran another benchmark as follows:
import time
import random
num_partitions = 2
a = sc.parallelize(map(lambda x: [random.randint
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/12335#discussion_r83357331
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonUDF.scala
---
@@ -28,10 +28,11 @@ case class PythonUDF(
name: String
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/12335#discussion_r83357461
--- Diff: python/pyspark/sql/functions.py ---
@@ -1741,15 +1742,15 @@ def __call__(self, *cols):
@since(1.3)
-def udf(f, returnType
GitHub user viirya reopened a pull request:
https://github.com/apache/spark/pull/14847
[SPARK-17254][SQL] Filter can stop when the condition is false if the child
output is sorted
## What changes were proposed in this pull request?
From
https://issues.apache.org/jira
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15445
ping @davies @felixcheung Could you take a look to see if we want to apply
this? Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15495#discussion_r83526181
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
---
@@ -587,6 +594,30 @@ class SQLQuerySuite extends QueryTest
Github user viirya closed the pull request at:
https://github.com/apache/spark/pull/14847
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15495#discussion_r83528964
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
---
@@ -587,6 +594,30 @@ class SQLQuerySuite extends QueryTest
GitHub user viirya opened a pull request:
https://github.com/apache/spark/pull/15500
[SPARK-17956][SQL] Fix projection output ordering
## What changes were proposed in this pull request?
Currently `ProjectExec` simply takes child plan's `outputOrdering` a
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15398#discussion_r83552457
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
---
@@ -25,26 +25,25 @@ object StringUtils
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15398
For escaping before a non-special character, I don't know if DB2 is
special. Because as I try, MySQL behaving like PostgreSQL.
---
If your project is set up for it, you can reply to this emai
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15495#discussion_r83552790
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
---
@@ -587,6 +594,30 @@ class SQLQuerySuite extends QueryTest
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15398
> If the character after an escape character is not a wildcard character,
the escape character is discarded and the character following the escape is
treated as a regular character in the patt
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15398
@gatorsmile That is for ending a pattern with the escape sequence. I mean
escaping before a non-special character.
---
If your project is set up for it, you can reply to this email and have your
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15398
Maybe more important is, how Hive performs `like`. For escaping before a
non-special character, loos like it is different to above examples. If you
gives pattern like `\a`, it matches exactly `\a
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15500#discussion_r83582706
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
---
@@ -77,9 +77,40 @@ case class ProjectExec(projectList
GitHub user viirya reopened a pull request:
https://github.com/apache/spark/pull/14847
[SPARK-17254][SQL] Add StopAfter physical plan for the filtering that can
be stopped early
## What changes were proposed in this pull request?
This is motivated by:
From
https
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/14847
retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15445
ping @davies @felixcheung May you review this again? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/13775
I agree that from a maintenance standpoint forking the classes is bad. But
if we really want to have the one in Spark, I would like to help too. :)
---
If your project is set up for it, you can
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/13775
@tejasapatil Thanks for the review comment! I will update this later.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15423#discussion_r83771158
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -168,17 +168,7 @@ class SparkSqlAstBuilder(conf: SQLConf
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15423
LGTM, see if @cloud-fan has more comments on this or not?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15500
also cc @cloud-fan @yhuai
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15500#discussion_r83781352
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
---
@@ -77,9 +77,40 @@ case class ProjectExec(projectList
Github user viirya closed the pull request at:
https://github.com/apache/spark/pull/15500
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/13775#discussion_r83796409
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala ---
@@ -118,6 +120,11 @@ class OrcFileFormat extends FileFormat with
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/13775#discussion_r83796484
--- Diff:
sql/hive/src/main/java/org/apache/hadoop/hive/ql/io/orc/VectorizedSparkOrcNewRecordReader.java
---
@@ -0,0 +1,318 @@
+/*
+ * Licensed to
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15445
@davies @felixcheung Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15481#discussion_r83990921
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
---
@@ -393,7 +393,7 @@ class
GitHub user viirya opened a pull request:
https://github.com/apache/spark/pull/15547
[SPARK-18002][SQL] Pruning unnecessary IsNotNull predicates from Filter
## What changes were proposed in this pull request?
In `PruneFilters` rule, we can prune unnecessary `IsNotNull
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15541#discussion_r83998540
--- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskAssigner.scala
---
@@ -0,0 +1,233 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15541#discussion_r83998771
--- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskAssigner.scala
---
@@ -0,0 +1,233 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15541#discussion_r83999452
--- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskAssigner.scala
---
@@ -0,0 +1,233 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15541#discussion_r83999827
--- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskAssigner.scala
---
@@ -0,0 +1,233 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15541#discussion_r8416
--- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskAssigner.scala
---
@@ -0,0 +1,233 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15481
@mridulm I checked #9963 and looks like we don't test against
`CoarseGrainedSchedulerBackend.reset`.
---
If your project is set up for it, you can reply to this email and have your
reply appe
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15523#discussion_r84010085
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
---
@@ -87,7 +87,14 @@ case class FilterExec(condition
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15523#discussion_r84011498
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
---
@@ -87,7 +87,14 @@ case class FilterExec(condition
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15547
cc @cloud-fan
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15547
@cloud-fan yeah, I've not noticed `NullPropagation` already has rule for
this. Close this now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on G
Github user viirya closed the pull request at:
https://github.com/apache/spark/pull/15547
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15481#discussion_r84024475
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
---
@@ -145,6 +145,9 @@ class
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15523
@gatorsmile A predicate like `IsNotNull(a + b + Rand())` will let this
change to wrongly set the nullability of `a` and `b` to true. Isn't it?
---
If your project is set up for it, you can rep
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/14847
@ioana-delaney Thanks for review!
I replied few points first. I will add the tests you mentioned later.
4. This feature is motivated from the bucketed (and sorted, of course)
table
GitHub user viirya opened a pull request:
https://github.com/apache/spark/pull/15558
[SPARK-17357][SPARK-6624][SQL] Convert filter predicate to CNF in Optimizer
for pushdown
## What changes were proposed in this pull request?
This PR is proposed to solve the problem #14912
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/15541#discussion_r84222924
--- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskAssigner.scala
---
@@ -0,0 +1,226 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/12904#discussion_r84235107
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVOptions.scala
---
@@ -90,6 +90,7 @@ private[csv] class CSVOptions
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15423
The tests are passed but the results are failed to post back to github...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15423
@cloud-fan Need to run tests again?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/17242
ping @cloud-fan Except for the optimization integration, do you have more
comments on this change? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/17242
hmm, so you don't think canonicalizer should use this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/17242
anyway, I will move it to optimizer in next update.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/17242
retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/17242
@cloud-fan I've moved this to the optimizer, please take a look. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/17242
ping @cloud-fan @hvanhovell Can you help review this? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/17186
ping @sameeragarwal This is updated according to your previous comment. Can
you help review this? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/17302#discussion_r107089550
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala ---
@@ -70,7 +70,20 @@ object RDDConversions {
object ExternalRDD
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/17371#discussion_r107094080
--- Diff: python/pyspark/sql/functions.py ---
@@ -1163,7 +1163,10 @@ def check_string_field(field, fieldName):
raise TypeError("%s shou
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/17371#discussion_r107096074
--- Diff: python/pyspark/sql/functions.py ---
@@ -1163,7 +1163,10 @@ def check_string_field(field, fieldName):
raise TypeError("%s shou
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/17371
For now, after `withWatermark`, we only update the metadata for the column
of event time. The expression id is the same. So once we use the column before
adding watermark `words.timestamp` as
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/17302#discussion_r107117098
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala ---
@@ -17,6 +17,8 @@
package org.apache.spark.sql.execution
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/17371
IMHO, the output after `withWatermark` should be new attribute and have new
expression id. Maybe @zsxwing @marmbrus have more insights on this?
Btw, does this issue also happen in Scala code
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/17302#discussion_r107169764
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala
---
@@ -41,7 +41,20 @@ object CatalystSerde
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/17302#discussion_r107192456
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala
---
@@ -41,7 +41,20 @@ object CatalystSerde
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/17302#discussion_r107298496
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala
---
@@ -41,7 +41,20 @@ object CatalystSerde
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/17302#discussion_r107298835
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala
---
@@ -41,7 +41,20 @@ object CatalystSerde
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/17371
Unfortunately, yes, allowing resolved attributes in user API will have this
kind of trouble.
> However, I don't think that piecemeal switching to unresolved attributes
is a g
401 - 500 of 9977 matches
Mail list logo