GitHub user wzhfy opened a pull request:
https://github.com/apache/spark/pull/9349
[SPARK-11398][SQL] misleading dialect conf at the start of spark-sql
Instead of overriding def dialect in conf of HiveContext, I set the
SQLConf.DIALECT directly as "hiveql", such that resu
Github user wzhfy commented on the pull request:
https://github.com/apache/spark/pull/9349#issuecomment-152116547
@davies
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user wzhfy commented on the pull request:
https://github.com/apache/spark/pull/9349#issuecomment-153262162
@davies @liancheng I've updated the description of this problem, hoping to
explain it better now.
Can you review this pr and authorize testing? thx.
---
If your
Github user wzhfy commented on the pull request:
https://github.com/apache/spark/pull/9349#issuecomment-153922195
@davies The description is updated
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user wzhfy commented on the pull request:
https://github.com/apache/spark/pull/9349#issuecomment-153553626
@davies Thanks for the advice. The commit has been updated, please check if
that's what we want.
Btw, I think the cause of this problem is the inconsistency
Github user wzhfy commented on the pull request:
https://github.com/apache/spark/pull/10838#issuecomment-173437752
hi, @redsanket , in what situation will the number of requests become very
large?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16594
ok I'll modify it with this new command.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16594
@rxin Can we add a flag to enable or disable it? Currently there's no other
way to see size and row count except debugging.
---
If your project is set up for it, you can reply to this email and have
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16594
@hvanhovell I've updated the description which shows a simple example.
The explained plan will become hard to read when joining many tables and
sizeInBytes is computed by the simple way (non
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16594
@ron8hu Yes, I've already updated this pr. I'll present an example.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16594#discussion_r97482084
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala
---
@@ -54,11 +56,32 @@ case class Statistics
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16594
@gatorsmile I just did a quick fix to show how the improved stats look
like. If @rxin @hvanhovell accept the change proposed in this pr, I'll update
to remove the flag :)
---
If your project is set
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16594#discussion_r97481455
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala
---
@@ -54,11 +56,32 @@ case class Statistics
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16594
@rxin @gatorsmile @hvanhovell I've updated this pr and make stats much more
readable:
SizeInBytes is shown in units of B, KB, MB ... PB, e.g. `sizeInBytes=228.8
GB`,
and if it's too
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16594#discussion_r97217477
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -649,6 +649,14 @@ object SQLConf {
.doubleConf
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16594#discussion_r97216860
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -649,6 +649,14 @@ object SQLConf {
.doubleConf
GitHub user wzhfy opened a pull request:
https://github.com/apache/spark/pull/16696
[SPARK-19350] [SQL] Cardinality estimation of Limit and Sample
## What changes were proposed in this pull request?
Before this pr, LocalLimit/GlobalLimit/Sample propagates the same row count
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16696
cc @cloud-fan @gatorsmile please review
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16594#discussion_r102399167
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala
---
@@ -54,11 +57,29 @@ case class Statistics
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16594#discussion_r102398658
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -282,7 +282,8 @@ class SparkSqlAstBuilder(conf: SQLConf) extends
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16594#discussion_r102398371
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala
---
@@ -54,11 +57,29 @@ case class Statistics
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r101657416
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,569
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r101659663
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,569
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16594#discussion_r102560014
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala
---
@@ -54,11 +57,29 @@ case class Statistics
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16696
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16594#discussion_r102887155
--- Diff:
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -794,6 +795,7 @@ EXPLAIN: 'EXPLAIN';
FORMAT: 'FORMAT
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16696
@cloud-fan @gatorsmile I've updated this pr and also added test cases,
please review.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16594
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r100682019
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,623
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16228#discussion_r100681997
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/JoinEstimation.scala
---
@@ -0,0 +1,314
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16228#discussion_r100940580
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/JoinEstimation.scala
---
@@ -0,0 +1,316
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16228#discussion_r100938944
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/JoinEstimation.scala
---
@@ -0,0 +1,316
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16228#discussion_r100939140
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/Range.scala
---
@@ -0,0 +1,120 @@
+/*
+ * Licensed
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16228#discussion_r100929681
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/JoinEstimation.scala
---
@@ -0,0 +1,316
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16228#discussion_r100660451
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/JoinEstimation.scala
---
@@ -0,0 +1,314
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r100660551
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,623
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16228#discussion_r100660524
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/JoinEstimation.scala
---
@@ -0,0 +1,314
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16228#discussion_r100660417
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/JoinEstimation.scala
---
@@ -0,0 +1,314
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16228#discussion_r100660535
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/Range.scala
---
@@ -0,0 +1,117 @@
+/*
+ * Licensed
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16228#discussion_r100660406
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
---
@@ -340,14 +340,22 @@ case class Join
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16228
This pr is updated, please review @cloud-fan
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user wzhfy opened a pull request:
https://github.com/apache/spark/pull/16631
[SPARK-19271] [SQL] Change non-cbo estimation of aggregate
## What changes were proposed in this pull request?
Change non-cbo estimation behavior of aggregate:
If groupExpression
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16594#discussion_r96585594
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveExplainSuite.scala
---
@@ -27,6 +27,21 @@ import
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16594#discussion_r96589910
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -649,6 +649,14 @@ object SQLConf {
.doubleConf
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16621#discussion_r96594811
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
---
@@ -215,37 +215,43 @@ case class DataSourceAnalysis
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16594#discussion_r96588756
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -649,6 +649,14 @@ object SQLConf {
.doubleConf
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16631
cc @rxin @cloud-fan Can you please review this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16395
@ron8hu @rxin It seems we don't need logics for binary filter conditions
for date/timestamp types, because currently spark will always cast all relative
timestamp/data/string comparison into string
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16395
ok after more testing, estimation for timestamp/date comparison is still
useful.
e.g. users can write cast to date explicitly:
```
where d_date > date('2000-08-23'), or
where d_d
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16631
@gatorsmile That would be great, thanks
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96181470
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96181148
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96184518
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96182039
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96177721
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -0,0 +1,309
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96185977
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96178759
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96186262
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96177460
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -0,0 +1,309
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96185566
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96177531
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -0,0 +1,309
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96183525
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96187524
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96183994
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96187461
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96182452
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96180586
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96178685
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
GitHub user wzhfy opened a pull request:
https://github.com/apache/spark/pull/16594
[SPARK-17078] [SQL] Show stats when explain
## What changes were proposed in this pull request?
Currently we can only check the estimated stats in logical plans by
debugging. We need
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16594
cc @rxin @cloud-fan
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96183779
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96185211
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96186022
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96182085
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96183807
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96181736
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96181424
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16594
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16631#discussion_r97009203
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
---
@@ -344,7 +344,8 @@ abstract class UnaryNode extends
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/16633
Hi @viirya , the main concern of @scwf is that, we can't afford performance
regression in any customer scenarios. I think you can understand that :)
I went through the discussion above
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16631#discussion_r97017902
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
---
@@ -344,7 +344,8 @@ abstract class UnaryNode extends
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96331440
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96331471
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96331982
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,618
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96331920
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,618
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96335150
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -0,0 +1,303
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96335015
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -0,0 +1,303
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/14712
@gatorsmile yes, we will support it in this pr
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14712#discussion_r75438629
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
---
@@ -95,6 +95,12 @@ abstract class LogicalPlan extends
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14712#discussion_r75438886
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala
---
@@ -32,5 +32,11 @@ package
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14712#discussion_r75450024
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
---
@@ -95,6 +95,12 @@ abstract class LogicalPlan extends
GitHub user wzhfy opened a pull request:
https://github.com/apache/spark/pull/14712
[SPARK-17072] [SQL] support table-level statistics generation and storing
into/loading from metastore
## What changes were proposed in this pull request?
1. support generation table-level
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14712#discussion_r75569351
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala
---
@@ -33,7 +34,7 @@ import
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14712#discussion_r75569106
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala
---
@@ -88,14 +90,30 @@ case class AnalyzeTableCommand
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14712#discussion_r75569715
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala ---
@@ -168,6 +170,57 @@ class StatisticsSuite extends QueryTest
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14712#discussion_r75568863
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -99,9 +99,7 @@ class SparkSqlAstBuilder(conf: SQLConf) extends
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14712#discussion_r75569609
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala ---
@@ -141,7 +142,16 @@ private[hive] case class MetastoreRelation
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14712#discussion_r75569605
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala
---
@@ -108,4 +126,8 @@ case class AnalyzeTableCommand
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14712#discussion_r75573715
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala ---
@@ -141,7 +142,16 @@ private[hive] case class MetastoreRelation
Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/14712#discussion_r75572799
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala
---
@@ -88,14 +90,30 @@ case class AnalyzeTableCommand
1 - 100 of 1296 matches
Mail list logo