[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-06 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
@yhuai @hvanhovell @cloud-fan Sorry for the late response, I'm out of 
office for two days.
@gatorsmile Thanks to fix it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-06 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14712
  
These tests block multiple PRs. It is midnight in China. : ) Let me do a 
quick fix based on the comments of @cloud-fan and @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-05 Thread yhuai
Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/14712
  
I have created https://issues.apache.org/jira/browse/SPARK-17408. @wzhfy 
Can you take a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-05 Thread yhuai
Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/14712
  
Can you take a look at the test at 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64956/testReport/junit/org.apache.spark.sql.hive/StatisticsSuite/test_statistics_of_LogicalRelation_converted_from_MetastoreRelation/?
 It is flaky.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-05 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/14712
  
LGTM. Merging to master. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64886/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64886 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64886/consoleFull)**
 for PR 14712 at commit 
[`5d6e559`](https://github.com/apache/spark/commit/5d6e5599b558512000f3f62349276ebf19be366a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64886 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64886/consoleFull)**
 for PR 14712 at commit 
[`5d6e559`](https://github.com/apache/spark/commit/5d6e5599b558512000f3f62349276ebf19be366a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
@gatorsmile Thank you for the good test cases!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14712
  
Below is a test case for a table with empty column. Could you also add it 
here? 

```Scala
  test("statistics collection of a table with zero column") {
val table_no_cols = "table_no_cols"
withTable(table_no_cols) {
  val rddNoCols = sparkContext.parallelize(1 to 10).map(_ => Row.empty)
  val dfNoCols = spark.createDataFrame(rddNoCols, StructType(Seq.empty))
  dfNoCols.write.format("json").saveAsTable(table_no_cols)
  sql(s"ANALYZE TABLE $table_no_cols COMPUTE STATISTICS")
  checkLogicalRelationStats(table_no_cols, expectedStats =
Some(Statistics(sizeInBytes = 30, rowCount = Some(10
}
  }
``` 

In the future, we will do column-level statistics collection. This might 
help you when you implement collection of column-level statistics.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14712
  
LGTM except two minor comments about test cases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14712
  
LGTM, @hvanhovell can you take another look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64850 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64850/consoleFull)**
 for PR 14712 at commit 
[`b946df0`](https://github.com/apache/spark/commit/b946df0928da061ca67b93d2587c1e258fd5d63b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64850/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #3244 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3244/consoleFull)**
 for PR 14712 at commit 
[`b946df0`](https://github.com/apache/spark/commit/b946df0928da061ca67b93d2587c1e258fd5d63b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64850 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64850/consoleFull)**
 for PR 14712 at commit 
[`b946df0`](https://github.com/apache/spark/commit/b946df0928da061ca67b93d2587c1e258fd5d63b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #3244 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3244/consoleFull)**
 for PR 14712 at commit 
[`b946df0`](https://github.com/apache/spark/commit/b946df0928da061ca67b93d2587c1e258fd5d63b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64848/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64848 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64848/consoleFull)**
 for PR 14712 at commit 
[`b946df0`](https://github.com/apache/spark/commit/b946df0928da061ca67b93d2587c1e258fd5d63b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64848 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64848/consoleFull)**
 for PR 14712 at commit 
[`b946df0`](https://github.com/apache/spark/commit/b946df0928da061ca67b93d2587c1e258fd5d63b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-02 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14712
  
looks pretty good now! I left some comments about some small issues, thanks 
for working on it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-01 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
@cloud-fan @hvanhovell @gatorsmile Please review again, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64789/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64789 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64789/consoleFull)**
 for PR 14712 at commit 
[`b6c655a`](https://github.com/apache/spark/commit/b6c655a176c313aa8aa055ef985401f78557e4ec).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64789 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64789/consoleFull)**
 for PR 14712 at commit 
[`b6c655a`](https://github.com/apache/spark/commit/b6c655a176c313aa8aa055ef985401f78557e4ec).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-01 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
@gatorsmile Yes, we should exclude the staging dir.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-01 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14712
  
Maybe I found another bug in the master branch?

When calculating statistics for data source tables, we do not exclude the 
staging directory. However, we exclude them when `AnalyzeTableCommand` 
calculating the size. Since we convert Hive serde tables to data source tables, 
it sounds like we should also exclude Hive staging directory, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64750/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64750 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64750/consoleFull)**
 for PR 14712 at commit 
[`aa438c4`](https://github.com/apache/spark/commit/aa438c43f78d5edd679fd3e6294d953181a40268).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/14712
  
Looks much better now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64750 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64750/consoleFull)**
 for PR 14712 at commit 
[`aa438c4`](https://github.com/apache/spark/commit/aa438c43f78d5edd679fd3e6294d953181a40268).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64726/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64726 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64726/consoleFull)**
 for PR 14712 at commit 
[`9715770`](https://github.com/apache/spark/commit/971577086ff4a61824034f7f04b4eba0f6de7e95).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64726 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64726/consoleFull)**
 for PR 14712 at commit 
[`9715770`](https://github.com/apache/spark/commit/971577086ff4a61824034f7f04b4eba0f6de7e95).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64713/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64713 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64713/consoleFull)**
 for PR 14712 at commit 
[`56ec68e`](https://github.com/apache/spark/commit/56ec68e455cefcc589db0336a580754662d8257c).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64713 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64713/consoleFull)**
 for PR 14712 at commit 
[`56ec68e`](https://github.com/apache/spark/commit/56ec68e455cefcc589db0336a580754662d8257c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64709/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64709 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64709/consoleFull)**
 for PR 14712 at commit 
[`c7cc55f`](https://github.com/apache/spark/commit/c7cc55fe2a42f5af3ffd2c9a81f58a608b636320).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64709 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64709/consoleFull)**
 for PR 14712 at commit 
[`c7cc55f`](https://github.com/apache/spark/commit/c7cc55fe2a42f5af3ffd2c9a81f58a608b636320).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-30 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
this pr is updated and passed all tests, please review @cloud-fan 
@hvanhovell @gatorsmile @viirya 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64641/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-30 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64641 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64641/consoleFull)**
 for PR 14712 at commit 
[`aef78d4`](https://github.com/apache/spark/commit/aef78d4bffac4dad99d1646c659e626a4eccb26b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-30 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64641 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64641/consoleFull)**
 for PR 14712 at commit 
[`aef78d4`](https://github.com/apache/spark/commit/aef78d4bffac4dad99d1646c659e626a4eccb26b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64626/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64626 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64626/consoleFull)**
 for PR 14712 at commit 
[`7e39a86`](https://github.com/apache/spark/commit/7e39a86030e45f10ae0c171a475c054b7c208d20).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
@gatorsmile Thank you for the information!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64626 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64626/consoleFull)**
 for PR 14712 at commit 
[`7e39a86`](https://github.com/apache/spark/commit/7e39a86030e45f10ae0c171a475c054b7c208d20).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14712
  
Since you are added to the white list, you can trigger the test by 
yourself. Below is the command you can use:

- "ok to test" to accept this pull request for testing
- "test this please" for a one time test run
- If the build fails for other various reasons you can rebuild. "retest 
this please" to start a new build


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
@cloud-fan @hvanhovell Oh, sorry, it's already been lauched. There's 
latency for about 5 mins.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64625/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64625 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64625/consoleFull)**
 for PR 14712 at commit 
[`9c27071`](https://github.com/apache/spark/commit/9c27071c05da5f285726381dff7eff3dfab7eda9).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64625 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64625/consoleFull)**
 for PR 14712 at commit 
[`9c27071`](https://github.com/apache/spark/commit/9c27071c05da5f285726381dff7eff3dfab7eda9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
@cloud-fan @hvanhovell Could you launch a test for this pr? Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64567 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64567/consoleFull)**
 for PR 14712 at commit 
[`3407c7f`](https://github.com/apache/spark/commit/3407c7f7aa62503e62c7c4847ad6a2568a676c38).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64567/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64567 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64567/consoleFull)**
 for PR 14712 at commit 
[`3407c7f`](https://github.com/apache/spark/commit/3407c7f7aa62503e62c7c4847ad6a2568a676c38).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/14712
  
Looks like Jenkins doesn't work for a while.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14712
  
add to whitelist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14712
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-29 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14712
  
@wzhfy sounds good to me, let's focus on the top priority in this PR :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-28 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/14712
  
yeah I think we should focus in the top priority target in this pr. And the 
hive related translation can be addressed in later prs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-28 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14712
  
@wzhfy Yeah, it sounds good to me to split the whole problem into multiple 
PRs. 

@hvanhovell Sure, let me create the JIRA and I can work on this when the 
other dependent JIRAs are resolved.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-28 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
I like @hvanhovell 's proposal, since providing a perfect Hive translation 
layer is not trivial based on @gatorsmile 's investigation - we need to deal 
with different versions of Hive. It is better to be decoupled from this pr. 
What do you think? @gatorsmile @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-28 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/14712
  
@gatorsmile lets also create another ticket for the explicit statistics 
updates. I do like the `alter table s update statistics set ...` option.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-28 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/14712
  
How about we use our own property names for now, and provide a Hive 
translation layer in a different PR. IMO it is fine to break a little bit of 
the behavior in master as long as we fix it (or make a well founded choice not 
to) before 2.1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14712
  
One minor comment about the current PR and the future PRs in statistics. 
When we adding new statistics-specific properties, we should exclude them in 
the DDL command, like `SHOW CREATE TABLE`. 

BTW, now, we have a bug in this part. Hive-generated statistics should not 
be shown too. I will fix it in a separate PR.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14712
  
Found another related JIRA: 
https://issues.apache.org/jira/browse/HIVE-12730. This is only available in 
Hive 2.1. They are trying to resolve the issues we hit:

> We would like to provide a way for developers/users to modify the numRows 
and dataSize for a table/partition. Right now although they are part of the 
table properties, they will be set to -1 when the task is not coming from a 
statsTask.

Now, users can change the statistics by the DDL statement. For example,
```SQL
alter table s update statistics set('numRows'='1212', 
'rawDataSize'='500500');
```

More important, after this fix, what we did in this PR does not work. 
`STATS_GENERATED_VIA_STATS_TASK` is not available. Instead, they changed it to 
`STATS_GENERATED`. Two options are availabel: `TASK` and `USER`. 

Thus, I believe our solution should not be based on how Hive behaves, if 
possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14712
  
Very recently, Hive 2.0.0 fixed a serious bug. See the JIRA: 
https://issues.apache.org/jira/browse/HIVE-12661 

We can get a wrong result when the statistics are out of dated (i.e., 
wrong). I can easily reproduce it.

Hive made a few changes in `COLUMN_STATS_ACCURATE`. Before Hive 2.0.0, it 
is like
```
COLUMN_STATS_ACCURATE true
```
After the fix, this becomes different.
```
COLUMN_STATS_ACCURATE 
{"COLUMN_STATS":{"key":"true","value":"true"},"BASIC_STATS":"true"}
```

I start worrying about the statistics value we populate might affect the 
query results through Hive interface, especially when we set 
`STATS_GENERATED_VIA_STATS_TASK`. 

We are unable to implement a concurrency control between Hive and Spark. If 
we use the same names, `statistics` is like a shared memory space. Both Hive 
and Spark can modify it without notice. I have not found the bug, but it sounds 
risky.

Conceptually, setting `STATS_GENERATED_VIA_STATS_TASK` is wrong. As @wzhfy 
pointed out, `numRows` is always `-1` if we do not set it. That also indicates 
`numRows` is not allowed external users to set it, right? 

I am not very confident that we can implement a very stable solution for 
sharing Spark-generated statistics with Hive, since Hive is out of our control. 
However, Spark should be able to leverage Hive-generated statistics and the 
statistics we generated in this CBO work. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14712
  
@cloud-fan let me do more investigation and will reply your question later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-27 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
@cloud-fan I think the current implementation is ok and can support three 
level priorities. The logic is encapsulated in two methods for 
storing/retrieving stats. When the interaction behavior with hive is changed in 
the future, its influence range is controlled and we just need to modify these 
two methods.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-27 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
> What if we use different property names to store statistics?
this may cause some consistency issues when both names exist in the 
metastore, we can't tell which one is the latest.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-27 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14712
  
> we must set STATS_GENERATED_VIA_STATS_TASK, otherwise the stats won't be 
stored

What if we use different property names to store statistics? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-27 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
I think the problem lies in the dependence on hive metastore. As long as we 
use hive metastore to persist/retrieve statistics, we need to deal with these 
flags.

- If we analyze the table in spark and want to persist stats in Hive 
metastore (and retrieve them when we do queries), we must set 
STATS_GENERATED_VIA_STATS_TASK, otherwise the stats won't be stored. - this is 
in top priority
- If users alter table's properties (without setting 
STATS_GENERATED_VIA_STATS_TASK) in Spark or Hive, then COLUMN_STATS_ACCURATE 
will be false and stats be set to invalid, so it's unnecessary to read the 
stats in Spark. - this is in second priority


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-27 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14712
  
I wanna make sure we are in the same page:
**top priority**: users analyze the table in Spark and query it in Spark. 
(this must work)
**second priority**: the table is already analyzed by Hive, users query it 
in Spark.(this should work)
**low priority**: users analyze the table in Spark and query it in Hive.(it 
will be good if this works, but also fine if this doesn't)

I'm a little hesitant about dealing with the 
`STATS_GENERATED_VIA_STATS_TASK` and `COLUMN_STATS_ACCURATE` flags, are they 
only needed for the third(low priority) target? If they are, I'd like to ignore 
them to simplify the logic.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-27 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14712
  
@gatorsmial, do you mean `COLUMN_STATS_ACCURATE` exists in any table's 
properties?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-26 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
@cloud-fan did you see the 
[comment](https://github.com/apache/spark/pull/14712#discussion_r75540560) by 
@gatorsmile ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-26 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14712
  
@wzhfy do you know how hive deal with the `STATS_GENERATED_VIA_STATS_TASK` 
flag and `COLUMN_STATS_ACCURATE` flag?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14712
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64459/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64459 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64459/consoleFull)**
 for PR 14712 at commit 
[`aee4139`](https://github.com/apache/spark/commit/aee4139031b283e95bf0ef9ae6488c3396909ec4).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14712
  
**[Test build #64459 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64459/consoleFull)**
 for PR 14712 at commit 
[`aee4139`](https://github.com/apache/spark/commit/aee4139031b283e95bf0ef9ae6488c3396909ec4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-26 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14712
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-25 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
@cloud-fan Can you please launch test for this pr? thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14712
  
Great! Will review the new changes when it is ready. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-21 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
@gatorsmile yes, we will support it in this pr


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14712
  
What is our CBO plan for data source tables? Is it implemented in the 
prototype? It is not mentioned in the design doc. If we want to support 
`InMemoryCatalog`, we need to provide the support now. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >