[GitHub] [spark] SparkQA commented on issue #24196: [SPARK-27244][CORE] Redact Passwords While Using Option logConf=true

2019-03-25 Thread GitBox
SparkQA commented on issue #24196: [SPARK-27244][CORE] Redact Passwords While 
Using Option logConf=true
URL: https://github.com/apache/spark/pull/24196#issuecomment-476499158
 
 
   **[Test build #103955 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103955/testReport)**
 for PR 24196 at commit 
[`1049d5a`](https://github.com/apache/spark/commit/1049d5ad075acf268bb2c1df1732d5075935d72d).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24196: [SPARK-27244][CORE] Redact Passwords While Using Option logConf=true

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24196: [SPARK-27244][CORE] Redact 
Passwords While Using Option logConf=true
URL: https://github.com/apache/spark/pull/24196#issuecomment-476498794
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #24213: [SPARK-27277][INFRA] Recover from setting fix version failure in merge script

2019-03-25 Thread GitBox
SparkQA commented on issue #24213: [SPARK-27277][INFRA] Recover from setting 
fix version failure in merge script
URL: https://github.com/apache/spark/pull/24213#issuecomment-476499127
 
 
   **[Test build #103954 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103954/testReport)**
 for PR 24213 at commit 
[`9204990`](https://github.com/apache/spark/commit/92049901b8f59da8535fd9cbb4bbce905b82b4d3).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24196: [SPARK-27244][CORE] Redact Passwords While Using Option logConf=true

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24196: [SPARK-27244][CORE] Redact Passwords 
While Using Option logConf=true
URL: https://github.com/apache/spark/pull/24196#issuecomment-476498799
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9313/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24196: [SPARK-27244][CORE] Redact Passwords While Using Option logConf=true

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24196: [SPARK-27244][CORE] Redact 
Passwords While Using Option logConf=true
URL: https://github.com/apache/spark/pull/24196#issuecomment-476498799
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9313/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24213: [SPARK-27277][INFRA] Recover from setting fix version failure in merge script

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24213: [SPARK-27277][INFRA] Recover 
from setting fix version failure in merge script
URL: https://github.com/apache/spark/pull/24213#issuecomment-476498744
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24196: [SPARK-27244][CORE] Redact Passwords While Using Option logConf=true

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24196: [SPARK-27244][CORE] Redact Passwords 
While Using Option logConf=true
URL: https://github.com/apache/spark/pull/24196#issuecomment-476498794
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24213: [SPARK-27277][INFRA] Recover from setting fix version failure in merge script

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24213: [SPARK-27277][INFRA] Recover from 
setting fix version failure in merge script
URL: https://github.com/apache/spark/pull/24213#issuecomment-476498744
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24213: [SPARK-27277][INFRA] Recover from setting fix version failure in merge script

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24213: [SPARK-27277][INFRA] Recover 
from setting fix version failure in merge script
URL: https://github.com/apache/spark/pull/24213#issuecomment-476498750
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9312/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24213: [SPARK-27277][INFRA] Recover from setting fix version failure in merge script

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24213: [SPARK-27277][INFRA] Recover from 
setting fix version failure in merge script
URL: https://github.com/apache/spark/pull/24213#issuecomment-476498750
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9312/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24213: [SPARK-27277][INFRA] Recover from setting fix version failure in merge script

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24213: [SPARK-27277][INFRA] Recover 
from setting fix version failure in merge script
URL: https://github.com/apache/spark/pull/24213#issuecomment-476497686
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24213: [SPARK-27277][INFRA] Recover from setting fix version failure in merge script

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24213: [SPARK-27277][INFRA] Recover 
from setting fix version failure in merge script
URL: https://github.com/apache/spark/pull/24213#issuecomment-476497690
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103940/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] felixcheung commented on issue #24213: [SPARK-27277][INFRA] Recover from setting fix version failure in merge script

2019-03-25 Thread GitBox
felixcheung commented on issue #24213: [SPARK-27277][INFRA] Recover from 
setting fix version failure in merge script
URL: https://github.com/apache/spark/pull/24213#issuecomment-476497828
 
 
   Jenkins, retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24213: [SPARK-27277][INFRA] Recover from setting fix version failure in merge script

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24213: [SPARK-27277][INFRA] Recover from 
setting fix version failure in merge script
URL: https://github.com/apache/spark/pull/24213#issuecomment-476497690
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103940/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24213: [SPARK-27277][INFRA] Recover from setting fix version failure in merge script

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24213: [SPARK-27277][INFRA] Recover from 
setting fix version failure in merge script
URL: https://github.com/apache/spark/pull/24213#issuecomment-476497686
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #24213: [SPARK-27277][INFRA] Recover from setting fix version failure in merge script

2019-03-25 Thread GitBox
SparkQA removed a comment on issue #24213: [SPARK-27277][INFRA] Recover from 
setting fix version failure in merge script
URL: https://github.com/apache/spark/pull/24213#issuecomment-476454080
 
 
   **[Test build #103940 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103940/testReport)**
 for PR 24213 at commit 
[`9204990`](https://github.com/apache/spark/commit/92049901b8f59da8535fd9cbb4bbce905b82b4d3).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24196: [SPARK-27244][CORE] Redact Passwords While Using Option logConf=true

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24196: [SPARK-27244][CORE] Redact 
Passwords While Using Option logConf=true
URL: https://github.com/apache/spark/pull/24196#issuecomment-476038561
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #24213: [SPARK-27277][INFRA] Recover from setting fix version failure in merge script

2019-03-25 Thread GitBox
SparkQA commented on issue #24213: [SPARK-27277][INFRA] Recover from setting 
fix version failure in merge script
URL: https://github.com/apache/spark/pull/24213#issuecomment-476497419
 
 
   **[Test build #103940 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103940/testReport)**
 for PR 24213 at commit 
[`9204990`](https://github.com/apache/spark/commit/92049901b8f59da8535fd9cbb4bbce905b82b4d3).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] felixcheung commented on issue #24196: [SPARK-27244][CORE] Redact Passwords While Using Option logConf=true

2019-03-25 Thread GitBox
felixcheung commented on issue #24196: [SPARK-27244][CORE] Redact Passwords 
While Using Option logConf=true
URL: https://github.com/apache/spark/pull/24196#issuecomment-476497550
 
 
   Jenkins, ok to test


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-25 Thread GitBox
SparkQA commented on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to 
collect tables stats for cached catalog views
URL: https://github.com/apache/spark/pull/24200#issuecomment-476497233
 
 
   **[Test build #103953 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103953/testReport)**
 for PR 24200 at commit 
[`b4bbdad`](https://github.com/apache/spark/commit/b4bbdad2f77adb11d898846e701e3900c9d1604b).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24200: [SPARK-27266][SQL] Support 
ANALYZE TABLE to collect tables stats for cached catalog views
URL: https://github.com/apache/spark/pull/24200#issuecomment-476496872
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24200: [SPARK-27266][SQL] Support 
ANALYZE TABLE to collect tables stats for cached catalog views
URL: https://github.com/apache/spark/pull/24200#issuecomment-476496879
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9311/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24200: [SPARK-27266][SQL] Support ANALYZE 
TABLE to collect tables stats for cached catalog views
URL: https://github.com/apache/spark/pull/24200#issuecomment-476496872
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24200: [SPARK-27266][SQL] Support ANALYZE 
TABLE to collect tables stats for cached catalog views
URL: https://github.com/apache/spark/pull/24200#issuecomment-476496879
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9311/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-25 Thread GitBox
maropu commented on a change in pull request #24200: [SPARK-27266][SQL] Support 
ANALYZE TABLE to collect tables stats for cached catalog views
URL: https://github.com/apache/spark/pull/24200#discussion_r268959862
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala
 ##
 @@ -35,19 +35,27 @@ case class AnalyzeTableCommand(
 val tableIdentWithDB = TableIdentifier(tableIdent.table, Some(db))
 val tableMeta = sessionState.catalog.getTableMetadata(tableIdentWithDB)
 if (tableMeta.tableType == CatalogTableType.VIEW) {
-  throw new AnalysisException("ANALYZE TABLE is not supported on views.")
-}
-
-// Compute stats for the whole table
-val newTotalSize = CommandUtils.calculateTotalSize(sparkSession, tableMeta)
-val newRowCount =
-  if (noscan) None else 
Some(BigInt(sparkSession.table(tableIdentWithDB).count()))
-
-// Update the metastore if the above statistics of the table are different 
from those
-// recorded in the metastore.
-val newStats = CommandUtils.compareAndGetNewStats(tableMeta.stats, 
newTotalSize, newRowCount)
-if (newStats.isDefined) {
-  sessionState.catalog.alterTableStats(tableIdentWithDB, newStats)
+  // Analyzes a catalog view if the view is cached
+  val table = sparkSession.table(tableIdent.quotedString)
+  val cacheManager = sparkSession.sharedState.cacheManager
+  if (cacheManager.lookupCachedData(table.logicalPlan).isDefined) {
+// To collect table stats, materializes an underlying columnar RDD
+table.collect()
 
 Review comment:
   I wrote this code to do the same thing with the normal case:
   
https://github.com/apache/spark/blob/90b72512f46ac08008d574577c65263d9b13a33c/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala#L44


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-25 Thread GitBox
maropu commented on a change in pull request #24200: [SPARK-27266][SQL] Support 
ANALYZE TABLE to collect tables stats for cached catalog views
URL: https://github.com/apache/spark/pull/24200#discussion_r268959862
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala
 ##
 @@ -35,19 +35,27 @@ case class AnalyzeTableCommand(
 val tableIdentWithDB = TableIdentifier(tableIdent.table, Some(db))
 val tableMeta = sessionState.catalog.getTableMetadata(tableIdentWithDB)
 if (tableMeta.tableType == CatalogTableType.VIEW) {
-  throw new AnalysisException("ANALYZE TABLE is not supported on views.")
-}
-
-// Compute stats for the whole table
-val newTotalSize = CommandUtils.calculateTotalSize(sparkSession, tableMeta)
-val newRowCount =
-  if (noscan) None else 
Some(BigInt(sparkSession.table(tableIdentWithDB).count()))
-
-// Update the metastore if the above statistics of the table are different 
from those
-// recorded in the metastore.
-val newStats = CommandUtils.compareAndGetNewStats(tableMeta.stats, 
newTotalSize, newRowCount)
-if (newStats.isDefined) {
-  sessionState.catalog.alterTableStats(tableIdentWithDB, newStats)
+  // Analyzes a catalog view if the view is cached
+  val table = sparkSession.table(tableIdent.quotedString)
+  val cacheManager = sparkSession.sharedState.cacheManager
+  if (cacheManager.lookupCachedData(table.logicalPlan).isDefined) {
+// To collect table stats, materializes an underlying columnar RDD
+table.collect()
 
 Review comment:
   I added this code to do the same thing with the normal case:
   
https://github.com/apache/spark/blob/90b72512f46ac08008d574577c65263d9b13a33c/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala#L44


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-25 Thread GitBox
SparkQA commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should 
compare child plan of `SubqueryExec`
URL: https://github.com/apache/spark/pull/24214#issuecomment-476495345
 
 
   **[Test build #103952 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103952/testReport)**
 for PR 24214 at commit 
[`5fa78c8`](https://github.com/apache/spark/commit/5fa78c8b4d38b6d107315850606f660ad87e20f8).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24203: [SPARK-27269][SQL] File source 
v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#issuecomment-476495124
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103936/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24203: [SPARK-27269][SQL] File source 
v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#issuecomment-476495120
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24203: [SPARK-27269][SQL] File source v2 
should validate data schema only
URL: https://github.com/apache/spark/pull/24203#issuecomment-476495124
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103936/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24203: [SPARK-27269][SQL] File source v2 
should validate data schema only
URL: https://github.com/apache/spark/pull/24203#issuecomment-476495120
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse 
subquery should compare child plan of `SubqueryExec`
URL: https://github.com/apache/spark/pull/24214#issuecomment-476494990
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9310/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery 
should compare child plan of `SubqueryExec`
URL: https://github.com/apache/spark/pull/24214#issuecomment-476494990
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9310/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse 
subquery should compare child plan of `SubqueryExec`
URL: https://github.com/apache/spark/pull/24214#issuecomment-476494986
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery 
should compare child plan of `SubqueryExec`
URL: https://github.com/apache/spark/pull/24214#issuecomment-476494986
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
SparkQA removed a comment on issue #24203: [SPARK-27269][SQL] File source v2 
should validate data schema only
URL: https://github.com/apache/spark/pull/24203#issuecomment-476449200
 
 
   **[Test build #103936 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103936/testReport)**
 for PR 24203 at commit 
[`214bd8b`](https://github.com/apache/spark/commit/214bd8b743dfafdb48372c7a0030ed04e3d3f4ba).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
SparkQA commented on issue #24203: [SPARK-27269][SQL] File source v2 should 
validate data schema only
URL: https://github.com/apache/spark/pull/24203#issuecomment-476494708
 
 
   **[Test build #103936 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103936/testReport)**
 for PR 24203 at commit 
[`214bd8b`](https://github.com/apache/spark/commit/214bd8b743dfafdb48372c7a0030ed04e3d3f4ba).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] adrian-wang commented on issue #24214: [SPARK-27279][SQL] Reuse subquery correctly

2019-03-25 Thread GitBox
adrian-wang commented on issue #24214: [SPARK-27279][SQL] Reuse subquery 
correctly
URL: https://github.com/apache/spark/pull/24214#issuecomment-476493952
 
 
   retest this please.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] adrian-wang commented on a change in pull request #24214: [SPARK-27279][SQL] Reuse subquery correctly

2019-03-25 Thread GitBox
adrian-wang commented on a change in pull request #24214: [SPARK-27279][SQL] 
Reuse subquery correctly
URL: https://github.com/apache/spark/pull/24214#discussion_r268957962
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
 ##
 @@ -125,15 +125,15 @@ case class PlanSubqueries(sparkSession: SparkSession) 
extends Rule[SparkPlan] {
 case class ReuseSubquery(conf: SQLConf) extends Rule[SparkPlan] {
 
   def apply(plan: SparkPlan): SparkPlan = {
-if (!conf.exchangeReuseEnabled) {
+if (!conf.subqueryReuseEnabled) {
   return plan
 }
 // Build a hash map using schema of subqueries to avoid O(N*N) sameResult 
calls.
 val subqueries = mutable.HashMap[StructType, ArrayBuffer[SubqueryExec]]()
 plan transformAllExpressions {
   case sub: ExecSubqueryExpression =>
 val sameSchema = subqueries.getOrElseUpdate(sub.plan.schema, 
ArrayBuffer[SubqueryExec]())
-val sameResult = sameSchema.find(_.sameResult(sub.plan))
+val sameResult = sameSchema.find(_.child.sameResult(sub.plan.child))
 
 Review comment:
   Thanks for the review, you can see the example in the added test.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268957136
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileTable.scala
 ##
 @@ -72,6 +82,22 @@ abstract class FileTable(
* Spark will require that user specify the schema manually.
*/
   def inferSchema(files: Seq[FileStatus]): Option[StructType]
+
+  /**
+   * Returns whether this format supports the given [[DataType]] in write path.
 
 Review comment:
   `write` -> `read/write`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-25 Thread GitBox
SparkQA commented on issue #24119: [SPARK-27182][SQL] Move the conflict source 
code of the sql/core module to sql/core/v1.2.1
URL: https://github.com/apache/spark/pull/24119#issuecomment-476491882
 
 
   **[Test build #103951 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103951/testReport)**
 for PR 24119 at commit 
[`11bc982`](https://github.com/apache/spark/commit/11bc98284566ae93caffa7d947543c095de03c75).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24119: [SPARK-27182][SQL] Move the 
conflict source code of the sql/core module to sql/core/v1.2.1
URL: https://github.com/apache/spark/pull/24119#issuecomment-476491498
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24119: [SPARK-27182][SQL] Move the 
conflict source code of the sql/core module to sql/core/v1.2.1
URL: https://github.com/apache/spark/pull/24119#issuecomment-476491501
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9309/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24119: [SPARK-27182][SQL] Move the conflict 
source code of the sql/core module to sql/core/v1.2.1
URL: https://github.com/apache/spark/pull/24119#issuecomment-476491498
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24119: [SPARK-27182][SQL] Move the conflict 
source code of the sql/core module to sql/core/v1.2.1
URL: https://github.com/apache/spark/pull/24119#issuecomment-476491501
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9309/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24215: [SPARK-27229][SQL] GroupBy 
Placement in Intersect Distinct
URL: https://github.com/apache/spark/pull/24215#issuecomment-476490672
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103946/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-25 Thread GitBox
wangyum commented on issue #24119: [SPARK-27182][SQL] Move the conflict source 
code of the sql/core module to sql/core/v1.2.1
URL: https://github.com/apache/spark/pull/24119#issuecomment-476490725
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement 
in Intersect Distinct
URL: https://github.com/apache/spark/pull/24215#issuecomment-476490672
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103946/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24215: [SPARK-27229][SQL] GroupBy 
Placement in Intersect Distinct
URL: https://github.com/apache/spark/pull/24215#issuecomment-476490668
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct

2019-03-25 Thread GitBox
SparkQA commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in 
Intersect Distinct
URL: https://github.com/apache/spark/pull/24215#issuecomment-476490535
 
 
   **[Test build #103946 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103946/testReport)**
 for PR 24215 at commit 
[`81705fa`](https://github.com/apache/spark/commit/81705fa660e1e9ae789eff885cb5b015e28932d4).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct

2019-03-25 Thread GitBox
SparkQA removed a comment on issue #24215: [SPARK-27229][SQL] GroupBy Placement 
in Intersect Distinct
URL: https://github.com/apache/spark/pull/24215#issuecomment-476471924
 
 
   **[Test build #103946 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103946/testReport)**
 for PR 24215 at commit 
[`81705fa`](https://github.com/apache/spark/commit/81705fa660e1e9ae789eff885cb5b015e28932d4).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement 
in Intersect Distinct
URL: https://github.com/apache/spark/pull/24215#issuecomment-476490668
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24119: [SPARK-27182][SQL] Move the 
conflict source code of the sql/core module to sql/core/v1.2.1
URL: https://github.com/apache/spark/pull/24119#issuecomment-476489995
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103935/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24119: [SPARK-27182][SQL] Move the conflict 
source code of the sql/core module to sql/core/v1.2.1
URL: https://github.com/apache/spark/pull/24119#issuecomment-476489987
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-25 Thread GitBox
SparkQA removed a comment on issue #24119: [SPARK-27182][SQL] Move the conflict 
source code of the sql/core module to sql/core/v1.2.1
URL: https://github.com/apache/spark/pull/24119#issuecomment-476445862
 
 
   **[Test build #103935 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103935/testReport)**
 for PR 24119 at commit 
[`11bc982`](https://github.com/apache/spark/commit/11bc98284566ae93caffa7d947543c095de03c75).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24119: [SPARK-27182][SQL] Move the 
conflict source code of the sql/core module to sql/core/v1.2.1
URL: https://github.com/apache/spark/pull/24119#issuecomment-476489987
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24119: [SPARK-27182][SQL] Move the conflict 
source code of the sql/core module to sql/core/v1.2.1
URL: https://github.com/apache/spark/pull/24119#issuecomment-476489995
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103935/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-25 Thread GitBox
SparkQA commented on issue #24119: [SPARK-27182][SQL] Move the conflict source 
code of the sql/core module to sql/core/v1.2.1
URL: https://github.com/apache/spark/pull/24119#issuecomment-476489807
 
 
   **[Test build #103935 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103935/testReport)**
 for PR 24119 at commit 
[`11bc982`](https://github.com/apache/spark/commit/11bc98284566ae93caffa7d947543c095de03c75).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-25 Thread GitBox
maropu commented on a change in pull request #24200: [SPARK-27266][SQL] Support 
ANALYZE TABLE to collect tables stats for cached catalog views
URL: https://github.com/apache/spark/pull/24200#discussion_r268954675
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
 ##
 @@ -200,11 +202,14 @@ case class InMemoryRelation(
   }
 
   override def computeStats(): Statistics = {
-if (cacheBuilder.sizeInBytesStats.value == 0L) {
+if (!cacheBuilder.isCachedColumnBuffersLoaded) {
   // Underlying columnar RDD hasn't been materialized, use the stats from 
the plan to cache.
   statsOfPlanToCache
 } else {
-  statsOfPlanToCache.copy(sizeInBytes = 
cacheBuilder.sizeInBytesStats.value.longValue)
+  statsOfPlanToCache.copy(
+sizeInBytes = cacheBuilder.sizeInBytesStats.value.longValue,
+rowCount = Some(cacheBuilder.rowCountStats.value.longValue)
 
 Review comment:
   Yea, we need it because this change passes `rowCount` into upper nodes;
   ```
   scala> sql("CREATE VIEW v AS SELECT 1 c")
   scala> sql("CACHE TABLE v")
   scala> spark.table("v").explain(true)
   ...
   == Optimized Logical Plan ==
   InMemoryRelation [c#28], StorageLevel(disk, memory, deserialized, 1 replicas)
  +- *(1) Project [1 AS c#1]
 +- Scan OneRowRelation[]
   ...
   
   > w/o this change
   scala> val stats = spark.table("v").queryExecution.optimizedPlan.stats
    Statistics(sizeInBytes=4.0 B)
   
   > w/ this change
   scala> val stats = spark.table("v").queryExecution.optimizedPlan.stats
    Statistics(sizeInBytes=4.0 B, rowCount=1)
  ^^^
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268954126
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/v2/FileTableSuite.scala
 ##
 @@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.execution.datasources.v2
+
+import scala.collection.JavaConverters._
+
+import org.apache.hadoop.fs.FileStatus
+
+import org.apache.spark.sql.{QueryTest, SparkSession}
+import org.apache.spark.sql.sources.v2.reader.ScanBuilder
+import org.apache.spark.sql.sources.v2.writer.WriteBuilder
+import org.apache.spark.sql.test.{SharedSQLContext, SQLTestUtils}
+import org.apache.spark.sql.types._
+import org.apache.spark.sql.util.CaseInsensitiveStringMap
+
+class DummyFileTable(
+sparkSession: SparkSession,
+options: CaseInsensitiveStringMap,
+paths: Seq[String],
+expectedDataSchema: StructType,
+userSpecifiedSchema: Option[StructType])
+  extends FileTable(sparkSession, options, paths, userSpecifiedSchema) {
+  override def inferSchema(files: Seq[FileStatus]): Option[StructType] = 
Some(expectedDataSchema)
+
+  override def name(): String = "Dummy"
+
+  override def formatName: String = "Dummy"
+
+  override def newScanBuilder(options: CaseInsensitiveStringMap): ScanBuilder 
= null
+
+  override def newWriteBuilder(options: CaseInsensitiveStringMap): 
WriteBuilder = null
+
+  override def supportsDataType(dataType: DataType): Boolean = dataType == 
StringType
+}
+
+class FileTableSuite extends QueryTest with SharedSQLContext with SQLTestUtils 
{
+
+  test("Data type validation should check data schema only") {
+withTempPath { dir =>
+  val df = spark.createDataFrame(Seq(("a", 1), ("b", 2))).toDF("v", "p")
+  val pathName = dir.getCanonicalPath
+  df.write.partitionBy("p").text(pathName)
+  val options = new CaseInsensitiveStringMap(Map("path" -> 
pathName).asJava)
+  val expectedDataSchema = StructType(Seq(StructField("v", StringType, 
true)))
+  // DummyFileTable doesn't support Integer data type.
+  // However, the partition schema is handled by Spark, so it is allowed 
to contain
+  // Integer data type here.
+  val table = new DummyFileTable(spark, options, Seq(pathName), 
expectedDataSchema, None)
+  assert(table.dataSchema == expectedDataSchema)
+  val expectedPartitionSchema = StructType(Seq(StructField("p", 
IntegerType, true)))
+  assert(table.fileIndex.partitionSchema ==  expectedPartitionSchema)
 
 Review comment:
   nit. additional space after `==`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268953943
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/v2/FileTableSuite.scala
 ##
 @@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.execution.datasources.v2
+
+import scala.collection.JavaConverters._
+
+import org.apache.hadoop.fs.FileStatus
+
+import org.apache.spark.sql.{QueryTest, SparkSession}
+import org.apache.spark.sql.sources.v2.reader.ScanBuilder
+import org.apache.spark.sql.sources.v2.writer.WriteBuilder
+import org.apache.spark.sql.test.{SharedSQLContext, SQLTestUtils}
+import org.apache.spark.sql.types._
+import org.apache.spark.sql.util.CaseInsensitiveStringMap
+
+class DummyFileTable(
+sparkSession: SparkSession,
+options: CaseInsensitiveStringMap,
+paths: Seq[String],
+expectedDataSchema: StructType,
+userSpecifiedSchema: Option[StructType])
+  extends FileTable(sparkSession, options, paths, userSpecifiedSchema) {
+  override def inferSchema(files: Seq[FileStatus]): Option[StructType] = 
Some(expectedDataSchema)
+
+  override def name(): String = "Dummy"
+
+  override def formatName: String = "Dummy"
+
+  override def newScanBuilder(options: CaseInsensitiveStringMap): ScanBuilder 
= null
+
+  override def newWriteBuilder(options: CaseInsensitiveStringMap): 
WriteBuilder = null
+
+  override def supportsDataType(dataType: DataType): Boolean = dataType == 
StringType
+}
+
+class FileTableSuite extends QueryTest with SharedSQLContext with SQLTestUtils 
{
+
+  test("Data type validation should check data schema only") {
+withTempPath { dir =>
+  val df = spark.createDataFrame(Seq(("a", 1), ("b", 2))).toDF("v", "p")
+  val pathName = dir.getCanonicalPath
+  df.write.partitionBy("p").text(pathName)
+  val options = new CaseInsensitiveStringMap(Map("path" -> 
pathName).asJava)
+  val expectedDataSchema = StructType(Seq(StructField("v", StringType, 
true)))
+  // DummyFileTable doesn't support Integer data type.
+  // However, the partition schema is handled by Spark, so it is allowed 
to contain
+  // Integer data type here.
+  val table = new DummyFileTable(spark, options, Seq(pathName), 
expectedDataSchema, None)
+  assert(table.dataSchema == expectedDataSchema)
+  val expectedPartitionSchema = StructType(Seq(StructField("p", 
IntegerType, true)))
+  assert(table.fileIndex.partitionSchema ==  expectedPartitionSchema)
+}
+  }
+
+  test("Returns correct data schema when user specified schema contains 
partition schema") {
 
 Review comment:
   Thank you for adding this test.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268953658
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcScan.scala
 ##
 @@ -43,10 +43,4 @@ case class OrcScan(
 OrcPartitionReaderFactory(sparkSession.sessionState.conf, broadcastedConf,
   dataSchema, fileIndex.partitionSchema, readSchema)
   }
-
-  override def supportsDataType(dataType: DataType): Boolean = {
-OrcDataSourceV2.supportsDataType(dataType)
-  }
-
-  override def formatName: String = "ORC"
 
 Review comment:
   ```
   -import org.apache.spark.sql.types.{DataType, StructType}
   +import org.apache.spark.sql.types.StructType
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268953505
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcDataSourceV2.scala
 ##
 @@ -42,19 +42,3 @@ class OrcDataSourceV2 extends FileDataSourceV2 {
   }
 }
 
-object OrcDataSourceV2 {
-  def supportsDataType(dataType: DataType): Boolean = dataType match {
-case _: AtomicType => true
-
-case st: StructType => st.forall { f => supportsDataType(f.dataType) }
-
-case ArrayType(elementType, _) => supportsDataType(elementType)
-
-case MapType(keyType, valueType, _) =>
-  supportsDataType(keyType) && supportsDataType(valueType)
-
-case udt: UserDefinedType[_] => supportsDataType(udt.sqlType)
-
-case _ => false
-  }
-}
 
 Review comment:
   ditto.
   ```
   -import org.apache.spark.sql.types._
   +import org.apache.spark.sql.types.StructType
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268953410
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/csv/CSVDataSourceV2.scala
 ##
 @@ -41,13 +41,3 @@ class CSVDataSourceV2 extends FileDataSourceV2 {
 CSVTable(tableName, sparkSession, options, paths, Some(schema))
   }
 }
-
-object CSVDataSourceV2 {
-  def supportsDataType(dataType: DataType): Boolean = dataType match {
-case _: AtomicType => true
-
-case udt: UserDefinedType[_] => supportsDataType(udt.sqlType)
-
-case _ => false
-  }
 
 Review comment:
   ```
   -import org.apache.spark.sql.types._
   +import org.apache.spark.sql.types.StructType
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer edited a comment on issue #23841: [SPARK-26936][SQL] Fix bug of insert overwrite local dir can not create temporary path in local staging directory

2019-03-25 Thread GitBox
beliefer edited a comment on issue #23841: [SPARK-26936][SQL] Fix bug of insert 
overwrite local dir can not create temporary path in local staging directory
URL: https://github.com/apache/spark/pull/23841#issuecomment-476472507
 
 
   > Will we hit this bug when we deploy spark in cluster? Seems to me it's not 
specific to yarn.
   
   Yes, If we spark runs in `yarn-client` deploy mode, this bug will occurs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268952963
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileScan.scala
 ##
 @@ -76,13 +60,5 @@ abstract class FileScan(
 partitions.toArray
   }
 
-  override def toBatch: Batch = {
-readSchema.foreach { field =>
-  if (!supportsDataType(field.dataType)) {
-throw new AnalysisException(
-  s"$formatName data source does not support 
${field.dataType.catalogString} data type.")
-  }
-}
-this
-  }
 
 Review comment:
   Shall we update the `import` according to this deletions?
   ```scala
   -import org.apache.spark.sql.{AnalysisException, SparkSession}
   +import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.execution.PartitionedFileUtil
import org.apache.spark.sql.execution.datasources._
import org.apache.spark.sql.sources.v2.reader.{Batch, InputPartition, Scan}
   -import org.apache.spark.sql.types.{DataType, StructType}
   +import org.apache.spark.sql.types.StructType
import org.apache.spark.sql.util.CaseInsensitiveStringMap
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268952310
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileTable.scala
 ##
 @@ -46,17 +46,27 @@ abstract class FileTable(
   sparkSession, rootPathsSpecified, caseSensitiveMap, userSpecifiedSchema, 
fileStatusCache)
   }
 
-  lazy val dataSchema: StructType = userSpecifiedSchema.orElse {
-inferSchema(fileIndex.allFiles())
-  }.getOrElse {
-throw new AnalysisException(
-  s"Unable to infer schema for $name. It must be specified manually.")
-  }.asNullable
+  lazy val dataSchema: StructType = userSpecifiedSchema.map { schema =>
+  val partitionSchema = fileIndex.partitionSchema
+  val equality = sparkSession.sessionState.conf.resolver
+  StructType(schema.filterNot(f => partitionSchema.exists(p => 
equality(p.name, f.name
+}.orElse {
 
 Review comment:
   Indentation? 
(https://github.com/databricks/scala-style-guide#pattern-matching)
   Line 50 ~ 58 should be updated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268952310
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileTable.scala
 ##
 @@ -46,17 +46,27 @@ abstract class FileTable(
   sparkSession, rootPathsSpecified, caseSensitiveMap, userSpecifiedSchema, 
fileStatusCache)
   }
 
-  lazy val dataSchema: StructType = userSpecifiedSchema.orElse {
-inferSchema(fileIndex.allFiles())
-  }.getOrElse {
-throw new AnalysisException(
-  s"Unable to infer schema for $name. It must be specified manually.")
-  }.asNullable
+  lazy val dataSchema: StructType = userSpecifiedSchema.map { schema =>
+  val partitionSchema = fileIndex.partitionSchema
+  val equality = sparkSession.sessionState.conf.resolver
+  StructType(schema.filterNot(f => partitionSchema.exists(p => 
equality(p.name, f.name
+}.orElse {
 
 Review comment:
   Indentation? 
(https://github.com/databricks/scala-style-guide#pattern-matching)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24076: [SPARK-27142] Provide REST API for SQL level information

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24076: [SPARK-27142] Provide REST API 
for SQL level information
URL: https://github.com/apache/spark/pull/24076#issuecomment-476486373
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9308/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24076: [SPARK-27142] Provide REST API for SQL level information

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24076: [SPARK-27142] Provide REST API 
for SQL level information
URL: https://github.com/apache/spark/pull/24076#issuecomment-476486370
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24076: [SPARK-27142] Provide REST API for SQL level information

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24076: [SPARK-27142] Provide REST API for SQL 
level information
URL: https://github.com/apache/spark/pull/24076#issuecomment-476486370
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24076: [SPARK-27142] Provide REST API for SQL level information

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24076: [SPARK-27142] Provide REST API for SQL 
level information
URL: https://github.com/apache/spark/pull/24076#issuecomment-476486373
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9308/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268951694
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileTable.scala
 ##
 @@ -46,17 +46,27 @@ abstract class FileTable(
   sparkSession, rootPathsSpecified, caseSensitiveMap, userSpecifiedSchema, 
fileStatusCache)
   }
 
-  lazy val dataSchema: StructType = userSpecifiedSchema.orElse {
-inferSchema(fileIndex.allFiles())
-  }.getOrElse {
-throw new AnalysisException(
-  s"Unable to infer schema for $name. It must be specified manually.")
-  }.asNullable
+  lazy val dataSchema: StructType = userSpecifiedSchema.map { schema =>
+  val partitionSchema = fileIndex.partitionSchema
+  val equality = sparkSession.sessionState.conf.resolver
 
 Review comment:
   If you search with `conf.resolver`, there are more instances with `val 
resolver`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268951694
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileTable.scala
 ##
 @@ -46,17 +46,27 @@ abstract class FileTable(
   sparkSession, rootPathsSpecified, caseSensitiveMap, userSpecifiedSchema, 
fileStatusCache)
   }
 
-  lazy val dataSchema: StructType = userSpecifiedSchema.orElse {
-inferSchema(fileIndex.allFiles())
-  }.getOrElse {
-throw new AnalysisException(
-  s"Unable to infer schema for $name. It must be specified manually.")
-  }.asNullable
+  lazy val dataSchema: StructType = userSpecifiedSchema.map { schema =>
+  val partitionSchema = fileIndex.partitionSchema
+  val equality = sparkSession.sessionState.conf.resolver
 
 Review comment:
   If you search with `conf.resolver`, there are more instances.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268951495
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileTable.scala
 ##
 @@ -46,17 +46,27 @@ abstract class FileTable(
   sparkSession, rootPathsSpecified, caseSensitiveMap, userSpecifiedSchema, 
fileStatusCache)
   }
 
-  lazy val dataSchema: StructType = userSpecifiedSchema.orElse {
-inferSchema(fileIndex.allFiles())
-  }.getOrElse {
-throw new AnalysisException(
-  s"Unable to infer schema for $name. It must be specified manually.")
-  }.asNullable
+  lazy val dataSchema: StructType = userSpecifiedSchema.map { schema =>
+  val partitionSchema = fileIndex.partitionSchema
+  val equality = sparkSession.sessionState.conf.resolver
 
 Review comment:
   Do you mean the [one 
line](https://github.com/apache/spark/blame/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L111)
 written two year ago. All the other new instances use `resolver = 
sparkSession.sessionState.conf.resolver` (more than 7).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268951495
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileTable.scala
 ##
 @@ -46,17 +46,27 @@ abstract class FileTable(
   sparkSession, rootPathsSpecified, caseSensitiveMap, userSpecifiedSchema, 
fileStatusCache)
   }
 
-  lazy val dataSchema: StructType = userSpecifiedSchema.orElse {
-inferSchema(fileIndex.allFiles())
-  }.getOrElse {
-throw new AnalysisException(
-  s"Unable to infer schema for $name. It must be specified manually.")
-  }.asNullable
+  lazy val dataSchema: StructType = userSpecifiedSchema.map { schema =>
+  val partitionSchema = fileIndex.partitionSchema
+  val equality = sparkSession.sessionState.conf.resolver
 
 Review comment:
   Do you mean the [one 
line](https://github.com/apache/spark/blame/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L111)
 written two year ago? All the other new instances use `resolver = 
sparkSession.sessionState.conf.resolver` (more than 7).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268951495
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileTable.scala
 ##
 @@ -46,17 +46,27 @@ abstract class FileTable(
   sparkSession, rootPathsSpecified, caseSensitiveMap, userSpecifiedSchema, 
fileStatusCache)
   }
 
-  lazy val dataSchema: StructType = userSpecifiedSchema.orElse {
-inferSchema(fileIndex.allFiles())
-  }.getOrElse {
-throw new AnalysisException(
-  s"Unable to infer schema for $name. It must be specified manually.")
-  }.asNullable
+  lazy val dataSchema: StructType = userSpecifiedSchema.map { schema =>
+  val partitionSchema = fileIndex.partitionSchema
+  val equality = sparkSession.sessionState.conf.resolver
 
 Review comment:
   There is only [one 
line](https://github.com/apache/spark/blame/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L111)
 written two year ago. All the other new instances use `resolver = 
sparkSession.sessionState.conf.resolver` (more than 7).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #24076: [SPARK-27142] Provide REST API for SQL level information

2019-03-25 Thread GitBox
SparkQA commented on issue #24076: [SPARK-27142] Provide REST API for SQL level 
information
URL: https://github.com/apache/spark/pull/24076#issuecomment-476485215
 
 
   **[Test build #103950 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103950/testReport)**
 for PR 24076 at commit 
[`b1a8deb`](https://github.com/apache/spark/commit/b1a8deb2484a484e9a0d8a70368b64fad8729006).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ajithme commented on issue #24076: [SPARK-27142] Provide REST API for SQL level information

2019-03-25 Thread GitBox
ajithme commented on issue #24076: [SPARK-27142] Provide REST API for SQL level 
information
URL: https://github.com/apache/spark/pull/24076#issuecomment-476485167
 
 
   Updated with latest comments fixed. Please review


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24076: [SPARK-27142] Provide REST API for SQL level information

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24076: [SPARK-27142] Provide REST API for SQL 
level information
URL: https://github.com/apache/spark/pull/24076#issuecomment-476484881
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9307/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268950930
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileScan.scala
 ##
 @@ -37,22 +37,6 @@ abstract class FileScan(
 false
   }
 
-  /**
-   * Returns whether this format supports the given [[DataType]] in write path.
-   * By default all data types are supported.
-   */
-  def supportsDataType(dataType: DataType): Boolean = true
-
-  /**
-   * The string that represents the format that this data source provider 
uses. This is
-   * overridden by children to provide a nice alias for the data source. For 
example:
-   *
-   * {{{
-   *   override def formatName(): String = "ORC"
-   * }}}
-   */
-  def formatName: String
-
 
 Review comment:
   Thanks. Yes. It seems to be moved together correctly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24076: [SPARK-27142] Provide REST API for SQL level information

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24076: [SPARK-27142] Provide REST API for SQL 
level information
URL: https://github.com/apache/spark/pull/24076#issuecomment-476484876
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24076: [SPARK-27142] Provide REST API for SQL level information

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24076: [SPARK-27142] Provide REST API 
for SQL level information
URL: https://github.com/apache/spark/pull/24076#issuecomment-476484881
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/9307/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24076: [SPARK-27142] Provide REST API for SQL level information

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24076: [SPARK-27142] Provide REST API 
for SQL level information
URL: https://github.com/apache/spark/pull/24076#issuecomment-476484876
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24197: [SPARK-24102][ML][MLLIB][PYSPARK][FOLLOWUP] Added weight column to pyspark API for regression evaluator and metrics

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24197: 
[SPARK-24102][ML][MLLIB][PYSPARK][FOLLOWUP] Added weight column to pyspark API 
for regression evaluator and metrics
URL: https://github.com/apache/spark/pull/24197#issuecomment-476483406
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103945/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24197: [SPARK-24102][ML][MLLIB][PYSPARK][FOLLOWUP] Added weight column to pyspark API for regression evaluator and metrics

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24197: 
[SPARK-24102][ML][MLLIB][PYSPARK][FOLLOWUP] Added weight column to pyspark API 
for regression evaluator and metrics
URL: https://github.com/apache/spark/pull/24197#issuecomment-476483400
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24197: [SPARK-24102][ML][MLLIB][PYSPARK][FOLLOWUP] Added weight column to pyspark API for regression evaluator and metrics

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24197: 
[SPARK-24102][ML][MLLIB][PYSPARK][FOLLOWUP] Added weight column to pyspark API 
for regression evaluator and metrics
URL: https://github.com/apache/spark/pull/24197#issuecomment-476483400
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24197: [SPARK-24102][ML][MLLIB][PYSPARK][FOLLOWUP] Added weight column to pyspark API for regression evaluator and metrics

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24197: 
[SPARK-24102][ML][MLLIB][PYSPARK][FOLLOWUP] Added weight column to pyspark API 
for regression evaluator and metrics
URL: https://github.com/apache/spark/pull/24197#issuecomment-476483406
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103945/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #24197: [SPARK-24102][ML][MLLIB][PYSPARK][FOLLOWUP] Added weight column to pyspark API for regression evaluator and metrics

2019-03-25 Thread GitBox
SparkQA removed a comment on issue #24197: 
[SPARK-24102][ML][MLLIB][PYSPARK][FOLLOWUP] Added weight column to pyspark API 
for regression evaluator and metrics
URL: https://github.com/apache/spark/pull/24197#issuecomment-476469190
 
 
   **[Test build #103945 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103945/testReport)**
 for PR 24197 at commit 
[`4cb2137`](https://github.com/apache/spark/commit/4cb213769f89fe4adc07afb08629b71770eb2d12).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #24197: [SPARK-24102][ML][MLLIB][PYSPARK][FOLLOWUP] Added weight column to pyspark API for regression evaluator and metrics

2019-03-25 Thread GitBox
SparkQA commented on issue #24197: [SPARK-24102][ML][MLLIB][PYSPARK][FOLLOWUP] 
Added weight column to pyspark API for regression evaluator and metrics
URL: https://github.com/apache/spark/pull/24197#issuecomment-476483135
 
 
   **[Test build #103945 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103945/testReport)**
 for PR 24197 at commit 
[`4cb2137`](https://github.com/apache/spark/commit/4cb213769f89fe4adc07afb08629b71770eb2d12).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `class RegressionEvaluator(JavaEvaluator, HasLabelCol, HasPredictionCol, 
HasWeightCol,`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery correctly

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse 
subquery correctly
URL: https://github.com/apache/spark/pull/24214#issuecomment-476482406
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103941/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gengliangwang commented on a change in pull request #24203: [SPARK-27269][SQL] File source v2 should validate data schema only

2019-03-25 Thread GitBox
gengliangwang commented on a change in pull request #24203: [SPARK-27269][SQL] 
File source v2 should validate data schema only
URL: https://github.com/apache/spark/pull/24203#discussion_r268949009
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileScan.scala
 ##
 @@ -37,22 +37,6 @@ abstract class FileScan(
 false
   }
 
-  /**
-   * Returns whether this format supports the given [[DataType]] in write path.
-   * By default all data types are supported.
-   */
-  def supportsDataType(dataType: DataType): Boolean = true
-
-  /**
-   * The string that represents the format that this data source provider 
uses. This is
-   * overridden by children to provide a nice alias for the data source. For 
example:
-   *
-   * {{{
-   *   override def formatName(): String = "ORC"
-   * }}}
-   */
-  def formatName: String
-
 
 Review comment:
   The `formatName` was added for showing the Exception message.
   https://github.com/apache/spark/pull/23714#pullrequestreview-203520602


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24212: [SPARK-26771][SQL][FOLLOWUP] Make all the uncache operations non-blocking by default

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24212: [SPARK-26771][SQL][FOLLOWUP] 
Make all the uncache operations non-blocking by default
URL: https://github.com/apache/spark/pull/24212#issuecomment-476482267
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery correctly

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse 
subquery correctly
URL: https://github.com/apache/spark/pull/24214#issuecomment-476482400
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery correctly

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery 
correctly
URL: https://github.com/apache/spark/pull/24214#issuecomment-476482400
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery correctly

2019-03-25 Thread GitBox
SparkQA removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery 
correctly
URL: https://github.com/apache/spark/pull/24214#issuecomment-476461973
 
 
   **[Test build #103941 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103941/testReport)**
 for PR 24214 at commit 
[`5fa78c8`](https://github.com/apache/spark/commit/5fa78c8b4d38b6d107315850606f660ad87e20f8).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #24212: [SPARK-26771][SQL][FOLLOWUP] Make all the uncache operations non-blocking by default

2019-03-25 Thread GitBox
AmplabJenkins removed a comment on issue #24212: [SPARK-26771][SQL][FOLLOWUP] 
Make all the uncache operations non-blocking by default
URL: https://github.com/apache/spark/pull/24212#issuecomment-476482269
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103933/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery correctly

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery 
correctly
URL: https://github.com/apache/spark/pull/24214#issuecomment-476482406
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103941/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #24214: [SPARK-27279][SQL] Reuse subquery correctly

2019-03-25 Thread GitBox
SparkQA commented on issue #24214: [SPARK-27279][SQL] Reuse subquery correctly
URL: https://github.com/apache/spark/pull/24214#issuecomment-476482287
 
 
   **[Test build #103941 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103941/testReport)**
 for PR 24214 at commit 
[`5fa78c8`](https://github.com/apache/spark/commit/5fa78c8b4d38b6d107315850606f660ad87e20f8).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #24212: [SPARK-26771][SQL][FOLLOWUP] Make all the uncache operations non-blocking by default

2019-03-25 Thread GitBox
AmplabJenkins commented on issue #24212: [SPARK-26771][SQL][FOLLOWUP] Make all 
the uncache operations non-blocking by default
URL: https://github.com/apache/spark/pull/24212#issuecomment-476482269
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103933/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   10   >