[GitHub] spark pull request #23208: [SPARK-25530][SQL] data source v2 API refactor (b...

2018-12-09 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23208#discussion_r240101369 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala --- @@ -17,52 +17,49

[GitHub] spark issue #23215: [SPARK-26263][SQL] Validate partition values with user p...

2018-12-06 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23215 Thank you @cloud-fan @viirya @HyukjinKwon . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #23215: [SPARK-26263][SQL] Validate partition values with user p...

2018-12-06 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23215 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23208: [SPARK-25530][SQL] data source v2 API refactor (b...

2018-12-06 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23208#discussion_r239454594 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsBatchWrite.java --- @@ -25,14 +25,14 @@ import

[GitHub] spark pull request #23215: [SPARK-26263][SQL] Validate partition values with...

2018-12-06 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23215#discussion_r239423651 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -272,9 +279,13 @@ object

[GitHub] spark issue #23215: [SPARK-26263][SQL] Validate partition values with user p...

2018-12-06 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23215 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23240: [SPARK-26281][WebUI] Duration column of task table shoul...

2018-12-05 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23240 Oh, I see. Close this one now. Please change the title in #23160 --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #23240: [SPARK-26281][WebUI] Duration column of task tabl...

2018-12-05 Thread gengliangwang
Github user gengliangwang closed the pull request at: https://github.com/apache/spark/pull/23240 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23240: [SPARK-26281][WebUI] Duration column of task table shoul...

2018-12-05 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23240 @shahidki31 @pgandhi999 @srowen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #23240: [SPARK-26281][WebUI] Duration column of task tabl...

2018-12-05 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/23240 [SPARK-26281][WebUI] Duration column of task table should be executor run time instead of real duration ## What changes were proposed in this pull request? In PR https://github.com

[GitHub] spark pull request #23215: [SPARK-26263][SQL] Throw exception when Partition...

2018-12-04 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/23215 [SPARK-26263][SQL] Throw exception when Partition column value can't be converted to user specified type ## What changes were proposed in this pull request? Currently if

[GitHub] spark issue #23215: [SPARK-26263][SQL] Throw exception when Partition column...

2018-12-04 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23215 @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23189: [SPARK-26235][Core] Change log level for ClassNotFoundEx...

2018-12-01 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23189 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23189: [SPARK-26235][Core] Change log level for ClassNot...

2018-11-30 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23189#discussion_r237972852 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -813,14 +813,14 @@ private[spark] class SparkSubmit extends Logging

[GitHub] spark issue #23189: [SPARK-26235][Core] Change log level for ClassNotFoundEx...

2018-11-30 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23189 @vanzin Ah, I see. Thanks for pointing it out! But I am now thinking overriding `logError` by calling `printMessage("Error...")`. What do

[GitHub] spark pull request #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, v...

2018-11-30 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23186#discussion_r237963856 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -345,15 +346,18 @@ object

[GitHub] spark issue #23189: [SPARK-26235][Core] Change log level for ClassNotFoundEx...

2018-11-30 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23189 @vanzin The `logWarning` call as the other handler below is also not overridden: ``` case e: NoClassDefFoundError => logWarning(s"Failed to load $childM

[GitHub] spark issue #23189: [SPARK-26235][Core] Change log level for ClassNotFoundEx...

2018-11-30 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23189 cc @vanzin This is really trivial, but can be helpful to user in certain case. --- - To unsubscribe, e-mail

[GitHub] spark pull request #23189: [SPARK-26235][Core] Change log level for ClassNot...

2018-11-30 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/23189 [SPARK-26235][Core] Change log level for ClassNotFoundException/NoClassDefFoundError in SparkSubmit to Error ## What changes were proposed in this pull request? In my local setup, I

[GitHub] spark issue #23160: [SPARK-26196]Total tasks title in the stage page is inco...

2018-11-30 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23160 cc @tgravescs @pgandhi999 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, validate...

2018-11-30 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23186 @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, v...

2018-11-30 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/23186 [SPARK-26230][SQL]FileIndex: if case sensitive, validate partitions with original column names ## What changes were proposed in this pull request? Partition column name is required

[GitHub] spark issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job pa...

2018-11-29 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23068 @gatorsmile thanks! For Jobs don't have associated SQL query, the text `Associated SQL query` won't show up. --- --

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237523177 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -250,7 +276,13 @@ object

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237516228 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -250,7 +276,13 @@ object

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237515959 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -250,7 +276,13 @@ object

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237513657 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -94,18 +94,34 @@ object

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237512471 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -94,18 +94,34 @@ object

[GitHub] spark issue #23165: [SPARK-26188][SQL] FileIndex: don't infer data types of ...

2018-11-29 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23165 @mgaido91 The user specified schema might not match the full data schema. For the missing columns, we still need to infer their data types. I will come up with a solution soon

[GitHub] spark issue #23165: [SPARK-26188][SQL] FileIndex: don't infer data types of ...

2018-11-28 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23165 @cloud-fan @mgaido91 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-28 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/23165 [SPARK-26188][SQL] FileIndex: don't infer data types of partition columns if user specifies schema ## What changes were proposed in this pull request? This PR is to fix a regre

[GitHub] spark pull request #21004: [SPARK-23896][SQL]Improve PartitioningAwareFileIn...

2018-11-28 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/21004#discussion_r237092452 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala --- @@ -126,35 +126,32 @@ abstract

[GitHub] spark issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job pa...

2018-11-27 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23068 ping @vanzin . Please review this one :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #23125: [SPARK-26156][WebUI] Revise summary section of st...

2018-11-23 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23125#discussion_r235995724 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala --- @@ -79,6 +79,9 @@ private[ui] class StagePage(parent: StagesTab, store

[GitHub] spark pull request #23125: [SPARK-26156][WebUI] Revise summary section of st...

2018-11-23 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/23125 [SPARK-26156][WebUI] Revise summary section of stage page ## What changes were proposed in this pull request? In the summary section of stage page: ![image](https://user

[GitHub] spark issue #23125: [SPARK-26156][WebUI] Revise summary section of stage pag...

2018-11-23 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23125 @srowen @vanzin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23098: [WIP][SPARK-26132][BUILD][CORE] Remove support fo...

2018-11-21 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23098#discussion_r235317414 --- Diff: bin/load-spark-env.cmd --- @@ -21,37 +21,42 @@ rem This script loads spark-env.cmd if it exists, and ensures it is only loaded rem

[GitHub] spark issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job pa...

2018-11-21 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23068 The web UI presents "jobs" tab as the default view. Showing the related SQL context is helpful, especially for new users. I think the downside is qu

[GitHub] spark pull request #23081: [SPARK-26109][WebUI]Duration in the task summary ...

2018-11-20 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23081#discussion_r235071529 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala --- @@ -996,7 +996,7 @@ private[ui] object ApiHelper { HEADER_EXECUTOR

[GitHub] spark pull request #23081: [SPARK-26109][WebUI]Duration in the task summary ...

2018-11-20 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23081#discussion_r235065240 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala --- @@ -996,7 +996,7 @@ private[ui] object ApiHelper { HEADER_EXECUTOR

[GitHub] spark issue #23081: [SPARK-26109][WebUI]Duration in the task summary metrics...

2018-11-20 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23081 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job pa...

2018-11-20 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23068 @srowen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23067: [SPARK-26097][Web UI] Add the new partitioning descripti...

2018-11-20 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23067 +1. We can also try overriding `toString` for better output. @vanzin --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-11-20 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23088 As the task section shows all the tasks, I think maybe it would be better to show aggregated metrics for all the tasks

[GitHub] spark pull request #23088: [SPARK-26119][CORE][WEBUI]Task summary table shou...

2018-11-20 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23088#discussion_r235027227 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala --- @@ -238,8 +239,16 @@ private[spark] class AppStatusStore

[GitHub] spark pull request #23038: [SPARK-25451][SPARK-26100][CORE]Aggregated metric...

2018-11-19 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23038#discussion_r234728748 --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusListenerSuite.scala --- @@ -1275,6 +1275,49 @@ class AppStatusListenerSuite extends

[GitHub] spark issue #23065: [SPARK-26090][CORE][SQL][ML] Resolve most miscellaneous ...

2018-11-19 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23065 I tried and it works. There is a similar warning in UnionRDD.scala, which will cause failure in Scala 2.11. --- - To

[GitHub] spark issue #23065: [SPARK-26090][CORE][SQL][ML] Resolve most miscellaneous ...

2018-11-19 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23065 Hi @srowen , Could you review and merge https://github.com/srowen/spark/pull/4 ? I see a lot of warnings as well. We should fix them

[GitHub] spark issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job pa...

2018-11-17 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23068 @vanzin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job pa...

2018-11-17 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23068 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23068: [SPARK-26098][WebUI] Show associated SQL query in...

2018-11-17 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23068#discussion_r234402149 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala --- @@ -56,6 +56,11 @@ private[spark] class AppStatusStore

[GitHub] spark pull request #23068: [SPARK-26098][WebUI] Show associated SQL query in...

2018-11-17 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23068#discussion_r234402077 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala --- @@ -70,6 +70,8 @@ private[spark] class AppStatusListener

[GitHub] spark pull request #23068: [SPARK-26098][WebUI] Show associated SQL query in...

2018-11-17 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/23068 [SPARK-26098][WebUI] Show associated SQL query in Job page ## What changes were proposed in this pull request? For jobs associated to SQL queries, it would be easier to understand

[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-17 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23049 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-16 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23049 @vanzin I see your point. I will add a link to https://spark.apache.org/docs/latest/configuration.html. Thanks for the suggestion. In my case, I didn't know where to find or edit `

[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-16 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23049 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23047: [BACKPORT][SPARK-25883][SQL][MINOR] Override meth...

2018-11-15 Thread gengliangwang
Github user gengliangwang closed the pull request at: https://github.com/apache/spark/pull/23047 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23049: [SPARK-26076][Build][Minor] Revise ambiguous error messa...

2018-11-15 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23049 Hi @vanzin , thanks for pointing it out! I have updated the script and PR description. --- - To unsubscribe, e-mail

[GitHub] spark pull request #23049: [SPARK-26076][Build][Minor] Revise ambiguous erro...

2018-11-15 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23049#discussion_r233902416 --- Diff: bin/load-spark-env.sh --- @@ -47,8 +47,8 @@ if [ -z "$SPARK_SCALA_VERSION" ]; then ASSEMBLY_DIR1="${SPARK_HOME}/assemb

[GitHub] spark issue #23049: [SPARK-26076][Build] Revise ambiguous error message from...

2018-11-15 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23049 @srowen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23049: [SPARK-26076][Build] Revise ambiguous error messa...

2018-11-15 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/23049 [SPARK-26076][Build] Revise ambiguous error message from load-spark-env.sh ## What changes were proposed in this pull request? When I try to run scripts (e.g. `./sbin/start-history

[GitHub] spark pull request #23047: [SPARK-25883][SQL][MINOR] Override method `pretty...

2018-11-15 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/23047 [SPARK-25883][SQL][MINOR] Override method `prettyName` in `from_avro`/`to_avro` ## What changes were proposed in this pull request? Previously in from_avro/to_avro, we override the

[GitHub] spark issue #22890: [SPARK-25883][SQL][Minor] Override method `prettyName` i...

2018-11-15 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22890 This PR turn out to be a bug fix for this issue: https://issues.apache.org/jira/browse/SPARK-26063 Back port this to branch-2.4

[GitHub] spark issue #21688: [SPARK-21809] : Change Stage Page to use datatables to s...

2018-11-14 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/21688 Hi @pgandhi999 Thanks for the work. One minor comment here: Currently the table header looks like this ![image](https://user-images.githubusercontent.com/1097932/48502853-841e4d80

[GitHub] spark issue #22966: [SPARK-25965][SQL][TEST] Add avro read benchmark

2018-11-13 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22966 @dongjoon-hyun sure. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22966: [SPARK-25965][SQL][TEST] Add avro read benchmark

2018-11-13 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22966#discussion_r233097082 --- Diff: external/avro/src/test/scala/org/apache/spark/sql/execution/benchmark/AvroReadBenchmark.scala --- @@ -0,0 +1,226

[GitHub] spark pull request #23002: [SPARK-26003] Improve SQLAppStatusListener.aggreg...

2018-11-12 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23002#discussion_r232713853 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala --- @@ -159,7 +159,7 @@ class SQLAppStatusListener

[GitHub] spark pull request #23002: [SPARK-26003] Improve SQLAppStatusListener.aggreg...

2018-11-12 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23002#discussion_r232713761 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala --- @@ -159,7 +159,7 @@ class SQLAppStatusListener

[GitHub] spark pull request #23002: [SPARK-26003] Improve SQLAppStatusListener.aggreg...

2018-11-12 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23002#discussion_r232706992 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala --- @@ -159,7 +159,7 @@ class SQLAppStatusListener

[GitHub] spark pull request #23002: [SPARK-26003] Improve SQLAppStatusListener.aggreg...

2018-11-12 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/23002#discussion_r232706543 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala --- @@ -159,7 +159,7 @@ class SQLAppStatusListener

[GitHub] spark pull request #22966: [SPARK-25965][SQL][TEST] Add avro read benchmark

2018-11-11 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22966#discussion_r232550388 --- Diff: external/avro/src/test/scala/org/apache/spark/sql/execution/benchmark/AvroReadBenchmark.scala --- @@ -0,0 +1,226

[GitHub] spark pull request #22966: [SPARK-25965][SQL][TEST] Add avro read benchmark

2018-11-10 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22966#discussion_r232461509 --- Diff: external/avro/benchmarks/AvroReadBenchmark-results.txt --- @@ -0,0 +1,122

[GitHub] spark pull request #22966: [SPARK-25965][SQL][TEST] Add avro read benchmark

2018-11-09 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22966#discussion_r232272074 --- Diff: external/avro/src/test/scala/org/apache/spark/sql/execution/benchmark/AvroReadBenchmark.scala --- @@ -0,0 +1,226

[GitHub] spark issue #22966: [SPARK-25965][SQL][TEST] Add avro read benchmark

2018-11-08 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22966 @dongjoon-hyun I think we can merge this one first. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #22987: [SPARK-25979][SQL] Window function: allow parentheses ar...

2018-11-08 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22987 @gatorsmile @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22987: [SPARK-25979][SQL] Window function: allow parenth...

2018-11-08 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22987#discussion_r231962213 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLWindowFunctionSuite.scala --- @@ -31,6 +32,19 @@ class SQLWindowFunctionSuite

[GitHub] spark pull request #22987: [SPARK-25979][SQL] Window function: allow parenth...

2018-11-08 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/22987 [SPARK-25979][SQL] Window function: allow parentheses around window reference ## What changes were proposed in this pull request? Very minor parser bug, but possibly problematic for

[GitHub] spark issue #22965: [SPARK-25964][SQL][Minor] Revise OrcReadBenchmark/DataSo...

2018-11-08 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22965 @dongjoon-hyun sure, done. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22965: [SPARK-25964][SQL][Minor] Revise OrcReadBenchmark...

2018-11-08 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22965#discussion_r231795384 --- Diff: sql/core/benchmarks/DataSourceReadBenchmark-results.txt --- @@ -2,268 +2,268 @@ SQL Single Numeric Column Scan

[GitHub] spark issue #22965: [SPARK-25964][SQL][Minor] Revise OrcReadBenchmark/DataSo...

2018-11-08 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22965 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22966: [SPARK-25965][SQL][TEST] Add avro read benchmark

2018-11-07 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22966 Cool, could you introduce it to Spark? That would be very helpful :) @dbtsai @jleach4 and @aokolnychyi --- - To

[GitHub] spark issue #22966: [PARK-25965][SQL][TEST] Add avro read benchmark

2018-11-07 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22966 @dbtsai Great! I was thinking the benchmark in this PR is kind of simple, so I didn't add it for over months.. The benchmark you mentioned should also workable for other data so

[GitHub] spark pull request #22965: [SPARK-25964][SQL][Minor] Revise OrcReadBenchmark...

2018-11-07 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22965#discussion_r231766852 --- Diff: sql/core/benchmarks/DataSourceReadBenchmark-results.txt --- @@ -2,268 +2,268 @@ SQL Single Numeric Column Scan

[GitHub] spark pull request #22965: [SPARK-25964][SQL][Minor] Revise OrcReadBenchmark...

2018-11-07 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22965#discussion_r231765680 --- Diff: sql/core/benchmarks/DataSourceReadBenchmark-results.txt --- @@ -2,268 +2,268 @@ SQL Single Numeric Column Scan

[GitHub] spark issue #22966: [PARK-25965][SQL][TEST] Add avro read benchmark

2018-11-07 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22966 Done, @dongjoon-hyun PTAL. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #22965: [SPARK-25964][SQL][Minor] Revise OrcReadBenchmark/DataSo...

2018-11-07 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22965 @dongjoon-hyun @yucai --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22966: [PARK-25965][SQL] Add avro read benchmark

2018-11-07 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22966 @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22965: [SPARK-25964][SQL][Minor] Revise OrcReadBenchmark...

2018-11-07 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22965#discussion_r231549251 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcReadBenchmark.scala --- @@ -32,9 +32,11 @@ import org.apache.spark.sql.types

[GitHub] spark pull request #22966: [PARK-25965][SQL] Add avro read benchmark

2018-11-07 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/22966 [PARK-25965][SQL] Add avro read benchmark ## What changes were proposed in this pull request? Add read benchmark for Avro, which is missing for a period. The benchmark is similar

[GitHub] spark pull request #22965: [SPARK-25964][SQL][Minor] Revise OrcReadBenchmark...

2018-11-07 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22965#discussion_r231541613 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcReadBenchmark.scala --- @@ -266,8 +268,9 @@ object OrcReadBenchmark extends

[GitHub] spark pull request #22965: [SPARK-25964][SQL][Minor] Revise OrcReadBenchmark...

2018-11-07 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/22965 [SPARK-25964][SQL][Minor] Revise OrcReadBenchmark/DataSourceReadBenchmark case names and execution instructions ## What changes were proposed in this pull request? 1

[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-11-04 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22914 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22878: [SPARK-25789][SQL] Support for Dataset of Avro

2018-11-01 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22878#discussion_r230041592 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroEncoder.scala --- @@ -0,0 +1,533 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-11-01 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22914 > Returning to current page seems more code change is required. Because we are getting all the parameters related to page from the url. Currently we are not passing any parameter to get wh

[GitHub] spark issue #22864: [SPARK-25861][Minor][WEBUI] Remove unused refreshInterva...

2018-11-01 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22864 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22878: [SPARK-25789][SQL] Support for Dataset of Avro

2018-11-01 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22878#discussion_r229952971 --- Diff: external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala --- @@ -1374,4 +1377,185 @@ class AvroSuite extends QueryTest with

[GitHub] spark pull request #22878: [SPARK-25789][SQL] Support for Dataset of Avro

2018-11-01 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22878#discussion_r229976669 --- Diff: external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala --- @@ -1374,4 +1377,185 @@ class AvroSuite extends QueryTest with

[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-10-31 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22914 > May be we can highlight above the table, that "Invalid page number, falling back to first page" Yes, that's what I mean. No big deal but falling back to th

[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-10-31 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22914 I prefer to just highlight the invalid output. E.g. ![image](https://user-images.githubusercontent.com/1097932/47831557-0e6ea800-ddcc-11e8-9fd1-c4d29f944c9d.png) ![image

[GitHub] spark pull request #22895: [SPARK-25886][SQL][Minor] Improve error message o...

2018-10-30 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22895#discussion_r229564121 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala --- @@ -100,9 +100,14 @@ case class AvroDataToCatalyst

  1   2   3   4   5   6   7   8   9   >