[GitHub] spark issue #14215: [SPARK-16544][SQL][WIP] Support for conversion from comp...

2016-10-09 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/14215 @HyukjinKwon no problem. Take your time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14215: [SPARK-16544][SQL][WIP] Support for conversion from comp...

2016-10-07 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/14215 @HyukjinKwon yep, keep each PR as small as possible is a good idea. BTW, may I know the target version of your non-vectorize fix? Our production job is in need of this fix. Separating

[GitHub] spark issue #14215: [SPARK-16544][SQL][WIP] Support for conversion from comp...

2016-10-06 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/14215 @HyukjinKwon Do you have a timeline for this patch? Also, what's your plan on vectorized parquet reader? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #15303: [SPARK-17671] Changed implementation of HistorySe...

2016-09-30 Thread wgtmac
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/spark/pull/15303#discussion_r81423981 --- Diff: core/src/main/scala/org/apache/spark/status/api/v1/ApplicationListResource.scala --- @@ -53,12 +55,7 @@ private[v1] class ApplicationListResource

[GitHub] spark issue #15264: [SPARK-17477][SQL] SparkSQL cannot handle schema evoluti...

2016-09-29 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/15264 @sameeragarwal I agree. I'm looking forward to a comprehensive patch for this. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #15248: [SPARK-17671] Spark 2.0 history server summary pa...

2016-09-29 Thread wgtmac
Github user wgtmac closed the pull request at: https://github.com/apache/spark/pull/15248 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15303: [SPARK-17671] Changed implementation of HistoryServer.ge...

2016-09-29 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/15303 @srowen @ajbozarth I created this PR without adding any new API. Just rewrote the way getApplicationList constructing the iterator. Can you guys take a look? Thanks! --- If your project is set up

[GitHub] spark pull request #15303: changed implementation of HistoryServer.getApplic...

2016-09-29 Thread wgtmac
GitHub user wgtmac opened a pull request: https://github.com/apache/spark/pull/15303 changed implementation of HistoryServer.getApplicationInfoList for lazy evaluation ## What changes were proposed in this pull request? Changed implementation

[GitHub] spark pull request #15247: [SPARK-17672] Spark 2.0 history server web Ui tak...

2016-09-29 Thread wgtmac
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/spark/pull/15247#discussion_r81220197 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala --- @@ -109,4 +109,11 @@ private[history] abstract class

[GitHub] spark pull request #15248: [SPARK-17671] Spark 2.0 history server summary pa...

2016-09-28 Thread wgtmac
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/spark/pull/15248#discussion_r81062518 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala --- @@ -179,7 +180,11 @@ class HistoryServer( } --- End diff

[GitHub] spark issue #15247: [SPARK-17672] Spark 2.0 history server web Ui takes too ...

2016-09-28 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/15247 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15247: [SPARK-17672] Spark 2.0 history server web Ui takes too ...

2016-09-28 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/15247 Another test failed in the mllib: org.apache.spark.mllib.classification.NaiveBayesSuite.Naive Bayes Multinomial @ajbozarth Have you seen this kind of unrelated failure before? Do I have

[GitHub] spark pull request #15248: [SPARK-17671] Spark 2.0 history server summary pa...

2016-09-28 Thread wgtmac
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/spark/pull/15248#discussion_r81027833 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala --- @@ -179,7 +180,11 @@ class HistoryServer( } --- End diff

[GitHub] spark pull request #15247: [SPARK-17672] Spark 2.0 history server web Ui tak...

2016-09-28 Thread wgtmac
GitHub user wgtmac reopened a pull request: https://github.com/apache/spark/pull/15247 [SPARK-17672] Spark 2.0 history server web Ui takes too long for a single application ## What changes were proposed in this pull request? Added a new API getApplicationInfo(appId: String

[GitHub] spark pull request #15247: [SPARK-17672] Spark 2.0 history server web Ui tak...

2016-09-28 Thread wgtmac
Github user wgtmac closed the pull request at: https://github.com/apache/spark/pull/15247 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15247: [SPARK-17672] Spark 2.0 history server web Ui takes too ...

2016-09-28 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/15247 Anyone knows why the last test failed in the sql module? My change has nothing to do with it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #15248: [SPARK-17671] Spark 2.0 history server summary pa...

2016-09-28 Thread wgtmac
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/spark/pull/15248#discussion_r81007488 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala --- @@ -179,7 +180,11 @@ class HistoryServer( } --- End diff

[GitHub] spark issue #15264: [SPARK-17477][SQL] SparkSQL cannot handle schema evoluti...

2016-09-27 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/15264 @HyukjinKwon Yup. I made a mistake in managing my branches so that I decided to create the PR again. Sorry for this confusion... --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #15248: [SPARK-17671] Spark 2.0 history server summary pa...

2016-09-27 Thread wgtmac
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/spark/pull/15248#discussion_r80818667 --- Diff: core/src/main/scala/org/apache/spark/status/api/v1/ApplicationListResource.scala --- @@ -42,7 +44,7 @@ private[v1] class ApplicationListResource

[GitHub] spark pull request #15247: [SPARK-17672] Spark 2.0 history server web Ui tak...

2016-09-27 Thread wgtmac
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/spark/pull/15247#discussion_r80806416 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala --- @@ -279,7 +279,8 @@ private[history] class ApplicationCache

[GitHub] spark pull request #15248: [SPARK-17671] Spark 2.0 history server summary pa...

2016-09-27 Thread wgtmac
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/spark/pull/15248#discussion_r80804405 --- Diff: core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala --- @@ -222,6 +222,7 @@ private[spark] object ApiRootResource

[GitHub] spark pull request #15248: [SPARK-17671] Spark 2.0 history server summary pa...

2016-09-27 Thread wgtmac
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/spark/pull/15248#discussion_r80791402 --- Diff: core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala --- @@ -222,6 +222,7 @@ private[spark] object ApiRootResource

[GitHub] spark pull request #15264: [SPARK-17477][SQL] SparkSQL cannot handle schema ...

2016-09-27 Thread wgtmac
GitHub user wgtmac opened a pull request: https://github.com/apache/spark/pull/15264 [SPARK-17477][SQL] SparkSQL cannot handle schema evolution from Int -> Long when parquet files have Int as its type while hive metastore has Long as its type ## What changes were propo

[GitHub] spark pull request #15248: [SPARK-17671] Spark 2.0 history server summary pa...

2016-09-27 Thread wgtmac
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/spark/pull/15248#discussion_r80754121 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala --- @@ -178,6 +178,23 @@ class HistoryServer( provider.getListing

[GitHub] spark pull request #15248: [SPARK-17671] Spark 2.0 history server summary pa...

2016-09-27 Thread wgtmac
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/spark/pull/15248#discussion_r80753534 --- Diff: core/src/main/scala/org/apache/spark/status/api/v1/ApplicationListResource.scala --- @@ -32,7 +32,14 @@ private[v1] class ApplicationListResource

[GitHub] spark pull request #15248: [SPARK-17671] Spark 2.0 history server summary pa...

2016-09-26 Thread wgtmac
GitHub user wgtmac opened a pull request: https://github.com/apache/spark/pull/15248 [SPARK-17671] Spark 2.0 history server summary page is slow even set spark.history.ui.maxApplications ## What changes were proposed in this pull request? Added a overloaded method

[GitHub] spark pull request #15247: [SPARK-17672] Spark 2.0 history server web Ui tak...

2016-09-26 Thread wgtmac
GitHub user wgtmac opened a pull request: https://github.com/apache/spark/pull/15247 [SPARK-17672] Spark 2.0 history server web Ui takes too long for a single application ## What changes were proposed in this pull request? Added a new API getApplicationInfo(appId: String

[GitHub] spark pull request #15155: [SPARK-17477][SQL] SparkSQL cannot handle schema ...

2016-09-26 Thread wgtmac
Github user wgtmac closed the pull request at: https://github.com/apache/spark/pull/15155 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15155: [SPARK-17477][SQL] SparkSQL cannot handle schema evoluti...

2016-09-19 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/15155 @HyukjinKwon Yup this PR is very similar to yours. For merging parquet schema, it won't work. Think about this: the table contains two parquet files, one has int, one has long. The DataFrame

[GitHub] spark pull request #15155: [SPARK-17477][SQL] SparkSQL cannot handle schema ...

2016-09-19 Thread wgtmac
GitHub user wgtmac opened a pull request: https://github.com/apache/spark/pull/15155 [SPARK-17477][SQL] SparkSQL cannot handle schema evolution from Int -… ## What changes were proposed in this pull request? Using SparkSession in Spark 2.0 to read a Hive table which

[GitHub] spark pull request #15035: [SPARK-17477]: SparkSQL cannot handle schema evol...

2016-09-19 Thread wgtmac
Github user wgtmac closed the pull request at: https://github.com/apache/spark/pull/15035 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15035: [SPARK-17477]: SparkSQL cannot handle schema evolution f...

2016-09-15 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/15035 Just confirmed that this also doesn't work with vectorized reader. What I did is as follows: 1. Created a flat hive table with schema "name: String, id: Long". But the parquet

[GitHub] spark issue #15035: [SPARK-17477]: SparkSQL cannot handle schema evolution f...

2016-09-12 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/15035 @HyukjinKwon Yup that makes sense. Do you have any idea where is the best place to fix this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #15035: [SPARK-17477]: SparkSQL cannot handle schema evolution f...

2016-09-12 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/15035 @JoshRosen yes it may have mask overflow risk. This conversion happens when user provided schema or hive metastore schema has Long but the parquet files have Int as the schema. We cannot avoid

[GitHub] spark issue #15035: [SPARK-17477]: SparkSQL cannot handle schema evolution f...

2016-09-12 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/15035 @HyukjinKwon This is not parquet specific, it applies to other data sources as well. 1. Change the reading path for parquet: It does not solve the problem. Some queries need to read all parquet

[GitHub] spark pull request #15035: [SPARK-17477]: SparkSQL cannot handle schema evol...

2016-09-09 Thread wgtmac
GitHub user wgtmac opened a pull request: https://github.com/apache/spark/pull/15035 [SPARK-17477]: SparkSQL cannot handle schema evolution from Int -> Lo… ## What changes were proposed in this pull request? Using SparkSession in Spark 2.0 to read a Hive table wh

[GitHub] spark pull request #14835: [SPARK-17243] [Web UI] Spark 2.0 History Server w...

2016-08-26 Thread wgtmac
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/spark/pull/14835#discussion_r76490832 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala --- @@ -171,7 +175,7 @@ class HistoryServer( * @return List of all