Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/14215
@HyukjinKwon no problem. Take your time.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/14215
@HyukjinKwon yep, keep each PR as small as possible is a good idea. BTW,
may I know the target version of your non-vectorize fix? Our production job is
in need of this fix.
Separating
Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/14215
@HyukjinKwon Do you have a timeline for this patch?
Also, what's your plan on vectorized parquet reader?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user wgtmac commented on a diff in the pull request:
https://github.com/apache/spark/pull/15303#discussion_r81423981
--- Diff:
core/src/main/scala/org/apache/spark/status/api/v1/ApplicationListResource.scala
---
@@ -53,12 +55,7 @@ private[v1] class ApplicationListResource
Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/15264
@sameeragarwal I agree. I'm looking forward to a comprehensive patch for
this. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user wgtmac closed the pull request at:
https://github.com/apache/spark/pull/15248
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/15303
@srowen @ajbozarth I created this PR without adding any new API. Just
rewrote the way getApplicationList constructing the iterator. Can you guys take
a look? Thanks!
---
If your project is set up
GitHub user wgtmac opened a pull request:
https://github.com/apache/spark/pull/15303
changed implementation of HistoryServer.getApplicationInfoList for lazy
evaluation
## What changes were proposed in this pull request?
Changed implementation
Github user wgtmac commented on a diff in the pull request:
https://github.com/apache/spark/pull/15247#discussion_r81220197
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
---
@@ -109,4 +109,11 @@ private[history] abstract class
Github user wgtmac commented on a diff in the pull request:
https://github.com/apache/spark/pull/15248#discussion_r81062518
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -179,7 +180,11 @@ class HistoryServer(
}
--- End diff
Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/15247
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/15247
Another test failed in the mllib:
org.apache.spark.mllib.classification.NaiveBayesSuite.Naive Bayes Multinomial
@ajbozarth Have you seen this kind of unrelated failure before? Do I have
Github user wgtmac commented on a diff in the pull request:
https://github.com/apache/spark/pull/15248#discussion_r81027833
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -179,7 +180,11 @@ class HistoryServer(
}
--- End diff
GitHub user wgtmac reopened a pull request:
https://github.com/apache/spark/pull/15247
[SPARK-17672] Spark 2.0 history server web Ui takes too long for a single
application
## What changes were proposed in this pull request?
Added a new API getApplicationInfo(appId: String
Github user wgtmac closed the pull request at:
https://github.com/apache/spark/pull/15247
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/15247
Anyone knows why the last test failed in the sql module? My change has
nothing to do with it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user wgtmac commented on a diff in the pull request:
https://github.com/apache/spark/pull/15248#discussion_r81007488
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -179,7 +180,11 @@ class HistoryServer(
}
--- End diff
Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/15264
@HyukjinKwon Yup. I made a mistake in managing my branches so that I
decided to create the PR again. Sorry for this confusion...
---
If your project is set up for it, you can reply to this email
Github user wgtmac commented on a diff in the pull request:
https://github.com/apache/spark/pull/15248#discussion_r80818667
--- Diff:
core/src/main/scala/org/apache/spark/status/api/v1/ApplicationListResource.scala
---
@@ -42,7 +44,7 @@ private[v1] class ApplicationListResource
Github user wgtmac commented on a diff in the pull request:
https://github.com/apache/spark/pull/15247#discussion_r80806416
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala ---
@@ -279,7 +279,8 @@ private[history] class ApplicationCache
Github user wgtmac commented on a diff in the pull request:
https://github.com/apache/spark/pull/15248#discussion_r80804405
--- Diff:
core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala ---
@@ -222,6 +222,7 @@ private[spark] object ApiRootResource
Github user wgtmac commented on a diff in the pull request:
https://github.com/apache/spark/pull/15248#discussion_r80791402
--- Diff:
core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala ---
@@ -222,6 +222,7 @@ private[spark] object ApiRootResource
GitHub user wgtmac opened a pull request:
https://github.com/apache/spark/pull/15264
[SPARK-17477][SQL] SparkSQL cannot handle schema evolution from Int -> Long
when parquet files have Int as its type while hive metastore has Long as its
type
## What changes were propo
Github user wgtmac commented on a diff in the pull request:
https://github.com/apache/spark/pull/15248#discussion_r80754121
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -178,6 +178,23 @@ class HistoryServer(
provider.getListing
Github user wgtmac commented on a diff in the pull request:
https://github.com/apache/spark/pull/15248#discussion_r80753534
--- Diff:
core/src/main/scala/org/apache/spark/status/api/v1/ApplicationListResource.scala
---
@@ -32,7 +32,14 @@ private[v1] class ApplicationListResource
GitHub user wgtmac opened a pull request:
https://github.com/apache/spark/pull/15248
[SPARK-17671] Spark 2.0 history server summary page is slow even set
spark.history.ui.maxApplications
## What changes were proposed in this pull request?
Added a overloaded method
GitHub user wgtmac opened a pull request:
https://github.com/apache/spark/pull/15247
[SPARK-17672] Spark 2.0 history server web Ui takes too long for a single
application
## What changes were proposed in this pull request?
Added a new API getApplicationInfo(appId: String
Github user wgtmac closed the pull request at:
https://github.com/apache/spark/pull/15155
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/15155
@HyukjinKwon Yup this PR is very similar to yours.
For merging parquet schema, it won't work. Think about this: the table
contains two parquet files, one has int, one has long. The DataFrame
GitHub user wgtmac opened a pull request:
https://github.com/apache/spark/pull/15155
[SPARK-17477][SQL] SparkSQL cannot handle schema evolution from Int -â¦
## What changes were proposed in this pull request?
Using SparkSession in Spark 2.0 to read a Hive table which
Github user wgtmac closed the pull request at:
https://github.com/apache/spark/pull/15035
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/15035
Just confirmed that this also doesn't work with vectorized reader. What I
did is as follows:
1. Created a flat hive table with schema "name: String, id: Long". But the
parquet
Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/15035
@HyukjinKwon Yup that makes sense. Do you have any idea where is the best
place to fix this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/15035
@JoshRosen yes it may have mask overflow risk. This conversion happens when
user provided schema or hive metastore schema has Long but the parquet files
have Int as the schema. We cannot avoid
Github user wgtmac commented on the issue:
https://github.com/apache/spark/pull/15035
@HyukjinKwon This is not parquet specific, it applies to other data sources
as well.
1. Change the reading path for parquet: It does not solve the problem. Some
queries need to read all parquet
GitHub user wgtmac opened a pull request:
https://github.com/apache/spark/pull/15035
[SPARK-17477]: SparkSQL cannot handle schema evolution from Int -> Loâ¦
## What changes were proposed in this pull request?
Using SparkSession in Spark 2.0 to read a Hive table wh
Github user wgtmac commented on a diff in the pull request:
https://github.com/apache/spark/pull/14835#discussion_r76490832
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -171,7 +175,7 @@ class HistoryServer(
* @return List of all
37 matches
Mail list logo