[GitHub] [spark] AmplabJenkins commented on issue #24900: [SPARK-28089][SQL] File source v2: support reading output of file streaming Sink
AmplabJenkins commented on issue #24900: [SPARK-28089][SQL] File source v2: support reading output of file streaming Sink URL: https://github.com/apache/spark/pull/24900#issuecomment-502959172 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11846/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24900: [SPARK-28089][SQL] File source v2: support reading output of file streaming Sink
AmplabJenkins commented on issue #24900: [SPARK-28089][SQL] File source v2: support reading output of file streaming Sink URL: https://github.com/apache/spark/pull/24900#issuecomment-502959170 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on issue #24900: [SPARK-28089][SQL] File source v2: support reading output of file streaming Sink
gengliangwang commented on issue #24900: [SPARK-28089][SQL] File source v2: support reading output of file streaming Sink URL: https://github.com/apache/spark/pull/24900#issuecomment-502958978 This PR also resolve several TODO items in https://github.com/apache/spark/pull/24830. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang edited a comment on issue #24900: [SPARK-28089][SQL] File source v2: support reading output of file streaming Sink
gengliangwang edited a comment on issue #24900: [SPARK-28089][SQL] File source v2: support reading output of file streaming Sink URL: https://github.com/apache/spark/pull/24900#issuecomment-502958978 This PR also resolves several TODO testing items in https://github.com/apache/spark/pull/24830. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang edited a comment on issue #24900: [SPARK-28089][SQL] File source v2: support reading output of file streaming Sink
gengliangwang edited a comment on issue #24900: [SPARK-28089][SQL] File source v2: support reading output of file streaming Sink URL: https://github.com/apache/spark/pull/24900#issuecomment-502958978 This PR also resolve several TODO testing items in https://github.com/apache/spark/pull/24830. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang opened a new pull request #24900: [SPARK-28089][SQL] File source v2: support reading output of file streaming Sink
gengliangwang opened a new pull request #24900: [SPARK-28089][SQL] File source v2: support reading output of file streaming Sink URL: https://github.com/apache/spark/pull/24900 ## What changes were proposed in this pull request? File source V1 supports reading output of FileStreamSink as batch. https://github.com/apache/spark/pull/11897 We should support this in file source V2 as well. When reading with paths, we first check if there is metadata log of FileStreamSink. If yes, we use `MetadataLogFileIndex` for listing files; Otherwise, we use `InMemoryFileIndex`. ## How was this patch tested? Unit test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] jerryshao commented on a change in pull request #24848: [SPARK-28014][core] All waiting apps will be changed to the wrong state of Running after master changed.
jerryshao commented on a change in pull request #24848: [SPARK-28014][core] All waiting apps will be changed to the wrong state of Running after master changed. URL: https://github.com/apache/spark/pull/24848#discussion_r294620175 ## File path: core/src/main/scala/org/apache/spark/deploy/master/Master.scala ## @@ -561,7 +561,7 @@ private[deploy] class Master( apps.filter(_.state == ApplicationState.UNKNOWN).foreach(finishApplication) // Update the state of recovered apps to RUNNING -apps.filter(_.state == ApplicationState.WAITING).foreach(_.state = ApplicationState.RUNNING) +apps.filter(_.coresGranted > 0).foreach(_.state = ApplicationState.RUNNING) Review comment: In a Dynamic Executor Allocation enabled scenario, executor number can be 0 in idle state (when min executor number is set to 0). So in this scenario, we can hardly say that application is waiting (and it is not finished). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24899: [SPARK-28088][SQL] Enhance LPAD/RPAD function
AmplabJenkins removed a comment on issue #24899: [SPARK-28088][SQL] Enhance LPAD/RPAD function URL: https://github.com/apache/spark/pull/24899#issuecomment-502957484 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24899: [SPARK-28088][SQL] Enhance LPAD/RPAD function
AmplabJenkins removed a comment on issue #24899: [SPARK-28088][SQL] Enhance LPAD/RPAD function URL: https://github.com/apache/spark/pull/24899#issuecomment-502957492 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11845/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24899: [SPARK-28088][SQL] Enhance LPAD/RPAD function
AmplabJenkins commented on issue #24899: [SPARK-28088][SQL] Enhance LPAD/RPAD function URL: https://github.com/apache/spark/pull/24899#issuecomment-502957484 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24899: [SPARK-28088][SQL] Enhance LPAD/RPAD function
AmplabJenkins commented on issue #24899: [SPARK-28088][SQL] Enhance LPAD/RPAD function URL: https://github.com/apache/spark/pull/24899#issuecomment-502957492 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11845/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24899: [SPARK-28088][SQL] Enhance LPAD/RPAD function
SparkQA commented on issue #24899: [SPARK-28088][SQL] Enhance LPAD/RPAD function URL: https://github.com/apache/spark/pull/24899#issuecomment-502956254 **[Test build #106608 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106608/testReport)** for PR 24899 at commit [`1fc1455`](https://github.com/apache/spark/commit/1fc1455486b8b7828021b472281099053379d506). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum opened a new pull request #24899: [SPARK-28088][SQL] Enhance LPAD/RPAD function
wangyum opened a new pull request #24899: [SPARK-28088][SQL] Enhance LPAD/RPAD function URL: https://github.com/apache/spark/pull/24899 ## What changes were proposed in this pull request? This pr enhance `LPAD`/`RPAD` function to make `pad` optional. ## How was this patch tested? unit tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] JkSelf commented on a change in pull request #21950: [SPARK-24914][SQL] Add configuration to avoid OOM during broadcast join (and other negative side effects of incorrect table sizing
JkSelf commented on a change in pull request #21950: [SPARK-24914][SQL] Add configuration to avoid OOM during broadcast join (and other negative side effects of incorrect table sizing) URL: https://github.com/apache/spark/pull/21950#discussion_r294613350 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -1054,11 +1056,18 @@ private[hive] object HiveClientImpl { // When table is external, `totalSize` is always zero, which will influence join strategy. // So when `totalSize` is zero, use `rawDataSize` instead. When `rawDataSize` is also zero, // return None. +// If a table has a deserialization factor, the table owner expects the in-memory +// representation of the table to be larger than the table's totalSize value. In that case, +// multiply totalSize by the deserialization factor and use that number instead. +// If the user has set spark.sql.statistics.ignoreRawDataSize to true (because of HIVE-20079, +// for example), don't use rawDataSize. // In Hive, when statistics gathering is disabled, `rawDataSize` and `numRows` is always // zero after INSERT command. So they are used here only if they are larger than zero. -if (totalSize.isDefined && totalSize.get > 0L) { - Some(CatalogStatistics(sizeInBytes = totalSize.get, rowCount = rowCount.filter(_ > 0))) -} else if (rawDataSize.isDefined && rawDataSize.get > 0) { +val adjustedSize = DataSourceUtils.calcDataSize(properties, totalSize.getOrElse(BigInt(0))) +val sqlConf = SQLConf.get +if (adjustedSize > 0L) { + Some(CatalogStatistics(sizeInBytes = adjustedSize, rowCount = rowCount.filter(_ > 0))) +} else if (rawDataSize.isDefined && rawDataSize.get > 0 && !sqlConf.ignoreRawDataSize) { Review comment: @bersprockets Here can we calculate the raw size based on the row count instead of use the not use the raw size? Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
AmplabJenkins removed a comment on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894#issuecomment-502947251 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106601/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
AmplabJenkins removed a comment on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894#issuecomment-502947247 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
AmplabJenkins commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894#issuecomment-502947247 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
AmplabJenkins commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894#issuecomment-502947251 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106601/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
SparkQA removed a comment on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894#issuecomment-502914590 **[Test build #106601 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106601/testReport)** for PR 24894 at commit [`9b684af`](https://github.com/apache/spark/commit/9b684af44a7039d3d9b9604e555dd653e276cda0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
SparkQA commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894#issuecomment-502946924 **[Test build #106601 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106601/testReport)** for PR 24894 at commit [`9b684af`](https://github.com/apache/spark/commit/9b684af44a7039d3d9b9604e555dd653e276cda0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
HyukjinKwon closed pull request #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
HyukjinKwon commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894#issuecomment-502945477 Merged to master and branch-2.4 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #24889: [SPARK-28072][SQL] Fix IncompatibleClassChangeError in `FromUnixTime` codegen on JDK9+
dongjoon-hyun commented on issue #24889: [SPARK-28072][SQL] Fix IncompatibleClassChangeError in `FromUnixTime` codegen on JDK9+ URL: https://github.com/apache/spark/pull/24889#issuecomment-502945094 Thank you for approval, @cloud-fan ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502944337 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502944337 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502944344 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11844/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502944344 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11844/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
HyukjinKwon commented on a change in pull request #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#discussion_r294606676 ## File path: core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala ## @@ -54,13 +55,36 @@ object PythonRunner { // Launch a Py4J gateway server for the process to connect to; this will let it see our // Java system properties and such val localhost = InetAddress.getLoopbackAddress() -val gatewayServer = new py4j.GatewayServer.GatewayServerBuilder() - .authToken(secret) - .javaPort(0) - .javaAddress(localhost) - .callbackClient(py4j.GatewayServer.DEFAULT_PYTHON_PORT, localhost, secret) - .build() -val thread = new Thread(() => Utils.logUncaughtExceptions { gatewayServer.start() }) + +val server = if (sys.env.getOrElse( +"PYSPARK_PIN_THREAD", "true").toLowerCase(Locale.ROOT) == "true") { + val clientServer = new py4j.ClientServer.ClientServerBuilder() +.authToken(secret) +.javaPort(0) +.javaAddress(localhost) +.build() + + (() => clientServer.startServer(), +() => clientServer.getJavaServer.getListeningPort, +() => clientServer.shutdown()) +} else { + val gatewayServer = new py4j.GatewayServer.GatewayServerBuilder() +.authToken(secret) +.javaPort(0) +.javaAddress(localhost) +.callbackClient(py4j.GatewayServer.DEFAULT_PYTHON_PORT, localhost, secret) +.build() + + (() => gatewayServer.start(), +() => gatewayServer.getListeningPort, +() => gatewayServer.shutdown()) +} + +val start: () => Unit = server._1 +val getPortNum: () => Int = server._2 +val shutdown: () => Unit = server._3 Review comment: Let me try to clean up later ... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
SparkQA commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502943440 **[Test build #106607 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106607/testReport)** for PR 24898 at commit [`a546fc9`](https://github.com/apache/spark/commit/a546fc908c2239cb10a6d2a3802f7fe43ae27089). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502943118 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11843/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502943111 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502943081 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106606/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502943076 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502943076 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
SparkQA removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502942099 **[Test build #106606 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106606/testReport)** for PR 24898 at commit [`0948598`](https://github.com/apache/spark/commit/094859872d81a74831064fffe39e253c5e55b4a3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502943081 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106606/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
SparkQA commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502943072 **[Test build #106606 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106606/testReport)** for PR 24898 at commit [`0948598`](https://github.com/apache/spark/commit/094859872d81a74831064fffe39e253c5e55b4a3). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502943111 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502943118 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11843/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #24891: [SPARK-28075][SQL] Enhance TRIM function
cloud-fan closed pull request #24891: [SPARK-28075][SQL] Enhance TRIM function URL: https://github.com/apache/spark/pull/24891 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
SparkQA commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502942099 **[Test build #106606 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106606/testReport)** for PR 24898 at commit [`0948598`](https://github.com/apache/spark/commit/094859872d81a74831064fffe39e253c5e55b4a3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502941620 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins removed a comment on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502941630 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11842/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502941630 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11842/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
AmplabJenkins commented on issue #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-502941620 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #24891: [SPARK-28075][SQL] Enhance TRIM function
cloud-fan commented on issue #24891: [SPARK-28075][SQL] Enhance TRIM function URL: https://github.com/apache/spark/pull/24891#issuecomment-502941414 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
HyukjinKwon commented on a change in pull request #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#discussion_r294605232 ## File path: core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala ## @@ -54,13 +55,32 @@ object PythonRunner { // Launch a Py4J gateway server for the process to connect to; this will let it see our // Java system properties and such val localhost = InetAddress.getLoopbackAddress() -val gatewayServer = new py4j.GatewayServer.GatewayServerBuilder() - .authToken(secret) - .javaPort(0) - .javaAddress(localhost) - .callbackClient(py4j.GatewayServer.DEFAULT_PYTHON_PORT, localhost, secret) - .build() -val thread = new Thread(() => Utils.logUncaughtExceptions { gatewayServer.start() }) + +val start, getPortNum, shutdown = if (sys.env.getOrElse( +"PYSPARK_PIN_THREAD", "true").toLowerCase(Locale.ROOT) == "true") { Review comment: note to myself: need to make it false default before merging. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders
dongjoon-hyun commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders URL: https://github.com/apache/spark/pull/23547#issuecomment-502941505 Hi, @rednaxelafx . Shall we close this PR for now? You can create an Apache Spark JIRA and restart when you're ready~. :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics
AmplabJenkins removed a comment on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics URL: https://github.com/apache/spark/pull/24868#issuecomment-502940995 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics
AmplabJenkins removed a comment on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics URL: https://github.com/apache/spark/pull/24868#issuecomment-502941001 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106603/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics
AmplabJenkins commented on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics URL: https://github.com/apache/spark/pull/24868#issuecomment-502941001 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106603/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics
AmplabJenkins commented on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics URL: https://github.com/apache/spark/pull/24868#issuecomment-502940995 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics
SparkQA commented on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics URL: https://github.com/apache/spark/pull/24868#issuecomment-502940759 **[Test build #106603 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106603/testReport)** for PR 24868 at commit [`fce9f54`](https://github.com/apache/spark/commit/fce9f54632927f45bfcc50ca096aa8e105cbfe54). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics
SparkQA removed a comment on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics URL: https://github.com/apache/spark/pull/24868#issuecomment-502928603 **[Test build #106603 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106603/testReport)** for PR 24868 at commit [`fce9f54`](https://github.com/apache/spark/commit/fce9f54632927f45bfcc50ca096aa8e105cbfe54). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #24889: [SPARK-28072][SQL] Fix IncompatibleClassChangeError in `FromUnixTime` codegen on JDK9+
cloud-fan commented on issue #24889: [SPARK-28072][SQL] Fix IncompatibleClassChangeError in `FromUnixTime` codegen on JDK9+ URL: https://github.com/apache/spark/pull/24889#issuecomment-502940116 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon opened a new pull request #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
HyukjinKwon opened a new pull request #24898: [WIP][SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898 ## What changes were proposed in this pull request? This PR proposes to pin Python thread into JVM's. WIP. Wanted to check if all tests pass. ## How was this patch tested? Manually. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #24889: [SPARK-28072][SQL] Fix IncompatibleClassChangeError in `FromUnixTime` codegen on JDK9+
dongjoon-hyun commented on issue #24889: [SPARK-28072][SQL] Fix IncompatibleClassChangeError in `FromUnixTime` codegen on JDK9+ URL: https://github.com/apache/spark/pull/24889#issuecomment-502938150 Hi, @cloud-fan , @srowen , @MaxGekk . This PR(title/description/content) is updated with @cloud-fan 's suggestion which replaces `$tf.apply` with `$tf$$.MODULE$$.apply`. Initially, I thought we don't have a test coverage for the other two cases which @MaxGekk pointed, but we have one test coverage for that. The implementation was correct, so it passed in JDK11. I added only one additional test coverage and add `SPARK-28072` to the relevant test code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24889: [SPARK-28072][SQL] Fix IncompatibleClassChangeError in `FromUnixTime` codegen on JDK9+
SparkQA commented on issue #24889: [SPARK-28072][SQL] Fix IncompatibleClassChangeError in `FromUnixTime` codegen on JDK9+ URL: https://github.com/apache/spark/pull/24889#issuecomment-502937476 **[Test build #106605 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106605/testReport)** for PR 24889 at commit [`c15ee4b`](https://github.com/apache/spark/commit/c15ee4bc5cf50a0c45f6498018b4432860a75447). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24889: [SPARK-28072][SQL] Fix IncompatibleClassChangeError in `FromUnixTime` codegen on JDK9+
AmplabJenkins removed a comment on issue #24889: [SPARK-28072][SQL] Fix IncompatibleClassChangeError in `FromUnixTime` codegen on JDK9+ URL: https://github.com/apache/spark/pull/24889#issuecomment-502937209 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11841/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24889: [SPARK-28072][SQL] Fix IncompatibleClassChangeError in `FromUnixTime` codegen on JDK9+
AmplabJenkins removed a comment on issue #24889: [SPARK-28072][SQL] Fix IncompatibleClassChangeError in `FromUnixTime` codegen on JDK9+ URL: https://github.com/apache/spark/pull/24889#issuecomment-502937205 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #17750: [SPARK-4899][MESOS] Support for Checkpointing on Coarse Grained Mode
AmplabJenkins removed a comment on issue #17750: [SPARK-4899][MESOS] Support for Checkpointing on Coarse Grained Mode URL: https://github.com/apache/spark/pull/17750#issuecomment-395924787 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24889: [SPARK-28072][SQL] Use `Iso8601TimestampFormatter` in `FromUnixTime` codegen to fix ICCE in JDK9+
AmplabJenkins commented on issue #24889: [SPARK-28072][SQL] Use `Iso8601TimestampFormatter` in `FromUnixTime` codegen to fix ICCE in JDK9+ URL: https://github.com/apache/spark/pull/24889#issuecomment-502937205 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24889: [SPARK-28072][SQL] Use `Iso8601TimestampFormatter` in `FromUnixTime` codegen to fix ICCE in JDK9+
AmplabJenkins commented on issue #24889: [SPARK-28072][SQL] Use `Iso8601TimestampFormatter` in `FromUnixTime` codegen to fix ICCE in JDK9+ URL: https://github.com/apache/spark/pull/24889#issuecomment-502937209 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11841/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #17750: [SPARK-4899][MESOS] Support for Checkpointing on Coarse Grained Mode
AmplabJenkins commented on issue #17750: [SPARK-4899][MESOS] Support for Checkpointing on Coarse Grained Mode URL: https://github.com/apache/spark/pull/17750#issuecomment-502937123 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24872: [SPARK-28023][SQL] Trim the string when cast string type to Boolean/Numeric types
AmplabJenkins removed a comment on issue #24872: [SPARK-28023][SQL] Trim the string when cast string type to Boolean/Numeric types URL: https://github.com/apache/spark/pull/24872#issuecomment-502936132 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24872: [SPARK-28023][SQL] Trim the string when cast string type to Boolean/Numeric types
AmplabJenkins commented on issue #24872: [SPARK-28023][SQL] Trim the string when cast string type to Boolean/Numeric types URL: https://github.com/apache/spark/pull/24872#issuecomment-502936134 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106600/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24872: [SPARK-28023][SQL] Trim the string when cast string type to Boolean/Numeric types
AmplabJenkins commented on issue #24872: [SPARK-28023][SQL] Trim the string when cast string type to Boolean/Numeric types URL: https://github.com/apache/spark/pull/24872#issuecomment-502936132 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24872: [SPARK-28023][SQL] Trim the string when cast string type to Boolean/Numeric types
AmplabJenkins removed a comment on issue #24872: [SPARK-28023][SQL] Trim the string when cast string type to Boolean/Numeric types URL: https://github.com/apache/spark/pull/24872#issuecomment-502936134 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106600/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24872: [SPARK-28023][SQL] Trim the string when cast string type to Boolean/Numeric types
SparkQA removed a comment on issue #24872: [SPARK-28023][SQL] Trim the string when cast string type to Boolean/Numeric types URL: https://github.com/apache/spark/pull/24872#issuecomment-502902256 **[Test build #106600 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106600/testReport)** for PR 24872 at commit [`e0ae14b`](https://github.com/apache/spark/commit/e0ae14b0f827e55f23b94b5ecc60d59c7a0b8bb1). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] asfgit closed pull request #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF
asfgit closed pull request #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF URL: https://github.com/apache/spark/pull/24897 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24872: [SPARK-28023][SQL] Trim the string when cast string type to Boolean/Numeric types
SparkQA commented on issue #24872: [SPARK-28023][SQL] Trim the string when cast string type to Boolean/Numeric types URL: https://github.com/apache/spark/pull/24872#issuecomment-502935695 **[Test build #106600 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106600/testReport)** for PR 24872 at commit [`e0ae14b`](https://github.com/apache/spark/commit/e0ae14b0f827e55f23b94b5ecc60d59c7a0b8bb1). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class CollapseCodegenStages(` * `case class AdaptiveSparkPlanExec(` * `case class StageSuccess(stage: QueryStageExec, result: Any) extends StageMaterializationEvent` * `case class StageFailure(stage: QueryStageExec, error: Throwable) extends StageMaterializationEvent` * `case class InsertAdaptiveSparkPlan(session: SparkSession) extends Rule[SparkPlan] ` * `case class LogicalQueryStage(` * `case class PlanAdaptiveSubqueries(` * `abstract class QueryStageExec extends LeafExecNode ` * `case class ShuffleQueryStageExec(` * `case class BroadcastQueryStageExec(` * `case class ReusedQueryStageExec(` * `class ParquetFilters(` * `class ParquetOutputWriter(path: String, context: TaskAttemptContext)` * `class ParquetReadSupport(val convertTz: Option[TimeZone],` * ` case class FileTypes(` * `class ParquetWriteSupport extends WriteSupport[InternalRow] with Logging ` * `class ParquetDataSourceV2 extends FileDataSourceV2 ` * `case class ParquetPartitionReaderFactory(` * `case class ParquetScan(` * `case class ParquetScanBuilder(` * `case class ParquetTable(` * `class ParquetWriteBuilder(` * `case class ArrowEvalPythonExec(udfs: Seq[PythonUDF], resultAttrs: Seq[Attribute], child: SparkPlan,` * `case class SparkListenerSQLAdaptiveExecutionUpdate(` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] mengxr commented on issue #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF
mengxr commented on issue #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF URL: https://github.com/apache/spark/pull/24897#issuecomment-502935220 Merged into master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #24889: [SPARK-28072][SQL] Use `Iso8601TimestampFormatter` in `FromUnixTime` codegen to fix ICCE in JDK9+
dongjoon-hyun commented on issue #24889: [SPARK-28072][SQL] Use `Iso8601TimestampFormatter` in `FromUnixTime` codegen to fix ICCE in JDK9+ URL: https://github.com/apache/spark/pull/24889#issuecomment-502935070 @cloud-fan . Yes. As you suggested, `.MODULE$$.apply` works. I'll update this PR with the new direction. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE and REPLACE TABLE AS SELECT with V2
AmplabJenkins removed a comment on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE and REPLACE TABLE AS SELECT with V2 URL: https://github.com/apache/spark/pull/24798#issuecomment-502934644 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106602/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE and REPLACE TABLE AS SELECT with V2
AmplabJenkins removed a comment on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE and REPLACE TABLE AS SELECT with V2 URL: https://github.com/apache/spark/pull/24798#issuecomment-502934640 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE and REPLACE TABLE AS SELECT with V2
AmplabJenkins commented on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE and REPLACE TABLE AS SELECT with V2 URL: https://github.com/apache/spark/pull/24798#issuecomment-502934644 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106602/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE and REPLACE TABLE AS SELECT with V2
AmplabJenkins commented on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE and REPLACE TABLE AS SELECT with V2 URL: https://github.com/apache/spark/pull/24798#issuecomment-502934640 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE and REPLACE TABLE AS SELECT with V2
SparkQA removed a comment on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE and REPLACE TABLE AS SELECT with V2 URL: https://github.com/apache/spark/pull/24798#issuecomment-502917439 **[Test build #106602 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106602/testReport)** for PR 24798 at commit [`2bf4b5f`](https://github.com/apache/spark/commit/2bf4b5fb1c5c9480bba59627a7b61865619f5502). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE and REPLACE TABLE AS SELECT with V2
SparkQA commented on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE and REPLACE TABLE AS SELECT with V2 URL: https://github.com/apache/spark/pull/24798#issuecomment-502934479 **[Test build #106602 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106602/testReport)** for PR 24798 at commit [`2bf4b5f`](https://github.com/apache/spark/commit/2bf4b5fb1c5c9480bba59627a7b61865619f5502). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF
SparkQA removed a comment on issue #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF URL: https://github.com/apache/spark/pull/24897#issuecomment-502931212 **[Test build #106604 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106604/testReport)** for PR 24897 at commit [`eacb5e4`](https://github.com/apache/spark/commit/eacb5e49911f85f93934f9de5ae1895ab2444b7c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF
SparkQA commented on issue #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF URL: https://github.com/apache/spark/pull/24897#issuecomment-502933655 **[Test build #106604 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106604/testReport)** for PR 24897 at commit [`eacb5e4`](https://github.com/apache/spark/commit/eacb5e49911f85f93934f9de5ae1895ab2444b7c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF
SparkQA commented on issue #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF URL: https://github.com/apache/spark/pull/24897#issuecomment-502931212 **[Test build #106604 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106604/testReport)** for PR 24897 at commit [`eacb5e4`](https://github.com/apache/spark/commit/eacb5e49911f85f93934f9de5ae1895ab2444b7c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
AmplabJenkins removed a comment on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894#issuecomment-502931107 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
AmplabJenkins commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894#issuecomment-502931107 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
AmplabJenkins commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894#issuecomment-502931113 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106599/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
AmplabJenkins removed a comment on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894#issuecomment-502931113 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106599/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
SparkQA commented on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894#issuecomment-502930802 **[Test build #106599 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106599/testReport)** for PR 24894 at commit [`da00963`](https://github.com/apache/spark/commit/da0096345035e397dabaaa6eda32f8eba8324709). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning
SparkQA removed a comment on issue #24894: [SPARK-28058][DOC] Add a note to doc of mode of CSV for column pruning URL: https://github.com/apache/spark/pull/24894#issuecomment-502896821 **[Test build #106599 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106599/testReport)** for PR 24894 at commit [`da00963`](https://github.com/apache/spark/commit/da0096345035e397dabaaa6eda32f8eba8324709). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] mengxr commented on a change in pull request #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF
mengxr commented on a change in pull request #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF URL: https://github.com/apache/spark/pull/24897#discussion_r294596371 ## File path: docs/sql-pyspark-pandas-with-arrow.md ## @@ -86,6 +86,22 @@ The following example shows how to create a scalar Pandas UDF that computes the +### Scalar Iterator + +Scalar iterator (`SCALAR_ITER`) Pandas UDF is the same as scalar Pandas UDF above except that the +underlying Python function takes an iterator of batches as input instead of a single batch and +it yields output batches instead of returning a single output batch. +It is useful when the UDF execution requires initializing some states, e.g., loading an machine +learning model file to apply inference to every input batch. + Review comment: Updated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on issue #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF
WeichenXu123 commented on issue #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF URL: https://github.com/apache/spark/pull/24897#issuecomment-502930617 LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics
SparkQA commented on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics URL: https://github.com/apache/spark/pull/24868#issuecomment-502928603 **[Test build #106603 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106603/testReport)** for PR 24868 at commit [`fce9f54`](https://github.com/apache/spark/commit/fce9f54632927f45bfcc50ca096aa8e105cbfe54). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics
AmplabJenkins commented on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics URL: https://github.com/apache/spark/pull/24868#issuecomment-502928198 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics
AmplabJenkins removed a comment on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics URL: https://github.com/apache/spark/pull/24868#issuecomment-502928207 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11839/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics
AmplabJenkins removed a comment on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics URL: https://github.com/apache/spark/pull/24868#issuecomment-502928198 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics
AmplabJenkins commented on issue #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics URL: https://github.com/apache/spark/pull/24868#issuecomment-502928207 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11839/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics
zhengruifeng commented on a change in pull request #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics URL: https://github.com/apache/spark/pull/24868#discussion_r294593992 ## File path: mllib/src/main/scala/org/apache/spark/ml/evaluation/MultilabelClassificationEvaluator.scala ## @@ -63,18 +63,18 @@ class MultilabelClassificationEvaluator (override val uid: String) setDefault(metricName -> "f1Measure") - final val metricClass: DoubleParam = new DoubleParam(this, "metricClass", + final val metricLabel: DoubleParam = new DoubleParam(this, "metricLabel", Review comment: Do you means `MimaExcludes`? The building is ok, we may not need to add `metricClass` into to it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #24889: [SPARK-28072][SQL] Use `Iso8601TimestampFormatter` in `FromUnixTime` codegen to fix ICCE in JDK9+
cloud-fan commented on issue #24889: [SPARK-28072][SQL] Use `Iso8601TimestampFormatter` in `FromUnixTime` codegen to fix ICCE in JDK9+ URL: https://github.com/apache/spark/pull/24889#issuecomment-502925426 @dongjoon-hyun sorry, typo This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan edited a comment on issue #24889: [SPARK-28072][SQL] Use `Iso8601TimestampFormatter` in `FromUnixTime` codegen to fix ICCE in JDK9+
cloud-fan edited a comment on issue #24889: [SPARK-28072][SQL] Use `Iso8601TimestampFormatter` in `FromUnixTime` codegen to fix ICCE in JDK9+ URL: https://github.com/apache/spark/pull/24889#issuecomment-502919524 @dongjoon-hyun does replacing `.apply` with `.MODULE$$.apply` fix the issue? According to @MaxGekk this seems what we do in other places. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF
WeichenXu123 commented on a change in pull request #24897: [SPARK-28056] [PYTHON] add doc for SCALAR_ITER Pandas UDF URL: https://github.com/apache/spark/pull/24897#discussion_r294592002 ## File path: docs/sql-pyspark-pandas-with-arrow.md ## @@ -86,6 +86,22 @@ The following example shows how to create a scalar Pandas UDF that computes the +### Scalar Iterator + +Scalar iterator (`SCALAR_ITER`) Pandas UDF is the same as scalar Pandas UDF above except that the +underlying Python function takes an iterator of batches as input instead of a single batch and +it yields output batches instead of returning a single output batch. +It is useful when the UDF execution requires initializing some states, e.g., loading an machine +learning model file to apply inference to every input batch. + Review comment: Add a note that the UDF can also return an iterator (instead of write a function with yield), and the input iterator is allowed to be prefetched. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics
zhengruifeng commented on a change in pull request #24868: [SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics URL: https://github.com/apache/spark/pull/24868#discussion_r294591710 ## File path: mllib/src/main/scala/org/apache/spark/ml/evaluation/MultilabelClassificationEvaluator.scala ## @@ -63,18 +63,18 @@ class MultilabelClassificationEvaluator (override val uid: String) setDefault(metricName -> "f1Measure") - final val metricClass: DoubleParam = new DoubleParam(this, "metricClass", + final val metricLabel: DoubleParam = new DoubleParam(this, "metricLabel", "The class whose metric will be computed in precisionByLabel|recallByLabel|" + "f1MeasureByLabel. Must be >= 0. The default value is 0.", ParamValidators.gtEq(0.0)) /** @group getParam */ - def getMetricClass: Double = $(metricClass) + def getMetricLabel: Double = $(metricLabel) Review comment: yes, they are introduced recently. maybe since the master is not released to version 3.0, so this is not a breaking change? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org