[GitHub] spark issue #16819: [SPARK-16441][YARN] Set maxNumExecutor depends on yarn c...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16819 I don't think this is a necessary change. Already, you can't ask for more resources than the cluster has; the cluster won't grant them. Capping it here means the app can't use more resources if the cluster suddenly gets more. I see the problem you're trying to solve but the resource manager already ramps up requests slowly, so I don't think this is the issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16787 **[Test build #72433 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72433/testReport)** for PR 16787 at commit [`57521c6`](https://github.com/apache/spark/commit/57521c6edfef58c48c12904ce3b7fb4949a76f82). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16787 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72433/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16787 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16803 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16803 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72428/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16803 **[Test build #72428 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72428/testReport)** for PR 16803 at commit [`1bb31e5`](https://github.com/apache/spark/commit/1bb31e51a73565a07dc703edf51578762a47f5b2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16810: [SPARK-19464][CORE][YARN][test-hadoop2.6] Remove ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16810#discussion_r99561230 --- Diff: docs/building-spark.md --- @@ -63,57 +63,30 @@ with Maven profile settings and so on like the direct Maven build. Example: This will build Spark distribution along with Python pip and R packages. For more information on usage, run `./dev/make-distribution.sh --help` -## Specifying the Hadoop Version - -Because HDFS is not protocol-compatible across versions, if you want to read from HDFS, you'll need to build Spark against the specific HDFS version in your environment. You can do this through the `hadoop.version` property. If unset, Spark will build against Hadoop 2.2.0 by default. Note that certain build profiles are required for particular Hadoop versions: - - - -Hadoop versionProfile required - - -2.2.xhadoop-2.2 -2.3.xhadoop-2.3 -2.4.xhadoop-2.4 -2.6.xhadoop-2.6 -2.7.x and later 2.xhadoop-2.7 - - - -Note that support for versions of Hadoop before 2.6 are deprecated as of Spark 2.1.0 and may be -removed in Spark 2.2.0. +## Specifying the Hadoop Version and Enabling YARN +You can specify the exact version of Hadoop to compile against through the `hadoop.version` property. +If unset, Spark will build against Hadoop 2.6.0 by default. --- End diff -- Yeah good call, let me fix up the version references to uniformly refer to the latest in each maintenance branch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16815: [SPARK-19407][SS] defaultFS is used FileSystem.get inste...
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/16815 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16817 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16817 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72430/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistices to improve...
Github user sujith71955 commented on the issue: https://github.com/apache/spark/pull/16677 @viirya i tested with the above mentioned approach with sample data, it has improved the performance almost into 3X Please find the test report Total No of Executers = 3 Total Memory assigned = 66 G Total Number of cores = 15 core Number of Partition = 200 Data size = 10745616 (>10 milliion) Limit value = 1000 (10 million) query executed: create destination_table as select * from source_table limit 1000; Time Taken with current implementation with single partition : 383 seconds **With use map output statistics(spark open source proposed solution) :120 sec** which is great!!! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16817 **[Test build #72430 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72430/testReport)** for PR 16817 at commit [`71a206f`](https://github.com/apache/spark/commit/71a206ff2f16a77359e7fe64086573d6c99795a7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16816: Code style improvement
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16816 @zhoucen please close this PR and read http://spark.apache.org/contributing.html --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16751 Pardon me, but is there anywhere else keeping track of the build break with SBT? It's been failing for a while in master: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.2/ I can have a look at it if nobody else is --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16815: [SPARK-19407][SS] defaultFS is used FileSystem.ge...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16815#discussion_r99556423 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetadata.scala --- @@ -47,7 +47,7 @@ object StreamMetadata extends Logging { /** Read the metadata from file if it exists */ def read(metadataFile: Path, hadoopConf: Configuration): Option[StreamMetadata] = { -val fs = FileSystem.get(hadoopConf) +val fs = FileSystem.get(metadataFile.toUri, hadoopConf) --- End diff -- I think this should be `metadataFile.getFileSystem(hadoopConf)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2
Github user robbinspg commented on the issue: https://github.com/apache/spark/pull/16751 Sorry, I've been away for the w/end. Yes we use maven for our test runs. Looks like you have it under control. Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16819: [SPARK-16441][YARN] Set maxNumExecutor depends on...
GitHub user wangyum opened a pull request: https://github.com/apache/spark/pull/16819 [SPARK-16441][YARN] Set maxNumExecutor depends on yarn cluster resources. ## What changes were proposed in this pull request? Dynamic set `spark.dynamicAllocation.maxExecutors` by cluster resources. ## How was this patch tested? manual test You can merge this pull request into a Git repository by running: $ git pull https://github.com/wangyum/spark SPARK-16441 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16819.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16819 commit 97e5eee6aaf2335a2af62e816b767d408c37a59e Author: Yuming WangDate: 2017-02-06T08:57:12Z Set maxNumExecutor depends on yarn cluster VCores Total. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16787 **[Test build #72433 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72433/testReport)** for PR 16787 at commit [`57521c6`](https://github.com/apache/spark/commit/57521c6edfef58c48c12904ce3b7fb4949a76f82). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16818: [SPARK-19451][SQL][Core] Underlying integer overflow in ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16818 **[Test build #72432 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72432/testReport)** for PR 16818 at commit [`ea1f440`](https://github.com/apache/spark/commit/ea1f44026dc9f5b0f8e660b5adf8824b8deb94df). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16818: [SPARK-19451][SQL][Core] Underlying integer overf...
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/16818 [SPARK-19451][SQL][Core] Underlying integer overflow in Window function ## What changes were proposed in this pull request? reproduce code: ``` val tw = Window.orderBy("date") .partitionBy("id") .rangeBetween( from , 0) ``` Everything seems ok, while from value is not too large... Even if the rangeBetween() method supports Long parameters. But If i set -216000L value to from it does not work ! It seems like there is an underlying integer overflow issue here, i.e. convert Long to Int: ``` private def between(typ: FrameType, start: Long, end: Long): WindowSpec = { val boundaryStart = start match { case 0 => CurrentRow case Long.MinValue => UnboundedPreceding case x if x < 0 => ValuePreceding(-start) case x if x > 0 => ValueFollowing(start) } val boundaryEnd = end match { case 0 => CurrentRow case Long.MaxValue => UnboundedFollowing case x if x < 0 => ValuePreceding(-end.toInt) case x if x > 0 => ValueFollowing(end.toInt) } new WindowSpec( partitionSpec, orderSpec, SpecifiedWindowFrame(typ, boundaryStart, boundaryEnd)) } ``` This pr changes the type of index from Int to Long. BTW: Is there any reason why the type of index is Int? I do not find any strong point to set like this. ## How was this patch tested? existing ut You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark SPARK-19451 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16818.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16818 commit ea1f44026dc9f5b0f8e660b5adf8824b8deb94df Author: uncleGenDate: 2017-02-06T01:36:29Z change the type of index from Int to Long --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16625: [SPARK-17874][core] Add SSL port configuration.
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/16625#discussion_r99538768 --- Diff: docs/configuration.md --- @@ -1797,6 +1797,20 @@ Apart from these, the following properties are also available, and may be useful +spark.ssl.[namespace].port --- End diff -- How about saying `spark.ssl.port` instead to be consistent with any other property related to SSL? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16625: [SPARK-17874][core] Add SSL port configuration.
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/16625#discussion_r99540244 --- Diff: docs/security.md --- @@ -49,10 +49,6 @@ component-specific configuration namespaces used to override the default setting Component -spark.ssl.fs --- End diff -- The namespace `fs` seems to be still referred (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SecurityManager.scala#L259 ). We shouldn't remove this description should we? Also it's chance to modify the description about the namespace in `SecurityManager.scala`. ``` * All the SSL settings like `spark.ssl.xxx` where `xxx` is a particular configuration property, * denote the global configuration for all the supported protocols. In order to override the global * configuration for the particular protocol, the properties must be overwritten in the * protocol-specific namespace. Use `spark.ssl.yyy.xxx` settings to overwrite the global * configuration for particular protocol denoted by `yyy`. Currently `yyy` can be only`fs` for * broadcast and file server. ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16625: [SPARK-17874][core] Add SSL port configuration.
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/16625#discussion_r99540452 --- Diff: core/src/main/scala/org/apache/spark/ui/JettyUtils.scala --- @@ -394,8 +410,7 @@ private[spark] object JettyUtils extends Logging { val httpsURI = createRedirectURI(scheme, baseRequest.getServerName, securePort, baseRequest.getRequestURI, baseRequest.getQueryString) response.setContentLength(0) -response.encodeRedirectURL(httpsURI) --- End diff -- Nice catch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16817 Thank you, @cloud-fan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16787 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16787 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72429/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16787 **[Test build #72429 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72429/testReport)** for PR 16787 at commit [`5ef2139`](https://github.com/apache/spark/commit/5ef2139a7628ea5d6568f56b3a87ad9b3cf1caed). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16817 LGTM if tests pass --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16386 can we focus on supporting multiline json in this PR? We can leave the improvements in new PRs, or this PR is kind of hard to review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16476 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72431/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16476 **[Test build #72431 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72431/testReport)** for PR 16476 at commit [`b5deea1`](https://github.com/apache/spark/commit/b5deea17a7afd39061bcaa0756a28a0dfe292b76). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16476 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16476 **[Test build #72431 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72431/testReport)** for PR 16476 at commit [`b5deea1`](https://github.com/apache/spark/commit/b5deea17a7afd39061bcaa0756a28a0dfe292b76). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16476: [SPARK-19084][SQL] Implement expression field
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/16476#discussion_r99535960 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -340,3 +344,99 @@ object CaseKeyWhen { CaseWhen(cases, elseValue) } } + +/** + * A function that returns the index of expr in (expr1, expr2, ...) list or 0 if not found. + * It takes at least 2 parameters, and all parameters should be subtype of AtomicType or NullType. + * It's also acceptable to give parameters of different types. When the parameters have different + * types, comparing will be done based on type firstly. For example, ''999'' 's type is StringType, + * while 999's type is IntegerType, so that no further comparison need to be done since they have + * different types. + * If the search expression is NULL, the return value is 0 because NULL fails equality comparison + * with any value. + * To also point out, no implicit cast will be done in this expression. + */ +@ExpressionDescription( + usage = "_FUNC_(expr, expr1, expr2, ...) - Returns the index of expr in the expr1, expr2, ... or 0 if not found.", + extended = """ +Examples: + > SELECT _FUNC_(10, 9, 3, 10, 4); + 3 + > SELECT _FUNC_('a', 'b', 'c', 'd', 'a'); + 4 + > SELECT _FUNC_('999', 'a', 999, 9.99, '999'); + 4 + """) +case class Field(children: Seq[Expression]) extends Expression { + + /** Even if expr is not found in (expr1, expr2, ...) list, the value will be 0, not null */ + override def nullable: Boolean = false + override def foldable: Boolean = children.forall(_.foldable) + + private lazy val ordering = TypeUtils.getInterpretedOrdering(children(0).dataType) + + private val dataTypeMatchIndex: Array[Int] = children.zipWithIndex.tail.filter( +_._1.dataType.sameType(children.head.dataType)).map(_._2).toArray + + override def checkInputDataTypes(): TypeCheckResult = { +if (children.length <= 1) { + TypeCheckResult.TypeCheckFailure(s"FIELD requires at least 2 arguments") +} else if (!children.forall( +e => e.dataType.isInstanceOf[AtomicType] || e.dataType.isInstanceOf[NullType])) { + TypeCheckResult.TypeCheckFailure(s"FIELD requires all arguments to be of AtomicType") +} else + TypeCheckResult.TypeCheckSuccess + } + + override def dataType: DataType = IntegerType + override def eval(input: InternalRow): Any = { +val target = children.head.eval(input) +@tailrec def findEqual(index: Int): Int = { + if (index == dataTypeMatchIndex.length) { +0 + } else { +val value = children(dataTypeMatchIndex(index)).eval(input) +if (value != null && ordering.equiv(target, value)) { + dataTypeMatchIndex(index) +} else { + findEqual(index + 1) +} + } +} +if (target == null) 0 else findEqual(index = 0) + } + + protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { +val evalChildren = children.map(_.genCode(ctx)) +val target = evalChildren(0) +val targetDataType = children(0).dataType +val dataTypes = children.map(_.dataType) + +def updateEval(evalWithIndex: ((ExprCode, DataType), Int)): String = { + val ((eval, _), index) = evalWithIndex + s""" +${eval.code} +if (${ctx.genEqual(targetDataType, eval.value, target.value)}) { + ${ev.value} = ${index}; +} + """ +} + +def genIfElseStructure(code1: String, code2: String): String = { + s""" + ${code1} + else { + ${code2} + } + """ +} + +ev.copy(code = s""" + ${target.code} --- End diff -- Thanks, I have added a if to bypass the case when target.isNull ==true, which is actually the same as yours. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16386: [SPARK-18352][SQL] Support parsing multiline json...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16386#discussion_r99535380 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala --- @@ -31,10 +31,17 @@ import org.apache.spark.sql.catalyst.util.{CaseInsensitiveMap, CompressionCodecs * Most of these map directly to Jackson's internal options, specified in [[JsonParser.Feature]]. */ private[sql] class JSONOptions( -@transient private val parameters: CaseInsensitiveMap) +@transient private val parameters: CaseInsensitiveMap, +defaultColumnNameOfCorruptRecord: String) --- End diff -- why we need this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16817 **[Test build #72430 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72430/testReport)** for PR 16817 at commit [`71a206f`](https://github.com/apache/spark/commit/71a206ff2f16a77359e7fe64086573d6c99795a7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet fi...
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/16817 [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter tests for binary and string ## What changes were proposed in this pull request? This PR proposes to enable the tests for Parquet filter pushdown with binary and string. This was disabled in https://github.com/apache/spark/pull/16106 due to Parquet's issue but it is now revived in https://github.com/apache/spark/pull/16791 after upgrading Parquet to 1.8.2. ## How was this patch tested? Manually tested via IDE. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark SPARK-17213 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16817.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16817 commit 71a206ff2f16a77359e7fe64086573d6c99795a7 Author: hyukjinkwonDate: 2017-02-06T08:18:46Z Re-enable parquet filter tests for binary and string --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16817 cc @liancheng, could you see if it makes sense? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16787 **[Test build #72429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72429/testReport)** for PR 16787 at commit [`5ef2139`](https://github.com/apache/spark/commit/5ef2139a7628ea5d6568f56b3a87ad9b3cf1caed). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16386: [SPARK-18352][SQL] Support parsing multiline json...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16386#discussion_r99534380 --- Diff: core/src/main/scala/org/apache/spark/input/PortableDataStream.scala --- @@ -194,5 +195,8 @@ class PortableDataStream( } def getPath(): String = path + + @Since("2.2.0") --- End diff -- can we just make `lazy val conf` not private? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16803 **[Test build #72428 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72428/testReport)** for PR 16803 at commit [`1bb31e5`](https://github.com/apache/spark/commit/1bb31e51a73565a07dc703edf51578762a47f5b2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16386: [SPARK-18352][SQL] Support parsing multiline json...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16386#discussion_r99534181 --- Diff: core/src/main/scala/org/apache/spark/input/PortableDataStream.scala --- @@ -194,5 +195,8 @@ class PortableDataStream( } def getPath(): String = path + + @Since("2.2.0") --- End diff -- there is no `since` tag in other methods of this class. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/16803 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16476 **[Test build #72427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72427/testReport)** for PR 16476 at commit [`f2d16ec`](https://github.com/apache/spark/commit/f2d16ec73a38f75d968a39616761708c3b1ba735). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16476 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72427/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16476 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/16787 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13379: [SPARK-12431][GraphX] Add local checkpointing to GraphX.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13379 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16476 **[Test build #72427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72427/testReport)** for PR 16476 at commit [`f2d16ec`](https://github.com/apache/spark/commit/f2d16ec73a38f75d968a39616761708c3b1ba735). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16791: [SPARK-19409][SPARK-17213] Cleanup Parquet workar...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16791 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16791: [SPARK-19409][SPARK-17213] Cleanup Parquet workarounds/h...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16791 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16787 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72426/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16803 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72425/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16787 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16803 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON p...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16750#discussion_r99531422 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -298,6 +299,8 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging { * `timestampFormat` (default `-MM-dd'T'HH:mm:ss.SSSZZ`): sets the string that * indicates a timestamp format. Custom date formats follow the formats at * `java.text.SimpleDateFormat`. This applies to timestamp type. + * `timeZone` (default session local timezone): sets the string that indicates a timezone --- End diff -- `timeZoneId`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org