[GitHub] spark issue #14195: [SPARK-16538][SPARKR] fix R call with namespace operator...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14195 Hi, @felixcheung LGTM. I tested it locally, too. By the way, could you add the original reporter's testcase, too? ``` SparkR::sql("") ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14112 **[Test build #62298 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62298/consoleFull)** for PR 14112 at commit [`58384d4`](https://github.com/apache/spark/commit/58384d447de9f1fd5959c9bfe0caae2e4bac92ae). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14192: [SPARK-16509][SPARKR] Rename window.partitionBy and wind...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14192 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14192: [SPARK-16509][SPARKR] Rename window.partitionBy and wind...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14192 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62297/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14192: [SPARK-16509][SPARKR] Rename window.partitionBy and wind...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14192 **[Test build #62297 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62297/consoleFull)** for PR 14192 at commit [`1ed246d`](https://github.com/apache/spark/commit/1ed246db0a9973bc9eb5c52f70e95718042b33e5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14181: [SPARK-15382][SQL] Fix a rule to push down projects bene...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/14181 yea, the solution is also okay. Is it okay to fix in that way? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14165: [SPARK-16503] SparkSession should provide Spark v...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14165 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14165: [SPARK-16503] SparkSession should provide Spark version
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14165 Merging in master/2.0. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14181: [SPARK-15382][SQL] Fix a rule to push down projects bene...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14181 should we just enforce sampling ratio <= 1.0? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14190: [SPARK-16536][SQL][PYSPARK][MINOR] Expose `sql` in PySpa...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14190 Thank you, @rxin ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14190: [SPARK-16536][SQL][PYSPARK][MINOR] Expose `sql` i...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14190 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14190: [SPARK-16536][SQL][PYSPARK][MINOR] Expose `sql` in PySpa...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14190 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14112: [SPARK-16240][ML] Model loading backward compatib...
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/14112#discussion_r70749752 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala --- @@ -566,26 +565,52 @@ object LocalLDAModel extends MLReadable[LocalLDAModel] { } } + private case class Data( + vocabSize: Int, + topicsMatrix: Matrix, + docConcentration: Vector, + topicConcentration: Double, + gammaShape: Double) + private class LocalLDAModelReader extends MLReader[LocalLDAModel] { private val className = classOf[LocalLDAModel].getName override def load(path: String): LocalLDAModel = { + // Import implicits for Dataset Encoder + val sparkSession = super.sparkSession + import sparkSession.implicits._ + val metadata = DefaultParamsReader.loadMetadata(path, sc, className) val dataPath = new Path(path, "data").toString val data = sparkSession.read.parquet(dataPath) -.select("vocabSize", "topicsMatrix", "docConcentration", "topicConcentration", - "gammaShape") -.head() - val vocabSize = data.getAs[Int](0) - val topicsMatrix = data.getAs[Matrix](1) - val docConcentration = data.getAs[Vector](2) - val topicConcentration = data.getAs[Double](3) - val gammaShape = data.getAs[Double](4) + val vectorConverted = MLUtils.convertVectorColumnsToML(data, "docConcentration") + val Row(vocabSize: Int, topicsMatrix: Matrix, docConcentration: Vector, --- End diff -- It worked when I locally ran the unit tests, but fails here on Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14195: [SPARK-16538][SPARKR] fix R call with namespace operator...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14195 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62296/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14195: [SPARK-16538][SPARKR] fix R call with namespace operator...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14195 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14195: [SPARK-16538][SPARKR] fix R call with namespace operator...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14195 **[Test build #62296 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62296/consoleFull)** for PR 14195 at commit [`75193ee`](https://github.com/apache/spark/commit/75193eebca5631587827ed0125a6df72e38c97a3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14192: [SPARK-16509][SPARKR] Rename window.partitionBy and wind...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14192 **[Test build #62297 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62297/consoleFull)** for PR 14192 at commit [`1ed246d`](https://github.com/apache/spark/commit/1ed246db0a9973bc9eb5c52f70e95718042b33e5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14173: [SPARKR][SPARK-16507] Add a CRAN checker, fix Rd aliases
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/14173 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14192: [SPARK-16509][SPARKR] Rename window.partitionBy and wind...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/14192 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14191: [SPARK-16217][SQL] Support SELECT INTO statement
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14191 Hi, @wuxianxingkong . Although I'm just a contributor like you, I left a few comments for you because I like your PR. I hope your PR will be merged soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14195: [SPARK-16538][SPARKR] fix R call with namespace operator...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14195 **[Test build #62296 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62296/consoleFull)** for PR 14195 at commit [`75193ee`](https://github.com/apache/spark/commit/75193eebca5631587827ed0125a6df72e38c97a3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14191: [SPARK-16217][SQL] Support SELECT INTO statement
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14191#discussion_r70748362 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -1755,4 +1755,97 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton { } } } + + test("select into(check relation)") { +val originalConf = sessionState.conf.convertCTAS + +setConf(SQLConf.CONVERT_CTAS, true) + +val defaultDataSource = sessionState.conf.defaultDataSourceName +try { + sql("DROP TABLE IF EXISTS si1") + sql("SELECT key, value INTO si1 FROM src ORDER BY key, value") + val message = intercept[AnalysisException] { +sql("SELECT key, value INTO si1 FROM src ORDER BY key, value") + }.getMessage + assert(message.contains("already exists")) + checkRelation("si1", true, defaultDataSource) + sql("DROP TABLE si1") + + // Specifying database name for query can be converted to data source write path + // is not allowed right now. + sql("SELECT key, value INTO default.si1 FROM src ORDER BY key, value") + checkRelation("si1", true, defaultDataSource) + sql("DROP TABLE si1") + +} finally { + setConf(SQLConf.CONVERT_CTAS, originalConf) + sql("DROP TABLE IF EXISTS si1") +} + } + + test("select into(check answer)") { +sql("DROP TABLE IF EXISTS si1") +sql("DROP TABLE IF EXISTS si2") +sql("DROP TABLE IF EXISTS si3") + +sql("SELECT key, value INTO si1 FROM src") +checkAnswer( + sql("SELECT key, value FROM si1 ORDER BY key"), + sql("SELECT key, value FROM src ORDER BY key").collect().toSeq) + +sql("SELECT key k, value INTO si2 FROM src ORDER BY k,value").collect() +checkAnswer( + sql("SELECT k, value FROM si2 ORDER BY k, value"), + sql("SELECT key, value FROM src ORDER BY key, value").collect().toSeq) + +sql("SELECT 1 AS key,value INTO si3 FROM src LIMIT 1").collect() +intercept[AnalysisException] { + sql("SELECT key, value INTO si3 FROM src ORDER BY key, value").collect() +} --- End diff -- Checking the real error message is better. ``` val m = intercept[AnalysisException] { sql("SELECT key, value INTO si3 FROM src ORDER BY key, value").collect() }.getMessage assert(m.contains("your exception message")) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14112 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62295/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14112 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14112 **[Test build #62295 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62295/consoleFull)** for PR 14112 at commit [`216777f`](https://github.com/apache/spark/commit/216777fdac275c8865d54e7193aff7e02714cba9). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14191: [SPARK-16217][SQL] Support SELECT INTO statement
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14191#discussion_r70748266 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -1755,4 +1755,97 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton { } } } + + test("select into(check relation)") { +val originalConf = sessionState.conf.convertCTAS + +setConf(SQLConf.CONVERT_CTAS, true) + +val defaultDataSource = sessionState.conf.defaultDataSourceName +try { + sql("DROP TABLE IF EXISTS si1") + sql("SELECT key, value INTO si1 FROM src ORDER BY key, value") + val message = intercept[AnalysisException] { +sql("SELECT key, value INTO si1 FROM src ORDER BY key, value") + }.getMessage + assert(message.contains("already exists")) + checkRelation("si1", true, defaultDataSource) + sql("DROP TABLE si1") + + // Specifying database name for query can be converted to data source write path + // is not allowed right now. + sql("SELECT key, value INTO default.si1 FROM src ORDER BY key, value") + checkRelation("si1", true, defaultDataSource) + sql("DROP TABLE si1") + +} finally { + setConf(SQLConf.CONVERT_CTAS, originalConf) + sql("DROP TABLE IF EXISTS si1") +} + } + + test("select into(check answer)") { +sql("DROP TABLE IF EXISTS si1") +sql("DROP TABLE IF EXISTS si2") +sql("DROP TABLE IF EXISTS si3") --- End diff -- ``` withTable("si1", "si2", "si3") { ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14191: [SPARK-16217][SQL] Support SELECT INTO statement
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14191#discussion_r70748215 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -1755,4 +1755,97 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton { } } } + + test("select into(check relation)") { +val originalConf = sessionState.conf.convertCTAS + +setConf(SQLConf.CONVERT_CTAS, true) + +val defaultDataSource = sessionState.conf.defaultDataSourceName +try { + sql("DROP TABLE IF EXISTS si1") --- End diff -- Please consider the following convention. ``` withTable("si1") { ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14195: [SPARK-16538][SPARKR] fix R call with namespace o...
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/14195 [SPARK-16538][SPARKR] fix R call with namespace operator SparkSession functions ## What changes were proposed in this pull request? Fix function routing to work with and without namespace operator `SparkR::createDataFrame` ## How was this patch tested? manual, unit tests @shivaram You can merge this pull request into a Git repository by running: $ git pull https://github.com/felixcheung/spark rroutedefault Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14195.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14195 commit 75193eebca5631587827ed0125a6df72e38c97a3 Author: Felix CheungDate: 2016-07-14T04:48:21Z fix call with namespace --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14191: [SPARK-16217][SQL] Support SELECT INTO statement
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14191#discussion_r70748124 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -1755,4 +1755,97 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton { } } } + + test("select into(check relation)") { +val originalConf = sessionState.conf.convertCTAS + +setConf(SQLConf.CONVERT_CTAS, true) --- End diff -- ``` withSQLConf(SQLConf. CONVERT_CTAS.key -> "true") { ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14191: [SPARK-16217][SQL] Support SELECT INTO statement
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14191#discussion_r70747940 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -338,7 +338,7 @@ querySpecification (RECORDREADER recordReader=STRING)? fromClause? (WHERE where=booleanExpression)?) -| ((kind=SELECT setQuantifier? namedExpressionSeq fromClause? +| ((kind=SELECT setQuantifier? namedExpressionSeq (intoClause? fromClause)? --- End diff -- Hi, @wuxianxingkong . Currently, the following seems to be not considered yet. Could you modify the syntax to support this too? ``` SELECT 1 INTO newtable ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14112 **[Test build #62295 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62295/consoleFull)** for PR 14112 at commit [`216777f`](https://github.com/apache/spark/commit/216777fdac275c8865d54e7193aff7e02714cba9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14194: [SPARK-16485][DOC][ML] Fixed several inline formatting i...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14194 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14194: [SPARK-16485][DOC][ML] Fixed several inline forma...
GitHub user lins05 opened a pull request: https://github.com/apache/spark/pull/14194 [SPARK-16485][DOC][ML] Fixed several inline formatting in ml features doc ## What changes were proposed in this pull request? Fixed several inline formatting in ml features doc. ## How was this patch tested? Genetate the docs locally by `SKIP_API=1 jekyll build` and view it in the browser. You can merge this pull request into a Git repository by running: $ git pull https://github.com/lins05/spark fix-docs-formatting Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14194.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14194 commit 23af691be41b6b18b3655f01f6755789ff891c7a Author: Shuai LinDate: 2016-07-14T04:05:55Z [SPARK-16485][DOC][ML] Fixed several inline formatting in ml features doc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14036: [SPARK-16323] [SQL] Add IntegerDivide to avoid un...
Github user lianhuiwang commented on a diff in the pull request: https://github.com/apache/spark/pull/14036#discussion_r70746279 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala --- @@ -234,6 +234,7 @@ object FunctionRegistry { expression[Subtract]("-"), expression[Multiply]("*"), expression[Divide]("/"), +expression[IntegerDivide]("div"), --- End diff -- 'select 4 div 2' is the right code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14193: [Minor][Build] Remove empty tags in parent pom.xml
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14193 **[Test build #62294 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62294/consoleFull)** for PR 14193 at commit [`534f8c6`](https://github.com/apache/spark/commit/534f8c677244fe09a25b103bc0bb8ae3de059f7e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14193: [Minor][Build] Remove empty tags in parent pom.xm...
GitHub user keypointt opened a pull request: https://github.com/apache/spark/pull/14193 [Minor][Build] Remove empty tags in parent pom.xml ## What changes were proposed in this pull request? Remove empty tags in parent pom.xml When I was working on another ticket and scanning code files randomly, I found these empty tags. I'm not sure if it is needed, or these empty tags are left on purpose. If this is not a valid PR, please just let me know and I'll close it. ## How was this patch tested? Tested by re-building the project on my local machine. You can merge this pull request into a Git repository by running: $ git pull https://github.com/keypointt/spark emptyTag Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14193.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14193 commit 534f8c677244fe09a25b103bc0bb8ae3de059f7e Author: Xin RenDate: 2016-07-14T03:52:31Z remove emtpy tags in parent pom.xml --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14181: [SPARK-15382][SQL] Fix a rule to push down projects bene...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14181 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14181: [SPARK-15382][SQL] Fix a rule to push down projects bene...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14181 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62291/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14181: [SPARK-15382][SQL] Fix a rule to push down projects bene...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14181 **[Test build #62291 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62291/consoleFull)** for PR 14181 at commit [`5c4d0df`](https://github.com/apache/spark/commit/5c4d0df7798e7e1428d01af7ef600d4f81690f5a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14192: [SPARK-16509][SPARKR] Rename window.partitionBy a...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14192#discussion_r70744731 --- Diff: R/pkg/R/window.R --- @@ -17,23 +17,23 @@ # window.R - Utility functions for defining window in DataFrames -#' window.partitionBy +#' windowPartitionBy #' #' Creates a WindowSpec with the partitioning defined. #' -#' @rdname window.partitionBy -#' @name window.partitionBy +#' @rdname windowPartitionBy +#' @name windowPartitionBy #' @export #' @examples #' \dontrun{ -#' ws <- window.partitionBy("key1", "key2") +#' ws <- windowPartitionBy("key1", "key2") #' df1 <- select(df, over(lead("value", 1), ws)) #' -#' ws <- window.partitionBy(df$key1, df$key2) +#' ws <- windowPartitionBy(df$key1, df$key2) #' df1 <- select(df, over(lead("value", 1), ws)) #' } -#' @note window.partitionBy(character) since 2.0.0 -setMethod("window.partitionBy", +#' @note windowPartitionBy(character) since 2.0.0 +setMethod("windowPartitionBy", --- End diff -- minor comment: Can we document the parameter `@param col` that is in all the 4 functions ? That'll also remove some of the CRAN warnings --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14192: [SPARK-16509][SPARKR] Rename window.partitionBy and wind...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14192 Thanks @sun-rui for the PR. LGTM. I had a minor comment inline --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14112 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62293/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14112 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14112 **[Test build #62293 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62293/consoleFull)** for PR 14112 at commit [`0c2e51c`](https://github.com/apache/spark/commit/0c2e51c2d38207003c5cf659423e71fd2739d003). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14192: [SPARK-16509][SPARKR] Rename window.partitionBy and wind...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14192 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14192: [SPARK-16509][SPARKR] Rename window.partitionBy and wind...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14192 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62292/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14192: [SPARK-16509][SPARKR] Rename window.partitionBy and wind...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14192 **[Test build #62292 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62292/consoleFull)** for PR 14192 at commit [`38b256a`](https://github.com/apache/spark/commit/38b256accd4ff1dabbdb5602eaaa600d9df9562a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14112 **[Test build #62293 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62293/consoleFull)** for PR 14112 at commit [`0c2e51c`](https://github.com/apache/spark/commit/0c2e51c2d38207003c5cf659423e71fd2739d003). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14192: [SPARK-16509][SPARKR] Rename window.partitionBy and wind...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14192 **[Test build #62292 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62292/consoleFull)** for PR 14192 at commit [`38b256a`](https://github.com/apache/spark/commit/38b256accd4ff1dabbdb5602eaaa600d9df9562a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14192: [SPARK-16509][SPARKR] Rename window.partitionBy a...
GitHub user sun-rui opened a pull request: https://github.com/apache/spark/pull/14192 [SPARK-16509][SPARKR] Rename window.partitionBy and window.orderBy to windowPartitionBy and windowOrderBy. ## What changes were proposed in this pull request? Rename window.partitionBy and window.orderBy to windowPartitionBy and windowOrderBy to pass CRAN package check. ## How was this patch tested? SparkR unit tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sun-rui/spark SPARK-16509 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14192.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14192 commit 38b256accd4ff1dabbdb5602eaaa600d9df9562a Author: Sun RuiDate: 2016-07-14T02:34:40Z [SPARK-16509][SPARKR] Rename window.partitionBy and window.orderBy to windowPartitionBy and windowOrderBy. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14169: [WIP][SPARK-16515][SQL]set default record reader and wri...
Github user jameszhouyi commented on the issue: https://github.com/apache/spark/pull/14169 Hi, Cool ! All of my cases relative to transformation script PASSED after applying this PR . Could Spark guys please review this codes to merge this PR ? Thanks a lots ! Best Regards Yi --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14191: [SPARK-16217][SQL] Support SELECT INTO statement
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14191 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14191: [SPARK-16217][SQL] Support SELECT INTO statement
GitHub user wuxianxingkong opened a pull request: https://github.com/apache/spark/pull/14191 [SPARK-16217][SQL] Support SELECT INTO statement ## What changes were proposed in this pull request? This PR implements the *SELECT INTO* statement. The *SELECT INTO* statement selects data from one table and inserts it into a new table as follows. SELECT column_name(s) INTO newtable FROM table1; This statement is commonly used in SQL but not currently supported in SparkSQL. We investigated the Catalyst and found that this statement can be implemented by improving the grammar and reusing the logical plan of *CTAS*. The related JIRA is https://issues.apache.org/jira/browse/SPARK-16217 ## How was this patch tested? SQLQuerySuite. You can merge this pull request into a Git repository by running: $ git pull https://github.com/wuxianxingkong/spark select_into Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14191.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14191 commit 605634deb779a0cf0eaece8420692d9bf44dab64 Author: cuiguangfan <736068...@qq.com> Date: 2016-07-12T13:16:43Z SELECT INTO Implements --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/14185 I tried playing a little bit with what this API would look like, and I'm starting to question my previous idea that exposing the ProcessBuilder is the way to go here... The above issue with logging redirection is just one source of issues. There are other problems, such as the API becoming a little bit convoluted: ``` SparkLauncher launcher = ...; ProcessBuilder pb = launcher.createProcessBuilder(); launcher.startApplication(pb); ``` And all the different ways to start the Spark app (3 different methods in SparkLauncher + `ProcessBuilder.start()`). At this point I'm starting to think it might be better to mirror parts of the ProcessBuilder API that are interesting. e.g., have: ``` SparkLauncher directory(File directory) SparkLauncher redirectErrorStream(boolean redirectErrorStream) SparkLauncher redirectError(ProcessBuilder.Redirect destination) SparkLauncher redirectOutput(ProcessBuilder.Redirect destination) ``` Optionally these (since you can use `Redirect.to(File)`): ``` SparkLauncher redirectError(File destination) SparkLauncher redirectOutput(File destination) ``` And add this one which implements the current logger redirection: ``` SparkLauncher redirectToLog(String loggerName) ``` By default logging redirection would be done when using `startApplication`, using the current semantics, unless the user has overridden that by calling one of the new methods (which would also apply to `launch`). This adds more methods and is a bit more work, but it avoids certains oddities in the API, avoids overloading `startApplication`, and hides ProcessBuilder APIs we don't want to expose (like `command()`). What do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14182: [SPARK-16444][WIP][SparkR]: Isotonic Regression wrapper ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14182 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14182: [SPARK-16444][WIP][SparkR]: Isotonic Regression wrapper ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14182 **[Test build #62290 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62290/consoleFull)** for PR 14182 at commit [`c02573f`](https://github.com/apache/spark/commit/c02573fa79fb94fd15e45bdbbf9b359b33c3c226). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14182: [SPARK-16444][WIP][SparkR]: Isotonic Regression wrapper ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14182 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62290/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14181: [SPARK-15382][SQL] Fix a rule to push down projects bene...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14181 **[Test build #62291 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62291/consoleFull)** for PR 14181 at commit [`5c4d0df`](https://github.com/apache/spark/commit/5c4d0df7798e7e1428d01af7ef600d4f81690f5a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14185 **[Test build #62289 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62289/consoleFull)** for PR 14185 at commit [`7fe36f5`](https://github.com/apache/spark/commit/7fe36f5970e7e577a47d8b6a7534cc95d22a94c2). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14185 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62289/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14185 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14189: [SPARK-16535][Build] In pom.xml, remove groupId which is...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14189 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14189: [SPARK-16535][Build] In pom.xml, remove groupId which is...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14189 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62285/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14189: [SPARK-16535][Build] In pom.xml, remove groupId which is...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14189 **[Test build #62285 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62285/consoleFull)** for PR 14189 at commit [`815aa05`](https://github.com/apache/spark/commit/815aa052ec55336c9a38665a0e5d871ef3110d44). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14139: [SPARK-16313][SQL][BRANCH-1.6] Spark should not silently...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14139 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14139: [SPARK-16313][SQL][BRANCH-1.6] Spark should not silently...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14139 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62288/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14173: [SPARKR][SPARK-16507] Add a CRAN checker, fix Rd aliases
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14173 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62284/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14173: [SPARKR][SPARK-16507] Add a CRAN checker, fix Rd aliases
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14173 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14139: [SPARK-16313][SQL][BRANCH-1.6] Spark should not silently...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14139 **[Test build #62288 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62288/consoleFull)** for PR 14139 at commit [`82d3711`](https://github.com/apache/spark/commit/82d371112cd5ae7dddeadb8d10b0d204e4c76e88). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14173: [SPARKR][SPARK-16507] Add a CRAN checker, fix Rd aliases
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14173 **[Test build #62284 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62284/consoleFull)** for PR 14173 at commit [`3299242`](https://github.com/apache/spark/commit/32992426f834ec0ad84163a16d43286f08382536). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14182: [SPARK-16444][WIP][SparkR]: Isotonic Regression wrapper ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14182 **[Test build #62290 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62290/consoleFull)** for PR 14182 at commit [`c02573f`](https://github.com/apache/spark/commit/c02573fa79fb94fd15e45bdbbf9b359b33c3c226). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14139: [SPARK-16313][SQL][BRANCH-1.6] Spark should not silently...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14139 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14139: [SPARK-16313][SQL][BRANCH-1.6] Spark should not silently...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14139 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62287/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14139: [SPARK-16313][SQL][BRANCH-1.6] Spark should not silently...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14139 **[Test build #62287 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62287/consoleFull)** for PR 14139 at commit [`5d66df7`](https://github.com/apache/spark/commit/5d66df76dd04930e8b877d0b4e56acb749ce9257). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14182: [SPARK-16444][WIP][SparkR]: Isotonic Regression wrapper ...
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/14182 test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/14177 I'd hit these errors fairly randomly if hive = T, even when stop is called ``` java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@1522765a, see the next exception for details. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.(Unknown Source) at org.apache.derby.jdbc.InternalDriver.getNewEmbedConnection(Unknown Source) at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source) at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source) at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:208) at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:349) at com.jolbox.bonecp.BoneCP.(BoneCP.java:416) at com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:120) at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:501) at org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:298) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631) at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301) at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187) at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965) at java.security.AccessController.doPrivileged(Native Method) at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701) at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365) at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394) at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291) at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.hive.metastore.RawStoreProxy.(RawStoreProxy.java:57) at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:593) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:571) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:66) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72) at
[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13758 @cloud-fan , would it be possible to review this? If I have to prepare additional benchmark results or etc., could you please let me know? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14022: [SPARK-16272][core] Allow config values to reference con...
Github user ericl commented on the issue: https://github.com/apache/spark/pull/14022 IMO `${sparkconf:spark.master}` is more clear to the unfamiliar reader, but it also seems ok to go with `${spark.master}`. Thought there might also be an issue if someone adds a spark conf that doesn't start with "spark.", which could be confusing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14190: [SPARK-16536][SQL][PYSPARK][MINOR] Expose `sql` in PySpa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14190 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14190: [SPARK-16536][SQL][PYSPARK][MINOR] Expose `sql` in PySpa...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14190 **[Test build #62286 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62286/consoleFull)** for PR 14190 at commit [`c5dc235`](https://github.com/apache/spark/commit/c5dc2355a5d9afbe98499767bf714a112f55d784). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14190: [SPARK-16536][SQL][PYSPARK][MINOR] Expose `sql` in PySpa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14190 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62286/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14022: [SPARK-16272][core] Allow config values to reference con...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/14022 That's mostly how it works; I would like to avoid an explicit `sparkconf:` prefix to avoid things like `sparkconf:spark.master`, but I can enforce that only variables starting with `spark.` are expanded easily. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14185 **[Test build #62289 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62289/consoleFull)** for PR 14185 at commit [`7fe36f5`](https://github.com/apache/spark/commit/7fe36f5970e7e577a47d8b6a7534cc95d22a94c2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14022: [SPARK-16272][core] Allow config values to reference con...
Github user ericl commented on the issue: https://github.com/apache/spark/pull/14022 To reduce the risk, how about changing the semantics to ``` * - spark/sparkconf/hiveconf: looks for the value in the spark config * - system: looks for the value in the system properties * - env: looks for the value in the environment * - (any other or no prefix): left as-is ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/14185 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's Proc...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/14185#discussion_r70731128 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java --- @@ -418,14 +414,26 @@ public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners) thr } } -String loggerPrefix = getClass().getPackage().getName(); -String loggerName = String.format("%s.app.%s", loggerPrefix, appName); -ProcessBuilder pb = createBuilder().redirectErrorStream(true); +ProcessBuilder pb = createProcessBuilder().redirectErrorStream(true); +return startApplication(appName, pb, listeners); + } + + public SparkAppHandle startApplication(String appName, ProcessBuilder pb, + SparkAppHandle.Listener... listeners) throws IOException { +ChildProcAppHandle handle = LauncherServer.newAppHandle(); +for (SparkAppHandle.Listener l : listeners) { + handle.addListener(l); +} + pb.environment().put(LauncherProtocol.ENV_LAUNCHER_PORT, String.valueOf(LauncherServer.getServerInstance().getPort())); pb.environment().put(LauncherProtocol.ENV_LAUNCHER_SECRET, handle.getSecret()); + +String loggerPrefix = getClass().getPackage().getName(); --- End diff -- There's an issue now that the user might call `.redirectOutput` on the `ProcessBuilder` and this code might break. So the log redirection really should only happen if the `ProcessBuilder` is configured to redirect to `ProcessBuilder.Redirect.PIPE`. And you'd need to make sure that `redirectErrorStream()` is set in that case. An alternative approach would be to only do this log redirection from the existing `startApplication` call that does not take a `ProcessBuilder`. In that case, output of the child process for the new `startApplication` should be handled by the caller, just like for the old `launch()` API. I really wish you could create new `ProcessBuilder.Redirect` values so that we could create a new `LOG` one and fix this properly, but it doesn't look like that's possible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/14185 This PR should really be referencing SPARK-14702, which is the older bug. SPARK-16511 should be closed as a duplicate of it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's Proc...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/14185#discussion_r70730228 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java --- @@ -418,14 +414,26 @@ public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners) thr } } -String loggerPrefix = getClass().getPackage().getName(); -String loggerName = String.format("%s.app.%s", loggerPrefix, appName); -ProcessBuilder pb = createBuilder().redirectErrorStream(true); +ProcessBuilder pb = createProcessBuilder().redirectErrorStream(true); +return startApplication(appName, pb, listeners); + } + + public SparkAppHandle startApplication(String appName, ProcessBuilder pb, + SparkAppHandle.Listener... listeners) throws IOException { +ChildProcAppHandle handle = LauncherServer.newAppHandle(); +for (SparkAppHandle.Listener l : listeners) { + handle.addListener(l); +} + pb.environment().put(LauncherProtocol.ENV_LAUNCHER_PORT, String.valueOf(LauncherServer.getServerInstance().getPort())); pb.environment().put(LauncherProtocol.ENV_LAUNCHER_SECRET, handle.getSecret()); + +String loggerPrefix = getClass().getPackage().getName(); +String loggerName = String.format("%s.app.%s", loggerPrefix, appName); try { - handle.setChildProc(pb.start(), loggerName); + Process proc = pb.start(); --- End diff -- This change is not necessary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14022: [SPARK-16272][core] Allow config values to reference con...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/14022 > Wouldn't enabling it by default break backwards compatibility? Yes, maybe. But having a flag to disable everything would also potentially break features that rely on it... although you could argue it's the user's fault at that point. I just don't like adding new options that only exist because we can't make a decision about what their behavior should be. If variable substitution applies to all configs, then it does apply to all configs and you can't disable it. Since no configs that I know of have this functionality (the SQL variable substitution would be broken), from a user's perspective it's very unlikely doing it for all would break things. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14190: [SPARK-16536][SQL][PYSPARK] Expose `sql` in PySpark Shel...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14190 Hi, @rxin . Could you review this trivial PR exposing `sql()` in PySpark Shell for consistency? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's Proc...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/14185#discussion_r70729812 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java --- @@ -418,14 +414,26 @@ public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners) thr } } -String loggerPrefix = getClass().getPackage().getName(); -String loggerName = String.format("%s.app.%s", loggerPrefix, appName); -ProcessBuilder pb = createBuilder().redirectErrorStream(true); +ProcessBuilder pb = createProcessBuilder().redirectErrorStream(true); +return startApplication(appName, pb, listeners); + } + + public SparkAppHandle startApplication(String appName, ProcessBuilder pb, --- End diff -- Style is: ``` public Type method( Type arg1, Type arg2, ...) { ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14184: [SPARK-16529][SQL][TEST] `withTempDatabase` should set `...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14184 Hi, @liancheng . Could you review this PR? It was made by you at https://github.com/apache/spark/commit/72981bc8f0d421e2563e2543a8c16a8cc76ad3aa#diff-e59968489e4f36f43010dd7acd60341dR106 and used in many cases. Recently, [SPARK-16459](https://github.com/apache/spark/pull/14115) starts to prevent dropping current database, we had better update this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's Proc...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/14185#discussion_r70729733 --- Diff: core/src/test/scala/org/apache/spark/launcher/LauncherBackendSuite.scala --- @@ -17,15 +17,16 @@ package org.apache.spark.launcher + --- End diff -- don't add this extra line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's Proc...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/14185#discussion_r70729726 --- Diff: core/src/test/scala/org/apache/spark/launcher/LauncherBackendSuite.scala --- @@ -17,15 +17,16 @@ package org.apache.spark.launcher + import java.util.concurrent.TimeUnit import scala.concurrent.duration._ import scala.language.postfixOps -import org.scalatest.Matchers import org.scalatest.concurrent.Eventually._ +import org.scalatest.Matchers --- End diff -- Please avoid making unnecessary changes to imports. This one, in particular, is wrong and would trigger a style checker error if the style checker didn't have a bug... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14022: [SPARK-16272][core] Allow config values to reference con...
Github user ericl commented on the issue: https://github.com/apache/spark/pull/14022 Wouldn't enabling it by default break backwards compatibility? I agree that would be better, but it seems likely that '${...}' may be used in existing configs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14177 Does the hive metastore not shutdown properly even if we do `sparkSession.stop()` in all the test files ? The reason I'm trying to avoid having `enableHiveMetastore=F` in most test files is that Hive enabled is true by default and hence closer to what users will see. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14022: [SPARK-16272][core] Allow config values to reference con...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/14022 > how about a global flag for enabling config expansion I think that would be more confusing. Why would someone disable expansion? My only concern with enabling it for all options is performance - reading configs becomes more expensive. Maybe that's not a big deal and we can just do that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14022: [SPARK-16272][core] Allow config values to refere...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/14022#discussion_r70728894 --- Diff: core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala --- @@ -99,13 +118,83 @@ private class FallbackConfigEntry[T] ( key: String, doc: String, isPublic: Boolean, -private val fallback: ConfigEntry[T]) -extends ConfigEntry[T](key, fallback.valueConverter, fallback.stringConverter, doc, isPublic) { +private[config] val fallback: ConfigEntry[T], +private val expandVars: Boolean) +extends ConfigEntry[T]( +key, +fallback.valueConverter, +fallback.stringConverter, +doc, +isPublic, +expandVars) { override def defaultValueString: String = s"" - override def readFrom(conf: SparkConf): T = { - conf.getOption(key).map(valueConverter).getOrElse(fallback.readFrom(conf)) + override def readFrom(conf: JMap[String, String], getenv: String => String): T = { + Option(conf.get(key)).map(valueConverter).getOrElse(fallback.readFrom(conf, getenv)) + } + +} + +private object ConfigEntry { + + private val knownConfigs = new java.util.concurrent.ConcurrentHashMap[String, ConfigEntry[_]]() + + private val REF_RE = "\\$\\{(?:(\\w+?):)?(\\S+?)\\}".r.pattern + + def registerEntry(entry: ConfigEntry[_]): Unit = { +val existing = knownConfigs.putIfAbsent(entry.key, entry) +require(existing == null, s"Config entry ${entry.key} already registered!") + } + + def findEntry(key: String): ConfigEntry[_] = knownConfigs.get(key) + + /** + * Expand the `value` according to the rules explained in `ConfigBuilder.withVariableExpansion`. + */ + def expand( + value: String, + conf: JMap[String, String], + getenv: String => String, + usedRefs: Set[String]): String = { +val matcher = REF_RE.matcher(value) +val result = new StringBuilder() +var end = 0 + +while (end < value.length() && matcher.find(end)) { --- End diff -- That looks interesting, let me take a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14022: [SPARK-16272][core] Allow config values to reference con...
Github user ericl commented on the issue: https://github.com/apache/spark/pull/14022 Instead of selectively enabling this for certain confs / config builders, how about a global flag for enabling config expansion? I think that would be less likely to be confusing. Also, perhaps a warning should be logged if there is a config value that looks expandable but the flag is off. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14079: [SPARK-8425][CORE] New Blacklist Mechanism
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/14079 Just some minor stuff, I'll let people more familiar with the scheduler comment further. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org