[GitHub] spark issue #15230: [SPARK-17657] [SQL] Disallow Users to Change Table Type
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15230 **[Test build #65863 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65863/consoleFull)** for PR 15230 at commit [`f028b5e`](https://github.com/apache/spark/commit/f028b5ec4730f658c545b6920e4af79ce5acc957). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15230: [SPARK-17657] [SQL] Disallow Users to Change Tabl...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/15230 [SPARK-17657] [SQL] Disallow Users to Change Table Type ### What changes were proposed in this pull request? Hive allows users to change the table type from `Managed` to `External` or from `External` to `Managed` by altering table's property `EXTERNAL`. See the JIRA: https://issues.apache.org/jira/browse/HIVE-1329 So far, Spark SQL does not correctly support it, although users can do it. Many assumptions are broken in the implementation. Thus, this PR is to disallow users to do it. ### How was this patch tested? Added test cases You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark alterTableSetExternal Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15230.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15230 commit f028b5ec4730f658c545b6920e4af79ce5acc957 Author: gatorsmile Date: 2016-09-24T06:02:46Z fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15229: [SPARK-17654] [SQL] Propagate bucketing information for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15229 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15229: [SPARK-17654] [SQL] Propagate bucketing information for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15229 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65862/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15229: [SPARK-17654] [SQL] Propagate bucketing information for ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15229 **[Test build #65862 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65862/consoleFull)** for PR 15229 at commit [`8726cc6`](https://github.com/apache/spark/commit/8726cc6430cbeaf8c2eebd7cef40199a7c563218). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14808: [SPARK-17156][ML][EXAMPLE] Add multiclass logistic regre...
Github user jaceklaskowski commented on the issue: https://github.com/apache/spark/pull/14808 @sethah I seem to have missed the other comment with the notes what has to be done to make the PR an example for the change. Sorry. Since I'm very new to it and the only way to learn it better is to help with the PR here or the one that's coming, I'd like to be engaged one way or the other. I'm gonna close this PR (to make it consistent JIRA-wise) and open another with the changes you've mentioned to have the example aligned with the changes and the requirements. Thanks @sethah for your help! See you in the other PR... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14808: [SPARK-17156][ML][EXAMPLE] Add multiclass logisti...
Github user jaceklaskowski closed the pull request at: https://github.com/apache/spark/pull/14808 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work for jdb...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65860/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work for jdb...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12601 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work for jdb...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12601 **[Test build #65860 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65860/consoleFull)** for PR 12601 at commit [`8fb86b4`](https://github.com/apache/spark/commit/8fb86b482929e321f4ec8865124b8661f1a29bbf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15229: [SPARK-17654] [SQL] Propagate bucketing information for ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15229 **[Test build #65862 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65862/consoleFull)** for PR 15229 at commit [`8726cc6`](https://github.com/apache/spark/commit/8726cc6430cbeaf8c2eebd7cef40199a7c563218). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15229: [SPARK-17654] [SQL] Propagate bucketing informati...
GitHub user tejasapatil opened a pull request: https://github.com/apache/spark/pull/15229 [SPARK-17654] [SQL] Propagate bucketing information for Hive tables to / from Catalog ## What changes were proposed in this pull request? Currently Spark does not respect bucketing for Hive tables. This PR includes following changes: - will extract table's bucketing information in `HiveClientImpl` - while writing table info to metastore, `MetastoreRelation` now populates the bucketing information in the hive `Table` object - `HiveTableScanExec` now exposes `outputPartitioning` and `outputOrdering` as per bucketing spec. - `InsertIntoHiveTable` now exposes `requiredChildDistribution` and `requiredChildOrdering` based on the target table's bucketing spec. TODOs (which will be done in linked PRs and not this one): - [ ] `ClusteredDistribution` does not guarantee the number of partitions (which corresponds to output bucket files created) generated. This will require adding strict guarantees to `ClusteredDistribution`. I think it will need more thought and better to do incrementally and not packing in this PR. - [ ] While writing to bucketed files, Hive's hashing function should be used. I have a PR open to implement Hive hashing native in Spark : https://github.com/apache/spark/pull/15047 - [ ] Allow creating Hive bucketed tables ## How was this patch tested? Tested with Hive tables created locally. Adding a new test case will need implementing bucketed table creation which is not supported :( Suggestions welcome. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tejasapatil/spark SPARK-17654_hive_extract_bucketing Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15229.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15229 commit caef89a198dac2fee4afaad622e2ecc11f200836 Author: Tejas Patil Date: 2016-08-23T20:45:00Z Support bucketing for Hive tables commit ee79dd2ae1e174ab38fc5f6b10f5a9a2e2721533 Author: Tejas Patil Date: 2016-08-23T20:45:00Z Support bucketing for Hive tables commit 8726cc6430cbeaf8c2eebd7cef40199a7c563218 Author: Tejas Patil Date: 2016-09-24T03:22:07Z Merge remote-tracking branch 'origin/SPARK-17654_hive_extract_bucketing' into SPARK-17654_hive_extract_bucketing_2 # Conflicts: # sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableSca nExec.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work for jdb...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12601 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work for jdb...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65858/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15168: [SPARK-17612][SQL] Support `DESCRIBE table PARTITION` SQ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15168 **[Test build #65861 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65861/consoleFull)** for PR 15168 at commit [`ba22975`](https://github.com/apache/spark/commit/ba22975232bd64263ef0b513f11887378e0de43f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15168: [SPARK-17612][SQL] Support `DESCRIBE table PARTITION` SQ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15168 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work for jdb...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/12601 Mostly LGTM, except three minor comments. Thank you for your hard work, @JustinPihony ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/12601#discussion_r80353253 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -420,62 +420,11 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) { def jdbc(url: String, table: String, connectionProperties: Properties): Unit = { assertNotPartitioned("jdbc") assertNotBucketed("jdbc") - -// to add required options like URL and dbtable -val params = extraOptions.toMap ++ Map("url" -> url, "dbtable" -> table) -val jdbcOptions = new JDBCOptions(params) -val jdbcUrl = jdbcOptions.url -val jdbcTable = jdbcOptions.table - -val props = new Properties() -extraOptions.foreach { case (key, value) => - props.put(key, value) -} // connectionProperties should override settings in extraOptions -props.putAll(connectionProperties) -val conn = JdbcUtils.createConnectionFactory(jdbcUrl, props)() - -try { - var tableExists = JdbcUtils.tableExists(conn, jdbcUrl, jdbcTable) - - if (mode == SaveMode.Ignore && tableExists) { -return - } - - if (mode == SaveMode.ErrorIfExists && tableExists) { -sys.error(s"Table $jdbcTable already exists.") - } - - if (mode == SaveMode.Overwrite && tableExists) { -if (jdbcOptions.isTruncate && -JdbcUtils.isCascadingTruncateTable(jdbcUrl) == Some(false)) { - JdbcUtils.truncateTable(conn, jdbcTable) -} else { - JdbcUtils.dropTable(conn, jdbcTable) - tableExists = false -} - } - - // Create the table if the table didn't exist. - if (!tableExists) { -val schema = JdbcUtils.schemaString(df, jdbcUrl) -// To allow certain options to append when create a new table, which can be -// table_options or partition_options. -// E.g., "CREATE TABLE t (name string) ENGINE=InnoDB DEFAULT CHARSET=utf8" -val createtblOptions = jdbcOptions.createTableOptions -val sql = s"CREATE TABLE $jdbcTable ($schema) $createtblOptions" -val statement = conn.createStatement -try { - statement.executeUpdate(sql) -} finally { - statement.close() -} - } -} finally { - conn.close() -} - -JdbcUtils.saveTable(df, jdbcUrl, jdbcTable, props) +this.extraOptions = this.extraOptions ++ (connectionProperties.asScala) +// explicit url and dbtable should override all +this.extraOptions += ("url" -> url, "dbtable" -> table) +format("jdbc").save --- End diff -- The omission of parentheses on methods should only be used when the method has no side-effects. Thus, please change it to `save()` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/12601#discussion_r80353203 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCWriteSuite.scala --- @@ -208,4 +210,84 @@ class JDBCWriteSuite extends SharedSQLContext with BeforeAndAfter { assert(2 === spark.read.jdbc(url1, "TEST.PEOPLE1", properties).count()) assert(2 === spark.read.jdbc(url1, "TEST.PEOPLE1", properties).collect()(0).length) } + + test("save works for format(\"jdbc\") if url and dbtable are set") { +val df = sqlContext.createDataFrame(sparkContext.parallelize(arr2x2), schema2) + +df.write.format("jdbc") +.options(Map("url" -> url, "dbtable" -> "TEST.SAVETEST")) +.save --- End diff -- Nit: `save` -> `save()` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15168: [SPARK-17612][SQL] Support `DESCRIBE table PARTITION` SQ...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/15168 The failure seems to be irrelevant. Retest this please. ``` [info] - Naive Bayes Multinomial *** FAILED *** (137 milliseconds) [info] Expected 0.7 and 0.6494565217391305 to be within 0.05 using absolute tolerance. [info] validateModelFit: ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15228: [SPARK-17654] [SQL] Propagate bucketing information for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15228 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15228: [SPARK-17654] [SQL] Propagate bucketing information for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15228 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65857/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15228: [SPARK-17654] [SQL] Propagate bucketing information for ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15228 **[Test build #65857 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65857/consoleFull)** for PR 15228 at commit [`caef89a`](https://github.com/apache/spark/commit/caef89a198dac2fee4afaad622e2ecc11f200836). * This patch **fails Spark unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/12601#discussion_r80353010 --- Diff: docs/sql-programming-guide.md --- @@ -1096,13 +1096,17 @@ the Data Sources API. The following options are supported: {% highlight sql %} -CREATE TEMPORARY VIEW jdbcTable +CREATE TEMPORARY TABLE jdbcTable --- End diff -- Please change it back. `CREATE TEMPORARY TABLE` is deprecated. You will get a Parser error ``` CREATE TEMPORARY TABLE is not supported yet. Please use CREATE TEMPORARY VIEW as an alternative.(line 1, pos 0) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15168: [SPARK-17612][SQL] Support `DESCRIBE table PARTITION` SQ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15168 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15168: [SPARK-17612][SQL] Support `DESCRIBE table PARTITION` SQ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15168 **[Test build #65859 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65859/consoleFull)** for PR 15168 at commit [`ba22975`](https://github.com/apache/spark/commit/ba22975232bd64263ef0b513f11887378e0de43f). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15168: [SPARK-17612][SQL] Support `DESCRIBE table PARTITION` SQ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15168 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65859/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15217: [SPARK-17577][Core] Update SparkContext.addFile to make ...
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/15217 Close this PR. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15217: [SPARK-17577][Core] Update SparkContext.addFile t...
Github user yanboliang closed the pull request at: https://github.com/apache/spark/pull/15217 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15228: [SPARK-17654] [SQL] Propagate bucketing informati...
Github user tejasapatil closed the pull request at: https://github.com/apache/spark/pull/15228 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work ...
Github user JustinPihony commented on a diff in the pull request: https://github.com/apache/spark/pull/12601#discussion_r80352586 --- Diff: examples/src/main/java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java --- @@ -21,6 +21,7 @@ import java.util.ArrayList; import java.util.Arrays; import java.util.List; +import java.util.Properties; // $example off:schema_merging$ --- End diff -- @HyukjinKwon Yes, that is what I was talking about...just fixed it back --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work for jdb...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12601 **[Test build #65860 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65860/consoleFull)** for PR 12601 at commit [`8fb86b4`](https://github.com/apache/spark/commit/8fb86b482929e321f4ec8865124b8661f1a29bbf). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15168: [SPARK-17612][SQL] Support `DESCRIBE table PARTITION` SQ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15168 **[Test build #65859 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65859/consoleFull)** for PR 15168 at commit [`ba22975`](https://github.com/apache/spark/commit/ba22975232bd64263ef0b513f11887378e0de43f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work for jdb...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/12601 Thanks for mentioning me. It looks good to me in my personal view. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12601#discussion_r80352317 --- Diff: examples/src/main/java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java --- @@ -21,6 +21,7 @@ import java.util.ArrayList; import java.util.Arrays; import java.util.List; +import java.util.Properties; // $example off:schema_merging$ --- End diff -- Oh, maybe, my previous comment was not clear. I meant ```java Import java.util.List; // $example off:schema_merging$ Import java.util.Properties; ``` I haven't tried to build the doc against the current state but I guess we won't need this import for Parquet`s schema mering example. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work for jdb...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12601 **[Test build #65858 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65858/consoleFull)** for PR 12601 at commit [`06c1cba`](https://github.com/apache/spark/commit/06c1cba1da5ab140d71c29f41afd608e863bfe1b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work for jdb...
Github user JustinPihony commented on the issue: https://github.com/apache/spark/pull/12601 @gatorsmile I added the R and SQL documentation. I took the SQL portion from https://github.com/apache/spark/pull/6121/files --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15228: [SPARK-17654] [SQL] Propagate bucketing information for ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15228 **[Test build #65857 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65857/consoleFull)** for PR 15228 at commit [`caef89a`](https://github.com/apache/spark/commit/caef89a198dac2fee4afaad622e2ecc11f200836). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15071: [SPARK-17517][SQL]Improve generated Code for BroadcastHa...
Github user yaooqinn commented on the issue: https://github.com/apache/spark/pull/15071 cc @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15228: [SPARK-17654] [SQL] Propagate bucketing informati...
GitHub user tejasapatil opened a pull request: https://github.com/apache/spark/pull/15228 [SPARK-17654] [SQL] Propagate bucketing information for Hive tables to / from Catalog ## What changes were proposed in this pull request? Currently Spark does not respect bucketing for Hive tables. This PR includes following changes: - will extract table's bucketing information in `HiveClientImpl` - while writing table info to metastore, `MetastoreRelation` now populates the bucketing information in the hive `Table` object - `HiveTableScanExec` now exposes `outputPartitioning` and `outputOrdering` as per bucketing spec. - `InsertIntoHiveTable` now exposes `requiredChildDistribution` and `requiredChildOrdering` based on the target table's bucketing spec. TODOs (which will be done in linked PRs and not this one): - [ ] `ClusteredDistribution` does not guarantee the number of partitions (which corresponds to output bucket files created) generated. This will require adding strict guarantees to `ClusteredDistribution`. I think it will need more thought and better to do incrementally and not packing in this PR. - [ ] While writing to bucketed files, Hive's hashing function should be used. I have a PR open to implement Hive hashing native in Spark : https://github.com/apache/spark/pull/15047 - [ ] Allow creating Hive bucketed tables ## How was this patch tested? Tested with Hive tables created locally. Adding a new test case will need implementing bucketed table creation which is not supported :( Suggestions welcome. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tejasapatil/spark SPARK-17654_hive_extract_bucketing Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15228.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15228 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15227: [SPARK-17655][SQL]Remove unused variables declarations a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15227 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15227: [SPARK-17655][SQL]Remove unused variables declara...
GitHub user yaooqinn opened a pull request: https://github.com/apache/spark/pull/15227 [SPARK-17655][SQL]Remove unused variables declarations and definations in a WholeStageCodeGened stage ## What changes were proposed in this pull request? A WholeStageCodeGened stage with multiple CodegenSupport Operators generates unused result rows and their associated buffer holders and row writers, which can be removed. ## How was this patch tested? existing ut. You can merge this pull request into a Git repository by running: $ git pull https://github.com/yaooqinn/spark rm-unused-object Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15227.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15227 commit eabd4a55cbe8fd57c722396c95087a2b6c695587 Author: Kent Yao Date: 2016-09-24T01:58:42Z remove redundant variables declarations and definations --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15218: [SPARK-17637][Scheduler]Packed scheduling for Spark task...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15218 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65856/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15218: [SPARK-17637][Scheduler]Packed scheduling for Spark task...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15218 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15218: [SPARK-17637][Scheduler]Packed scheduling for Spark task...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15218 **[Test build #65856 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65856/consoleFull)** for PR 15218 at commit [`f71f1c0`](https://github.com/apache/spark/commit/f71f1c0f245aa9534330c9b4913ce40a1cfa250e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12601#discussion_r80350919 --- Diff: examples/src/main/java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java --- @@ -23,6 +23,8 @@ import java.util.List; // $example off:schema_merging$ +import java.util.Properties; + --- End diff -- No reason to not follow the guildline? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15226: [SPARK-17649][CORE] Log how many Spark events got droppe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15226 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65855/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15226: [SPARK-17649][CORE] Log how many Spark events got droppe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15226 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15226: [SPARK-17649][CORE] Log how many Spark events got droppe...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15226 **[Test build #65855 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65855/consoleFull)** for PR 15226 at commit [`0e014b0`](https://github.com/apache/spark/commit/0e014b02d03eeda8373cd8892662ed6ce9de664c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work ...
Github user JustinPihony commented on a diff in the pull request: https://github.com/apache/spark/pull/12601#discussion_r80350755 --- Diff: examples/src/main/java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java --- @@ -23,6 +23,8 @@ import java.util.List; // $example off:schema_merging$ +import java.util.Properties; + --- End diff -- Should this really be added to the example, though? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12601#discussion_r80350458 --- Diff: examples/src/main/java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java --- @@ -23,6 +23,8 @@ import java.util.List; // $example off:schema_merging$ +import java.util.Properties; + --- End diff -- I think we should put `java.util` imports together without additional newline. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15224: [SPARK-17650] malformed url's throw exceptions before br...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15224 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65854/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15224: [SPARK-17650] malformed url's throw exceptions before br...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15224 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15224: [SPARK-17650] malformed url's throw exceptions before br...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15224 **[Test build #65854 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65854/consoleFull)** for PR 15224 at commit [`49afc56`](https://github.com/apache/spark/commit/49afc5686d7ccf9a7864fc9b9c9eb5217a281086). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15226: [SPARK-17649][CORE] Log how many Spark events got...
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/15226#discussion_r80350179 --- Diff: core/src/main/scala/org/apache/spark/util/AsynchronousListenerBus.scala --- @@ -117,6 +124,24 @@ private[spark] abstract class AsynchronousListenerBus[L <: AnyRef, E](name: Stri eventLock.release() } else { onDropEvent(event) + droppedEventsCounter.incrementAndGet() +} + +val droppedEvents = droppedEventsCounter.get +if (droppedEvents > 0) { + // Don't log too frequently + if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) { --- End diff -- Won't nanotime be overkill ? Even if there is a single dropped event, this check will get executed with every post() so having currentTimeMillis (which is less costly) is preferable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15213: [SPARK-17644] [CORE] Do not add failedStages when abortS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15213 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15213: [SPARK-17644] [CORE] Do not add failedStages when abortS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65853/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15213: [SPARK-17644] [CORE] Do not add failedStages when abortS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15213 **[Test build #65853 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65853/consoleFull)** for PR 15213 at commit [`1127ca1`](https://github.com/apache/spark/commit/1127ca1538e9a9ded9e91ead65af8c710e99003d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15220: [SPARK-17649][Core]Log how many Spark events got dropped...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15220 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15220: [SPARK-17649][Core]Log how many Spark events got dropped...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15220 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65851/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15226: [SPARK-17649][CORE] Log how many Spark events got...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15226#discussion_r80347195 --- Diff: core/src/main/scala/org/apache/spark/util/AsynchronousListenerBus.scala --- @@ -117,6 +124,24 @@ private[spark] abstract class AsynchronousListenerBus[L <: AnyRef, E](name: Stri eventLock.release() } else { onDropEvent(event) + droppedEventsCounter.incrementAndGet() +} + +val droppedEvents = droppedEventsCounter.get +if (droppedEvents > 0) { + // Don't log too frequently + if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) { --- End diff -- use nanotime --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15220: [SPARK-17649][Core]Log how many Spark events got dropped...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15220 **[Test build #65851 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65851/consoleFull)** for PR 15220 at commit [`77d7ba0`](https://github.com/apache/spark/commit/77d7ba0ad3f2382c52a15a24cabcb02c3c0009f1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15226: [SPARK-17649][CORE] Log how many Spark events got droppe...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15226 **[Test build #65855 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65855/consoleFull)** for PR 15226 at commit [`0e014b0`](https://github.com/apache/spark/commit/0e014b02d03eeda8373cd8892662ed6ce9de664c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15218: [SPARK-17637][Scheduler]Packed scheduling for Spark task...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15218 **[Test build #65856 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65856/consoleFull)** for PR 15218 at commit [`f71f1c0`](https://github.com/apache/spark/commit/f71f1c0f245aa9534330c9b4913ce40a1cfa250e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...
Github user scwf commented on the issue: https://github.com/apache/spark/pull/15213 > actual problem is not in abortStage but rather in improper additions to failedStages correct, i think a more accurate description for this issue is "do not add `failedStages` when abortStage for fetch failure" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15226: [SPARK-17649][CORE] Log how many Spark events got...
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/15226 [SPARK-17649][CORE] Log how many Spark events got dropped in AsynchronousListenerBus ## What changes were proposed in this pull request? Backport #15220 to 1.6. ## How was this patch tested? Jenkins You can merge this pull request into a Git repository by running: $ git pull https://github.com/zsxwing/spark SPARK-17649-branch-1.6 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15226.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15226 commit 0e014b02d03eeda8373cd8892662ed6ce9de664c Author: Shixiong Zhu Date: 2016-09-23T23:57:28Z [SPARK-17649][CORE] Log how many Spark events got dropped in LiveListenerBus Log how many Spark events got dropped in LiveListenerBus so that the user can get insights on how to set a correct event queue size. Jenkins Author: Shixiong Zhu Closes #15220 from zsxwing/SPARK-17649. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15218: [SPARK-17637][Scheduler]Packed scheduling for Spark task...
Github user zhzhan commented on the issue: https://github.com/apache/spark/pull/15218 @gatorsmile Thanks. #65832 is the latest one which does not have the same failure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15220: [SPARK-17649][Core]Log how many Spark events got dropped...
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15220 Thanks! Merging to master / 2.0. I will submit a patch for 1.6. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15223: [SPARKR][SPARK-17651] Set R package version number along...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15223 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15223: [SPARKR][SPARK-17651] Set R package version number along...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15223 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65849/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15223: [SPARKR][SPARK-17651] Set R package version number along...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15223 **[Test build #65849 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65849/consoleFull)** for PR 15223 at commit [`a0122f0`](https://github.com/apache/spark/commit/a0122f0569b9caa8995c65eb27314edb0234a5ff). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...
Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/15213 Right, but `abortStage` occurs elsewhere. "When abort stage" seems to imply that this fix is necessary for all usages of `abortStage` when the actual problem is not in `abortStage` but rather in improper additions to `failedStages`. I've got to go now, but I'll come back to this soon(ish). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...
Github user scwf commented on the issue: https://github.com/apache/spark/pull/15213 Actually the failedStages only added here in spark. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...
Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/15213 @scwf That description would actually be at least as bad since there are multiple routes to `abortStage` and this issue of adding to `failedStages` only applies to these two. I'll take another look soon and see if I can come up with a clean refactoring and a better description for the commit message. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15089: [SPARK-15621] [SQL] Support spilling for Python UDF
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15089 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15089: [SPARK-15621] [SQL] Support spilling for Python UDF
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15089 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65850/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15089: [SPARK-15621] [SQL] Support spilling for Python UDF
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15089 **[Test build #65850 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65850/consoleFull)** for PR 15089 at commit [`5239042`](https://github.com/apache/spark/commit/52390429fb1f7b20705ddad5621e8267c2aff12b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...
Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/15213 Ok, that makes better sense. The `disallowStageRetryForTest` case doesn't worry me too much since it is only used in tests. If we can fix this case, great; else if it remains possible to create failing tests that can never happen outside of the tests, then that is not all that important (but should at least be noted in comments in the test suite.) Yes, not adding to `failedStages` after going down either of those two paths to `abortStage` is a correct fix even if the description of the problem wasn't really accurate. I'll take another look over the weekend to see if the logic can be expressed a bit more clearly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...
Github user scwf commented on the issue: https://github.com/apache/spark/pull/15213 Thanks @zsxwing to explain this. @markhamstra the issue happens in the case of my PR description. It usually depends on muti-thread submitting jobs cases and the order of fetch failure, so i said it is a race condition. If you think it is confusing, how about change the title to " do not add failedStages when abort stage"? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15224: [SPARK-17650] malformed url's throw exceptions before br...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15224 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65846/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15224: [SPARK-17650] malformed url's throw exceptions before br...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15224 **[Test build #65846 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65846/consoleFull)** for PR 15224 at commit [`c65f94f`](https://github.com/apache/spark/commit/c65f94f440fd67c1d3b555e647dede95ac71fa25). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15220: [SPARK-17649][Core]Log how many Spark events got dropped...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15220 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65847/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15220: [SPARK-17649][Core]Log how many Spark events got dropped...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15220 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15220: [SPARK-17649][Core]Log how many Spark events got dropped...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15220 **[Test build #65847 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65847/consoleFull)** for PR 15220 at commit [`2f47c30`](https://github.com/apache/spark/commit/2f47c30bf9b3ad1e929fe9bf0da4b835e7ea13cd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15089: [SPARK-15621] [SQL] Support spilling for Python UDF
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15089 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65848/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15089: [SPARK-15621] [SQL] Support spilling for Python UDF
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15089 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15213 @markhamstra I agreed this is not a race condition since there is only one single thread. This issue is the code doesn't handle the following two corner cases: - `failedStage.failedOnFetchAndShouldAbort(task.stageAttemptId) && failedStages.isEmpty` is true - `disallowStageRetryForTest && failedStages.isEmpty` In the above cases, `ResubmitFailedStages` won't be posted. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15089: [SPARK-15621] [SQL] Support spilling for Python UDF
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15089 **[Test build #65848 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65848/consoleFull)** for PR 15089 at commit [`87ecc0d`](https://github.com/apache/spark/commit/87ecc0db2c5c980273e06d37ecb764fd03ad2b65). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15224: [SPARK-17650] malformed url's throw exceptions before br...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15224 **[Test build #65854 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65854/consoleFull)** for PR 15224 at commit [`49afc56`](https://github.com/apache/spark/commit/49afc5686d7ccf9a7864fc9b9c9eb5217a281086). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15224: [SPARK-17650] malformed url's throw exceptions before br...
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15224 @zsxwing Thanks for the review. Addressed the nit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...
Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/15213 This doesn't make sense to me. The DAGSchedulerEventProcessLoop runs on a single thread and processes a single event from its queue at a time. When the first CompletionEvent is run as a result of a fetch failure, failedStages is added to and a ResubmitFailedStages event is queued. After handleTaskCompletion is done, the next event from the queue will be processed. As events are sequentially dequeued and handled, either the ResubmitFailedStages event will be handled before the CompletionEvent for the second fetch failure, or the CompletionEvent will be handled before the ResubmitFailedStages event. If the ResubmitFailedStages is handled first, then failedStages will be cleared in resubmitFailedStages, and there will be nothing preventing the subsequent CompletionEvent from queueing another ResubmitFailedStages event to handle additional fetch failures. In the alternative that the second CompletionEvent is queued and handled before the ResubmitFailedStages event, then the additional stages are added to the non-empty failedStages, but there is no need to schedule another ResubmitFailedStages event because the one from the first CompletionEvent is still on the queue and the handling of that queued event will also handle the newly added failedStages from the second CompletionEvent. In either ordering, all the failedStages are handled and there is no race condition. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15223: [SPARKR][SPARK-17651] Set R package version number along...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15223 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15223: [SPARKR][SPARK-17651] Set R package version number along...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15223 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65844/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15223: [SPARKR][SPARK-17651] Set R package version number along...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15223 **[Test build #65844 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65844/consoleFull)** for PR 15223 at commit [`742a787`](https://github.com/apache/spark/commit/742a7879865a4b85883337798c36af99c867ccae). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14808: [SPARK-17156][ML][EXAMPLE] Add multiclass logistic regre...
Github user sethah commented on the issue: https://github.com/apache/spark/pull/14808 I think we should close this. The new example and the user guide should be updated against [SPARK-17239](https://issues.apache.org/jira/browse/SPARK-17239). @jaceklaskowski If you'd still like to do it, please let me know otherwise I am happy to do it. We should try to get this in soon. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15213 **[Test build #65853 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65853/consoleFull)** for PR 15213 at commit [`1127ca1`](https://github.com/apache/spark/commit/1127ca1538e9a9ded9e91ead65af8c710e99003d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15200: Skip building R vignettes if Spark is not built
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/15200 if it's part of the `-Psparkr` profile of the build it will be regenerated by default. If it's changed and not in .gitignore it should be flagged for commit.. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15220: [SPARK-17649][Core]Log how many Spark events got dropped...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15220 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65843/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15220: [SPARK-17649][Core]Log how many Spark events got dropped...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15220 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15220: [SPARK-17649][Core]Log how many Spark events got dropped...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15220 **[Test build #65843 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65843/consoleFull)** for PR 15220 at commit [`b4f56a0`](https://github.com/apache/spark/commit/b4f56a073ac8f5b76db929a456f18b77b8e8910f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org