[GitHub] spark issue #20726: [SPARK-23574][CORE] Report SinglePartition in DataSource...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20726 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87924/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20726: [SPARK-23574][CORE] Report SinglePartition in DataSource...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20726 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20726: [SPARK-23574][CORE] Report SinglePartition in DataSource...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20726 **[Test build #87924 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87924/testReport)** for PR 20726 at commit [`efb8397`](https://github.com/apache/spark/commit/efb839759ddc1df1eec1b14500eebe5e4ca903c5). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20705: [SPARK-23553][TESTS] Tests should not assume the default...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20705 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1251/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20705: [SPARK-23553][TESTS] Tests should not assume the ...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20705#discussion_r172007674 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetPartitionDiscoverySuite.scala --- @@ -57,6 +57,16 @@ class ParquetPartitionDiscoverySuite extends QueryTest with ParquetTest with Sha val timeZone = TimeZone.getDefault() val timeZoneId = timeZone.getID + protected override def beforeAll(): Unit = { +super.beforeAll() +spark.conf.set(SQLConf.DEFAULT_DATA_SOURCE_NAME.key, "parquet") --- End diff -- Since this is `ParquetPartitionDiscoverySuite`, the test cases' assumption is legitimate. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20705: [SPARK-23553][TESTS] Tests should not assume the default...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20705 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20705: [SPARK-23553][TESTS] Tests should not assume the ...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20705#discussion_r172007654 --- Diff: python/pyspark/sql/readwriter.py --- @@ -147,6 +147,7 @@ def load(self, path=None, format=None, schema=None, **options): or a DDL-formatted string (For example ``col0 INT, col1 DOUBLE``). :param options: all other string options +>>> spark.conf.set("spark.sql.sources.default", "parquet") >>> df = spark.read.load('python/test_support/sql/parquet_partitioned', opt1=True, --- End diff -- The built-in test data is `parquet`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20705: [SPARK-23553][TESTS] Tests should not assume the default...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20705 **[Test build #87925 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87925/testReport)** for PR 20705 at commit [`144460d`](https://github.com/apache/spark/commit/144460d791eb92316f609b5934ff10892e3c9be0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87922/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19222 **[Test build #87922 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87922/testReport)** for PR 19222 at commit [`abf6ba0`](https://github.com/apache/spark/commit/abf6ba02554091d974ec7a289d318cae559bc3cb). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20705: [SPARK-23553][TESTS] Tests should not assume the default...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20705 Since we verified JSON result, I'll update the PR to address @HyukjinKwon 's comment. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20714: [SPARK-23457][SQL][BRANCH-2.3] Register task completion ...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20714 Hi, @cloud-fan and @gatorsmile . This is a backport of #20619 . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20715: [SPARK-23434][SQL][BRANCH-2.2] Spark should not warn `me...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20715 @cloud-fan and @zsxwing . This is a backport of #20616 . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20713: [SPARK-23434][SQL][BRANCH-2.3] Spark should not warn `me...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20713 @cloud-fan and @zsxwing . This is a backport of #20616 . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20705: [SPARK-23553][TESTS] Tests should not assume the default...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20705 @gatorsmile and @HyukjinKwon . Two failures are due to limitation of the current JSON data source implementation. Here, we can see that the test suite correctly tests the target data source. 1. `resolveRelation for a FileFormat DataSource without userSchema scan filesystem only once` For Json source, the statistic count becomes 2. 2. `Pre insert nullability check (MapType)` Since Json source save as string, it raises ClassCastException when the given user or table schema is different. ```scala scala> (Tuple1(Map(1 -> (null: Integer))) :: Nil).toDF("a").write.mode("overwrite").save("/tmp/json") scala> spark.read.json("/tmp/json").printSchema root |-- a: struct (nullable = true) ||-- 1: string (nullable = true) scala> (Tuple1(Map(1 -> (null: Integer))) :: Nil).toDF("a").write.mode("overwrite").saveAsTable("map") 18/03/02 21:13:49 WARN HiveExternalCatalog: Couldn't find corresponding Hive SerDe for data source provider json. Persisting data source table `default`.`map` into Hive metastore in Spark SQL specific format, which is NOT compatible with Hive. scala> spark.read.json("/tmp/json").printSchema root |-- a: struct (nullable = true) ||-- 1: string (nullable = true) scala> spark.table("map").printSchema root |-- a: map (nullable = true) ||-- key: integer ||-- value: integer (valueContainsNull = true) scala> spark.table("map").show 18/03/02 21:14:12 ERROR Executor: Exception in task 0.0 in stage 10.0 (TID 10) java.lang.ClassCastException: org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.Integer at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:101) ``` For JSON format, could you confirm this, @HyukjinKwon ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20726: [SPARK-23574][CORE] Report SinglePartition in DataSource...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20726 **[Test build #87924 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87924/testReport)** for PR 20726 at commit [`efb8397`](https://github.com/apache/spark/commit/efb839759ddc1df1eec1b14500eebe5e4ca903c5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20726: [SPARK-23574][CORE] Report SinglePartition in DataSource...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20726 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20726: [SPARK-23574][CORE] Report SinglePartition in DataSource...
Github user jose-torres commented on the issue: https://github.com/apache/spark/pull/20726 @cloud-fan @rdblue --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20726: [SPARK-23574][CORE] Report SinglePartition in Dat...
GitHub user jose-torres opened a pull request: https://github.com/apache/spark/pull/20726 [SPARK-23574][CORE] Report SinglePartition in DataSourceV2ScanExec when there's exactly 1 data reader factory. ## What changes were proposed in this pull request? Report SinglePartition in DataSourceV2ScanExec when there's exactly 1 data reader factory. ## How was this patch tested? existing unit tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/jose-torres/spark SPARK-23574 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20726.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20726 commit efb839759ddc1df1eec1b14500eebe5e4ca903c5 Author: Jose TorresDate: 2018-03-03T04:34:05Z SinglePartition check --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87921/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19222 **[Test build #87921 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87921/testReport)** for PR 19222 at commit [`3a93d61`](https://github.com/apache/spark/commit/3a93d6163b659f5ebb76a48b4948d46d44750878). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20710 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20710 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87917/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20710 **[Test build #87917 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87917/testReport)** for PR 20710 at commit [`9fb74e2`](https://github.com/apache/spark/commit/9fb74e2ccbe668ac6a1f2d2240b67a04400ba78b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20710 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87918/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20710 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20710 **[Test build #87918 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87918/testReport)** for PR 20710 at commit [`215c225`](https://github.com/apache/spark/commit/215c225c5a1623cfa02f617201e21067bbf6088a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20710 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20710 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87915/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20710 **[Test build #87915 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87915/testReport)** for PR 20710 at commit [`79495b1`](https://github.com/apache/spark/commit/79495b1f9e994f77ccf40c47eb2fb0baf5873f66). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20678: [SPARK-23380][PYTHON] Adds a conf for Arrow fallback in ...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20678 gentle ping, I believe this is ready for another look. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20618: [SPARK-23329][SQL] Fix documentation of trigonometric fu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20618 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87914/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20618: [SPARK-23329][SQL] Fix documentation of trigonometric fu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20618 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20618: [SPARK-23329][SQL] Fix documentation of trigonometric fu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20618 **[Test build #87914 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87914/testReport)** for PR 20618 at commit [`627e204`](https://github.com/apache/spark/commit/627e204ed03cfd6caa06e8f64dc605b62f4d2e5e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20724: [SPARK-18630][PYTHON][ML] Move del method from JavaParam...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20724 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87923/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20710 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87912/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20724: [SPARK-18630][PYTHON][ML] Move del method from JavaParam...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20724 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20724: [SPARK-18630][PYTHON][ML] Move del method from JavaParam...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20724 **[Test build #87923 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87923/testReport)** for PR 20724 at commit [`07e1829`](https://github.com/apache/spark/commit/07e18299c83d0874f7b5301d0eb80746ab01). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20710 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20710 **[Test build #87912 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87912/testReport)** for PR 20710 at commit [`544eb1b`](https://github.com/apache/spark/commit/544eb1b296bceb213965bf3c5dc1a6264c5b7acd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20702: [SPARK-23547][SQL]Cleanup the .pipeout file when ...
Github user attilapiros commented on a diff in the pull request: https://github.com/apache/spark/pull/20702#discussion_r172003781 --- Diff: sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/HiveSessionImpl.java --- @@ -665,6 +667,25 @@ public void close() throws HiveSQLException { } } + private void cleanupPipeoutFile() { +String lScratchDir = hiveConf.getVar(ConfVars.LOCALSCRATCHDIR); +String sessionID = hiveConf.getVar(ConfVars.HIVESESSIONID); + +File[] fileAry = new File(lScratchDir).listFiles( --- End diff -- No problem. I hope this works. ``` File[] fileAry = new File(lScratchDir).listFiles( (dir, name) -> name.startsWith(sessionID) && name.endsWith(".pipeout")); ``` I think it would be good if you would have one unit test for your change. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20724: [SPARK-18630][PYTHON][ML] Move del method from JavaParam...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20724 **[Test build #87923 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87923/testReport)** for PR 20724 at commit [`07e1829`](https://github.com/apache/spark/commit/07e18299c83d0874f7b5301d0eb80746ab01). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19222 **[Test build #87922 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87922/testReport)** for PR 19222 at commit [`abf6ba0`](https://github.com/apache/spark/commit/abf6ba02554091d974ec7a289d318cae559bc3cb). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1250/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20723: [SPARK-23538][core] Remove custom configuration for SSL ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20723 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20723: [SPARK-23538][core] Remove custom configuration for SSL ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20723 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87907/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20458: changed scala example from java "style" to scala
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20458 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20723: [SPARK-23538][core] Remove custom configuration for SSL ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20723 **[Test build #87907 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87907/testReport)** for PR 20723 at commit [`c83611e`](https://github.com/apache/spark/commit/c83611eca573f3f460790f4fde7bea7ef7887839). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20698: [SPARK-23541][SS] Allow Kafka source to read data...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20698 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20698: [SPARK-23541][SS] Allow Kafka source to read data with g...
Github user tdas commented on the issue: https://github.com/apache/spark/pull/20698 Thank you. Merging to master only as this is a new feature touching production code paths. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20705: [SPARK-23553][TESTS] Tests should not assume the default...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20705 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20705: [SPARK-23553][TESTS] Tests should not assume the default...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20705 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87916/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20705: [SPARK-23553][TESTS] Tests should not assume the default...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20705 **[Test build #87916 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87916/testReport)** for PR 20705 at commit [`3ec9309`](https://github.com/apache/spark/commit/3ec9309f923405873d73a6e5d9376bba08e050d0). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19222 **[Test build #87921 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87921/testReport)** for PR 19222 at commit [`3a93d61`](https://github.com/apache/spark/commit/3a93d6163b659f5ebb76a48b4948d46d44750878). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1249/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20718: [SPARK-23514][FOLLOW-UP] Remove more places using...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20718 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20718: [SPARK-23514][FOLLOW-UP] Remove more places using sparkC...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20718 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20208: [SPARK-23007][SQL][TEST] Add schema evolution test suite...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20208 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20208: [SPARK-23007][SQL][TEST] Add schema evolution test suite...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20208 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87906/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20208: [SPARK-23007][SQL][TEST] Add schema evolution test suite...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20208 **[Test build #87906 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87906/testReport)** for PR 20208 at commit [`6ae471c`](https://github.com/apache/spark/commit/6ae471c8ecaae3eb3888eecaac1c4e7552bedcc6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20725: [WIP][SPARK-23555][PYTHON] Add BinaryType support for Ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20725 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87919/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20725: [WIP][SPARK-23555][PYTHON] Add BinaryType support for Ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20725 **[Test build #87919 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87919/testReport)** for PR 20725 at commit [`afcb0d5`](https://github.com/apache/spark/commit/afcb0d5608c17d2fc004a0b6d4af4573abca4e4b). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20725: [WIP][SPARK-23555][PYTHON] Add BinaryType support for Ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20725 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20724: [SPARK-18630][PYTHON][ML] Move del method from JavaParam...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20724 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20724: [SPARK-18630][PYTHON][ML] Move del method from JavaParam...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20724 **[Test build #87920 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87920/testReport)** for PR 20724 at commit [`d36c1a1`](https://github.com/apache/spark/commit/d36c1a10cd318d9ddeb2717737248c974a2349f1). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20724: [SPARK-18630][PYTHON][ML] Move del method from JavaParam...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20724 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87920/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20724: [SPARK-18630][PYTHON][ML] Move del method from JavaParam...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20724 **[Test build #87920 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87920/testReport)** for PR 20724 at commit [`d36c1a1`](https://github.com/apache/spark/commit/d36c1a10cd318d9ddeb2717737248c974a2349f1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20724: [SPARK-18630][PYTHON][ML] Move del method from JavaParam...
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/20724 add to whitelist --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20714: [SPARK-23457][SQL][BRANCH-2.3] Register task completion ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20714 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87905/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20714: [SPARK-23457][SQL][BRANCH-2.3] Register task completion ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20714 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20698: [SPARK-23541][SS] Allow Kafka source to read data with g...
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/20698 LGTM pending tests. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20714: [SPARK-23457][SQL][BRANCH-2.3] Register task completion ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20714 **[Test build #87905 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87905/testReport)** for PR 20714 at commit [`d15eba7`](https://github.com/apache/spark/commit/d15eba754a59721bc7d9cdc7d374f2f323d21e41). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20725: [WIP][SPARK-23555][PYTHON] Add BinaryType support for Ar...
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/20725 This is a WIP as some issues need to be worked out on the Arrow side and need tests for pandas_udfs. Currently get the following error when converting from pandas: ``` File "/home/bryan/git/spark/python/pyspark/serializers.py", line 237, in create_array return pa.Array.from_pandas(s, mask=mask, type=t) File "array.pxi", line 335, in pyarrow.lib.Array.from_pandas File "array.pxi", line 170, in pyarrow.lib.array File "array.pxi", line 70, in pyarrow.lib._ndarray_to_array File "error.pxi", line 85, in pyarrow.lib.check_status ArrowNotImplementedError: No cast implemented from binary to binary ``` The corresponding JIRA is https://issues.apache.org/jira/browse/ARROW-2141 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20725: [WIP][SPARK-23555][PYTHON] Add BinaryType support for Ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20725 **[Test build #87919 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87919/testReport)** for PR 20725 at commit [`afcb0d5`](https://github.com/apache/spark/commit/afcb0d5608c17d2fc004a0b6d4af4573abca4e4b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20725: [WIP][SPARK-23555][PYTHON] Add BinaryType support for Ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20725 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20724: [SPARK-18630][PYTHON][ML] Move del method from JavaParam...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20724 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20725: [WIP][SPARK-23555][PYTHON] Add BinaryType support for Ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20725 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1248/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20725: [WIP][SPARK-23555][PYTHON] Add BinaryType support...
GitHub user BryanCutler opened a pull request: https://github.com/apache/spark/pull/20725 [WIP][SPARK-23555][PYTHON] Add BinaryType support for Arrow ## What changes were proposed in this pull request? Adding `BinaryType` support for Arrow in pyspark. ## How was this patch tested? Additional unit tests in pyspark for code paths that use Arrow You can merge this pull request into a Git repository by running: $ git pull https://github.com/BryanCutler/spark arrow-binary-type-support-SPARK-23555 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20725.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20725 commit afcb0d5608c17d2fc004a0b6d4af4573abca4e4b Author: Bryan CutlerDate: 2018-03-03T00:32:45Z added support for binary type, not currently working due to arrow error --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20724: [SPARK-18630][PYTHON][ML] Move del method from JavaParam...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20724 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20632: [SPARK-3159][ML] Add decision tree pruning
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20632 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20632: [SPARK-3159][ML] Add decision tree pruning
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20632 Merged with master. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20724: [SPARK-18630][PYTHON][ML] Move del method from Ja...
GitHub user yogeshg opened a pull request: https://github.com/apache/spark/pull/20724 [SPARK-18630][PYTHON][ML] Move del method from JavaParams to JavaWrapper; add tests ## What changes were proposed in this pull request? Move del method from JavaParams to JavaWrapper; add tests ## How was this patch tested? I ran pyspark tests agains `pyspark-ml` module `./python/run-tests --python-executables=$(which python) --modules=pyspark-ml` You can merge this pull request into a Git repository by running: $ git pull https://github.com/yogeshg/spark java_wrapper_memory Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20724.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20724 commit 50acecc3d12778f1f30ca636b6e83163f1fc775a Author: Yogesh GargDate: 2018-03-03T00:00:40Z add test case for JavaWrapper that displays memory leak for JavaWrapper but not JavaParams commit d36c1a10cd318d9ddeb2717737248c974a2349f1 Author: Yogesh Garg Date: 2018-03-03T00:01:19Z send the delete method from JavaParams to JavaWrapper --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning
Github user zaycev commented on the issue: https://github.com/apache/spark/pull/16578 I observed about 5x better performance in reading a small subset of fields of a highly nested parquet table: master: https://user-images.githubusercontent.com/283938/36928047-e07e5b52-1e36-11e8-98e4-a614ad7589b6.png;> https://user-images.githubusercontent.com/283938/36928033-c9a21022-1e36-11e8-81bf-7008e1f40d6f.png;> master with @mallman patch: https://user-images.githubusercontent.com/283938/36928037-cdc9ec10-1e36-11e8-8830-5e77c074e4ab.png;> https://user-images.githubusercontent.com/283938/36928048-e3e15a88-1e36-11e8-8dda-9b384c4a04c8.png;> --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20710 **[Test build #87918 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87918/testReport)** for PR 20710 at commit [`215c225`](https://github.com/apache/spark/commit/215c225c5a1623cfa02f617201e21067bbf6088a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20698: [SPARK-23541][SS] Allow Kafka source to read data with g...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20698 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20698: [SPARK-23541][SS] Allow Kafka source to read data with g...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20698 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87913/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20698: [SPARK-23541][SS] Allow Kafka source to read data with g...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20698 **[Test build #87913 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87913/testReport)** for PR 20698 at commit [`602ab36`](https://github.com/apache/spark/commit/602ab36490a692080682867f98a8a5d8f7b2390d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20710 **[Test build #87917 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87917/testReport)** for PR 20710 at commit [`9fb74e2`](https://github.com/apache/spark/commit/9fb74e2ccbe668ac6a1f2d2240b67a04400ba78b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20705: [SPARK-23553][TESTS] Tests should not assume the default...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20705 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20705: [SPARK-23553][TESTS] Tests should not assume the default...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20705 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1247/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFactory.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20710 **[Test build #87915 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87915/testReport)** for PR 20710 at commit [`79495b1`](https://github.com/apache/spark/commit/79495b1f9e994f77ccf40c47eb2fb0baf5873f66). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20705: [SPARK-23553][TESTS] Tests should not assume the default...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20705 **[Test build #87916 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87916/testReport)** for PR 20705 at commit [`3ec9309`](https://github.com/apache/spark/commit/3ec9309f923405873d73a6e5d9376bba08e050d0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFacto...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/20710#discussion_r171993983 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/streaming/StreamWriter.java --- @@ -39,21 +36,21 @@ * If this method fails (by throwing an exception), this writing job is considered to have been * failed, and the execution engine will attempt to call {@link #abort(WriterCommitMessage[])}. * - * To support exactly-once processing, writer implementations should ensure that this method is - * idempotent. The execution engine may call commit() multiple times for the same epoch - * in some circumstances. + * The execution engine may call commit() multiple times for the same epoch in some circumstances. --- End diff -- Somewhere in this file, add docs about what epochId means for MB and C execution. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFacto...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/20710#discussion_r171993716 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataWriterFactory.java --- @@ -48,6 +48,9 @@ * same task id but different attempt number, which means there are multiple * tasks with the same task id running at the same time. Implementations can * use this attempt number to distinguish writers of different task attempts. + * @param epochId A monotonically increasing id for streaming queries that are split in to + *discrete periods of execution. For queries that execute as a single batch, this --- End diff -- Also, make it clear that, this is batchId for MicroBatch processing and epochId for Continuous processing --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20710: [SPARK-23559][SS] Add epoch ID to DataWriterFacto...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/20710#discussion_r171993622 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataWriterFactory.java --- @@ -48,6 +48,9 @@ * same task id but different attempt number, which means there are multiple * tasks with the same task id running at the same time. Implementations can * use this attempt number to distinguish writers of different task attempts. + * @param epochId A monotonically increasing id for streaming queries that are split in to + *discrete periods of execution. For queries that execute as a single batch, this --- End diff -- For non-streaming queries, this... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20705: [SPARK-23553][TESTS] Tests should not assume the default...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20705 Sure, @gatorsmile . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org