[GitHub] [spark] cloud-fan commented on a change in pull request #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group
cloud-fan commented on a change in pull request #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group URL: https://github.com/apache/spark/pull/27786#discussion_r387497757 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -367,12 +367,12 @@ final class ShuffleBlockFetcherIterator( // For batch fetch, the actual block in flight should count for merged block. val mayExceedsMaxBlocks = !doBatchFetch && curBlocks.size >= maxBlocksInFlightPerAddress if (curRequestSize >= targetRemoteRequestSize || mayExceedsMaxBlocks) { -createFetchRequests() +createFetchRequests(true) Review comment: let's write down the parameter name. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group
cloud-fan commented on a change in pull request #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group URL: https://github.com/apache/spark/pull/27786#discussion_r387497646 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -339,14 +339,14 @@ final class ShuffleBlockFetcherIterator( + s"with ${blocks.size} blocks") } -def createFetchRequests(): Unit = { +def createFetchRequests(hasMore: Boolean): Unit = { Review comment: nit: `isLast`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #27640: [SPARK-30667][CORE] Add all gather method to BarrierTaskContext
zhengruifeng commented on a change in pull request #27640: [SPARK-30667][CORE] Add all gather method to BarrierTaskContext URL: https://github.com/apache/spark/pull/27640#discussion_r387492294 ## File path: core/src/main/scala/org/apache/spark/BarrierTaskContext.scala ## @@ -163,6 +155,73 @@ class BarrierTaskContext private[spark] ( timerTask.cancel() timer.purge() } +json + } + + /** + * :: Experimental :: + * Sets a global barrier and waits until all tasks in this stage hit this barrier. Similar to + * MPI_Barrier function in MPI, the barrier() function call blocks until all tasks in the same + * stage have reached this routine. + * + * CAUTION! In a barrier stage, each task must have the same number of barrier() calls, in all + * possible code branches. Otherwise, you may get the job hanging or a SparkException after + * timeout. Some examples of '''misuses''' are listed below: + * 1. Only call barrier() function on a subset of all the tasks in the same barrier stage, it + * shall lead to timeout of the function call. + * {{{ + * rdd.barrier().mapPartitions { iter => + * val context = BarrierTaskContext.get() + * if (context.partitionId() == 0) { + * // Do nothing. + * } else { + * context.barrier() + * } + * iter + * } + * }}} + * + * 2. Include barrier() function in a try-catch code block, this may lead to timeout of the + * second function call. + * {{{ + * rdd.barrier().mapPartitions { iter => + * val context = BarrierTaskContext.get() + * try { + * // Do something that might throw an Exception. + * doSomething() + * context.barrier() + * } catch { + * case e: Exception => logWarning("...", e) + * } + * context.barrier() + * iter + * } + * }}} + */ + @Experimental + @Since("2.4.0") + def barrier(): Unit = { +runBarrier(RequestMethod.BARRIER) +() + } + + /** + * :: Experimental :: + * Blocks until all tasks in the same stage have reached this routine. Each task passes in + * a message and returns with a list of all the messages passed in by each of those tasks. + * + * CAUTION! The allGather method requires the same precautions as the barrier method + * + * The message is type String rather than Array[Byte] because it is more convenient for + * the user at the cost of worse performance. + */ + @Experimental + @Since("3.0.0") + def allGather(message: String): ArrayBuffer[String] = { Review comment: Just out of curiosity, why return an `ArrayBuffer[String]` instead of an `Array[String]` here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on issue #27771: [SPARK-31020][SQL] Support foldable schemas by `from_csv`
MaxGekk commented on issue #27771: [SPARK-31020][SQL] Support foldable schemas by `from_csv` URL: https://github.com/apache/spark/pull/27771#issuecomment-594370346 @dongjoon-hyun WDYT of mixing all changes into one PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group
AmplabJenkins commented on issue #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group URL: https://github.com/apache/spark/pull/27786#issuecomment-594369875 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24025/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group
AmplabJenkins commented on issue #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group URL: https://github.com/apache/spark/pull/27786#issuecomment-594369869 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group
AmplabJenkins removed a comment on issue #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group URL: https://github.com/apache/spark/pull/27786#issuecomment-594369875 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24025/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group
AmplabJenkins removed a comment on issue #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group URL: https://github.com/apache/spark/pull/27786#issuecomment-594369869 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group
SparkQA commented on issue #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group URL: https://github.com/apache/spark/pull/27786#issuecomment-594369497 **[Test build #119285 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119285/testReport)** for PR 27786 at commit [`fc36eb1`](https://github.com/apache/spark/commit/fc36eb10856943fbeb6b0d099d130041d587edc6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on issue #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group
Ngone51 commented on issue #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group URL: https://github.com/apache/spark/pull/27786#issuecomment-594367935 cc @cloud-fan @xuanyuanking This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on a change in pull request #27767: [SPARK-31017][TEST][CORE] Test for shuffle requests packaging with different size and numBlocks limit
Ngone51 commented on a change in pull request #27767: [SPARK-31017][TEST][CORE] Test for shuffle requests packaging with different size and numBlocks limit URL: https://github.com/apache/spark/pull/27767#discussion_r387488135 ## File path: core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala ## @@ -341,31 +433,85 @@ class ShuffleBlockFetcherIteratorSuite extends SparkFunSuite with PrivateMethodT assert(blockManager.hostLocalDirManager.get.getCachedHostLocalDirs().size === 1) } - test("fetch continuous blocks in batch respects maxSize and maxBlocks") { + test("fetch continuous blocks in batch should respects maxBytesInFlight") { val blockManager = mock(classOf[BlockManager]) val localBmId = BlockManagerId("test-client", "test-local-host", 1) doReturn(localBmId).when(blockManager).blockManagerId // Make sure remote blocks would return the merged block -val remoteBmId = BlockManagerId("test-client-1", "test-client-1", 2) -val remoteBlocks = Seq[BlockId]( +val remoteBmId1 = BlockManagerId("test-client-1", "test-client-1", 1) +val remoteBmId2 = BlockManagerId("test-client-2", "test-client-2", 2) +val remoteBlocks1 = (0 until 15).map(ShuffleBlockId(0, 3, _)) +val remoteBlocks2 = Seq[BlockId](ShuffleBlockId(0, 4, 0), ShuffleBlockId(0, 4, 1)) +val mergedRemoteBlocks = Map[BlockId, ManagedBuffer]( + ShuffleBlockBatchId(0, 3, 0, 3) -> createMockManagedBuffer(), + ShuffleBlockBatchId(0, 3, 3, 6) -> createMockManagedBuffer(), + ShuffleBlockBatchId(0, 3, 6, 9) -> createMockManagedBuffer(), + ShuffleBlockBatchId(0, 3, 9, 12) -> createMockManagedBuffer(), + ShuffleBlockBatchId(0, 3, 12, 15) -> createMockManagedBuffer(), + ShuffleBlockBatchId(0, 4, 0, 2) -> createMockManagedBuffer()) +val transfer = createMockTransfer(mergedRemoteBlocks) + +val blocksByAddress = Seq[(BlockManagerId, Seq[(BlockId, Long, Int)])]( + (remoteBmId1, remoteBlocks1.map(blockId => (blockId, 100L, 1))), + (remoteBmId2, remoteBlocks2.map(blockId => (blockId, 100L, 1.toIterator + +val taskContext = TaskContext.empty() +val metrics = taskContext.taskMetrics.createTempShuffleReadMetrics() +val iterator = new ShuffleBlockFetcherIterator( + taskContext, + transfer, + blockManager, + blocksByAddress, + (_, in) => in, + 1500, + Int.MaxValue, + Int.MaxValue, + Int.MaxValue, + true, + false, + metrics, + true) + +var numResults = 0 +// After initialize(), there will be 6 FetchRequests, and the each of the first 5 +// includes 3 merged blocks and the last one has 1 merged block. So, only the +// first 5 requests(5 * 3 * 100 >= 1500) can be sent. +verify(transfer, times(5)).fetchBlocks(any(), any(), any(), any(), any(), any()) +while (iterator.hasNext) { + val (blockId, inputStream) = iterator.next() + // Make sure we release buffers when a wrapped input stream is closed. + val mockBuf = mergedRemoteBlocks(blockId) + verifyBufferRelease(mockBuf, inputStream) + numResults += 1 +} +// The last request will be sent after next() is called. +verify(transfer, times(6)).fetchBlocks(any(), any(), any(), any(), any(), any()) +assert(numResults == 6) + } + + // TODO(wuyi) test this after the fix Review comment: should be fixed by #27786. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 opened a new pull request #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group
Ngone51 opened a new pull request #27786: [SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group URL: https://github.com/apache/spark/pull/27786 ### What changes were proposed in this pull request? This is a bug fix of #27280. This PR fix the bug where ShuffleBlockFetcherIterator may forget to create request for the last block group. ### Why are the changes needed? When all blocks.sum < `targetRemoteRequestSize` and all blocks.length > `maxBlocksInFlightPerAddress` and (last block group).size < `maxBlocksInFlightPerAddress`, `ShuffleBlockFetcherIterator` will not create a request for the last group. Thus, it will lost data for the reduce task. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Should be covered by #27767. And I tested it locally to verify this fix. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR
AmplabJenkins removed a comment on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR URL: https://github.com/apache/spark/pull/27570#issuecomment-594359985 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119280/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR
AmplabJenkins commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR URL: https://github.com/apache/spark/pull/27570#issuecomment-594359976 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR
AmplabJenkins commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR URL: https://github.com/apache/spark/pull/27570#issuecomment-594359985 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119280/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR
AmplabJenkins removed a comment on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR URL: https://github.com/apache/spark/pull/27570#issuecomment-594359976 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR
SparkQA commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR URL: https://github.com/apache/spark/pull/27570#issuecomment-594359632 **[Test build #119280 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119280/testReport)** for PR 27570 at commit [`2156bed`](https://github.com/apache/spark/commit/2156bed223ec28279fbaa18e2bc0f8c47ade7d0d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27767: [SPARK-31017][TEST][CORE] Test for shuffle requests packaging with different size and numBlocks limit
SparkQA commented on issue #27767: [SPARK-31017][TEST][CORE] Test for shuffle requests packaging with different size and numBlocks limit URL: https://github.com/apache/spark/pull/27767#issuecomment-594359699 **[Test build #119284 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119284/testReport)** for PR 27767 at commit [`73803b7`](https://github.com/apache/spark/commit/73803b76c0c2d4ab1b5451eced23f1a1f41dd9f0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR
SparkQA removed a comment on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR URL: https://github.com/apache/spark/pull/27570#issuecomment-594338843 **[Test build #119280 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119280/testReport)** for PR 27570 at commit [`2156bed`](https://github.com/apache/spark/commit/2156bed223ec28279fbaa18e2bc0f8c47ade7d0d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release
AmplabJenkins removed a comment on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release URL: https://github.com/apache/spark/pull/27785#issuecomment-594359072 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release
AmplabJenkins removed a comment on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release URL: https://github.com/apache/spark/pull/27785#issuecomment-594359077 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119282/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release
AmplabJenkins commented on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release URL: https://github.com/apache/spark/pull/27785#issuecomment-594359072 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release
AmplabJenkins commented on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release URL: https://github.com/apache/spark/pull/27785#issuecomment-594359077 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119282/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release
SparkQA commented on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release URL: https://github.com/apache/spark/pull/27785#issuecomment-594358944 **[Test build #119282 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119282/testReport)** for PR 27785 at commit [`2ef974e`](https://github.com/apache/spark/commit/2ef974e276db711d623a61e40f558d09789f9bc8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release
SparkQA removed a comment on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release URL: https://github.com/apache/spark/pull/27785#issuecomment-594355182 **[Test build #119282 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119282/testReport)** for PR 27785 at commit [`2ef974e`](https://github.com/apache/spark/commit/2ef974e276db711d623a61e40f558d09789f9bc8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27767: [SPARK-31017][TEST][CORE] Test for shuffle requests packaging with different size and numBlocks limit
AmplabJenkins removed a comment on issue #27767: [SPARK-31017][TEST][CORE] Test for shuffle requests packaging with different size and numBlocks limit URL: https://github.com/apache/spark/pull/27767#issuecomment-594357803 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27767: [SPARK-31017][TEST][CORE] Test for shuffle requests packaging with different size and numBlocks limit
AmplabJenkins removed a comment on issue #27767: [SPARK-31017][TEST][CORE] Test for shuffle requests packaging with different size and numBlocks limit URL: https://github.com/apache/spark/pull/27767#issuecomment-594357806 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24024/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys
AmplabJenkins removed a comment on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys URL: https://github.com/apache/spark/pull/27772#issuecomment-594357748 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys
AmplabJenkins removed a comment on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys URL: https://github.com/apache/spark/pull/27772#issuecomment-594357754 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24023/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27767: [SPARK-31017][TEST][CORE] Test for shuffle requests packaging with different size and numBlocks limit
AmplabJenkins commented on issue #27767: [SPARK-31017][TEST][CORE] Test for shuffle requests packaging with different size and numBlocks limit URL: https://github.com/apache/spark/pull/27767#issuecomment-594357806 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24024/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys
AmplabJenkins commented on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys URL: https://github.com/apache/spark/pull/27772#issuecomment-594357748 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27767: [SPARK-31017][TEST][CORE] Test for shuffle requests packaging with different size and numBlocks limit
AmplabJenkins commented on issue #27767: [SPARK-31017][TEST][CORE] Test for shuffle requests packaging with different size and numBlocks limit URL: https://github.com/apache/spark/pull/27767#issuecomment-594357803 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys
AmplabJenkins commented on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys URL: https://github.com/apache/spark/pull/27772#issuecomment-594357754 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24023/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys
SparkQA commented on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys URL: https://github.com/apache/spark/pull/27772#issuecomment-594357303 **[Test build #119283 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119283/testReport)** for PR 27772 at commit [`b201c9e`](https://github.com/apache/spark/commit/b201c9e68d7f8c5d3e4145b10f0820c247ca4ad8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release
AmplabJenkins removed a comment on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release URL: https://github.com/apache/spark/pull/27785#issuecomment-594355503 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24022/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release
AmplabJenkins commented on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release URL: https://github.com/apache/spark/pull/27785#issuecomment-594355503 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24022/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release
AmplabJenkins commented on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release URL: https://github.com/apache/spark/pull/27785#issuecomment-594355496 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release
AmplabJenkins removed a comment on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release URL: https://github.com/apache/spark/pull/27785#issuecomment-594355496 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release
SparkQA commented on issue #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release URL: https://github.com/apache/spark/pull/27785#issuecomment-594355182 **[Test build #119282 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119282/testReport)** for PR 27785 at commit [`2ef974e`](https://github.com/apache/spark/commit/2ef974e276db711d623a61e40f558d09789f9bc8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao commented on a change in pull request #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release
huaxingao commented on a change in pull request #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release URL: https://github.com/apache/spark/pull/27785#discussion_r387474968 ## File path: docs/ml-migration-guide.md ## @@ -33,16 +33,65 @@ Please refer [Migration Guide: SQL, Datasets and DataFrame](sql-migration-guide. * `OneHotEncoder` which is deprecated in 2.3, is removed in 3.0 and `OneHotEncoderEstimator` is now renamed to `OneHotEncoder`. * `org.apache.spark.ml.image.ImageSchema.readImages` which is deprecated in 2.3, is removed in 3.0, use `spark.read.format('image')` instead. +* `org.apache.spark.mllib.clustering.KMeans.train` with param Int `runs` which is deprecated in 2.1, is removed in 3.0, use `train` method without `runs` instead. +* `org.apache.spark.mllib.classification.LogisticRegressionWithSGD` which is deprecated in 2.0, is removed in 3.0, use `org.apache.spark.ml.classification.LogisticRegression` or `spark.mllib.classification.LogisticRegressionWithLBFGS` instead. +* `org.apache.spark.mllib.feature.ChiSqSelectorModel.isSorted ` which is deprecated in 2.1, is removed in 3.0, is not intended for subclasses to use. +* `org.apache.spark.mllib.regression.RidgeRegressionWithSGD` which is deprecated in 2.0, is removed in 3.0, use `org.apache.spark.ml.regression.LinearRegression` with `elasticNetParam` = 0.0. Note the default `regParam` is 0.01 for `RidgeRegressionWithSGD`, but is 0.0 for `LinearRegression`. +* `org.apache.spark.mllib.regression.LassoWithSGD` which is deprecated in 2.0, is removed in 3.0, use `org.apache.spark.ml.regression.LinearRegression` with `elasticNetParam` = 1.0. Note the default `regParam` is 0.01 for `LassoWithSGD`, but is 0.0 for `LinearRegression`. +* `org.apache.spark.mllib.regression.LinearRegressionWithSGD` which is deprecated in 2.0, is removed in 3.0, use `org.apache.spark.ml.regression.LinearRegression` or `LBFGS` instead. +* `org.apache.spark.mllib.clustering.KMeans.getRuns` and `setRuns` which are deprecated in 2.1, is removed in 3.0, have no effect since Spark 2.0.0. +* `org.apache.spark.ml.LinearSVCModel.setWeightCol` which is deprecated in 2.4, is removed in 3.0, is not intended for users. +* From 3.0, `org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel` extends `MultilayerPerceptronParams` to expose the training params. As a result, layers in `MultilayerPerceptronClassificationModel` has been changed from Array[Int] to IntArrayParam. Users should use `MultilayerPerceptronClassificationModel.getLayers` instead of `MultilayerPerceptronClassificationModel.layers` to retrieve the size of layers. +* `org.apache.spark.ml.classification.GBTClassifier.numTrees` which is deprecated in 2.4.5, is removed in 3.0, use `getNumTrees` instead. +* `org.apache.spark.ml.clustering.KMeansModel.computeCost` which is deprecated in 2.4, is removed in 3.0, use `ClusteringEvaluator` instead. +* `org.apache.spark.mllib.evaluation.MulticlassMetrics.precision` which is deprecated in 2.0, is removed in 3.0, use `accuracy` instead. +* `org.apache.spark.mllib.evaluation.MulticlassMetrics.recall` which is deprecated in 2.0, is removed in 3.0, use `accuracy` instead. +* `org.apache.spark.mllib.evaluation.MulticlassMetrics.fMeasure` which is deprecated in 2.0, is removed in 3.0, use `accuracy` instead. +* `org.apache.spark.ml.util.GeneralMLWriter.context` which is deprecated in 2.0, is removed in 3.0, use `session` instead. +* `org.apache.spark.ml.util.MLWriter.context` which is deprecated in 2.0, is removed in 3.0, use `session` instead. +* `org.apache.spark.ml.util.MLReader.context` which is deprecated in 2.0, is removed in 3.0, use `session` instead. +* `abstract class UnaryTransformer[IN, OUT, T <: UnaryTransformer[IN, OUT, T]]` is changed to `abstract class UnaryTransformer[IN: TypeTag, OUT: TypeTag, T <: UnaryTransformer[IN, OUT, T]]` in 3.0. -### Changes of behavior +### Deprecations and changes of behavior {:.no_toc} +**Deprecations** + +* [SPARK-11215](https://issues.apache.org/jira/browse/SPARK-11215): +`labels` in `StringIndexer` is deprecated and will be removed in 3.1.0. Use `labelsArray` instead. +* [SPARK-25758](https://issues.apache.org/jira/browse/SPARK-25758): +`computeCost` in `BisectingKMeans` is deprecated and will be removed in future versions. Use `ClusteringEvaluator` instead. + +**Changes of behavior** Review comment: Do we need to include the newly added stuff in this section? For example, ```[SPARK-29960](https://issues.apache.org/jira/browse/SPARK-29960) `hammingLoss` support is addd to MulticlassClassificationEvaluator```? If we do, then there are a lot more to add. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this
[GitHub] [spark] gengliangwang commented on a change in pull request #27765: [SPARK-31014][CORE] InMemoryStore: remove key from parentToChildrenMap when removing key from CountingRemoveIfForEach
gengliangwang commented on a change in pull request #27765: [SPARK-31014][CORE] InMemoryStore: remove key from parentToChildrenMap when removing key from CountingRemoveIfForEach URL: https://github.com/apache/spark/pull/27765#discussion_r387474359 ## File path: common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java ## @@ -177,7 +177,7 @@ public void clear() { * iterators. https://bugs.openjdk.java.net/browse/JDK-8078645 */ private static class CountingRemoveIfForEach implements BiConsumer, T> { - private final ConcurrentMap, T> data; + private final InstanceList data; Review comment: Nit: how about rename this to `list` or `instanceList`? It would be more readable. There is another `private final ConcurrentMap, InstanceList> data = new ConcurrentHashMap<>();` in `InMemoryLists`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on a change in pull request #27765: [SPARK-31014][CORE] InMemoryStore: remove key from parentToChildrenMap when removing key from CountingRemoveIfForEach
gengliangwang commented on a change in pull request #27765: [SPARK-31014][CORE] InMemoryStore: remove key from parentToChildrenMap when removing key from CountingRemoveIfForEach URL: https://github.com/apache/spark/pull/27765#discussion_r387474359 ## File path: common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java ## @@ -177,7 +177,7 @@ public void clear() { * iterators. https://bugs.openjdk.java.net/browse/JDK-8078645 */ private static class CountingRemoveIfForEach implements BiConsumer, T> { - private final ConcurrentMap, T> data; + private final InstanceList data; Review comment: Nit: how about rename this to `list` or `instanceList`? It would be more readable. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao opened a new pull request #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release
huaxingao opened a new pull request #27785: [SPARK-30934][ML][DOCS] Update ml-guide and ml-migration-guide for 3.0 release URL: https://github.com/apache/spark/pull/27785 ### What changes were proposed in this pull request? Update ml-guide and ml-migration-guide for 3.0. ### Why are the changes needed? This is required for each release. ### Does this PR introduce any user-facing change? Yes. ![image](https://user-images.githubusercontent.com/13592258/75851710-478fa180-5d9f-11ea-9a1e-47e97c764f2e.png) ![image](https://user-images.githubusercontent.com/13592258/75851734-524a3680-5d9f-11ea-9eb6-6124a9663e47.png) ![image](https://user-images.githubusercontent.com/13592258/75851754-5c6c3500-5d9f-11ea-9542-d5feb664d49a.png) ### How was this patch tested? Manually build and check. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#discussion_r387471894 ## File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ## @@ -3954,6 +3954,17 @@ object functions { def to_json(e: Column): Column = to_json(e, Map.empty[String, String]) + /** Review comment: I believe it should be allowed to be used directly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables
AmplabJenkins removed a comment on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables URL: https://github.com/apache/spark/pull/27776#issuecomment-594347699 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables
SparkQA removed a comment on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables URL: https://github.com/apache/spark/pull/27776#issuecomment-594255590 **[Test build #119267 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119267/testReport)** for PR 27776 at commit [`b7c02b7`](https://github.com/apache/spark/commit/b7c02b77ce4eb91ef30c00cb3608a2bc0d681811). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables
AmplabJenkins removed a comment on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables URL: https://github.com/apache/spark/pull/27776#issuecomment-594347705 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119267/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables
AmplabJenkins commented on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables URL: https://github.com/apache/spark/pull/27776#issuecomment-594347699 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables
AmplabJenkins commented on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables URL: https://github.com/apache/spark/pull/27776#issuecomment-594347705 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119267/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27765: [SPARK-31014][CORE] InMemoryStore: remove key from parentToChildrenMap when removing key from CountingRemoveIfForEach
cloud-fan commented on a change in pull request #27765: [SPARK-31014][CORE] InMemoryStore: remove key from parentToChildrenMap when removing key from CountingRemoveIfForEach URL: https://github.com/apache/spark/pull/27765#discussion_r387467686 ## File path: common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java ## @@ -278,7 +276,22 @@ public void put(T value) throws Exception { public boolean delete(Object key) { boolean entryExists = data.remove(asKey(key)) != null; - if (entryExists && hasNaturalParentIndex) { + if (entryExists) { +deleteParentIndex(key); Review comment: we should only do it when `hasNaturalParentIndex=true`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27765: [SPARK-31014][CORE] InMemoryStore: remove key from parentToChildrenMap when removing key from CountingRemoveIfForEach
cloud-fan commented on a change in pull request #27765: [SPARK-31014][CORE] InMemoryStore: remove key from parentToChildrenMap when removing key from CountingRemoveIfForEach URL: https://github.com/apache/spark/pull/27765#discussion_r387467686 ## File path: common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java ## @@ -278,7 +276,22 @@ public void put(T value) throws Exception { public boolean delete(Object key) { boolean entryExists = data.remove(asKey(key)) != null; - if (entryExists && hasNaturalParentIndex) { + if (entryExists) { +deleteParentIndex(key); Review comment: we should only do it when `hasNaturalParentIndex=true`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387467415 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -438,60 +438,66 @@ object DataSourceStrategy { } private def translateLeafNodeFilter(predicate: Expression): Option[Filter] = predicate match { -case expressions.EqualTo(a: Attribute, Literal(v, t)) => - Some(sources.EqualTo(a.name, convertToScala(v, t))) -case expressions.EqualTo(Literal(v, t), a: Attribute) => - Some(sources.EqualTo(a.name, convertToScala(v, t))) - -case expressions.EqualNullSafe(a: Attribute, Literal(v, t)) => - Some(sources.EqualNullSafe(a.name, convertToScala(v, t))) -case expressions.EqualNullSafe(Literal(v, t), a: Attribute) => - Some(sources.EqualNullSafe(a.name, convertToScala(v, t))) - -case expressions.GreaterThan(a: Attribute, Literal(v, t)) => - Some(sources.GreaterThan(a.name, convertToScala(v, t))) -case expressions.GreaterThan(Literal(v, t), a: Attribute) => - Some(sources.LessThan(a.name, convertToScala(v, t))) - -case expressions.LessThan(a: Attribute, Literal(v, t)) => - Some(sources.LessThan(a.name, convertToScala(v, t))) -case expressions.LessThan(Literal(v, t), a: Attribute) => - Some(sources.GreaterThan(a.name, convertToScala(v, t))) - -case expressions.GreaterThanOrEqual(a: Attribute, Literal(v, t)) => - Some(sources.GreaterThanOrEqual(a.name, convertToScala(v, t))) -case expressions.GreaterThanOrEqual(Literal(v, t), a: Attribute) => - Some(sources.LessThanOrEqual(a.name, convertToScala(v, t))) - -case expressions.LessThanOrEqual(a: Attribute, Literal(v, t)) => - Some(sources.LessThanOrEqual(a.name, convertToScala(v, t))) -case expressions.LessThanOrEqual(Literal(v, t), a: Attribute) => - Some(sources.GreaterThanOrEqual(a.name, convertToScala(v, t))) - -case expressions.InSet(a: Attribute, set) => - val toScala = CatalystTypeConverters.createToScalaConverter(a.dataType) - Some(sources.In(a.name, set.toArray.map(toScala))) +case expressions.EqualTo(PushableColumn(name), Literal(v, t)) => + Some(sources.EqualTo(name, convertToScala(v, t))) +case expressions.EqualTo(Literal(v, t), PushableColumn(name)) => + Some(sources.EqualTo(name, convertToScala(v, t))) + +case expressions.EqualNullSafe(PushableColumn(name), Literal(v, t)) => + Some(sources.EqualNullSafe(name, convertToScala(v, t))) +case expressions.EqualNullSafe(Literal(v, t), PushableColumn(name)) => + Some(sources.EqualNullSafe(name, convertToScala(v, t))) + +case expressions.GreaterThan(PushableColumn(name), Literal(v, t)) => + Some(sources.GreaterThan(name, convertToScala(v, t))) +case expressions.GreaterThan(Literal(v, t), PushableColumn(name)) => + Some(sources.LessThan(name, convertToScala(v, t))) + +case expressions.LessThan(PushableColumn(name), Literal(v, t)) => + Some(sources.LessThan(name, convertToScala(v, t))) +case expressions.LessThan(Literal(v, t), PushableColumn(name)) => + Some(sources.GreaterThan(name, convertToScala(v, t))) + +case expressions.GreaterThanOrEqual(PushableColumn(name), Literal(v, t)) => + Some(sources.GreaterThanOrEqual(name, convertToScala(v, t))) +case expressions.GreaterThanOrEqual(Literal(v, t), PushableColumn(name)) => + Some(sources.LessThanOrEqual(name, convertToScala(v, t))) + +case expressions.LessThanOrEqual(PushableColumn(name), Literal(v, t)) => + Some(sources.LessThanOrEqual(name, convertToScala(v, t))) +case expressions.LessThanOrEqual(Literal(v, t), PushableColumn(name)) => + Some(sources.GreaterThanOrEqual(name, convertToScala(v, t))) + +case expressions.InSet(e: Expression, set) => e match { Review comment: Seems we don't need `: Expression` here too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387466848 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -635,3 +641,16 @@ object DataSourceStrategy { } } } + +/** + * Find the column name of an expression that can be pushed down. + */ +private[sql] object PushableColumn { Review comment: @dbtsai, if this is `private[sql]` for the test purpose, we could just `private[datasources]`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables
SparkQA commented on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables URL: https://github.com/apache/spark/pull/27776#issuecomment-594346981 **[Test build #119267 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119267/testReport)** for PR 27776 at commit [`b7c02b7`](https://github.com/apache/spark/commit/b7c02b77ce4eb91ef30c00cb3608a2bc0d681811). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on issue #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines
zhengruifeng commented on issue #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines URL: https://github.com/apache/spark/pull/27546#issuecomment-594342913 @yma11 Yes, but there are still some algorithms have not been migrated to `.ml`. BTW, did you also change `.mllib.linalg.BLAS` in your KMeans tests? Since KMeans is actually implemented in the `.mllib` side, so `.mllib.linalg.BLAS` is used instead of `.ml.linalg.BLAS`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #27732: [SPARK-30984][SS]Add UI test for Structured Streaming UI
cloud-fan closed pull request #27732: [SPARK-30984][SS]Add UI test for Structured Streaming UI URL: https://github.com/apache/spark/pull/27732 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #27732: [SPARK-30984][SS]Add UI test for Structured Streaming UI
cloud-fan commented on issue #27732: [SPARK-30984][SS]Add UI test for Structured Streaming UI URL: https://github.com/apache/spark/pull/27732#issuecomment-594341571 thanks, merging to master/3.0! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594341056 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24021/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594341046 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594341046 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594341056 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24021/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594340673 **[Test build #119281 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119281/testReport)** for PR 27778 at commit [`09826d1`](https://github.com/apache/spark/commit/09826d1502be444118fe1aad30c43f6ba4f8af58). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR
AmplabJenkins removed a comment on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR URL: https://github.com/apache/spark/pull/27570#issuecomment-594339189 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24020/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR
AmplabJenkins removed a comment on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR URL: https://github.com/apache/spark/pull/27570#issuecomment-594339181 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR
AmplabJenkins commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR URL: https://github.com/apache/spark/pull/27570#issuecomment-594339189 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24020/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
AmplabJenkins removed a comment on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet URL: https://github.com/apache/spark/pull/27728#issuecomment-594338745 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119266/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
AmplabJenkins removed a comment on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet URL: https://github.com/apache/spark/pull/27728#issuecomment-594338740 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR
AmplabJenkins commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR URL: https://github.com/apache/spark/pull/27570#issuecomment-594339181 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
AmplabJenkins commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet URL: https://github.com/apache/spark/pull/27728#issuecomment-594338740 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables
cloud-fan commented on a change in pull request #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables URL: https://github.com/apache/spark/pull/27776#discussion_r387459156 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala ## @@ -769,9 +771,9 @@ class SessionCatalog( desc = metadata, output = metadata.schema.toAttributes, child = parser.parsePlan(viewText)) - SubqueryAlias(table, db, child) + SubqueryAlias(multiParts, child) } else { - SubqueryAlias(table, db, UnresolvedCatalogRelation(metadata)) + SubqueryAlias(multiParts, UnresolvedCatalogRelation(metadata)) Review comment: Yea let's handle 3 parts qualifier, to avoid perf regression. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
AmplabJenkins commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet URL: https://github.com/apache/spark/pull/27728#issuecomment-594338745 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119266/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR
SparkQA commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR URL: https://github.com/apache/spark/pull/27570#issuecomment-594338843 **[Test build #119280 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119280/testReport)** for PR 27570 at commit [`2156bed`](https://github.com/apache/spark/commit/2156bed223ec28279fbaa18e2bc0f8c47ade7d0d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #27774: [SPARK-31023][SQL] Support foldable schemas by `from_json`
MaxGekk commented on a change in pull request #27774: [SPARK-31023][SQL] Support foldable schemas by `from_json` URL: https://github.com/apache/spark/pull/27774#discussion_r387459222 ## File path: sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala ## @@ -653,4 +653,18 @@ class JsonFunctionsSuite extends QueryTest with SharedSparkSession { assert(json_tuple_result === len) } } + + test("support foldable schema by from_json") { +val options = Map[String, String]().asJava +val schema = regexp_replace(lit("dpt_org_id INT, dpt_org_city STRING"), "dpt_org_", "") Review comment: > I couldn't come up with the case where the foldable expression is used. For example, you import data from another db by dumping the data to csv files. You take the schema from the dbms, and find out difference in types - the dbms uses VARCHAR(100) for strings. By using replace, you could replace it by STRING. Or text format of schema could be different, so, using Spark's string functions you could correct it. If you cannot edit it on-the-fly, you have to hardcoded schema in your app, and take care of sync. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
SparkQA removed a comment on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet URL: https://github.com/apache/spark/pull/27728#issuecomment-594251307 **[Test build #119266 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119266/testReport)** for PR 27728 at commit [`2c65b17`](https://github.com/apache/spark/commit/2c65b1753de2306742c491ad34b610dd01294023). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
SparkQA commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet URL: https://github.com/apache/spark/pull/27728#issuecomment-594338181 **[Test build #119266 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119266/testReport)** for PR 27728 at commit [`2c65b17`](https://github.com/apache/spark/commit/2c65b1753de2306742c491ad34b610dd01294023). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR
huaxingao commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR URL: https://github.com/apache/spark/pull/27570#issuecomment-594337920 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function
iRakson commented on a change in pull request #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#discussion_r387458054 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ## @@ -781,3 +782,59 @@ case class SchemaOfJson( override def prettyName: String = "schema_of_json" } + +/** + * A function that returns number of elements in outer Json Array. + */ +@ExpressionDescription( + usage = "_FUNC_(jsonArray) - Returns length of the jsonArray", Review comment: arguments description is added. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#issuecomment-594337296 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24019/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#issuecomment-594337291 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#issuecomment-594337296 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24019/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#issuecomment-594337291 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#issuecomment-594336960 **[Test build #119279 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119279/testReport)** for PR 27759 at commit [`58a9e0d`](https://github.com/apache/spark/commit/58a9e0dfdb7d72bc8ffd67b5c69f4b52a65e4b5e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xuanyuanking commented on issue #27782: [SPARK-30289][FOLLOWUP][DOC] Update the migration guide for `spark.sql.legacy.ctePrecedencePolicy`
xuanyuanking commented on issue #27782: [SPARK-30289][FOLLOWUP][DOC] Update the migration guide for `spark.sql.legacy.ctePrecedencePolicy` URL: https://github.com/apache/spark/pull/27782#issuecomment-594336485 Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots
AmplabJenkins commented on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots URL: https://github.com/apache/spark/pull/27780#issuecomment-594336248 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119265/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots
AmplabJenkins commented on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots URL: https://github.com/apache/spark/pull/27780#issuecomment-594336243 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots
AmplabJenkins removed a comment on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots URL: https://github.com/apache/spark/pull/27780#issuecomment-594336243 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots
AmplabJenkins removed a comment on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots URL: https://github.com/apache/spark/pull/27780#issuecomment-594336248 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119265/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots
SparkQA commented on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots URL: https://github.com/apache/spark/pull/27780#issuecomment-594335739 **[Test build #119265 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119265/testReport)** for PR 27780 at commit [`ecf1f9d`](https://github.com/apache/spark/commit/ecf1f9d8470c667ebf05094f9791e2c1f5455bab). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots
SparkQA removed a comment on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots URL: https://github.com/apache/spark/pull/27780#issuecomment-594247247 **[Test build #119265 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119265/testReport)** for PR 27780 at commit [`ecf1f9d`](https://github.com/apache/spark/commit/ecf1f9d8470c667ebf05094f9791e2c1f5455bab). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594335461 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24018/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yma11 edited a comment on issue #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines
yma11 edited a comment on issue #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines URL: https://github.com/apache/spark/pull/27546#issuecomment-594335144 mllib package is in maintenance mode that's why change applied in ml. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594335459 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594335461 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24018/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594335459 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594335198 **[Test build #119278 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119278/testReport)** for PR 27778 at commit [`d08f0ed`](https://github.com/apache/spark/commit/d08f0ed1e49c3610597a8bf88e407007eec02cd8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job
AmplabJenkins commented on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job URL: https://github.com/apache/spark/pull/27784#issuecomment-594335045 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job
AmplabJenkins commented on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job URL: https://github.com/apache/spark/pull/27784#issuecomment-594335056 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119275/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dbtsai commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
dbtsai commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594335106 @HyukjinKwon @dongjoon-hyun @viirya @rdblue @cloud-fan Some changes based on @HyukjinKwon 's suggestion to use extractor pattern have been done. Would like to have your final reviews. Thanks, This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job
AmplabJenkins removed a comment on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job URL: https://github.com/apache/spark/pull/27784#issuecomment-594335045 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yma11 commented on issue #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines
yma11 commented on issue #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines URL: https://github.com/apache/spark/pull/27546#issuecomment-594335144 > n the `.mllib` side. mllib package is in maintenance mode that why change applied in ml. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org