[GitHub] spark pull request: [SPARK-9152][SQL] Implement code generation fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7561#issuecomment-123352359 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9019][YARN] Add RM delegation token to ...
Github user bolkedebruin commented on the pull request: https://github.com/apache/spark/pull/7489#issuecomment-123366143 @tgravescs I have added a full debug log to the jira issue (SPARK-9019). For this cluster the job enters accepted state but never running. I don't think I have seen another issue, but you might see one. @harishreedharan I have commented out the body of the function getDriverLogUrls in YarnClusterSchedulerBackend and building now. I hope I have enough time to test it tonight. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9081][SPARK-9168][SQL] nanvl dropna/f...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/7523 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8695][CORE][MLlib] TreeAggregation shou...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7397#issuecomment-123373169 [Test build #37956 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37956/consoleFull) for PR 7397 at commit [`041620c`](https://github.com/apache/spark/commit/041620c93dc72010bb0907c0c5363808878d2496). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8915] [Documentation, MLlib] Added @sin...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/7371 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9152][SQL] Implement code generation fo...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7561#issuecomment-123388415 [Test build #43 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SlowSparkPullRequestBuilder/43/console) for PR 7561 at commit [`aea58e0`](https://github.com/apache/spark/commit/aea58e0737a60a9f3dcdab49c2c8dfd66d1f8e49). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9152][SQL] Implement code generation fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7561#issuecomment-123388600 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9152][SQL] Implement code generation fo...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7561#issuecomment-123392783 [Test build #37955 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37955/console) for PR 7561 at commit [`aea58e0`](https://github.com/apache/spark/commit/aea58e0737a60a9f3dcdab49c2c8dfd66d1f8e49). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-4751] Dynamic allocation in standa...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/7532#discussion_r35123818 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/ApplicationInfo.scala --- @@ -96,6 +109,47 @@ private[spark] class ApplicationInfo( private[master] def coresLeft: Int = requestedCores - coresGranted + /** + * Return the number of executors waiting to be scheduled once space frees up. + * + * This is only defined if the application explicitly set the executor limit. For instance, + * if an application asks for 8 executors but there is only space for 5, then there will be + * 3 waiting executors. + */ + private[master] def numWaitingExecutors: Int = { +if (executorLimit != Integer.MAX_VALUE) { + math.max(0, executorLimit - executors.size) +} else { + 0 +} + } + + /** + * Add a worker to the blacklist, called when the executor running on the worker is killed. + * This is used only if cores per executor is not set. + */ + private[master] def blacklistWorker(workerId: String): Unit = { +blacklistedWorkers += workerId + } + + /** + * Remove workers from the blacklist, called when the application requests new executors. + * This is used only if cores per executor is not set. + */ + private[master] def removeFromBlacklist(numWorkers: Int): Unit = { +blacklistedWorkers.take(numWorkers).foreach { workerId = --- End diff -- drop returns a copy --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9024] [WIP] Unsafe HashJoin
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7480#issuecomment-123370771 [Test build #1149 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1149/consoleFull) for PR 7480 at commit [`1a40f02`](https://github.com/apache/spark/commit/1a40f02df481263d7dc25aa5b96157e2f6a5380f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9019][YARN] Add RM delegation token to ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/7489#issuecomment-123371110 hmm, perhaps its an interaction with the RM Ha: 15/07/21 16:02:35 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm1 15/07/21 16:02:35 INFO retry.RetryInvocationHandler: Exception while invoking getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm1 after 6 fail over attempts. Trying to fail over immediately. we don't use RM HA. Either way I think we should just remove the getNodeReport call in https://issues.apache.org/jira/browse/SPARK-8988. Then this wouldn't be an issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure shuffle metadata a...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4055#discussion_r35121095 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -739,6 +742,88 @@ class DAGSchedulerSuite assertDataStructuresEmpty() } + test(verify not submit next stage while not have registered mapStatus) { +val firstRDD = new MyRDD(sc, 3, Nil) +val firstShuffleDep = new ShuffleDependency(firstRDD, null) +val firstShuffleId = firstShuffleDep.shuffleId +val shuffleMapRdd = new MyRDD(sc, 3, List(firstShuffleDep)) +val shuffleDep = new ShuffleDependency(shuffleMapRdd, null) +val reduceRdd = new MyRDD(sc, 1, List(shuffleDep)) +submit(reduceRdd, Array(0)) + +// things start out smoothly, stage 0 completes with no issues +complete(taskSets(0), Seq( + (Success, makeMapStatus(hostB, shuffleMapRdd.partitions.size)), + (Success, makeMapStatus(hostB, shuffleMapRdd.partitions.size)), + (Success, makeMapStatus(hostA, shuffleMapRdd.partitions.size)) +)) + +// then one executor dies, and a task fails in stage 1 +runEvent(ExecutorLost(exec-hostA)) +runEvent(CompletionEvent(taskSets(1).tasks(0), + FetchFailed(null, firstShuffleId, 2, 0, Fetch failed), + null, null, createFakeTaskInfo(), null)) + +// so we resubmit stage 0, which completes happily +Thread.sleep(1000) +val stage0Resubmit = taskSets(2) +assert(stage0Resubmit.stageId == 0) +assert(stage0Resubmit.stageAttemptId === 1) +val task = stage0Resubmit.tasks(0) +assert(task.partitionId === 2) +runEvent(CompletionEvent(task, Success, + makeMapStatus(hostC, shuffleMapRdd.partitions.size), null, createFakeTaskInfo(), null)) + +// now here is where things get tricky : we will now have a task set representing +// the second attempt for stage 1, but we *also* have some tasks for the first attempt for +// stage 1 still going +val stage1Resubmit = taskSets(3) +assert(stage1Resubmit.stageId == 1) +assert(stage1Resubmit.stageAttemptId === 1) +assert(stage1Resubmit.tasks.length === 3) + +// we'll have some tasks finish from the first attempt, and some finish from the second attempt, +// so that we actually have all stage outputs, though no attempt has completed all its +// tasks +runEvent(CompletionEvent(taskSets(3).tasks(0), Success, + makeMapStatus(hostC, reduceRdd.partitions.size), null, createFakeTaskInfo(), null)) +runEvent(CompletionEvent(taskSets(3).tasks(1), Success, + makeMapStatus(hostC, reduceRdd.partitions.size), null, createFakeTaskInfo(), null)) +// late task finish from the first attempt +runEvent(CompletionEvent(taskSets(1).tasks(2), Success, + makeMapStatus(hostB, reduceRdd.partitions.size), null, createFakeTaskInfo(), null)) + +// What should happen now is that we submit stage 2. However, we might not see an error +// b/c of DAGScheduler's error handling (it tends to swallow errors and just log them). But +// we can check some conditions. +// Note that the really important thing here is not so much that we submit stage 2 *immediately* +// but that we don't end up with some error from these interleaved completions. It would also +// be OK (though sub-optimal) if stage 2 simply waited until the resubmission of stage 1 had +// all its tasks complete + +// check that we have all the map output for stage 0 (it should have been there even before +// the last round of completions from stage 1, but just to double check it hasn't been messed +// up) +(0 until 3).foreach { reduceIdx = + val arr = mapOutputTracker.getServerStatuses(0, reduceIdx) + assert(arr != null) + assert(arr.nonEmpty) --- End diff -- `getServerStatuses` has been removed in master -- I guess both of these should be ```scala val statuses = mapOutputTracker.getMapSizesByExecutorId(0, reduceIdx) assert(statuses != null) assert(statuses.nonEmpty) ``` The new code will now throw an exception if we're missing the map output data, but I feel like its probably still good to leave those asserts in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure shuffle metadata a...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4055#discussion_r35121106 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -739,6 +742,88 @@ class DAGSchedulerSuite assertDataStructuresEmpty() } + test(verify not submit next stage while not have registered mapStatus) { +val firstRDD = new MyRDD(sc, 3, Nil) +val firstShuffleDep = new ShuffleDependency(firstRDD, null) +val firstShuffleId = firstShuffleDep.shuffleId +val shuffleMapRdd = new MyRDD(sc, 3, List(firstShuffleDep)) +val shuffleDep = new ShuffleDependency(shuffleMapRdd, null) +val reduceRdd = new MyRDD(sc, 1, List(shuffleDep)) +submit(reduceRdd, Array(0)) + +// things start out smoothly, stage 0 completes with no issues +complete(taskSets(0), Seq( + (Success, makeMapStatus(hostB, shuffleMapRdd.partitions.size)), + (Success, makeMapStatus(hostB, shuffleMapRdd.partitions.size)), + (Success, makeMapStatus(hostA, shuffleMapRdd.partitions.size)) +)) + +// then one executor dies, and a task fails in stage 1 +runEvent(ExecutorLost(exec-hostA)) +runEvent(CompletionEvent(taskSets(1).tasks(0), + FetchFailed(null, firstShuffleId, 2, 0, Fetch failed), + null, null, createFakeTaskInfo(), null)) + +// so we resubmit stage 0, which completes happily +Thread.sleep(1000) +val stage0Resubmit = taskSets(2) +assert(stage0Resubmit.stageId == 0) +assert(stage0Resubmit.stageAttemptId === 1) +val task = stage0Resubmit.tasks(0) +assert(task.partitionId === 2) +runEvent(CompletionEvent(task, Success, + makeMapStatus(hostC, shuffleMapRdd.partitions.size), null, createFakeTaskInfo(), null)) + +// now here is where things get tricky : we will now have a task set representing +// the second attempt for stage 1, but we *also* have some tasks for the first attempt for +// stage 1 still going +val stage1Resubmit = taskSets(3) +assert(stage1Resubmit.stageId == 1) +assert(stage1Resubmit.stageAttemptId === 1) +assert(stage1Resubmit.tasks.length === 3) + +// we'll have some tasks finish from the first attempt, and some finish from the second attempt, +// so that we actually have all stage outputs, though no attempt has completed all its +// tasks +runEvent(CompletionEvent(taskSets(3).tasks(0), Success, + makeMapStatus(hostC, reduceRdd.partitions.size), null, createFakeTaskInfo(), null)) +runEvent(CompletionEvent(taskSets(3).tasks(1), Success, + makeMapStatus(hostC, reduceRdd.partitions.size), null, createFakeTaskInfo(), null)) +// late task finish from the first attempt +runEvent(CompletionEvent(taskSets(1).tasks(2), Success, + makeMapStatus(hostB, reduceRdd.partitions.size), null, createFakeTaskInfo(), null)) + +// What should happen now is that we submit stage 2. However, we might not see an error +// b/c of DAGScheduler's error handling (it tends to swallow errors and just log them). But +// we can check some conditions. +// Note that the really important thing here is not so much that we submit stage 2 *immediately* +// but that we don't end up with some error from these interleaved completions. It would also +// be OK (though sub-optimal) if stage 2 simply waited until the resubmission of stage 1 had +// all its tasks complete + +// check that we have all the map output for stage 0 (it should have been there even before +// the last round of completions from stage 1, but just to double check it hasn't been messed +// up) +(0 until 3).foreach { reduceIdx = + val arr = mapOutputTracker.getServerStatuses(0, reduceIdx) + assert(arr != null) + assert(arr.nonEmpty) +} + +// and check we have all the map output for stage 1 +(0 until 1).foreach { reduceIdx = + val arr = mapOutputTracker.getServerStatuses(1, reduceIdx) + assert(arr != null) + assert(arr.nonEmpty) --- End diff -- same here --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8906][SQL] Move all internal data sourc...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7565#issuecomment-123394190 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8922] [Documentation, MLlib] Add @since...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7429#issuecomment-123394101 [Test build #37958 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37958/console) for PR 7429 at commit [`59dc104`](https://github.com/apache/spark/commit/59dc104c0336bc09501b172faffd04e8b5c567d0). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9189][CORE] Takes locality and the sum ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7536#issuecomment-123370835 [Test build #37951 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37951/console) for PR 7536 at commit [`cb72d0f`](https://github.com/apache/spark/commit/cb72d0f2ce1432ad58246fbeae60c4565fbb4ce7). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8357] Fix unsafe memory leak on empty i...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/7560#issuecomment-123378564 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9193] Avoid assigning tasks to lost e...
Github user GraceH commented on the pull request: https://github.com/apache/spark/pull/7528#issuecomment-123378688 Thanks @squito. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8364][SPARKR] Add crosstab to SparkR Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/7318#issuecomment-123386484 Hmm okay. Lets leave it as `crosstab` in this PR -- Before the release I'll try to do one more pass over the API and we can revisit this if required. Other than the minor unit test comment this looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8906][SQL] Move all internal data sourc...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/7565#issuecomment-123393312 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9193] Avoid assigning tasks to lost e...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/7528 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9189][CORE] Takes locality and the sum ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7536#issuecomment-123366165 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9193] Avoid assigning tasks to lost e...
Github user squito commented on the pull request: https://github.com/apache/spark/pull/7528#issuecomment-123369818 yeah, I don't love the idea of adding things w/out tests, but in this case I suppose its best left for the future. lgtm pending the tests passing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9019][YARN] Add RM delegation token to ...
Github user bolkedebruin commented on the pull request: https://github.com/apache/spark/pull/7489#issuecomment-123377041 True, but SPARK-8988 mentiones that Node Report API is not available in a secure cluster. Which is not true (my patch basically enables it). So I am - personally - fine with removing it, but being not available in a secure cluster should not be the reason I would say. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8357] Fix unsafe memory leak on empty i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7560#issuecomment-123380341 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8671] [ML]. Added isotonic regression t...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/7517#issuecomment-123380348 @jkbradley Isotonic regression expects a single feature instead of a feature vector. Do we want to make it a `Regressor` and use `featuresCol` as a param? One common use case of isotonic regression is to calibrate probabilities output by logistic regression. However, logistic regression only outputs probabilities as vectors (of size 2). It would be hard to connect logistic regression with isotonic regression. Any suggestions? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8357] Fix unsafe memory leak on empty i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7560#issuecomment-123380296 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8922] [Documentation, MLlib] Add @since...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7429#issuecomment-123383741 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8922] [Documentation, MLlib] Add @since...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7429#issuecomment-123383681 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8922] [Documentation, MLlib] Add @since...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7429#issuecomment-123384128 [Test build #37958 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37958/consoleFull) for PR 7429 at commit [`59dc104`](https://github.com/apache/spark/commit/59dc104c0336bc09501b172faffd04e8b5c567d0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9152][SQL] Implement code generation fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7561#issuecomment-123392901 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8906][SQL] Move all internal data sourc...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7565#issuecomment-123394206 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9209] Using executor allocation, a exec...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/7559#issuecomment-123397022 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9152][SQL] Implement code generation fo...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7561#issuecomment-123355527 [Test build #37955 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37955/consoleFull) for PR 7561 at commit [`aea58e0`](https://github.com/apache/spark/commit/aea58e0737a60a9f3dcdab49c2c8dfd66d1f8e49). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9121][SparkR] Get rid of the warnings a...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/7567#discussion_r35114903 --- Diff: R/pkg/inst/tests/test_sparkSQL.R --- @@ -21,10 +21,10 @@ context(SparkSQL functions) # Utility function for easily checking the values of a StructField checkStructField - function(actual, expectedName, expectedType, expectedNullable) { - expect_equal(class(actual), structField) - expect_equal(actual$name(), expectedName) - expect_equal(actual$dataType.toString(), expectedType) - expect_equal(actual$nullable(), expectedNullable) + testthat::expect_equal(class(actual), structField) --- End diff -- We don't need to do this -- instead we can include `library(testthat)` in our `lint-R` script as Jenkins and developers who run unit tests should have this package installed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8695][CORE][MLlib] TreeAggregation shou...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7397#issuecomment-123369725 [Test build #1148 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1148/console) for PR 7397 at commit [`041620c`](https://github.com/apache/spark/commit/041620c93dc72010bb0907c0c5363808878d2496). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8695][CORE][MLlib] TreeAggregation shou...
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/7397#issuecomment-123371365 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9193] Avoid assigning tasks to lost e...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7528#issuecomment-123381479 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9193] Avoid assigning tasks to lost e...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7528#issuecomment-123381194 [Test build #37954 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37954/console) for PR 7528 at commit [`ecc1da6`](https://github.com/apache/spark/commit/ecc1da60869554211fa053908778a2abc1656160). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8922] [Documentation, MLlib] Add @since...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/7429#issuecomment-123382351 @coderxiang Could you help review this PR? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8922] [Documentation, MLlib] Add @since...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/7429#issuecomment-123382296 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8915] [Documentation, MLlib] Added @sin...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/7371#issuecomment-123382153 Merged into master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9019][YARN] Add RM delegation token to ...
Github user harishreedharan commented on the pull request: https://github.com/apache/spark/pull/7489#issuecomment-123382315 That exception message is misleading. The message catch block assumes the failure happened because the API is not available but it is because of security issues. On Tuesday, July 21, 2015, bolkedebruin notificati...@github.com wrote: True, but SPARK-8988 mentiones that Node Report API is not available in a secure cluster. Which is not true (my patch basically enables it). So I am - personally - fine with removing it, but being not available in a secure cluster should not be the reason I would say. — Reply to this email directly or view it on GitHub https://github.com/apache/spark/pull/7489#issuecomment-123377041. -- Thanks, Hari --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8906][SQL] Move all internal data sourc...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7565#issuecomment-123394368 [Test build #37959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37959/consoleFull) for PR 7565 at commit [`7661aff`](https://github.com/apache/spark/commit/7661aff472de1bcddc91d9bd325d8572abf69474). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5989] [MLlib] Model save/load for LDA
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6948#issuecomment-123395868 Sounds good, --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9152][SQL] Implement code generation fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7561#issuecomment-123352192 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9152][SQL] Implement code generation fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7561#issuecomment-123351206 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9152][SQL] Implement code generation fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7561#issuecomment-123351305 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9152][SQL] Implement code generation fo...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7561#issuecomment-123351651 [Test build #43 has started](https://amplab.cs.berkeley.edu/jenkins/job/SlowSparkPullRequestBuilder/43/consoleFull) for PR 7561 at commit [`aea58e0`](https://github.com/apache/spark/commit/aea58e0737a60a9f3dcdab49c2c8dfd66d1f8e49). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9189][CORE] Takes locality and the sum ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7536#issuecomment-123366100 [Test build #42 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SlowSparkPullRequestBuilder/42/console) for PR 7536 at commit [`cb72d0f`](https://github.com/apache/spark/commit/cb72d0f2ce1432ad58246fbeae60c4565fbb4ce7). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9081][SPARK-9168][SQL] nanvl dropna/f...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/7523#issuecomment-123369477 Merging this into master, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9189][CORE] Takes locality and the sum ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7536#issuecomment-123370978 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8695][CORE][MLlib] TreeAggregation shou...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/7397#issuecomment-123376608 @piganesh You don't need to make a new PR for updates. You can push new commits to your remote branch, which you used to create the PR. Please address @srowen 's comment and remove `(...)` around `numPartitions`. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure shuffle metadata a...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4055#discussion_r35121122 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -749,7 +834,7 @@ class DAGSchedulerSuite *| \ | *| \ | *|\ | - * reduceRdd1reduceRdd2 + * reduceRdd1reduceRddi2 --- End diff -- looks like an accidental change --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure shuffle metadata a...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4055#discussion_r35121200 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -17,6 +17,8 @@ package org.apache.spark.scheduler +import org.apache.spark.shuffle.MetadataFetchFailedException + --- End diff -- this is not used, delete --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure shuffle metadata a...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/4055#discussion_r35121783 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -487,8 +487,8 @@ private[spark] class TaskSetManager( // a good proxy to task serialization time. // val timeTaken = clock.getTime() - startTime val taskName = stask ${info.id} in stage ${taskSet.id} - logInfo(Starting %s (TID %d, %s, %s, %d bytes).format( - taskName, taskId, host, taskLocality, serializedTask.limit)) + logInfo(sStarting $taskName (TID $taskId, $host, ${task.partitionId}, + +s$taskLocality, ${serializedTask.limit} bytes)) --- End diff -- I like the inclusion of the partitionId in the msg, but can you add a partition label in there, eg ```scala logInfo(sStarting $taskName (TID $taskId, $host, partition ${task.partitionId}, + s$taskLocality, ${serializedTask.limit} bytes)) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8935][SQL] Implement code generation fo...
Github user yjshen commented on the pull request: https://github.com/apache/spark/pull/7365#issuecomment-123396367 More comments on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9019][YARN] Add RM delegation token to ...
Github user bolkedebruin commented on the pull request: https://github.com/apache/spark/pull/7489#issuecomment-123396052 @harishreedharan your SPARK-8988 trace looks very much like (or is exactly the same) as mine when testing on a CDH 5.4 cluster. With my patch the token is added and getDriverLogUrls works in a secure setup (you might say SPARK-9019 duplicates SPARK-8988) So if you mean the error happens because of a security issue because of a missing token then I would say my patch fixes that issue. If there are other issues for some reason then maybe it is indeed smart to remove the call in general as @tgravescs and change it it to get the information from the environment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-4751] Dynamic allocation in standa...
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/7532#discussion_r35173556 --- Diff: core/src/main/scala/org/apache/spark/deploy/client/AppClient.scala --- @@ -256,4 +272,33 @@ private[spark] class AppClient( endpoint = null } } + + /** + * Request executors from the Master by specifying the total number desired, + * including existing pending and running executors. + * + * @return whether the request is acknowledged. + */ + def requestTotalExecutors(requestedTotal: Int): Boolean = { --- End diff -- is it necessary to validate the value of `requestedTotal`, like `= 0`? though negative numbers does not bring any impact on the correctness of the program (if I understand code correctly) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8484] [ML]. Added TrainValidationSplit ...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/7337#discussion_r35173552 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.ml.tuning + +import scala.reflect.ClassTag + +import org.apache.spark.Logging +import org.apache.spark.annotation.Experimental +import org.apache.spark.ml.evaluation.Evaluator +import org.apache.spark.ml.{Estimator, Model} +import org.apache.spark.ml.param.{DoubleParam, ParamMap, ParamValidators} +import org.apache.spark.ml.util.Identifiable +import org.apache.spark.rdd.{RDD, PartitionwiseSampledRDD} +import org.apache.spark.sql.DataFrame +import org.apache.spark.sql.types.StructType +import org.apache.spark.util.Utils +import org.apache.spark.util.random.BernoulliCellSampler + +/** + * Params for [[TrainValidatorSplit]] and [[TrainValidatorSplitModel]]. + */ +private[ml] trait TrainValidatorSplitParams extends ValidatorParams { + /** + * Param for ratio between train and validation data. Must be between 0 and 1. + * Default: 0.75 + * @group param + */ + val trainRatio: DoubleParam = new DoubleParam(this, trainRatio, +ratio between training set and validation set (= 0 = 1), ParamValidators.inRange(0, 1)) + + /** @group getParam */ + def getTrainRatio: Double = $(trainRatio) + + setDefault(trainRatio - 0.75) +} + +/** + * :: Experimental :: + * Validation for hyper-parameter tuning. + * Randomly splits the input dataset into train and validation sets, + * and uses evaluation metric on the validation set to select the best model. + * Similar to [[CrossValidator]], but only splits the set once. + */ +@Experimental +class TrainValidatorSplit(override val uid: String) extends Estimator[TrainValidatorSplitModel] + with TrainValidatorSplitParams with Logging { + + def this() = this(Identifiable.randomUID(tvs)) + + /** @group setParam */ + def setEstimator(value: Estimator[_]): this.type = set(estimator, value) + + /** @group setParam */ + def setEstimatorParamMaps(value: Array[ParamMap]): this.type = set(estimatorParamMaps, value) + + /** @group setParam */ + def setEvaluator(value: Evaluator): this.type = set(evaluator, value) + + /** @group setParam */ + def setTrainRatio(value: Double): this.type = set(trainRatio, value) + + private[this] def sample[T: ClassTag]( + rdd: RDD[T], + lb: Double, + ub: Double, + seed: Int = Utils.random.nextInt()): (RDD[T], RDD[T]) = { --- End diff -- Should the method be a one-liner: `val (train, validation) = df.randomSplit([trainRatio, 1 - trainRatio], seed)`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8484] [ML]. Added TrainValidationSplit ...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/7337#discussion_r35173547 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.ml.tuning + +import scala.reflect.ClassTag + +import org.apache.spark.Logging +import org.apache.spark.annotation.Experimental +import org.apache.spark.ml.evaluation.Evaluator +import org.apache.spark.ml.{Estimator, Model} +import org.apache.spark.ml.param.{DoubleParam, ParamMap, ParamValidators} +import org.apache.spark.ml.util.Identifiable +import org.apache.spark.rdd.{RDD, PartitionwiseSampledRDD} +import org.apache.spark.sql.DataFrame +import org.apache.spark.sql.types.StructType +import org.apache.spark.util.Utils +import org.apache.spark.util.random.BernoulliCellSampler + +/** + * Params for [[TrainValidatorSplit]] and [[TrainValidatorSplitModel]]. + */ +private[ml] trait TrainValidatorSplitParams extends ValidatorParams { --- End diff -- `TrainValidatorSplit` - `TrainValidationSplit` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9180] fix spark-shell to accept --name ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7512#issuecomment-123525589 [Test build #38007 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38007/consoleFull) for PR 7512 at commit [`e24991a`](https://github.com/apache/spark/commit/e24991a6195bf21ff765ccbc02cb8f64b14437f0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7254][MLlib] Run PowerIterationClusteri...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/6054#discussion_r35174305 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala --- @@ -152,7 +152,28 @@ class PowerIterationClustering private[clustering] ( } this } - + + /** + * Run the PIC algorithm on Graph. + * + * @param graph an affinity matrix represented as graph, which is the matrix A in the PIC paper. + * The similarity s,,ij,, represented as the edge between vertices (i, j) must --- End diff -- fix indentation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7254][MLlib] Run PowerIterationClusteri...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/6054#discussion_r35174307 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala --- @@ -213,6 +234,29 @@ object PowerIterationClustering extends Logging { case class Assignment(id: Long, cluster: Int) /** + * Normalizes the affinity graph (A) and returns the normalized affinity matrix (W). + */ + private[clustering] + def normalize(graph: Graph[Double, Double]): Graph[Double, Double] = { +val vD = graph.aggregateMessages[Double]( + sendMsg = ctx = { +val i = ctx.srcId +val j = ctx.dstId +val s = ctx.attr +if (s 0.0) { + throw new SparkException(Similarity must be nonnegative but found s($i, $j) = $s.) +} +ctx.sendToSrc(s) --- End diff -- Add `if s 0.0`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8850] [SQL] [WIP] Enable Unsafe mode by...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7564#issuecomment-123534982 **[Test build #37990 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37990/console)** for PR 7564 at commit [`5464206`](https://github.com/apache/spark/commit/54642067fd49794cb29882a7cdc0fb0bb16180b1) after a configured wait of `175m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8850] [SQL] [WIP] Enable Unsafe mode by...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7564#issuecomment-123535090 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8484] [ML]. Added TrainValidationSplit ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7337#issuecomment-123539941 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8484] [ML]. Added TrainValidationSplit ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7337#issuecomment-123539981 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8881] Fix algorithm for scheduling exec...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7274#issuecomment-123548432 [Test build #38019 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38019/consoleFull) for PR 7274 at commit [`da0f491`](https://github.com/apache/spark/commit/da0f491a930e8b7f7a761ff666afaac5cbb13aaa). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8232][SQL] Add sort_array support
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7581#issuecomment-123548392 [Test build #38018 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38018/consoleFull) for PR 7581 at commit [`f7974ce`](https://github.com/apache/spark/commit/f7974ceb971e0e9d6f37d526ad7b6efe1e172ea1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4366] [SQL] Aggregation Improvement
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7458#issuecomment-123558966 [Test build #38023 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38023/consoleFull) for PR 7458 at commit [`35b0520`](https://github.com/apache/spark/commit/35b05207dc329173b6c59778830fbbf59752128a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4366] [SQL] Aggregation Improvement
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/7458#issuecomment-123559602 I will merge this one once it passes jenkins to unblock other work. If you have any comments to this, feel free to leave them at here. I will address them in a follow-up PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9024] Unsafe HashJoin/HashOuterJoin/Has...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7480#issuecomment-123561145 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9121][SparkR] Get rid of the warnings a...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7567#issuecomment-123561274 **[Test build #38009 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38009/console)** for PR 7567 at commit [`c8cfd63`](https://github.com/apache/spark/commit/c8cfd63cdca66a9429565e9546a1d4f05a913c60) after a configured wait of `175m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9024] Unsafe HashJoin/HashOuterJoin/Has...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7480#issuecomment-123561237 [Test build #38028 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38028/consoleFull) for PR 7480 at commit [`6294b1e`](https://github.com/apache/spark/commit/6294b1e3de357c94646c323eba2d4bde80971c45). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9024] Unsafe HashJoin/HashOuterJoin/Has...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7480#issuecomment-123561157 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9238][SQL]two extra useless entries for...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7582#issuecomment-123561073 [Test build #38026 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38026/consoleFull) for PR 7582 at commit [`8bddd01`](https://github.com/apache/spark/commit/8bddd0143cff2f24e17a5b3ed53103f6fd59e4fb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8264][SQL]add substring_index function
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/7533#discussion_r35181361 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -356,6 +358,92 @@ case class StringInstr(str: Expression, substr: Expression) } /** + * Returns the substring from string str before count occurrences of the delimiter delim. + * If count is positive, everything the left of the final delimiter (counting from left) is + * returned. If count is negative, every to the right of the final delimiter (counting from the + * right) is returned. substring_index performs a case-sensitive match when searching for delim. + */ +case class Substring_index(strExpr: Expression, delimExpr: Expression, countExpr: Expression) + extends Expression with ImplicitCastInputTypes with CodegenFallback { + + override def dataType: DataType = StringType + override def inputTypes: Seq[DataType] = Seq(StringType, StringType, IntegerType) + override def nullable: Boolean = strExpr.nullable || delimExpr.nullable || countExpr.nullable + override def children: Seq[Expression] = Seq(strExpr, delimExpr, countExpr) + override def prettyName: String = substring_index + override def toString: String = ssubstring_index($strExpr, $delimExpr, $countExpr) + + override def eval(input: InternalRow): Any = { +val str = strExpr.eval(input) +val delim = delimExpr.eval(input) +val count = countExpr.eval(input) +if (str == null || delim == null || count == null) { + null +} else { + subStrIndex( +str.asInstanceOf[UTF8String], +delim.asInstanceOf[UTF8String], +count.asInstanceOf[Int]) +} + } + + private def lastOrdinalIndexOf( +str: UTF8String, searchStr: UTF8String, ordinal: Int, lastIndex: Boolean = false): Int = { +ordinalIndexOf(str, searchStr, ordinal, true) + } + + private def ordinalIndexOf( + str: UTF8String, searchStr: UTF8String, ordinal: Int, lastIndex: Boolean = false): Int = { +if (str == null || searchStr == null || ordinal = 0) { + return -1 +} +val strNumChars = str.numChars() +if (searchStr.numBytes() == 0) { + return if (lastIndex) {strNumChars} else {0} +} +var found = 0 +var index = if (lastIndex) {strNumChars} else {0} +do { + if (lastIndex) { +index = str.lastIndexOf(searchStr, index - 1) + } else { +index = str.indexOf(searchStr, index + 1) + } + if (index 0) { +return index + } + found += 1 +} while (found ordinal) +index + } + + private def subStrIndex(strUtf8: UTF8String, delimUtf8: UTF8String, count: Int): UTF8String = { +if (strUtf8 == null || delimUtf8 == null || count == null) { + return null +} +if (strUtf8.numBytes() == 0 || delimUtf8.numBytes() == 0 || count == 0) { + return UTF8String.fromString() --- End diff -- `UTF8String.EMPTY_UTF8` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4233] [SPARK-4367] [SPARK-3947] [SPARK-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7458#issuecomment-123562244 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP] [SPARK-8176] [SPARK-8197] [SQL] Udf to_d...
GitHub user adrian-wang reopened a pull request: https://github.com/apache/spark/pull/6988 [WIP] [SPARK-8176] [SPARK-8197] [SQL] Udf to_date/ trunc I'll add unit test/function registry/codegen after #6782 get in. You can merge this pull request into a Git repository by running: $ git pull https://github.com/adrian-wang/spark udftodatetruc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6988.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6988 commit 662e2bfdad48640016d6afc0e67eb628a71549b1 Author: Daoyuan Wang daoyuan.w...@intel.com Date: 2015-06-24T11:57:48Z to_date commit 450159cd9e6d645350192cfeee6f950c32406960 Author: Daoyuan Wang daoyuan.w...@intel.com Date: 2015-06-24T13:27:33Z udf trunc --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9053][SparkR] Fix spaces around parens,...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/7584#discussion_r35183469 --- Diff: R/pkg/inst/tests/test_sparkSQL.R --- @@ -664,10 +664,10 @@ test_that(column binary mathfunctions, { expect_equal(collect(select(df, atan2(df$a, df$b)))[2, ATAN2(a, b)], atan2(2, 6)) expect_equal(collect(select(df, atan2(df$a, df$b)))[3, ATAN2(a, b)], atan2(3, 7)) expect_equal(collect(select(df, atan2(df$a, df$b)))[4, ATAN2(a, b)], atan2(4, 8)) - expect_equal(collect(select(df, hypot(df$a, df$b)))[1, HYPOT(a, b)], sqrt(1^2 + 5^2)) - expect_equal(collect(select(df, hypot(df$a, df$b)))[2, HYPOT(a, b)], sqrt(2^2 + 6^2)) - expect_equal(collect(select(df, hypot(df$a, df$b)))[3, HYPOT(a, b)], sqrt(3^2 + 7^2)) - expect_equal(collect(select(df, hypot(df$a, df$b)))[4, HYPOT(a, b)], sqrt(4^2 + 8^2)) + expect_equal(collect(select(df, hypot(df$a, df$b)))[1, HYPOT(a, b)], sqrt(1 ^ 2 + 5 ^ 2)) --- End diff -- I'm not sure we should change these. Its more readable to have `1^2` rather than `1 ^ 2`. Could we add a style ignore around these 4 lines alone ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8231][SQL] Add array_contains
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7580#issuecomment-123526078 [Test build #38008 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38008/consoleFull) for PR 7580 at commit [`6e01e53`](https://github.com/apache/spark/commit/6e01e53eccc41b4216f7ab2f0f8e7f879aaf689c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6485] [MLlib] [Python] Add CoordinateMa...
Github user dusenberrymw commented on the pull request: https://github.com/apache/spark/pull/7554#issuecomment-123526084 @mengxr Thanks for the thoughts! I'll trim this PR down to just the Python wrappers, and then open another JIRA up for further discussion on adding a DistributedMatrices class to Scala. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8231][SQL] Add array_contains
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7580#issuecomment-123526251 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9154][SQL] Rename formatString to forma...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7579#issuecomment-123526235 [Test build #37998 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37998/console) for PR 7579 at commit [`53ee54f`](https://github.com/apache/spark/commit/53ee54f570660caa5cfbbad1e4cd42e6f0e2adf7). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-4751] Dynamic allocation in standa...
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/7532#discussion_r35173745 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -1387,8 +1374,6 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli * This is currently only supported in YARN mode. Return whether the request is received. --- End diff -- outdated comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8269][SQL]string function: initcap
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7208#issuecomment-123527415 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9232] [SQL] Duplicate code in JSONRelat...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7576#issuecomment-123527437 [Test build #1151 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1151/consoleFull) for PR 7576 at commit [`ea80803`](https://github.com/apache/spark/commit/ea808034de4a5e358535cc82e58501e85f4d9d9d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8269][SQL]string function: initcap
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7208#issuecomment-123527425 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9121][SparkR] Get rid of the warnings a...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7567#issuecomment-123532059 [Test build #37997 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37997/console) for PR 7567 at commit [`1a03987`](https://github.com/apache/spark/commit/1a0398735a113869d75bbce2d864d109ff7f0920). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8935][SQL] Implement code generation fo...
Github user yjshen commented on a diff in the pull request: https://github.com/apache/spark/pull/7365#discussion_r35175062 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -418,51 +418,518 @@ case class Cast(child: Expression, dataType: DataType) protected override def nullSafeEval(input: Any): Any = cast(input) override def genCode(ctx: CodeGenContext, ev: GeneratedExpressionCode): String = { -// TODO: Add support for more data types. -(child.dataType, dataType) match { +val nullSafeCast = nullSafeCastFunction(child.dataType, dataType, ctx) +if (nullSafeCast != null) { + val eval = child.gen(ctx) + eval.code + +castCode(ctx, eval.primitive, eval.isNull, ev.primitive, ev.isNull, dataType, nullSafeCast) +} else { + super.genCode(ctx, ev) +} + } + + // three function arguments are: child.primitive, result.primitive and result.isNull + // it returns the code snippets to be put in null safe evaluation region + private[this] type CastFunction = (String, String, String) = String + + private[this] def nullSafeCastFunction( + from: DataType, + to: DataType, + ctx: CodeGenContext): CastFunction = to match { + +case _ if from == NullType = (c, evPrim, evNull) = s$evNull = true; +case _ if to == from = (c, evPrim, evNull) = s$evPrim = $c; +case StringType = castToStringCode(from, ctx) +case BinaryType = castToBinaryCode(from) +case DateType = castToDateCode(from, ctx) +case decimal: DecimalType = castToDecimalCode(from, decimal) +case TimestampType = castToTimestampCode(from, ctx) +case IntervalType = castToIntervalCode(from) +case BooleanType = castToBooleanCode(from) +case ByteType = castToByteCode(from) +case ShortType = castToShortCode(from) +case IntegerType = castToIntCode(from) +case FloatType = castToFloatCode(from) +case LongType = castToLongCode(from) +case DoubleType = castToDoubleCode(from) + +case array: ArrayType = castArrayCode(from.asInstanceOf[ArrayType], array, ctx) +case map: MapType = castMapCode(from.asInstanceOf[MapType], map, ctx) +case struct: StructType = castStructCode(from.asInstanceOf[StructType], struct, ctx) +case other = null + } + + private[this] def castCode(ctx: CodeGenContext, childPrim: String, childNull: String, +resultPrim: String, resultNull: String, resultType: DataType, cast: CastFunction): String = { +s + boolean $resultNull = $childNull; + ${ctx.javaType(resultType)} $resultPrim = ${ctx.defaultValue(resultType)}; + if (!${childNull}) { +${cast(childPrim, resultPrim, resultNull)} + } + + } + + private[this] def castToStringCode(from: DataType, ctx: CodeGenContext): CastFunction = { +from match { + case BinaryType = +(c, evPrim, evNull) = s$evPrim = UTF8String.fromBytes($c); + case DateType = +(c, evPrim, evNull) = s$evPrim = UTF8String.fromString( + org.apache.spark.sql.catalyst.util.DateTimeUtils.dateToString($c)); + case TimestampType = +(c, evPrim, evNull) = s$evPrim = UTF8String.fromString( + org.apache.spark.sql.catalyst.util.DateTimeUtils.timestampToString($c)); + case _ = +(c, evPrim, evNull) = s$evPrim = UTF8String.fromString(String.valueOf($c)); +} + } + + private[this] def castToBinaryCode(from: DataType): CastFunction = from match { +case StringType = + (c, evPrim, evNull) = s$evPrim = $c.getBytes(); + } + + private[this] def castToDateCode( + from: DataType, + ctx: CodeGenContext): CastFunction = from match { +case StringType = + val intOpt = ctx.freshName(intOpt) + (c, evPrim, evNull) = s +scala.OptionInteger $intOpt = + org.apache.spark.sql.catalyst.util.DateTimeUtils.stringToDate($c); +if ($intOpt.isDefined()) { + $evPrim = ((Integer) $intOpt.get()).intValue(); +} else { + $evNull = true; +} + +case TimestampType = + (c, evPrim, evNull) = +s$evPrim = org.apache.spark.sql.catalyst.util.DateTimeUtils.millisToDays($c / 1000L);; +case _ = + (c, evPrim, evNull) = s$evNull = true; + } + + private[this] def changePrecision(d: String, decimalType: DecimalType, + evPrim: String, evNull: String): String = { +decimalType match { + case DecimalType.Unlimited = +s$evPrim = $d; + case DecimalType.Fixed(precision, scale) = +s + if
[GitHub] spark pull request: [SPARK-8231][SQL] Add array_contains
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7580#issuecomment-123531919 [Test build #38011 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38011/console) for PR 7580 at commit [`fe88d63`](https://github.com/apache/spark/commit/fe88d631cd9c27b7a2dd0871978536a5ba4f3d03). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class ArrayContains(left: Expression, right: Expression) extends BinaryExpression ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8231][SQL] Add array_contains
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7580#issuecomment-123531926 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9121][SparkR] Get rid of the warnings a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7567#issuecomment-123532304 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8881] Fix algorithm for scheduling exec...
Github user nishkamravi2 commented on a diff in the pull request: https://github.com/apache/spark/pull/7274#discussion_r35176367 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -543,59 +544,108 @@ private[master] class Master( * multiple executors from the same application may be launched on the same worker if the worker * has enough cores and memory. Otherwise, each executor grabs all the cores available on the * worker by default, in which case only one executor may be launched on each worker. + * + * It is important to allocate coresPerExecutor on each worker at a time (instead of 1 core + * at a time). Consider the following example: cluster has 4 workers with 16 cores each. + * User requests 3 executors (spark.cores.max = 48, spark.executor.cores = 16). If 1 core is + * allocated at a time, 12 cores from each worker would be assigned to each executor. + * Since 12 16, no executors would launch [SPARK-8881]. */ - private def startExecutorsOnWorkers(): Unit = { -// Right now this is a very simple FIFO scheduler. We keep trying to fit in the first app -// in the queue, then the second app, etc. + private[master] def scheduleExecutorsOnWorkers( + app: ApplicationInfo, + usableWorkers: Array[WorkerInfo], + spreadOutApps: Boolean): Array[Int] = { +// If the number of cores per executor is not specified, then we can just schedule +// 1 core at a time since we expect a single executor to be launched on each worker +val coresPerExecutor = app.desc.coresPerExecutor.getOrElse(1) +val memoryPerExecutor = app.desc.memoryPerExecutorMB +val numUsable = usableWorkers.length +val assignedCores = new Array[Int](numUsable) // Number of cores to give to each worker +val assignedMemory = new Array[Int](numUsable) // Amount of memory to give to each worker +var coresToAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum) +var pos = 0 +var lastCoresToAssign = coresToAssign if (spreadOutApps) { - // Try to spread out each app among all the workers, until it has all its cores - for (app - waitingApps if app.coresLeft 0) { -val usableWorkers = workers.toArray.filter(_.state == WorkerState.ALIVE) - .filter(worker = worker.memoryFree = app.desc.memoryPerExecutorMB -worker.coresFree = app.desc.coresPerExecutor.getOrElse(1)) - .sortBy(_.coresFree).reverse -val numUsable = usableWorkers.length -val assigned = new Array[Int](numUsable) // Number of cores to give on each node -var toAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum) -var pos = 0 -while (toAssign 0) { - if (usableWorkers(pos).coresFree - assigned(pos) 0) { -toAssign -= 1 -assigned(pos) += 1 - } - pos = (pos + 1) % numUsable + // Try to spread out executors among workers (sparse scheduling) + while (coresToAssign 0) { +if (usableWorkers(pos).coresFree - assignedCores(pos) = coresPerExecutor +usableWorkers(pos).memoryFree - assignedMemory(pos) = memoryPerExecutor) { + coresToAssign -= coresPerExecutor + assignedCores(pos) += coresPerExecutor + assignedMemory(pos) += memoryPerExecutor } -// Now that we've decided how many cores to give on each node, let's actually give them -for (pos - 0 until numUsable if assigned(pos) 0) { - allocateWorkerResourceToExecutors(app, assigned(pos), usableWorkers(pos)) +pos = (pos + 1) % numUsable +if (pos == 0) { + if (lastCoresToAssign == coresToAssign) { +return assignedCores + } + lastCoresToAssign = coresToAssign } } } else { - // Pack each app into as few workers as possible until we've assigned all its cores - for (worker - workers if worker.coresFree 0 worker.state == WorkerState.ALIVE) { -for (app - waitingApps if app.coresLeft 0) { - allocateWorkerResourceToExecutors(app, app.coresLeft, worker) + // Pack executors into as few workers as possible (dense scheduling) + while (coresToAssign 0) { +while (usableWorkers(pos).coresFree - assignedCores(pos) = coresPerExecutor + usableWorkers(pos).memoryFree - assignedMemory(pos) = memoryPerExecutor + coresToAssign 0) { + coresToAssign -= coresPerExecutor + assignedCores(pos) += coresPerExecutor + assignedMemory(pos) += memoryPerExecutor +} +pos = (pos
[GitHub] spark pull request: [SPARK-9216][Streaming] Define KinesisBackedBl...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7578#issuecomment-123544537 [Test build #38016 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38016/consoleFull) for PR 7578 at commit [`575bdbc`](https://github.com/apache/spark/commit/575bdbcc5ccf766ecaf324623e5d1204f7634224). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9216][Streaming] Define KinesisBackedBl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7578#issuecomment-123552016 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9216][Streaming] Define KinesisBackedBl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7578#issuecomment-123552029 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9222] [MLlib] Make class instantiation ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7573#issuecomment-123560226 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL][minor] remove literal in agg group expre...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/7583 [SQL][minor] remove literal in agg group expressions during analysis a follow-up of https://github.com/apache/spark/pull/4169 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark minor Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/7583.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #7583 commit a93293edc6b89b1be48686063bd12a25e72841d1 Author: Wenchen Fan cloud0...@outlook.com Date: 2015-07-22T04:29:32Z remove literal in agg group expressions during analysis --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org