[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168538194 **[Test build #48612 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48612/consoleFull)** for PR 10561 at commit [`696296d`](https://github.com/apache/spark/commit/696296df412f9e64b656aa63431df13b8e4d52c9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168540065 **[Test build #48614 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48614/consoleFull)** for PR 10561 at commit [`39f8d12`](https://github.com/apache/spark/commit/39f8d120f77423396293b279be5ca37e63759034). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168540700 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48614/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168540684 **[Test build #48614 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48614/consoleFull)** for PR 10561 at commit [`39f8d12`](https://github.com/apache/spark/commit/39f8d120f77423396293b279be5ca37e63759034). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168540699 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168561218 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168561184 **[Test build #48618 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48618/consoleFull)** for PR 10561 at commit [`a990a79`](https://github.com/apache/spark/commit/a990a79c8a28d8801b4c2ebd9b39d8d108a56195). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168561220 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48618/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12537] [SQL] Add option to accept quoti...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10497#issuecomment-168561815 @Cazen you probably want to add this email to your github profile so the commit shows up under your account: ca...@korea.com --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168561746 Does this actually change anything w.r.t. bytecode singature? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12486] Worker should kill the executors...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10438#issuecomment-168562638 **[Test build #48624 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48624/consoleFull)** for PR 10438 at commit [`9ed4377`](https://github.com/apache/spark/commit/9ed43770799350721d84817ad309bb8ebec2827b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Adding zipPartitions to PySpark
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168568081 **[Test build #48630 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48630/consoleFull)** for PR 10550 at commit [`d4b855d`](https://github.com/apache/spark/commit/d4b855d175cba948fc90c53387515772dcf65afe). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Adding zipPartitions to PySpark
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168568011 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/10542#issuecomment-168544367 Created a fix in https://github.com/apache/spark/pull/10564 , seems that the lag between the test being written and it getting merged had some bits change underneath it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12611][SQL][PYSPARK][TESTS] Fix test_in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10564#issuecomment-168548692 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48617/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168554230 **[Test build #48618 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48618/consoleFull)** for PR 10561 at commit [`a990a79`](https://github.com/apache/spark/commit/a990a79c8a28d8801b4c2ebd9b39d8d108a56195). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12612][PROJECT-INFRA][test-hadoop2.4] A...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/10565#discussion_r48699087 --- Diff: dev/run-tests.py --- @@ -295,13 +295,14 @@ def exec_sbt(sbt_args=()): def get_hadoop_profiles(hadoop_version): """ -For the given Hadoop version tag, return a list of SBT profile flags for +For the given Hadoop version tag, return a list of Maven/SBT profile flags for --- End diff -- These flags are used in both Maven and SBT, hence this clarification (to see this, grep the file for usages of `get_hadoop_profiles`). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12612][PROJECT-INFRA][test-hadoop2.4] A...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/10565#discussion_r48699084 --- Diff: dev/run-tests.py --- @@ -295,13 +295,14 @@ def exec_sbt(sbt_args=()): def get_hadoop_profiles(hadoop_version): """ -For the given Hadoop version tag, return a list of SBT profile flags for +For the given Hadoop version tag, return a list of Maven/SBT profile flags for building and testing against that Hadoop version. """ sbt_maven_hadoop_profiles = { "hadoop2.2": ["-Pyarn", "-Phadoop-2.2"], -"hadoop2.3": ["-Pyarn", "-Phadoop-2.3", "-Dhadoop.version=2.3.0"], --- End diff -- Setting `hadoop.version` is unnecessary here; it's already set in the profile. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-12196][Core] Store/retrieve blocks...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/10225#discussion_r48699576 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala --- @@ -53,35 +53,98 @@ private[spark] class DiskBlockManager(blockManager: BlockManager, conf: SparkCon private val shutdownHook = addShutdownHook() + private abstract class FileAllocationStrategy { +def apply(filename: String): File + +protected def getFile(filename: String, storageDirs: Array[File]): File = { + require(storageDirs.nonEmpty, "could not find file when the directories are empty") + + // Figure out which local directory it hashes to, and which subdirectory in that + val hash = Utils.nonNegativeHash(filename) + val dirId = localDirs.indexOf(storageDirs(hash % storageDirs.length)) + val subDirId = (hash / storageDirs.length) % subDirsPerLocalDir + + // Create the subdirectory if it doesn't already exist + val subDir = subDirs(dirId).synchronized { +val old = subDirs(dirId)(subDirId) +if (old != null) { + old +} else { + val newDir = new File(localDirs(dirId), "%02x".format(subDirId)) + if (!newDir.exists() && !newDir.mkdir()) { +throw new IOException(s"Failed to create local dir in $newDir.") + } + subDirs(dirId)(subDirId) = newDir + newDir +} + } + + new File(subDir, filename) +} + } + /** Looks up a file by hashing it into one of our local subdirectories. */ // This method should be kept in sync with // org.apache.spark.network.shuffle.ExternalShuffleBlockResolver#getFile(). - def getFile(filename: String): File = { -// Figure out which local directory it hashes to, and which subdirectory in that -val hash = Utils.nonNegativeHash(filename) -val dirId = hash % localDirs.length -val subDirId = (hash / localDirs.length) % subDirsPerLocalDir - -// Create the subdirectory if it doesn't already exist -val subDir = subDirs(dirId).synchronized { - val old = subDirs(dirId)(subDirId) - if (old != null) { -old - } else { -val newDir = new File(localDirs(dirId), "%02x".format(subDirId)) -if (!newDir.exists() && !newDir.mkdir()) { - throw new IOException(s"Failed to create local dir in $newDir.") -} -subDirs(dirId)(subDirId) = newDir -newDir + private object hashAllocator extends FileAllocationStrategy { +def apply(filename: String): File = getFile(filename, localDirs) + } + + /** Looks up a file by hierarchy way in different speed storage devices. */ + private val hierarchyStore = conf.getOption("spark.storage.hierarchyStore") + private class HierarchyAllocator extends FileAllocationStrategy { +case class LayerInfo(key: String, threshold: Long, dirs: Array[File]) +val hsSpecs: Array[(String, Long)] = + // e.g.: hierarchyStore = "ssd 200GB, hdd 100GB" + hierarchyStore.get.trim.split(",").map { +s => val x = s.trim.split(" +") + (x(0).toLowerCase, Utils.byteStringAsGb(x(1))) } +val hsLayers: Array[LayerInfo] = hsSpecs.map( + s => LayerInfo(s._1, s._2, localDirs.filter(_.getPath.toLowerCase.containsSlice(s._1))) +) +val lastLayerDirs = localDirs.filter(dir => !hsLayers.exists(_.dirs.contains(dir))) +val allLayers: Array[LayerInfo] = hsLayers :+ + LayerInfo("Last Storage", 10.toLong, lastLayerDirs) +val finalLayers: Array[LayerInfo] = allLayers.filter(_.dirs.nonEmpty) +logInfo("Hierarchy store info:") +for (layer <- finalLayers) { + logInfo("Layer: %s, Threshold: %dGB".format(layer.key, layer.threshold)) + layer.dirs.foreach { dir => logInfo("\t%s".format(dir.getCanonicalPath)) } } -new File(subDir, filename) +def apply(filename: String): File = { + var availableFile: File = null + for (layer <- finalLayers) { +val file = getFile(filename, layer.dirs) +if (file.exists()) return file + +if (availableFile == null && file.getParentFile.getUsableSpace>>30 >= layer.threshold) { --- End diff -- There are some issues here: 1. file.getParentFile probably returns null, the following is that I am trying under windows ```scala scala> val f = new java.io.File("c:/") f: java.io.File = c:\ scala> f.getParentFile res7: java.io.File = null ``` 2. getUsableSpaces returns available space in bytes, if we can save the threshold in bytes, then we don't need to convert the available
[GitHub] spark pull request: [WIP][SPARK-12196][Core] Store/retrieve blocks...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/10225#discussion_r48699693 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala --- @@ -53,35 +53,98 @@ private[spark] class DiskBlockManager(blockManager: BlockManager, conf: SparkCon private val shutdownHook = addShutdownHook() + private abstract class FileAllocationStrategy { +def apply(filename: String): File + +protected def getFile(filename: String, storageDirs: Array[File]): File = { + require(storageDirs.nonEmpty, "could not find file when the directories are empty") + + // Figure out which local directory it hashes to, and which subdirectory in that + val hash = Utils.nonNegativeHash(filename) + val dirId = localDirs.indexOf(storageDirs(hash % storageDirs.length)) + val subDirId = (hash / storageDirs.length) % subDirsPerLocalDir + + // Create the subdirectory if it doesn't already exist + val subDir = subDirs(dirId).synchronized { +val old = subDirs(dirId)(subDirId) +if (old != null) { + old +} else { + val newDir = new File(localDirs(dirId), "%02x".format(subDirId)) + if (!newDir.exists() && !newDir.mkdir()) { +throw new IOException(s"Failed to create local dir in $newDir.") + } + subDirs(dirId)(subDirId) = newDir + newDir +} + } + + new File(subDir, filename) +} + } + /** Looks up a file by hashing it into one of our local subdirectories. */ // This method should be kept in sync with // org.apache.spark.network.shuffle.ExternalShuffleBlockResolver#getFile(). --- End diff -- We need to sync up the `org.apache.spark.network.shuffle.ExternalShuffleBlockResolver#getFile()`, so this feature can also be used within the external shuffle service in dynamic allocation of yarn. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12537] [SQL] Add option to accept quoti...
Github user Cazen commented on the pull request: https://github.com/apache/spark/pull/10497#issuecomment-168565440 Thank you @rxin I've added my email address to github profile. I'm very pleased with your help. Have a good day! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Adding zipPartitions to PySpark
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168567526 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48626/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Adding zipPartitions to PySpark
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168567522 **[Test build #48626 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48626/consoleFull)** for PR 10550 at commit [`c9bf4a0`](https://github.com/apache/spark/commit/c9bf4a0aa2ce412d700c29bd9c1b850dd93b211a). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Adding zipPartitions to PySpark
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168567525 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10542#issuecomment-168568360 **[Test build #48628 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48628/consoleFull)** for PR 10542 at commit [`ee29dd2`](https://github.com/apache/spark/commit/ee29dd230702413238abed77fc6a7691bd4ff6a5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Outer Join Elimination by Parent Join Conditio...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10566#issuecomment-168569409 **[Test build #48631 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48631/consoleFull)** for PR 10566 at commit [`d6a6e9c`](https://github.com/apache/spark/commit/d6a6e9cc31b0f7547b35cf25884135ea65b03676). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12611][SQL][PYSPARK][TESTS] Fix test_in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10564#issuecomment-168548691 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12611][SQL][PYSPARK][TESTS] Fix test_in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10564#issuecomment-168548668 **[Test build #48617 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48617/consoleFull)** for PR 10564 at commit [`72009b7`](https://github.com/apache/spark/commit/72009b7715f404c2903a2a7d28f8c79fc2f49cfd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5682][Core] Add encrypted shuffle in sp...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8880#issuecomment-168568380 **[Test build #48629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48629/consoleFull)** for PR 8880 at commit [`6bcc1ef`](https://github.com/apache/spark/commit/6bcc1efe897a5680687b00488acfdf4e4ffb91b5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12486] Worker should kill the executors...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10438#issuecomment-168570140 **[Test build #48624 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48624/consoleFull)** for PR 10438 at commit [`9ed4377`](https://github.com/apache/spark/commit/9ed43770799350721d84817ad309bb8ebec2827b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12486] Worker should kill the executors...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10438#issuecomment-168570220 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12600][SQL] Remove deprecated methods i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10559#issuecomment-168569884 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12600][SQL] Remove deprecated methods i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10559#issuecomment-168569887 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48623/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12600][SQL] Remove deprecated methods i...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10559#issuecomment-168569815 **[Test build #48623 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48623/consoleFull)** for PR 10559 at commit [`4df345e`](https://github.com/apache/spark/commit/4df345e58cf8582f7a93a8e00e715854d5e2148d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12486] Worker should kill the executors...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10438#issuecomment-168570221 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48624/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12580][SQL] Remove string concatenation...
Github user kiszk commented on the pull request: https://github.com/apache/spark/pull/10524#issuecomment-168571357 I see. Sounds good. I will reformat them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10567#issuecomment-168573786 **[Test build #48633 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48633/consoleFull)** for PR 10567 at commit [`d516ed4`](https://github.com/apache/spark/commit/d516ed443b043261005c39917922f79094385358). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168539249 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48612/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168539244 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10359][PROJECT-INFRA] Use more random n...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10558#issuecomment-168540598 **[Test build #48615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48615/consoleFull)** for PR 10558 at commit [`a2d59e5`](https://github.com/apache/spark/commit/a2d59e53fd02286b5e99546a759732dd687950eb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12300][SQL][PYSPARK] fix schema inferan...
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/10275#issuecomment-168541454 I'm on vacation with less than great internet but I'll try and repro locally. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12612][PROJECT-INFRA][test-hadoop2.4] A...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10565#issuecomment-168556248 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168557925 @rxin, it looks like this has now passed MiMa checks in case you want to merge it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12340][SQL]fix Int overflow in the Spar...
Github user QiangCai commented on the pull request: https://github.com/apache/spark/pull/10562#issuecomment-168564772 I have removed some blank lines. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12612][PROJECT-INFRA][test-hadoop2.3] A...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10565#issuecomment-168564702 **[Test build #48620 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48620/consoleFull)** for PR 10565 at commit [`426c3e2`](https://github.com/apache/spark/commit/426c3e2fdea7e9a40a3149974a3f9b9912a0bb8e). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12562][SQL] DataFrame.write.format(text...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10515#issuecomment-168565338 **[Test build #2309 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2309/consoleFull)** for PR 10515 at commit [`d01ecac`](https://github.com/apache/spark/commit/d01ecac4966896cdd05698006c032d58f511c36b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Outer join elimination by parent join predicat...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/10566 Outer join elimination by parent join predicate This PR is another enhancement to Optimizer. It does not conflict with the other PRs (https://github.com/apache/spark/pull/10542 and https://github.com/apache/spark/pull/10551). Given an outer join is involved in another join (called parent join), when the join type of the parent join is inner, left-semi, left-outer and right-outer, checking if the join condition of the parent join satisfies the following two conditions: 1) there exist null filtering predicates against the columns in the null-supplying side of parent join. 2) these columns are from the child join. If having such join predicates, execute the elimination rules: - full outer -> inner if both sides of the child join have such predicates - left outer -> inner if the right side of the child join has such predicates - right outer -> inner if the left side of the child join has such predicates - full outer -> left outer if only the left side of the child join has such predicates - full outer -> right outer if only the right side of the child join has such predicates If applicable, this can greatly improve the performance, since outer join is much slower than inner join, full outer join is much slower than left/right outer join. BTW, since the rule is different from the rule in https://github.com/apache/spark/pull/10542, I did not merge them in the same one for simplifying the code review. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark OuterJoinEliminationByParentJoinPredicate Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10566.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10566 commit bde74f83e24c2dc9bd9fd9e5541362049594c972 Author: gatorsmileDate: 2016-01-03T17:45:38Z Merge remote-tracking branch 'upstream/master' into OuterJoinEliminationByParentJoinPredicate commit e18ba758aa94cc75115cc689f49b75ccd5d0ce51 Author: gatorsmile Date: 2016-01-04T02:21:59Z Merge remote-tracking branch 'upstream/master' into OuterJoinEliminationByParentJoinPredicate commit d6a6e9cc31b0f7547b35cf25884135ea65b03676 Author: gatorsmile Date: 2016-01-04T02:40:26Z outer join elimination by parent join. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11400][SQL] BroadcastNestedLoopJoin sho...
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/9351 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10542#issuecomment-168571474 **[Test build #48632 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48632/consoleFull)** for PR 10542 at commit [`63d5d62`](https://github.com/apache/spark/commit/63d5d62bcc719fc988510281d6ed0b4add4eaa3c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10567#issuecomment-168574127 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48633/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10359][PROJECT-INFRA] Use more random n...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10558#issuecomment-168546678 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48615/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-12196][Core] Store/retrieve blocks...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/10225#discussion_r48699527 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala --- @@ -53,35 +53,98 @@ private[spark] class DiskBlockManager(blockManager: BlockManager, conf: SparkCon private val shutdownHook = addShutdownHook() + private abstract class FileAllocationStrategy { +def apply(filename: String): File + +protected def getFile(filename: String, storageDirs: Array[File]): File = { + require(storageDirs.nonEmpty, "could not find file when the directories are empty") + + // Figure out which local directory it hashes to, and which subdirectory in that + val hash = Utils.nonNegativeHash(filename) + val dirId = localDirs.indexOf(storageDirs(hash % storageDirs.length)) + val subDirId = (hash / storageDirs.length) % subDirsPerLocalDir + + // Create the subdirectory if it doesn't already exist + val subDir = subDirs(dirId).synchronized { +val old = subDirs(dirId)(subDirId) +if (old != null) { + old +} else { + val newDir = new File(localDirs(dirId), "%02x".format(subDirId)) + if (!newDir.exists() && !newDir.mkdir()) { +throw new IOException(s"Failed to create local dir in $newDir.") + } + subDirs(dirId)(subDirId) = newDir + newDir +} + } + + new File(subDir, filename) +} + } + /** Looks up a file by hashing it into one of our local subdirectories. */ // This method should be kept in sync with // org.apache.spark.network.shuffle.ExternalShuffleBlockResolver#getFile(). - def getFile(filename: String): File = { -// Figure out which local directory it hashes to, and which subdirectory in that -val hash = Utils.nonNegativeHash(filename) -val dirId = hash % localDirs.length -val subDirId = (hash / localDirs.length) % subDirsPerLocalDir - -// Create the subdirectory if it doesn't already exist -val subDir = subDirs(dirId).synchronized { - val old = subDirs(dirId)(subDirId) - if (old != null) { -old - } else { -val newDir = new File(localDirs(dirId), "%02x".format(subDirId)) -if (!newDir.exists() && !newDir.mkdir()) { - throw new IOException(s"Failed to create local dir in $newDir.") -} -subDirs(dirId)(subDirId) = newDir -newDir + private object hashAllocator extends FileAllocationStrategy { +def apply(filename: String): File = getFile(filename, localDirs) + } + + /** Looks up a file by hierarchy way in different speed storage devices. */ + private val hierarchyStore = conf.getOption("spark.storage.hierarchyStore") + private class HierarchyAllocator extends FileAllocationStrategy { +case class LayerInfo(key: String, threshold: Long, dirs: Array[File]) +val hsSpecs: Array[(String, Long)] = + // e.g.: hierarchyStore = "ssd 200GB, hdd 100GB" + hierarchyStore.get.trim.split(",").map { +s => val x = s.trim.split(" +") + (x(0).toLowerCase, Utils.byteStringAsGb(x(1))) --- End diff -- `Utils.byteStringAsBytes` instead? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10561 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168561314 Merging - thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12611][SQL][PYSPARK][TESTS] Fix test_in...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10564#issuecomment-168561635 cc @davies I'm going to merge this first so pull requests don't just fail. We should look into whether the behavior change is desired. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12537] [SQL] Add option to accept quoti...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10497 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12562][SQL] DataFrame.write.format(text...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/10515#issuecomment-168564909 @marmbrus Can we trigger a test for this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/10567 [SPARK-12594] [SQL] Outer Join Elimination by Filter Conditions Conversion of outer joins, if the predicates in filter conditions can restrict the result sets so that all null-supplying rows are eliminated. - `full outer` -> `inner` if both sides have such predicates - `left outer` -> `inner` if the right side has such predicates - `right outer` -> `inner` if the left side has such predicates - `full outer` -> `left outer` if only the left side has such predicates - `full outer` -> `right outer` if only the right side has such predicates If applicable, this can greatly improve the performance, since outer join is much slower than inner join, full outer join is much slower than left/right outer join. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark outerJoinEliminationByFilterCond Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10567.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10567 commit d516ed443b043261005c39917922f79094385358 Author: gatorsmileDate: 2016-01-04T03:32:39Z outer join elimination by Filter condition --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Update MimaExcludes now Spark 1.6 is in Maven.
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10561#issuecomment-168539177 **[Test build #48612 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48612/consoleFull)** for PR 10561 at commit [`696296d`](https://github.com/apache/spark/commit/696296df412f9e64b656aa63431df13b8e4d52c9). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12300][SQL][PYSPARK] fix schema inferan...
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/10275#issuecomment-168543962 ok reproed and I've got a fix (seems just a test issue from a parallel change in how missing fields can be done). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7689][WIP] Remove TTL-based metadata cl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10534#issuecomment-168545451 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48616/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10542#issuecomment-168545428 Thank you! @holdenk --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7689][WIP] Remove TTL-based metadata cl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10534#issuecomment-168545450 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7689][WIP] Remove TTL-based metadata cl...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10534#issuecomment-168545406 **[Test build #48616 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48616/consoleFull)** for PR 10534 at commit [`1d96791`](https://github.com/apache/spark/commit/1d96791a0949ab8d79333ce0f38b08261a1f776a). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12340][SQL]fix Int overflow in the Spar...
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/10562#discussion_r48698010 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2028,4 +2028,18 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext { Row(false) :: Row(true) :: Nil) } + test("SPARK-12340: overstep the bounds of Int in SparkPlan.executeTake"){ +val rdd = sqlContext.sparkContext.parallelize(1 to 3 , 3 ) + +rdd.toDF("key").registerTempTable("spark12340") +checkAnswer( + sql("select key from spark12340 limit 2147483638"), + Row(1) :: Row(2) :: Row(3) :: Nil +) + +assert(rdd.take(2147483638).size === 3) + +assert(rdd.takeAsync(2147483638).get.size === 3) --- End diff -- Does this test case pass the compilation? `2147483638` is not within the range of `Int`. Should we say `Int.MaxValue` ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12537] [SQL] Add option to accept quoti...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10497#issuecomment-168548879 **[Test build #2308 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2308/consoleFull)** for PR 10497 at commit [`e75ff1d`](https://github.com/apache/spark/commit/e75ff1df0ad1297fd1990ed23b5bc5e659154888). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12612][PROJECT-INFRA][test-hadoop2.4] A...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10565#issuecomment-168556598 **[Test build #48619 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48619/consoleFull)** for PR 10565 at commit [`426c3e2`](https://github.com/apache/spark/commit/426c3e2fdea7e9a40a3149974a3f9b9912a0bb8e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-12196][Core] Store/retrieve blocks...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/10225#discussion_r48699647 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala --- @@ -53,35 +53,98 @@ private[spark] class DiskBlockManager(blockManager: BlockManager, conf: SparkCon private val shutdownHook = addShutdownHook() + private abstract class FileAllocationStrategy { +def apply(filename: String): File + +protected def getFile(filename: String, storageDirs: Array[File]): File = { + require(storageDirs.nonEmpty, "could not find file when the directories are empty") + + // Figure out which local directory it hashes to, and which subdirectory in that + val hash = Utils.nonNegativeHash(filename) + val dirId = localDirs.indexOf(storageDirs(hash % storageDirs.length)) + val subDirId = (hash / storageDirs.length) % subDirsPerLocalDir + + // Create the subdirectory if it doesn't already exist + val subDir = subDirs(dirId).synchronized { +val old = subDirs(dirId)(subDirId) +if (old != null) { + old +} else { + val newDir = new File(localDirs(dirId), "%02x".format(subDirId)) + if (!newDir.exists() && !newDir.mkdir()) { +throw new IOException(s"Failed to create local dir in $newDir.") + } + subDirs(dirId)(subDirId) = newDir + newDir +} + } + + new File(subDir, filename) +} + } + /** Looks up a file by hashing it into one of our local subdirectories. */ // This method should be kept in sync with // org.apache.spark.network.shuffle.ExternalShuffleBlockResolver#getFile(). - def getFile(filename: String): File = { -// Figure out which local directory it hashes to, and which subdirectory in that -val hash = Utils.nonNegativeHash(filename) -val dirId = hash % localDirs.length -val subDirId = (hash / localDirs.length) % subDirsPerLocalDir - -// Create the subdirectory if it doesn't already exist -val subDir = subDirs(dirId).synchronized { - val old = subDirs(dirId)(subDirId) - if (old != null) { -old - } else { -val newDir = new File(localDirs(dirId), "%02x".format(subDirId)) -if (!newDir.exists() && !newDir.mkdir()) { - throw new IOException(s"Failed to create local dir in $newDir.") -} -subDirs(dirId)(subDirId) = newDir -newDir + private object hashAllocator extends FileAllocationStrategy { +def apply(filename: String): File = getFile(filename, localDirs) + } + + /** Looks up a file by hierarchy way in different speed storage devices. */ + private val hierarchyStore = conf.getOption("spark.storage.hierarchyStore") + private class HierarchyAllocator extends FileAllocationStrategy { +case class LayerInfo(key: String, threshold: Long, dirs: Array[File]) +val hsSpecs: Array[(String, Long)] = + // e.g.: hierarchyStore = "ssd 200GB, hdd 100GB" + hierarchyStore.get.trim.split(",").map { +s => val x = s.trim.split(" +") + (x(0).toLowerCase, Utils.byteStringAsGb(x(1))) } +val hsLayers: Array[LayerInfo] = hsSpecs.map( + s => LayerInfo(s._1, s._2, localDirs.filter(_.getPath.toLowerCase.containsSlice(s._1))) +) +val lastLayerDirs = localDirs.filter(dir => !hsLayers.exists(_.dirs.contains(dir))) +val allLayers: Array[LayerInfo] = hsLayers :+ + LayerInfo("Last Storage", 10.toLong, lastLayerDirs) +val finalLayers: Array[LayerInfo] = allLayers.filter(_.dirs.nonEmpty) +logInfo("Hierarchy store info:") +for (layer <- finalLayers) { + logInfo("Layer: %s, Threshold: %dGB".format(layer.key, layer.threshold)) + layer.dirs.foreach { dir => logInfo("\t%s".format(dir.getCanonicalPath)) } } -new File(subDir, filename) +def apply(filename: String): File = { + var availableFile: File = null + for (layer <- finalLayers) { +val file = getFile(filename, layer.dirs) +if (file.exists()) return file + +if (availableFile == null && file.getParentFile.getUsableSpace>>30 >= layer.threshold) { + availableFile = file +} + } + + if (availableFile == null) { +throw new IOException(s"No enough disk space.") + } + availableFile +} } - def getFile(blockId: BlockId): File = getFile(blockId.name) + private val fileAllocator: FileAllocationStrategy = +if (hierarchyStore.isDefined &&
[GitHub] spark pull request: [SPARK-12537] [SQL] Add option to accept quoti...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10497#issuecomment-168560762 **[Test build #2308 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2308/consoleFull)** for PR 10497 at commit [`e75ff1d`](https://github.com/apache/spark/commit/e75ff1df0ad1297fd1990ed23b5bc5e659154888). * This patch passes all tests. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12580][SQL] Remove string concatenation...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10524#issuecomment-168561948 How about 80 characters? Let's make sure it is easy to read by users (who use describe function command) and developers (who read code). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12600][SQL] Remove deprecated methods i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10559#issuecomment-168561886 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12600][SQL] Remove deprecated methods i...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10559#issuecomment-168561883 **[Test build #48622 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48622/consoleFull)** for PR 10559 at commit [`c4e37ae`](https://github.com/apache/spark/commit/c4e37ae43d0675c3c34b293368a955e6f4cdf979). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12600][SQL] Remove deprecated methods i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10559#issuecomment-168561887 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48622/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12600][SQL] Remove deprecated methods i...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10559#issuecomment-168562467 **[Test build #48623 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48623/consoleFull)** for PR 10559 at commit [`4df345e`](https://github.com/apache/spark/commit/4df345e58cf8582f7a93a8e00e715854d5e2148d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Adding zipPartitions to PySpark
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168566477 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user gatorsmile closed the pull request at: https://github.com/apache/spark/pull/10542 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10542#issuecomment-168571033 Let me close it and then resubmit the PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12562][SQL] DataFrame.write.format(text...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10515#issuecomment-168572335 **[Test build #2309 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2309/consoleFull)** for PR 10515 at commit [`d01ecac`](https://github.com/apache/spark/commit/d01ecac4966896cdd05698006c032d58f511c36b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10542#issuecomment-168543158 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10542#issuecomment-168543160 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48613/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12612][PROJECT-INFRA][test-hadoop2.4] A...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10565#issuecomment-168557133 **[Test build #48619 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48619/consoleFull)** for PR 10565 at commit [`426c3e2`](https://github.com/apache/spark/commit/426c3e2fdea7e9a40a3149974a3f9b9912a0bb8e). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12612][PROJECT-INFRA][test-hadoop2.4] A...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10565#issuecomment-168557143 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48619/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12612][PROJECT-INFRA][test-hadoop2.4] A...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10565#issuecomment-168557142 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12612][PROJECT-INFRA][test-hadoop2.3] A...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/10565#issuecomment-168557245 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12612][PROJECT-INFRA][test-hadoop2.3] A...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10565#issuecomment-168557882 **[Test build #48620 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48620/consoleFull)** for PR 10565 at commit [`426c3e2`](https://github.com/apache/spark/commit/426c3e2fdea7e9a40a3149974a3f9b9912a0bb8e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12486] Worker should kill the executors...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10438#issuecomment-168561998 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12612][PROJECT-INFRA][test-hadoop2.3] A...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10565#issuecomment-168564758 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48620/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Adding zipPartitions to PySpark
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10550#issuecomment-168567422 **[Test build #48626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48626/consoleFull)** for PR 10550 at commit [`c9bf4a0`](https://github.com/apache/spark/commit/c9bf4a0aa2ce412d700c29bd9c1b850dd93b211a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10542#issuecomment-168567303 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10567#issuecomment-168574126 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10542#issuecomment-168538163 **[Test build #48613 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48613/consoleFull)** for PR 10542 at commit [`ee29dd2`](https://github.com/apache/spark/commit/ee29dd230702413238abed77fc6a7691bd4ff6a5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10359][PROJECT-INFRA] Use more random n...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/10558#issuecomment-168539962 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7689][WIP] Remove TTL-based metadata cl...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/10534#issuecomment-168539982 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7689][WIP] Remove TTL-based metadata cl...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10534#issuecomment-168540401 **[Test build #48616 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48616/consoleFull)** for PR 10534 at commit [`1d96791`](https://github.com/apache/spark/commit/1d96791a0949ab8d79333ce0f38b08261a1f776a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10542#issuecomment-168543107 **[Test build #48613 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48613/consoleFull)** for PR 10542 at commit [`ee29dd2`](https://github.com/apache/spark/commit/ee29dd230702413238abed77fc6a7691bd4ff6a5). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12340][SQL]fix Int overflow in the Spar...
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/10562#discussion_r48697977 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2028,4 +2028,18 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext { Row(false) :: Row(true) :: Nil) } + test("SPARK-12340: overstep the bounds of Int in SparkPlan.executeTake"){ +val rdd = sqlContext.sparkContext.parallelize(1 to 3 , 3 ) + +rdd.toDF("key").registerTempTable("spark12340") +checkAnswer( + sql("select key from spark12340 limit 2147483638"), + Row(1) :: Row(2) :: Row(3) :: Nil +) + +assert(rdd.take(2147483638).size === 3) + --- End diff -- I don't think a blank line is not needed here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10359][PROJECT-INFRA] Use more random n...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10558#issuecomment-168546673 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10359][PROJECT-INFRA] Use more random n...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10558#issuecomment-168546188 **[Test build #48615 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48615/consoleFull)** for PR 10558 at commit [`a2d59e5`](https://github.com/apache/spark/commit/a2d59e53fd02286b5e99546a759732dd687950eb). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12537] [SQL] Add option to accept quoti...
Github user Cazen commented on the pull request: https://github.com/apache/spark/pull/10497#issuecomment-168546132 Could I ask to run test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12612][PROJECT-INFRA][test-hadoop2.3] A...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/10565#issuecomment-168558078 It occurred to me that we should probably also add these profiles to the `dev/deps` lists as well, esp. since the `hadoop-2.6` profile has some dependency overrides. I might as well do that as part of this PR to collect all of the Hadoop profile parity changes in one place. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org