[GitHub] spark pull request: [SPARK-5011][SQL] Add support for WITH SERDEPR...
Github user tianyi commented on a diff in the pull request: https://github.com/apache/spark/pull/3847#discussion_r22376322 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/ddl.scala --- @@ -70,13 +73,33 @@ private[sql] class DDLParser extends StandardTokenParsers with PackratParsers wi * CREATE TEMPORARY TABLE avroTable * USING org.apache.spark.sql.avro * OPTIONS (path ../hive/src/test/resources/data/files/episodes.avro) + * OR, + * For other external datasources not only a kind of file like:avro, parquet, json, but a cluster database, like: cassandra an hbase etc... + * DDL like this: + * CREATE TEMPORARY TABLE cassandraTable + * USING org.apache.spark.sql.cassandra + * WITH SERDEPROP(serialization.format=1, cassandra.columns.mapping=key,data) + * TBLPROPERTIES(cassandra.keyspace.name = cassandra_keyspace) */ protected lazy val createTable: Parser[LogicalPlan] = -CREATE ~ TEMPORARY ~ TABLE ~ ident ~ (USING ~ className) ~ (OPTIONS ~ options) ^^ { - case tableName ~ provider ~ opts = -CreateTableUsing(tableName, provider, opts) +CREATE ~ TEMPORARY ~ TABLE ~ ident ~ (USING ~ className) ~ + (OPTIONS ~ options).? ~ (WITH ~ SERDEPROP ~ properties).? ~ + (TBLPROP ~ properties).? ^^ { + case tableName ~ provider ~ opts ~ serdeprop ~ tblprop = +val optionParams = opts.getOrElse(Map[String,String]()) +val serdeParams = serdeprop.getOrElse(Map[String,String]()) +val tblParams = tblprop.getOrElse(Map[String,String]()) +//TODO: in order to not break current interface, simple union them, if interface changes, also change this --- End diff -- exceeds limitation of line length --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5011][SQL] Add support for WITH SERDEPR...
Github user tianyi commented on a diff in the pull request: https://github.com/apache/spark/pull/3847#discussion_r22376331 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/ddl.scala --- @@ -70,13 +73,33 @@ private[sql] class DDLParser extends StandardTokenParsers with PackratParsers wi * CREATE TEMPORARY TABLE avroTable * USING org.apache.spark.sql.avro * OPTIONS (path ../hive/src/test/resources/data/files/episodes.avro) + * OR, + * For other external datasources not only a kind of file like:avro, parquet, json, but a cluster database, like: cassandra an hbase etc... + * DDL like this: + * CREATE TEMPORARY TABLE cassandraTable + * USING org.apache.spark.sql.cassandra + * WITH SERDEPROP(serialization.format=1, cassandra.columns.mapping=key,data) + * TBLPROPERTIES(cassandra.keyspace.name = cassandra_keyspace) */ protected lazy val createTable: Parser[LogicalPlan] = -CREATE ~ TEMPORARY ~ TABLE ~ ident ~ (USING ~ className) ~ (OPTIONS ~ options) ^^ { - case tableName ~ provider ~ opts = -CreateTableUsing(tableName, provider, opts) +CREATE ~ TEMPORARY ~ TABLE ~ ident ~ (USING ~ className) ~ + (OPTIONS ~ options).? ~ (WITH ~ SERDEPROP ~ properties).? ~ + (TBLPROP ~ properties).? ^^ { + case tableName ~ provider ~ opts ~ serdeprop ~ tblprop = +val optionParams = opts.getOrElse(Map[String,String]()) +val serdeParams = serdeprop.getOrElse(Map[String,String]()) +val tblParams = tblprop.getOrElse(Map[String,String]()) +//TODO: in order to not break current interface, simple union them, if interface changes, also change this --- End diff -- and need a space after // --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5011][SQL] Add support for WITH SERDEPR...
Github user tianyi commented on a diff in the pull request: https://github.com/apache/spark/pull/3847#discussion_r22376333 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/ddl.scala --- @@ -70,13 +73,33 @@ private[sql] class DDLParser extends StandardTokenParsers with PackratParsers wi * CREATE TEMPORARY TABLE avroTable * USING org.apache.spark.sql.avro * OPTIONS (path ../hive/src/test/resources/data/files/episodes.avro) + * OR, + * For other external datasources not only a kind of file like:avro, parquet, json, but a cluster database, like: cassandra an hbase etc... + * DDL like this: + * CREATE TEMPORARY TABLE cassandraTable + * USING org.apache.spark.sql.cassandra + * WITH SERDEPROP(serialization.format=1, cassandra.columns.mapping=key,data) + * TBLPROPERTIES(cassandra.keyspace.name = cassandra_keyspace) */ protected lazy val createTable: Parser[LogicalPlan] = -CREATE ~ TEMPORARY ~ TABLE ~ ident ~ (USING ~ className) ~ (OPTIONS ~ options) ^^ { - case tableName ~ provider ~ opts = -CreateTableUsing(tableName, provider, opts) +CREATE ~ TEMPORARY ~ TABLE ~ ident ~ (USING ~ className) ~ + (OPTIONS ~ options).? ~ (WITH ~ SERDEPROP ~ properties).? ~ + (TBLPROP ~ properties).? ^^ { + case tableName ~ provider ~ opts ~ serdeprop ~ tblprop = +val optionParams = opts.getOrElse(Map[String,String]()) +val serdeParams = serdeprop.getOrElse(Map[String,String]()) +val tblParams = tblprop.getOrElse(Map[String,String]()) +//TODO: in order to not break current interface, simple union them, if interface changes, also change this +val passedParams = optionParams ++ serdeParams ++ tblParams +passedParams.foreach(println) +CreateTableUsing(tableName, provider, passedParams) } + protected lazy val properties: Parser[Map[String,String]] = +( ~ repsep(equalStrKVPair, ,) ~ ) ^^ { case s: Seq[(String, String)] = s.toMap } + + protected lazy val equalStrKVPair: Parser[(String, String)] = stringLit ~ = ~ stringLit ^^ { case k ~ = ~ v = (k,v) } --- End diff -- exceeds limitation of line length --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4631] unit test for MQTT
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3844#issuecomment-68428985 [Test build #24948 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24948/consoleFull) for PR 3844 at commit [`b1ac4ad`](https://github.com/apache/spark/commit/b1ac4ad62ff6d537f669699d5da49bc4ee1ab154). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class MQTTStreamSuite extends FunSuite with Eventually with BeforeAndAfter ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5011][SQL] Add support for WITH SERDEPR...
Github user tianyi commented on a diff in the pull request: https://github.com/apache/spark/pull/3847#discussion_r22376350 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/ddl.scala --- @@ -70,13 +73,33 @@ private[sql] class DDLParser extends StandardTokenParsers with PackratParsers wi * CREATE TEMPORARY TABLE avroTable * USING org.apache.spark.sql.avro * OPTIONS (path ../hive/src/test/resources/data/files/episodes.avro) + * OR, + * For other external datasources not only a kind of file like:avro, parquet, json, but a cluster database, like: cassandra an hbase etc... --- End diff -- exceeds limitation of line length --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4631] unit test for MQTT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3844#issuecomment-68428989 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24948/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4963 [SQL] Add copy to SQL's Sample oper...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3827#issuecomment-68429063 [Test build #24947 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24947/consoleFull) for PR 3827 at commit [`65c4e7c`](https://github.com/apache/spark/commit/65c4e7cdb906412abc154bb25cdfd49b6d53e9f9). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4963 [SQL] Add copy to SQL's Sample oper...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3827#issuecomment-68429064 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24947/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...
Github user harishreedharan commented on the pull request: https://github.com/apache/spark/pull/3842#issuecomment-68429122 I believe we need to track only the Avro classes and the SparkSink class (not sure if we need binary compat for this either - I think API compat is all we need for even the SparkSink class). Other than that we should be fine since the jar that contains the sink itself has the other classes, so binary compat isn't an issue. On Tuesday, December 30, 2014, Sean Owen notificati...@github.com wrote: @harishreedharan https://github.com/harishreedharan might know better; it was his suggestion to track this with MiMa. I suppose it can be enabled, to err on the side of tracking these things, and exclude/disable later when needed? â Reply to this email directly or view it on GitHub https://github.com/apache/spark/pull/3842#issuecomment-68426102. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4631] unit test for MQTT
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3844#issuecomment-68429312 [Test build #24949 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24949/consoleFull) for PR 3844 at commit [`4b58094`](https://github.com/apache/spark/commit/4b580943de5137e947d1a6cdadd054020932ed8e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class MQTTStreamSuite extends FunSuite with Eventually with BeforeAndAfter ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4631] unit test for MQTT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3844#issuecomment-68429316 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24949/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3849#issuecomment-68429438 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24945/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3849#issuecomment-68429433 [Test build #24945 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24945/consoleFull) for PR 3849 at commit [`8c387ce`](https://github.com/apache/spark/commit/8c387ce76850caf2163dd82dc5c79b079788a921). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4991][CORE] Worker should reconnect to ...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3825#issuecomment-68429462 I'd rather use fewer Akka features than more, since this will make it easier to replace Akka with our own RPC layer in the future. Therefore, I'd much prefer to not allow exceptions to trigger actor restarts / state clearing. I think that adding an experimental Akka feature like persistence would be a huge risk for little obvious gain. I'm not sure if the heartbeat from unknown worker can ever occur if we don't clear the master's state because I think that workers only begin sending heartbeats once a master has ack'd their registration in which case the master would know that it was a previously-registered worker and instruct it to reconnect. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3325][Streaming] Add a parameter to the...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3237#issuecomment-68429851 [Test build #24951 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24951/consoleFull) for PR 3237 at commit [`bb35d1a`](https://github.com/apache/spark/commit/bb35d1a3e14703c1ca71a8b3b463a65b15cace3d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3325][Streaming] Add a parameter to the...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3237#issuecomment-68429853 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24951/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user YanTangZhai commented on a diff in the pull request: https://github.com/apache/spark/pull/3794#discussion_r22376680 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -178,7 +178,7 @@ abstract class RDD[T: ClassTag]( // Our dependencies and partitions will be gotten by calling subclass's methods below, and will // be overwritten when we're checkpointed private var dependencies_ : Seq[Dependency[_]] = null - @transient private var partitions_ : Array[Partition] = null + @transient private var partitions_ : Array[Partition] = getPartitions --- End diff -- Sorry. This approach may cause error as follows: Exception in thread main java.lang.NullPointerException at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191) at com.google.common.collect.MapMakerInternalMap.put(MapMakerInternalMap.java:3499) at org.apache.spark.rdd.HadoopRDD$.putCachedMetadata(HadoopRDD.scala:273) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:151) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:173) at org.apache.spark.rdd.RDD.init(RDD.scala:181) at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:97) at org.apache.spark.SparkContext.hadoopFile(SparkContext.scala:561) at org.apache.spark.SparkContext.textFile(SparkContext.scala:471) since jobConfCacheKey has not been initialized at that time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5032] [graphx] Remove GraphX MIMA exclu...
GitHub user jkbradley opened a pull request: https://github.com/apache/spark/pull/3856 [SPARK-5032] [graphx] Remove GraphX MIMA exclude for 1.3 Since GraphX is no longer alpha as of 1.2, MimaExcludes should not exclude GraphX for 1.3 You can merge this pull request into a Git repository by running: $ git pull https://github.com/jkbradley/spark graphx-mima Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3856.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3856 commit a3fea4282f9f96d6b5bb5d378ba6198160d84c31 Author: Joseph K. Bradley jos...@databricks.com Date: 2014-12-31T08:30:51Z removed graphx mima exclude for 1.3 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5032] [graphx] Remove GraphX MIMA exclu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3856#issuecomment-68430068 [Test build #24953 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24953/consoleFull) for PR 3856 at commit [`a3fea42`](https://github.com/apache/spark/commit/a3fea4282f9f96d6b5bb5d378ba6198160d84c31). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5032] [graphx] Remove GraphX MIMA exclu...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3856#issuecomment-68430096 This is not fixed yet; I need to include some Mima excludes for GraphX, it seems. I'll update this within a day once I track down the JIRAs to associate with the excludes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4789] [SPARK-4942] [SPARK-5031] [mllib]...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3637#issuecomment-68431205 [Test build #24952 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24952/consoleFull) for PR 3637 at commit [`dc5647e`](https://github.com/apache/spark/commit/dc5647ee1438228f53f79037e7b47a8d1ac2d61b). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `// where index i corresponds to class i (i = 0, 1).` * `new Param(this, probabilityCol, column name for predicted class conditional probabilities,` * `class VectorUDT extends UserDefinedType[Vector] ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4789] [SPARK-4942] [SPARK-5031] [mllib]...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3637#issuecomment-68431209 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24952/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...
GitHub user JoshRosen opened a pull request: https://github.com/apache/spark/pull/3857 [SPARK-5035] [Streaming] ReceiverMessage trait should extend Serializable Spark Streaming's ReceiverMessage trait should extend Serializable in order to fix a subtle bug that only occurs when running on a real cluster: If you attempt to send a fire-and-forget message to a remote Akka actor and that message cannot be serialized, then this seems to lead to more-or-less silent failures. As an optimization, Akka skips message serialization for messages sent within the same JVM. As a result, Spark's unit tests will never fail due to non-serializable Akka messages, but these will cause mostly-silent failures when running on a real cluster. Before this patch, here was the code for ReceiverMessage: ``` /** Messages sent to the NetworkReceiver. */ private[streaming] sealed trait ReceiverMessage private[streaming] object StopReceiver extends ReceiverMessage ``` Since ReceiverMessage does not extend Serializable and StopReceiver is a regular `object`, not a `case object`, StopReceiver will throw serialization errors. As a result, graceful receiver shutdown is broken on real clusters but works in local and local-cluster modes. If you want to reproduce this, try running the word count example from the Streaming Programming Guide in the Spark shell: ``` import org.apache.spark._ import org.apache.spark.streaming._ import org.apache.spark.streaming.StreamingContext._ val ssc = new StreamingContext(sc, Seconds(10)) // Create a DStream that will connect to hostname:port, like localhost: val lines = ssc.socketTextStream(localhost, ) // Split each line into words val words = lines.flatMap(_.split( )) import org.apache.spark.streaming.StreamingContext._ // Count each word in each batch val pairs = words.map(word = (word, 1)) val wordCounts = pairs.reduceByKey(_ + _) // Print the first ten elements of each RDD generated in this DStream to the console wordCounts.print() ssc.start() Thread.sleep(1) ssc.stop(true, true) ``` Prior to this patch, this would work correctly in local mode but fail when running against a real cluster. You can merge this pull request into a Git repository by running: $ git pull https://github.com/JoshRosen/spark SPARK-5035 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3857.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3857 commit 71d0eae7658641b9a820b86e8017dc9c7d3c6029 Author: Josh Rosen joshro...@databricks.com Date: 2014-12-31T09:20:37Z [SPARK-5035] ReceiverMessage trait should extend Serializable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3857#issuecomment-68431882 [Test build #24954 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24954/consoleFull) for PR 3857 at commit [`71d0eae`](https://github.com/apache/spark/commit/71d0eae7658641b9a820b86e8017dc9c7d3c6029). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3857#issuecomment-68431968 /cc @tdas. This might fix both https://issues.apache.org/jira/browse/SPARK-4986 and https://issues.apache.org/jira/browse/SPARK-2892, although there could possibly be more pieces to solving those (e.g. replace 10 second timeout with a configurable timeout). I want to give a huge thanks to @cleaton for filing SPARK-4986 and for coming up with a workaround patch for SPARK-4986 which helped to spot this issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3857#issuecomment-68432275 Also, this was a really nasty bug because it seems very hard to test for this in Spark's own unit tests. Akka has a configuration option to force all messages to be serialized, even between local actors, but unfortunately this breaks Spark core because we send some non-serializable SparkContext references when initializing the DAGScheduler actor. Can we force serialization by spinning up separate actor systems for the master / worker / executor processes when running in local-cluster mode? Or is there some other way that we can selectively force serialization in order to uncover these sorts of issues? We can definitely reproduce these sorts of issues in my spark-integration-tests system, since that uses multiple JVMs, but for completeness's sake I guess we'd need that tool's suites to send all of the remote messages (so this could be a lot of test code duplication). Maybe the simplest (general) preventative test would have been something that just tries to call the Java serializer on an instance of each message class, so we just test serializability independent of Akka. Checking for serializability (either manually or through handwritten tests) should be part of our review checklist when adding new Akka messages. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3857#issuecomment-68432566 Also, just to sanity check and make sure I haven't overlooked something, at least one other person besides me should run the `spark-shell` reproduction listed in the PR description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3849#issuecomment-68432871 [Test build #24955 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24955/consoleFull) for PR 3849 at commit [`0b10526`](https://github.com/apache/spark/commit/0b1052611f7fd7f71232ba3f2c0505e5711080e1). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4631] unit test for MQTT
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3844#issuecomment-68433088 [Test build #24956 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24956/consoleFull) for PR 3844 at commit [`fc8eb28`](https://github.com/apache/spark/commit/fc8eb286db6aa8e78a567537996011f554eed969). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3847] Use portable hashcode for Java en...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3795#issuecomment-68433103 For Enums, this patch seems like a strict improvement over the status quo. The strengthening of the array checks is the only potentially controversial change, but I think it's extremely unlikely to break user programs (it could only affect users who tried to use CombineByKey with array keys and a custom serializer, which seems like an unlikely use case); besides, any program that this breaks was likely giving the wrong answer / results, so it's better to fail loudly. I guess there are still a few cases that could slip through the cracks: - Java users who use custom serializers - Cases where the Java API uses the wrong manifest and can't tell that we've passed an array. I think both of these cases can only be detected with runtime-checks on the first record being shuffled. Maybe we should add those as part of a separate PR, though, if we think they're worthwhile. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5032] [graphx] Remove GraphX MIMA exclu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3856#issuecomment-68433311 [Test build #24953 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24953/consoleFull) for PR 3856 at commit [`a3fea42`](https://github.com/apache/spark/commit/a3fea4282f9f96d6b5bb5d378ba6198160d84c31). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5032] [graphx] Remove GraphX MIMA exclu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3856#issuecomment-68433315 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24953/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4797] Replace breezeSquaredDistance
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/3643#issuecomment-68434376 @mengxr The implementation is renamed and moved to `linalg.Vectors`. Would you like to test it again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3857#issuecomment-68435325 [Test build #24954 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24954/consoleFull) for PR 3857 at commit [`71d0eae`](https://github.com/apache/spark/commit/71d0eae7658641b9a820b86e8017dc9c7d3c6029). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3857#issuecomment-68435331 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24954/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4631] unit test for MQTT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3844#issuecomment-68435869 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24956/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4631] unit test for MQTT
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3844#issuecomment-68435866 [Test build #24956 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24956/consoleFull) for PR 3844 at commit [`fc8eb28`](https://github.com/apache/spark/commit/fc8eb286db6aa8e78a567537996011f554eed969). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class MQTTStreamSuite extends FunSuite with Eventually with BeforeAndAfter ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-68436420 [Test build #24957 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24957/consoleFull) for PR 3794 at commit [`fd87518`](https://github.com/apache/spark/commit/fd87518d7f81de1a122cfad25a88956a596ccd4f). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3849#issuecomment-68437209 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24955/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3849#issuecomment-68437206 [Test build #24955 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24955/consoleFull) for PR 3849 at commit [`0b10526`](https://github.com/apache/spark/commit/0b1052611f7fd7f71232ba3f2c0505e5711080e1). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-68438167 @JoshRosen Thanks for your comments. I've updated it according to your comments and contrived a simple example as follows: ```javascript val inputfile1 = ./testin/in_1.txt val inputfile2 = ./testin/in_2.txt val tempfile = ./testtmp val outputfile = ./testout val sc = new SparkContext(new SparkConf()) sc.textFile(inputfile1) .flatMap(line = line.split( )) .map(word = (word, 1)) .reduceByKey(_ + _, 1) .map{kv = (kv._1 + , + kv._2.toString)} .saveAsTextFile(tempfile) val wordCounts1 = sc.textFile(tempfile) val wordCounts2 = sc.textFile(inputfile2) val wordCounts = wordCounts1.union(wordCounts2) wordCounts.map{line = val kv = line.split(,) (kv(0), Integer.parseInt(kv(1))) } .reduceByKey(_ + _, 1) .map{kv = (kv._1 + , + kv._2.toString)} .saveAsTextFile(outputfile) ``` ./testin/in_1.txt (23 bytes) and ./testin/in_2.txt (19 bytes) are all local files. - Before optimization, - job1 br/New stage creation took 0.729638 s among which HadoopRDD getPartitions took 0.710247 s. - job2 br/New stage creation took 0.882241 s among which HadoopRDD.getPartitions took 0.850668 + 0.023490 s. - After optimization, - job1 br/HadoopRDD getPartitions took 0.802133 s. br/New stage creation took 0.029328 s. - job2 br/HadoopRDD getPartitions took 0.464713 + 0.022568 s. br/New stage creation took 0.001773 s. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-68438540 [Test build #24958 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24958/consoleFull) for PR 3794 at commit [`74c1dec`](https://github.com/apache/spark/commit/74c1dec31ba9ded5a82640f7354aa6231169281c). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-68441198 **[Test build #24957 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24957/consoleFull)** for PR 3794 at commit [`fd87518`](https://github.com/apache/spark/commit/fd87518d7f81de1a122cfad25a88956a596ccd4f) after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-68441201 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24957/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-68441848 [Test build #24959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24959/consoleFull) for PR 3222 at commit [`2948c58`](https://github.com/apache/spark/commit/2948c583e77d636b001d484872abf4e76a2f02dd). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-68442070 [Test build #24960 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24960/consoleFull) for PR 3222 at commit [`ebd07ad`](https://github.com/apache/spark/commit/ebd07addf5883cfdeacc88a0b551fd5c9a2245e6). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-68443755 **[Test build #24958 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24958/consoleFull)** for PR 3794 at commit [`74c1dec`](https://github.com/apache/spark/commit/74c1dec31ba9ded5a82640f7354aa6231169281c) after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-68443757 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24958/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-68445236 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24959/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-68445231 [Test build #24959 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24959/consoleFull) for PR 3222 at commit [`2948c58`](https://github.com/apache/spark/commit/2948c583e77d636b001d484872abf4e76a2f02dd). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class AdaGradUpdater(` * `class DBN(val stackedRBM: StackedRBM)` * `class MLP(` * `class MomentumUpdater(val momentum: Double) extends Updater ` * `class RBM(` * `class StackedRBM(val innerRBMs: Array[RBM])` * `case class MinstItem(label: Int, data: Array[Int]) ` * `class MinstDatasetReader(labelsFile: String, imagesFile: String)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4991][CORE] Worker should reconnect to ...
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/3825#issuecomment-68445280 It doesn't seem to me that usage of the newer Akka persistence API is called for, but it does seem that wrapping the `receive` in a try-catch is trying to do the job for which Akka's `SupervisorStrategy` is intended. I can't recommend the hand-rolled try-catch approach. http://doc.akka.io/docs/akka/2.3.4/general/supervision.html#supervision --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-68446626 [Test build #24960 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24960/consoleFull) for PR 3222 at commit [`ebd07ad`](https://github.com/apache/spark/commit/ebd07addf5883cfdeacc88a0b551fd5c9a2245e6). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class AdaGradUpdater(` * `class DBN(val stackedRBM: StackedRBM)` * `class MLP(` * `class MomentumUpdater(val momentum: Double) extends Updater ` * `class RBM(` * `class StackedAutoEncoder(val stackedRBM: StackedRBM)` * `class StackedRBM(val innerRBMs: Array[RBM])` * `case class MinstItem(label: Int, data: Array[Int]) ` * `class MinstDatasetReader(labelsFile: String, imagesFile: String)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3222#issuecomment-68446628 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24960/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4014] Change TaskContext.attemptId to r...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3849#issuecomment-68453172 Hmm, looks like this change somehow broke a PySpark Streaming test: ``` == FAIL: Basic operation test for DStream.mapPartitions. -- Traceback (most recent call last): File pyspark/streaming/tests.py, line 228, in test_mapPartitions self._test_func(rdds, func, expected) File pyspark/streaming/tests.py, line 114, in _test_func self.assertEqual(expected, result) AssertionError: [[3, 7], [11, 15], [19, 23]] != [] ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5037] dynamically loaded DStreams imple...
GitHub user industrial-sloth opened a pull request: https://github.com/apache/spark/pull/3858 [SPARK-5037] dynamically loaded DStreams implementation and example This PR adds a new reflection-based method of creating input DStreams to the scala StreamingContext, and wires it through to the python streaming API. Trying to create DStream instances directly by reflection runs into trouble with unwanted stuff getting dragged into closures, so I worked around this by defining a new abstract serializable `ReflectedDStreamFactory` class. The idea is that one subclasses this with a concrete implementation that directly instantiates the desired InputDStream; then the StreamingContext uses reflection to dynamically load this new Factory implementation. This PR also has an example showing how this works with the existing ZeroMQ example code in both the scala and python streaming APIs. Parameters are passed into the input DStream indirectly by first putting them into the factory constructor, then requiring the factory implementation to pass them on into the DStream instance. At the moment these parameters are limited to String type, which I think should cover the majority of use cases, but I'd think it should be possible to generalize this further. Am throwing this out there for comment; suggestions and alternative approaches more than welcome. You can merge this pull request into a Git repository by running: $ git pull https://github.com/industrial-sloth/spark reflected-dstreams Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3858.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3858 commit 2ffec19c21348934911a56a14799a0ddcae5e4da Author: industrial-sloth industrial-sl...@users.noreply.github.com Date: 2014-12-31T16:54:48Z dynamically leaded DStreams implementation and example --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5037] dynamically loaded DStreams imple...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3858#issuecomment-68457064 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4298][Core] - The spark-submit cannot r...
Github user brennonyork commented on the pull request: https://github.com/apache/spark/pull/3561#issuecomment-68457321 @JoshRosen took care of the minor edits for ya! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4298][Core] - The spark-submit cannot r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3561#issuecomment-68457422 [Test build #24961 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24961/consoleFull) for PR 3561 at commit [`5e0fce1`](https://github.com/apache/spark/commit/5e0fce1cd0a6b4b413c31e8ca214c11c569c6164). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [branch-1.0][SPARK-4355] ColumnStatisticsAggre...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3850#issuecomment-68457889 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [branch-1.0][SPARK-4355] ColumnStatisticsAggre...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3850#issuecomment-68458124 [Test build #24962 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24962/consoleFull) for PR 3850 at commit [`ae9b94a`](https://github.com/apache/spark/commit/ae9b94a3f817759ee6249af991beec7e19e52f12). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/3562#issuecomment-68458684 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3562#issuecomment-68458773 [Test build #24963 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24963/consoleFull) for PR 3562 at commit [`6e97f72`](https://github.com/apache/spark/commit/6e97f72ca401e21e6ef81f7a0535b96801776e6f). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4797] Replace breezeSquaredDistance
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3643#issuecomment-68459442 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4797] Replace breezeSquaredDistance
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3643#issuecomment-68459446 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4797] Replace breezeSquaredDistance
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3643#issuecomment-68459776 [Test build #24964 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24964/consoleFull) for PR 3643 at commit [`f28b275`](https://github.com/apache/spark/commit/f28b275e153b3d093bf063c53efe1dea91084918). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5032] [graphx] Remove GraphX MIMA exclu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3856#issuecomment-68462221 [Test build #24965 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24965/consoleFull) for PR 3856 at commit [`30f8bb4`](https://github.com/apache/spark/commit/30f8bb4cbe472a536c9506a7365e76f736adcb33). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5020 [MLlib] GaussianMixtureModel.predic...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3854#issuecomment-68462618 [Test build #558 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/558/consoleFull) for PR 3854 at commit [`0f1d96e`](https://github.com/apache/spark/commit/0f1d96e2b4292edbf0a4c9db82fc2969016b0587). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5020 [MLlib] GaussianMixtureModel.predic...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3854#issuecomment-68462718 @tgaloppo Ok, thanks for the fix! LGTM once the tests are done. @tgaloppo @mengxr As long as this method is being edited, do you like the name ```predictMembership``` for soft clustering? I assume we may eventually use the same term for other clustering methods, includes LDA. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4298][Core] - The spark-submit cannot r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3561#issuecomment-68462934 [Test build #24961 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24961/consoleFull) for PR 3561 at commit [`5e0fce1`](https://github.com/apache/spark/commit/5e0fce1cd0a6b4b413c31e8ca214c11c569c6164). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4298][Core] - The spark-submit cannot r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3561#issuecomment-68462940 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24961/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5032] [graphx] Remove GraphX MIMA exclu...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3856#issuecomment-68463177 Hey @jkbradley - are you sure those excludes are necessary? All of the patches you mentioned are in Spark 1.2 already excludes should only be relevant to things that changed between 1.2 and the master branch. Ideally we haven't made any breaking changes since the 1.2 release. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5038][SQL] Add explicit return type for...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/3859 [SPARK-5038][SQL] Add explicit return type for implicit functions in Spark SQL You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark sql-implicits Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3859.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3859 commit 30c2c2463fd7fafbce2f072e9f81ec813b9e6589 Author: Reynold Xin r...@databricks.com Date: 2014-12-31T19:21:09Z [SPARK-5038] Add explicit return type for implicit functions in Spark SQL. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5038][SQL] Add explicit return type for...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3859#issuecomment-68463483 [Test build #24966 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24966/consoleFull) for PR 3859 at commit [`30c2c24`](https://github.com/apache/spark/commit/30c2c2463fd7fafbce2f072e9f81ec813b9e6589). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5038] Add explicit return type for impl...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/3860 [SPARK-5038] Add explicit return type for implicit functions. This is a follow up PR for rest of Spark (outside Spark SQL). The original PR for Spark SQL can be found at https://github.com/apache/spark/pull/3859 You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark implicit Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3860.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3860 commit 73702f9ad2725c51fb1a58aa8c3b58eb1b0fd88d Author: Reynold Xin r...@databricks.com Date: 2014-12-31T19:29:45Z [SPARK-5038] Add explicit return type for implicit functions. This is a follow up PR for rest of Spark (outside Spark SQL). The original PR for Spark SQL can be found at https://github.com/apache/spark/pull/3859 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5038] Add explicit return type for impl...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3860#issuecomment-68464128 [Test build #24967 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24967/consoleFull) for PR 3860 at commit [`73702f9`](https://github.com/apache/spark/commit/73702f9ad2725c51fb1a58aa8c3b58eb1b0fd88d). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3562#issuecomment-68464188 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24963/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3562#issuecomment-68464184 [Test build #24963 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24963/consoleFull) for PR 3562 at commit [`6e97f72`](https://github.com/apache/spark/commit/6e97f72ca401e21e6ef81f7a0535b96801776e6f). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4797] Replace breezeSquaredDistance
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3643#issuecomment-68464357 I ran some quick tests with random sparsity patterns. Averaged over 1000 iterations, it's definitely faster: length | v1 sparsity | v2 sparsity | new time| old time| speedup - | - | --- | --- | --- | --- 1000| 1 | 0.5 | 9.42E-06| 6.73E-04| 71.44 1000| 1 | 0.1 | 1.69E-06| 5.50E-05| 32.43 1000| 1 | 0.01| 1.90E-06| 3.30E-05| 17.40 1000| 0.5 | 0.1 | 9.89E-06| 7.17E-05| 7.25 1000| 0.5 | 0.01| 2.54E-06| 5.80E-05| 22.80 1000| 0.1 | 0.01| 1.95E-06| 5.82E-05| 29.84 1 | 1 | 0.5 | 1.11E-05| 2.30E-04| 20.73 1 | 1 | 0.1 | 1.03E-05| 2.54E-04| 24.54 1 | 1 | 0.01| 8.69E-06| 3.90E-04| 44.92 1 | 0.5 | 0.1 | 1.47E-05| 3.90E-04| 26.63 1 | 0.5 | 0.01| 8.63E-06| 4.03E-04| 46.76 1 | 0.1 | 0.01| 1.81E-06| 5.96E-04| 329.01 10 | 1 | 0.5 | 9.27E-05| 0.004039351 | 43.60 10 | 1 | 0.1 | 9.06E-05| 0.001540544 | 17.01 10 | 1 | 0.01| 8.71E-05| 0.002636216 | 30.25 10 | 0.5 | 0.1 | 1.15E-04| 0.003777669 | 32.76 10 | 0.5 | 0.01| 9.61E-05| 0.004879063 | 50.79 10 | 0.1 | 0.01| 1.89E-05| 0.003148419 | 166.29 100 | 1 | 0.5 | 0.001017196 | 0.05418411 | 53.27 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4797] Replace breezeSquaredDistance
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3643#issuecomment-68464642 @viirya Thanks for the updates! LGTM pending Jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5032] [graphx] Remove GraphX MIMA exclu...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3856#issuecomment-68464735 @pwendell Check out the MIMA failures before the excludes: ``` [error] * abstract method unpersist(Boolean)org.apache.spark.graphx.Graph in class org.apache.spark.graphx.Graph does not have a correspondent in old version [error]filter with: ProblemFilters.exclude[MissingMethodProblem](org.apache.spark.graphx.Graph.unpersist) [error] * abstract method checkpoint()Unit in class org.apache.spark.graphx.Graph does not have a correspondent in old version [error]filter with: ProblemFilters.exclude[MissingMethodProblem](org.apache.spark.graphx.Graph.checkpoint) [error] * method fromEdges(org.apache.spark.rdd.RDD,scala.reflect.ClassTag,scala.reflect.ClassTag)org.apache.spark.graphx.EdgeRDD in object org.apache.spark.graphx.EdgeRDD has now a different result type; was: org.apache.spark.graphx.EdgeRDD, is now: org.apache.spark.graphx.impl.EdgeRDDImpl [error]filter with: ProblemFilters.exclude[IncompatibleResultTypeProblem](org.apache.spark.graphx.EdgeRDD.fromEdges) [error] * abstract method filter(scala.Function1,scala.Function2)org.apache.spark.graphx.EdgeRDD in class org.apache.spark.graphx.EdgeRDD does not have a correspondent in new version [error]filter with: ProblemFilters.exclude[MissingMethodProblem](org.apache.spark.graphx.EdgeRDD.filter) [error] * method filter(scala.Function1,scala.Function2)org.apache.spark.graphx.EdgeRDD in class org.apache.spark.graphx.impl.EdgeRDDImpl has now a different result type; was: org.apache.spark.graphx.EdgeRDD, is now: org.apache.spark.graphx.impl.EdgeRDDImpl [error]filter with: ProblemFilters.exclude[IncompatibleResultTypeProblem](org.apache.spark.graphx.impl.EdgeRDDImpl.filter) ``` Were the version tags not centered at the right commits? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4797] Replace breezeSquaredDistance
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3643#issuecomment-68465045 [Test build #24964 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24964/consoleFull) for PR 3643 at commit [`f28b275`](https://github.com/apache/spark/commit/f28b275e153b3d093bf063c53efe1dea91084918). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4797] Replace breezeSquaredDistance
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3643#issuecomment-68465046 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24964/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1010] Clean up uses of System.setProper...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3739#issuecomment-68465243 It doesn't look like there are any new test failures / flakiness that can be attributed to this patch, so I've finished backporting this to `branch-1.2` (1.2.1), `branch-1.1` (1.1.2), and `branch-1.0` (1.0.3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4797] Replace breezeSquaredDistance
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3643 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5020 [MLlib] GaussianMixtureModel.predic...
Github user tgaloppo commented on the pull request: https://github.com/apache/spark/pull/3854#issuecomment-68465266 @jkbradley I am not crazy about the name predictMembership() ... to me it implies the hard assignment; a simple change like predictMemberships() might be more clear, or predictSoft(), or (thinking from a slightly different direction) allocate(). Any of those should be robust enough to reuse for soft k-means or LDA (or other such partial assignment algorithms). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4797] Replace breezeSquaredDistance
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3643#issuecomment-68465304 Merged into master. Thanks! (minor TODO: Though `sqdist` is touched in MLUtilsSuite, it would be nice to add unit tests to `VectorsSuite`.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4298][Core] - The spark-submit cannot r...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3561#issuecomment-68465381 I finished my backports of the other patch, so I'm going to merge this now. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4298][Core] - The spark-submit cannot r...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3561 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5020 [MLlib] GaussianMixtureModel.predic...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3854#issuecomment-68465590 It may be hard for users to tell the difference between `predict` and `predictMembership`, because `predict` is also predicting the membership. `predictFuzzy` or `predictSoft` sounds better to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [branch-1.0][SPARK-4355] ColumnStatisticsAggre...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3850#issuecomment-68465589 **[Test build #24962 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24962/consoleFull)** for PR 3850 at commit [`ae9b94a`](https://github.com/apache/spark/commit/ae9b94a3f817759ee6249af991beec7e19e52f12) after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [branch-1.0][SPARK-4355] ColumnStatisticsAggre...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3850#issuecomment-68465596 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24962/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4298][Core] - The spark-submit cannot r...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3561#issuecomment-68465783 Alright, I've merged this to `master` (1.3.0), `branch-1.2` (1.2.1), `branch-1.1` (1.1.2), and `branch-1.0` (1.0.3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-794][Core] Remove sleep() in ClusterSch...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3851#issuecomment-68465933 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-794][Core] Remove sleep() in ClusterSch...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3851#issuecomment-68466150 [Test build #24968 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24968/consoleFull) for PR 3851 at commit [`04c3e64`](https://github.com/apache/spark/commit/04c3e648021fa38acdde0745d4f7f961ef125dc1). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Integrate external shuffle service to coarse g...
GitHub user tnachen opened a pull request: https://github.com/apache/spark/pull/3861 Integrate external shuffle service to coarse grained Mesos mode You can merge this pull request into a Git repository by running: $ git pull https://github.com/tnachen/spark mesos_shuffle Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3861.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3861 commit 60df548387412ae12e6bd8439d48931aa564a22b Author: Timothy Chen tnac...@apache.org Date: 2014-12-05T05:55:42Z Launch External Shuffle Service with mesos commit 145bf5b9578d8087fb926e8fd73a8b04c34d07aa Author: Timothy Chen tnac...@gmail.com Date: 2014-12-13T20:17:47Z Support total and kill executors in coarse grained mesos mode. commit 03ee4f79c86c846c923ec83e17dd5ea7805091f6 Author: Timothy Chen tnac...@gmail.com Date: 2014-12-17T01:36:56Z Propogate the shuffle service setting. commit 7434bb22858899a8808edee16c11c7bd68263828 Author: Timothy Chen tnac...@gmail.com Date: 2014-12-20T01:27:42Z Implement a new executor for coarse grained mesos mode. commit 25331b1216889a9abfb40e583b56657ba45f840e Author: Timothy Chen tnac...@gmail.com Date: 2014-12-24T05:28:09Z Launch executor with shell and add traces commit 1aca094a1a20e961c76f65c98139ead4de8e4eab Author: Timothy Chen tnac...@gmail.com Date: 2014-12-30T08:12:42Z Fix destroying executor. commit 5c9fd75b2ae48f8a2c8c0b6cf8ebd7ab58e84b18 Author: Timothy Chen tnac...@gmail.com Date: 2014-12-31T20:10:19Z Only process status update if task is still tracked. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5020 [MLlib] GaussianMixtureModel.predic...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3854#issuecomment-68466378 +1 for predictSoft --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Integrate external shuffle service to coarse g...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3861#issuecomment-68466436 [Test build #24969 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24969/consoleFull) for PR 3861 at commit [`5c9fd75`](https://github.com/apache/spark/commit/5c9fd75b2ae48f8a2c8c0b6cf8ebd7ab58e84b18). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5020 [MLlib] GaussianMixtureModel.predic...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3854#issuecomment-68466535 [Test build #558 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/558/consoleFull) for PR 3854 at commit [`0f1d96e`](https://github.com/apache/spark/commit/0f1d96e2b4292edbf0a4c9db82fc2969016b0587). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4835] [WIP] Disable validateOutputSpecs...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3832#issuecomment-68466612 I'd be glad to add a test here, although this might be a little tricky since the old behavior resulted in silent failures; I should be able to come up with a test though. Regarding the streaming-specific `spark.streaming.hadoop.validateOutputSpecs` setting, which of the following behaviors is more intuitive? 1. Streaming jobs always respect the Streaming version of the setting and non-streaming jobs respect the regular version. If the streaming checks are enabled but the core checks are disabled, then we do output spec validation for streaming. 2. The Streaming version is just a gate which controls whether the core setting also applies to streaming jobs. If the streaming setting is true but the core setting is false, then the checks are not applied. Which of these makes more sense? I think that option 2 is a better backwards-compatibility escape hatch / flag. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5020 [MLlib] GaussianMixtureModel.predic...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3854#issuecomment-68466757 streaming failure --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org