[GitHub] spark pull request: SPARK-4963 [SQL] HiveTableScan return mutable ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3827#issuecomment-68337920 [Test build #24888 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24888/consoleFull) for PR 3827 at commit [`cea7e2e`](https://github.com/apache/spark/commit/cea7e2ec42c44f81965f8adf462faa887e2dae89). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5003][SQL]cast support date data type
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3839#issuecomment-68338141 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5003][SQL]cast support date data type
GitHub user haiyangsea opened a pull request: https://github.com/apache/spark/pull/3839 [SPARK-5003][SQL]cast support date data type enable cast to support date data type such as : select * from tableX where dateFiled cast('2014-12-30' as date) You can merge this pull request into a Git repository by running: $ git pull https://github.com/haiyangsea/spark datatype Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3839.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3839 commit 8bdd08f4fb08d6442861d855f015544a3d9c96a4 Author: haiyang huhaiy...@huawei.com Date: 2014-12-29T11:05:58Z support date datatype commit 81bd51da01813ade238fee65f4f2accdf3b6eda9 Author: haiyang huhaiy...@huawei.com Date: 2014-12-30T07:22:27Z add test case --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5002][SQL] Using ascending by default w...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3838#issuecomment-68338232 [Test build #24886 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24886/consoleFull) for PR 3838 at commit [`114b64a`](https://github.com/apache/spark/commit/114b64a9b8dba469c44a455cb6f239ea1e8c0d2a). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class GaussianMixtureModel(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5002][SQL] Using ascending by default w...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3838#issuecomment-68338233 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24886/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4963 [SQL] HiveTableScan return mutable ...
Github user yanbohappy commented on the pull request: https://github.com/apache/spark/pull/3827#issuecomment-68339079 @liancheng I agree to move the copy call to execution.Sample.execute and added new commits. It will take no effect on HiveTableScan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4988][SQL] Fix: 'Create table ..as sele...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3821#issuecomment-68339552 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24887/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4988][SQL] Fix: 'Create table ..as sele...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3821#issuecomment-68339547 [Test build #24887 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24887/consoleFull) for PR 3821 at commit [`1bab9e4`](https://github.com/apache/spark/commit/1bab9e4b782e62485f01f4f650a54c5ccb86f2a1). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4963 [SQL] Add copy to SQL's Sample oper...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3827#issuecomment-68341428 [Test build #24888 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24888/consoleFull) for PR 3827 at commit [`cea7e2e`](https://github.com/apache/spark/commit/cea7e2ec42c44f81965f8adf462faa887e2dae89). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4963 [SQL] Add copy to SQL's Sample oper...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3827#issuecomment-68341432 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24888/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/3246#issuecomment-68342204 Anyone would like to review this pr? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3246#discussion_r22343345 --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala --- @@ -25,6 +25,91 @@ import org.apache.spark.streaming.api.java.{JavaReceiverInputDStream, JavaDStrea import org.apache.spark.streaming.dstream.{ReceiverInputDStream, DStream} object TwitterUtils { + + // For implicit parameter used to avoid to have same type after erasure + case class Ignore(value: String ) { --- End diff -- This looks like a big hack. Just use different method names if you are trying to avoid type conflict after erasure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3246#discussion_r22343369 --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala --- @@ -60,6 +61,7 @@ private[streaming] class TwitterReceiver( twitterAuth: Authorization, filters: Seq[String], +locations: Seq[Seq[Double]], --- End diff -- These are supposed to be a bunch of (lat,lon) pairs, right? why not `Seq[(Double,Double)]`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3246#discussion_r22343378 --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala --- @@ -88,6 +90,11 @@ class TwitterReceiver( val query = new FilterQuery if (filters.size 0) { query.track(filters.toArray) + } + if (locations.size 0) { +query.locations(locations.map(_.toArray).toArray) + } + if (filters.size 0 || locations.size 0) { --- End diff -- It seems like this could be rewritten to avoid the redundant checks and create of `FilterQuery` when there is no filtering. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3246#discussion_r22343430 --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala --- @@ -112,20 +269,91 @@ object TwitterUtils { ): JavaReceiverInputDStream[Status] = { createStream(jssc.ssc, Some(twitterAuth), filters) } - + + /** + * Create a input stream that returns tweets received from Twitter. + * Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2. + * @param jsscJavaStreamingContext object + * @param twitterAuth Twitter4J Authorization + * @param filters Set of filter strings to get only those tweets that match them + * @param locations Set of longitude, latitude pairs to get only those tweets + *that falling within the requested bounding boxes + */ + def createStream( + jssc: JavaStreamingContext, + twitterAuth: Authorization, + filters: Array[String], + locations: Array[Array[Double]] +): JavaReceiverInputDStream[Status] = { +createStream(jssc.ssc, Some(twitterAuth), filters, locations.map(_.toSeq).toSeq) + } + + /** + * Create a input stream that returns tweets received from Twitter. + * Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2. + * @param jsscJavaStreamingContext object + * @param twitterAuth Twitter4J Authorization + * @param locations Set of longitude, latitude pairs to get only those tweets + *that falling within the requested bounding boxes + */ + def createStream( + jssc: JavaStreamingContext, + twitterAuth: Authorization, + locations: Array[Array[Double]] +): JavaReceiverInputDStream[Status] = { +createStream(jssc.ssc, Some(twitterAuth), Nil, locations.map(_.toSeq).toSeq) + } + + /** + * Create a input stream that returns tweets received from Twitter. + * @param jssc JavaStreamingContext object + * @param twitterAuth Twitter4J Authorization object + * @param filters Set of filter strings to get only those tweets that match them + * @param storageLevel Storage level to use for storing the received objects + */ + def createStream( + jssc: JavaStreamingContext, + twitterAuth: Authorization, + filters: Array[String], + storageLevel: StorageLevel +): JavaReceiverInputDStream[Status] = { +createStream(jssc.ssc, Some(twitterAuth), filters, Nil, storageLevel) + } + /** * Create a input stream that returns tweets received from Twitter. * @param jssc JavaStreamingContext object * @param twitterAuth Twitter4J Authorization object * @param filters Set of filter strings to get only those tweets that match them + * @param locations Set of longitude, latitude pairs to get only those tweets + *that falling within the requested bounding boxes * @param storageLevel Storage level to use for storing the received objects */ def createStream( jssc: JavaStreamingContext, twitterAuth: Authorization, filters: Array[String], + locations: Array[Array[Double]], + storageLevel: StorageLevel +): JavaReceiverInputDStream[Status] = { +createStream(jssc.ssc, Some(twitterAuth), filters, locations.map(_.toSeq).toSeq, storageLevel) + } + + /** + * Create a input stream that returns tweets received from Twitter. + * @param jssc JavaStreamingContext object + * @param twitterAuth Twitter4J Authorization object + * @param locations Set of longitude, latitude pairs to get only those tweets + *that falling within the requested bounding boxes + * @param storageLevel Storage level to use for storing the received objects + */ + def createStream( --- End diff -- There are *12* new overloads of `createStream` on top of the existing 4. This seems like big overkill. There should be one version in Java/Scala that takes all arguments, one each that takes minimal arguments, and any others needed to retain binary compatibility. The rest seem superfluous. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:
[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/3246#discussion_r22343534 --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala --- @@ -60,6 +61,7 @@ private[streaming] class TwitterReceiver( twitterAuth: Authorization, filters: Seq[String], +locations: Seq[Seq[Double]], --- End diff -- I remember that seems to be for using under both scala and java. So we can simply give a lat, lon pair in array. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3246#discussion_r22344218 --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala --- @@ -60,6 +61,7 @@ private[streaming] class TwitterReceiver( twitterAuth: Authorization, filters: Seq[String], +locations: Seq[Seq[Double]], --- End diff -- There is already a separate set of methods for Java, no? that use `Array` and not things like `Seq`. This is private to the Scala-based Spark code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/3820#issuecomment-68346947 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/3246#discussion_r22344303 --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala --- @@ -60,6 +61,7 @@ private[streaming] class TwitterReceiver( twitterAuth: Authorization, filters: Seq[String], +locations: Seq[Seq[Double]], --- End diff -- There is. I meant that the main problem is the inconsistency between Scala and Java apis. One use `Seq[(Double,Double)]` and other uses `Array[Array[Double]]`. If it is ok, I would revise it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3820#issuecomment-68347060 [Test build #24889 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24889/consoleFull) for PR 3820 at commit [`dc6eaba`](https://github.com/apache/spark/commit/dc6eaba7db957eb9038532c7c57282c040e870d4). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Changes to illustrate the principles of functi...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3835#issuecomment-68347289 @yujunliang If you're not working on a change that you want to be considered for merging later, then I would not open a pull request at all. Just work in your local branch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MLlib]delete the train function
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3836#issuecomment-68348124 Although this is an experimental class, and so API methods could be removed, I think you'd still want a decent reason to remove this method at this point, even if it's deprecated. Here, there is no method with the same signature in the object though, so I'm not sure what the problem is. It's common to have many methods with the same name in a class anyway. With reflection you differentiate them by their method signature, so this shouldn't be an obstacle. (I think the object even appears as a separate class `DecisionTree$` in the JVM?) Last you'd want to make a JIRA for this too in general, but first I think the points above would need to be answered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4660: Use correct default classloader in...
GitHub user pkolaczk opened a pull request: https://github.com/apache/spark/pull/3840 SPARK-4660: Use correct default classloader in JavaSerializer. You can merge this pull request into a Git repository by running: $ git pull https://github.com/pkolaczk/spark SPARK-4660 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3840.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3840 commit 86bc5ebdfb2c5f0d58ffeaf184f94f60923fe676 Author: Piotr Kolaczkowski pkola...@datastax.com Date: 2014-12-30T11:01:47Z SPARK-4660: Use correct default classloader in JavaSerializer. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4660: Use correct default classloader in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3840#issuecomment-68349098 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3820#issuecomment-68351142 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24889/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3820#issuecomment-68351139 [Test build #24889 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24889/consoleFull) for PR 3820 at commit [`dc6eaba`](https://github.com/apache/spark/commit/dc6eaba7db957eb9038532c7c57282c040e870d4). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/3841 [SPARK-5006][Deploy]spark.port.maxRetries doesn't work https://issues.apache.org/jira/browse/SPARK-5006 I think the issue is produced in https://github.com/apache/spark/pull/1777. Not digging mesos's backend yet. Maybe should add same logic either. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark SPARK-5006 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3841.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3841 commit 62ec336fd3c600a5646d3614287cbb1de72e930d Author: WangTaoTheTonic barneystin...@aliyun.com Date: 2014-12-30T12:12:39Z spark.port.maxRetries doesn't work --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-68352288 [Test build #24890 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24890/consoleFull) for PR 3841 at commit [`62ec336`](https://github.com/apache/spark/commit/62ec336fd3c600a5646d3614287cbb1de72e930d). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-68352388 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24890/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-68352385 [Test build #24890 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24890/consoleFull) for PR 3841 at commit [`62ec336`](https://github.com/apache/spark/commit/62ec336fd3c600a5646d3614287cbb1de72e930d). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-68353060 [Test build #24891 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24891/consoleFull) for PR 3841 at commit [`191face`](https://github.com/apache/spark/commit/191face9291c8d455223858882ef509406a8826d). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-68353372 [Test build #24892 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24892/consoleFull) for PR 3794 at commit [`6e95955`](https://github.com/apache/spark/commit/6e95955c9c67ce509372fe08f9ced962eb251593). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/3842 SPARK-2757 [BUILD] [STREAMING] Add Mima test for Spark Sink after 1.10 is released Re-enable MiMa for Streaming Flume Sink module, now that 1.1.0 is released, per the JIRA TO-DO. That's pretty much all there is to this. You can merge this pull request into a Git repository by running: $ git pull https://github.com/srowen/spark SPARK-2757 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3842.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3842 commit 0e5ba5cefaca04c188aadf5309ca6d5dffe1c63f Author: Sean Owen so...@cloudera.com Date: 2014-12-30T12:46:10Z Re-enable MiMa for Streaming Flume Sink module --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3842#issuecomment-68354029 [Test build #24893 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24893/consoleFull) for PR 3842 at commit [`0e5ba5c`](https://github.com/apache/spark/commit/0e5ba5cefaca04c188aadf5309ca6d5dffe1c63f). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4386] Improve performance when writing ...
GitHub user MickDavies opened a pull request: https://github.com/apache/spark/pull/3843 [SPARK-4386] Improve performance when writing Parquet files Convert type of RowWriteSupport.attributes to Array. Analysis of performance for writing very wide tables shows that time is spent predominantly in apply method on attributes var. Type of attributes previously was LinearSeqOptimized and apply is O(N) which made write O(N squared). Measurements on 575 column table showed this change made a 6x improvement in write times. You can merge this pull request into a Git repository by running: $ git pull https://github.com/MickDavies/spark SPARK-4386 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3843.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3843 commit 892519d3bb7166ea184f0c070759b8a3b679e2c4 Author: Michael Davies michael.belldav...@gmail.com Date: 2014-12-30T13:00:25Z [SPARK-4386] Improve performance when writing Parquet files Convert type of RowWriteSupport.attributes to Array. Analysis of performance for writing very wide tables shows that time is spent predominantly in apply method on attributes var. Type of attributes previously was LinearSeqOptimized and apply is O(N) which made write O(N squared). Measurements on 575 column table showed this change showed a 6x improvement in write times. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4631] unit test for MQTT
GitHub user Bilna opened a pull request: https://github.com/apache/spark/pull/3844 [SPARK-4631] unit test for MQTT You can merge this pull request into a Git repository by running: $ git pull https://github.com/Bilna/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3844.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3844 commit 86164950acfc794c6c9b1db3663716ac4626c55b Author: bilna bil...@am.amrita.edu Date: 2014-12-30T13:06:09Z [SPARK-4631] unit test for MQTT --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4631] unit test for MQTT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3844#issuecomment-68355440 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4386] Improve performance when writing ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3843#issuecomment-68355445 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4386] Improve performance when writing ...
Github user MickDavies commented on the pull request: https://github.com/apache/spark/pull/3254#issuecomment-68355607 @jimfcarroll sorry I misunderstood your comment. Good that you have verified performance gain. I have added a PR. It is number 3843. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4631] unit test for MQTT
Github user prabeesh commented on the pull request: https://github.com/apache/spark/pull/3844#issuecomment-68355645 @tdas verify this patch --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5007] [CORE] Try random port when start...
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/3845 [SPARK-5007] [CORE] Try random port when startServiceOnPort to reduce the chance of port collision When multiple Spark programs are submitted at the same node (called springboard machine). The ports (default 4040) of these SparkUIs are from 4040 to 4056. Then the Spark programs submitted later would fail because of SparkUI port collision. The chance of port collision could be reduced by setting spark.ui.port or spark.port.maxRetries. However, I think it's better to try random port when startServiceOnPort to reduce the chance of port collision. You can merge this pull request into a Git repository by running: $ git pull https://github.com/YanTangZhai/spark SPARK-5007 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3845.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3845 commit cdef539abc5d2d42d4661373939bdd52ca8ee8e6 Author: YanTangZhai hakeemz...@tencent.com Date: 2014-08-06T13:07:08Z Merge pull request #1 from apache/master update commit cbcba66ad77b96720e58f9d893e87ae5f13b2a95 Author: YanTangZhai hakeemz...@tencent.com Date: 2014-08-20T13:14:08Z Merge pull request #3 from apache/master Update commit 8a0010691b669495b4c327cf83124cabb7da1405 Author: YanTangZhai hakeemz...@tencent.com Date: 2014-09-12T06:54:58Z Merge pull request #6 from apache/master Update commit 03b62b043ab7fd39300677df61c3d93bb9beb9e3 Author: YanTangZhai hakeemz...@tencent.com Date: 2014-09-16T12:03:22Z Merge pull request #7 from apache/master Update commit 76d40277d51f709247df1d3734093bf2c047737d Author: YanTangZhai hakeemz...@tencent.com Date: 2014-10-20T12:52:22Z Merge pull request #8 from apache/master update commit d26d98248a1a4d0eb15336726b6f44e05dd7a05a Author: YanTangZhai hakeemz...@tencent.com Date: 2014-11-04T09:00:31Z Merge pull request #9 from apache/master Update commit e249846d9b7967ae52ec3df0fb09e42ffd911a8a Author: YanTangZhai hakeemz...@tencent.com Date: 2014-11-11T03:18:24Z Merge pull request #10 from apache/master Update commit 6e643f81555d75ec8ef3eb57bf5ecb6520485588 Author: YanTangZhai hakeemz...@tencent.com Date: 2014-12-01T11:23:56Z Merge pull request #11 from apache/master Update commit 718afebe364bd54ac33be425e24183eb1c76b5d3 Author: YanTangZhai hakeemz...@tencent.com Date: 2014-12-05T11:08:31Z Merge pull request #12 from apache/master update commit e4c2c0a18bdc78cc17823cbc2adf3926944e6bc5 Author: YanTangZhai hakeemz...@tencent.com Date: 2014-12-24T03:15:22Z Merge pull request #15 from apache/master update commit 2fb4f4450230fee09ff8932eb107f09ef72f2402 Author: yantangzhai tyz0...@163.com Date: 2014-12-30T13:41:59Z [SPARK-5007] [CORE] Try random port when startServiceOnPort to reduce the chance of port collision --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5007] [CORE] Try random port when start...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3845#issuecomment-68357320 [Test build #24894 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24894/consoleFull) for PR 3845 at commit [`2fb4f44`](https://github.com/apache/spark/commit/2fb4f4450230fee09ff8932eb107f09ef72f2402). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5007] [CORE] Try random port when start...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3845#issuecomment-68357384 [Test build #24894 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24894/consoleFull) for PR 3845 at commit [`2fb4f44`](https://github.com/apache/spark/commit/2fb4f4450230fee09ff8932eb107f09ef72f2402). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5007] [CORE] Try random port when start...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3845#issuecomment-68357385 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24894/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-4995] Replace Vector.toBreeze.activeIte...
GitHub user james64 opened a pull request: https://github.com/apache/spark/pull/3846 [Spark-4995] Replace Vector.toBreeze.activeIterator with foreachActive New foreachActive method of vector was introduced by SPARK-4431 as more efficient alternative to vector.toBreeze.activeIterator. There are some parts of codebase where it was not yet replaced. @dbtsai You can merge this pull request into a Git repository by running: $ git pull https://github.com/james64/spark SPARK-4995-foreachActive Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3846.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3846 commit 90a7d982e298863d16455108208fdb0765fe2ec6 Author: Jakub Dubovsky dubov...@avast.com Date: 2014-12-30T13:00:23Z activeIterator removed in MLUtils.saveAsLibSVMFile commit 47a49c2e13a4828ce633b3080e2ff7a92f6a Author: Jakub Dubovsky dubov...@avast.com Date: 2014-12-30T13:22:29Z activeIterator removed in RowMatrix.toBreeze commit 32fe6c67e46837f6625ad8ed5ed5eee20c3793d2 Author: Jakub Dubovsky dubov...@avast.com Date: 2014-12-30T13:29:35Z activeIterator removed - IndexedRowMatrix.toBreeze commit 3eb7e3711fcae74031a94708233db0d8da348ea4 Author: Jakub Dubovsky dubov...@avast.com Date: 2014-12-30T13:35:17Z Scalastyle fix --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-4995] Replace Vector.toBreeze.activeIte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3846#issuecomment-68357963 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-68358227 [Test build #24891 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24891/consoleFull) for PR 3841 at commit [`191face`](https://github.com/apache/spark/commit/191face9291c8d455223858882ef509406a8826d). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-68358232 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24891/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3842#issuecomment-68359334 [Test build #24893 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24893/consoleFull) for PR 3842 at commit [`0e5ba5c`](https://github.com/apache/spark/commit/0e5ba5cefaca04c188aadf5309ca6d5dffe1c63f). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3842#issuecomment-68359340 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24893/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3842#issuecomment-68360108 Interesting situation. There is a MiMa failure since SPARK-3154 / https://github.com/apache/spark/commit/bcb5cdad614d4fce43725dfec3ce88172d2f8c11 changed a method after 1.2.0, but, it's `private[sink]`. I should just exclude this failure I believe. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-68361338 **[Test build #24892 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24892/consoleFull)** for PR 3794 at commit [`6e95955`](https://github.com/apache/spark/commit/6e95955c9c67ce509372fe08f9ced962eb251593) after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-68361347 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24892/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3842#issuecomment-68361334 [Test build #24895 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24895/consoleFull) for PR 3842 at commit [`50ff80e`](https://github.com/apache/spark/commit/50ff80e4498c2cb0a30793fb41fa2d20942811d6). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4386] Improve performance when writing ...
Github user jimfcarroll commented on the pull request: https://github.com/apache/spark/pull/3254#issuecomment-68361866 @MickDavies thanks. I needed the change and was beginning the process of profiling again. 5.5 million rows, 2000+ columns took over 15 hours to create a Parquet file for me so I incorporated your change when I saw your description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4576][SQL] Add concatenation operator
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3433#issuecomment-68363448 [Test build #24896 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24896/consoleFull) for PR 3433 at commit [`6b47555`](https://github.com/apache/spark/commit/6b4755503e64153fb05425b2085076450a7cbe4a). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [YARN][SPARK-4929] Bug fix: fix the yarn-clien...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3771#issuecomment-68367226 @SaintBacchus so I'm still a bit unclear of the exact scenario. I just want to make sure we are handling everything properly so want to make sure I understand fully. So this is when the RM goes down and is being brought back up or fails over to a standby. At that point it restarts the applications to start a new attempt. The shutdown hook is run and the code you mention above runs and unregisters. I understand client mode can't set it because spark context is not in the same process. The thing that is unclear to me is how is cluster mode setting the finalStatus to something other then succeeded? Is sparkContext being signalled and then throwing exception so that startUserClass catches it and marks it as failed? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3842#issuecomment-68369026 [Test build #24895 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24895/consoleFull) for PR 3842 at commit [`50ff80e`](https://github.com/apache/spark/commit/50ff80e4498c2cb0a30793fb41fa2d20942811d6). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3842#issuecomment-68369035 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24895/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [YARN][SPARK-4929] Bug fix: fix the yarn-clien...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/3771#discussion_r22352924 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala --- @@ -153,6 +153,19 @@ private[spark] class ApplicationMaster(args: ApplicationMasterArguments, } /** + * we should distinct the default final status between client and cluster, + * because the SUCCEEDED status may cause the HA failed in client mode and + * UNDEFINED may cause the error reporter in cluster when using sys.exit. + */ + final def getDefaultFinalStatus() = { --- End diff -- I assume we are hitting the logic on line 108 above in if (!finished) {... I think that comment and code is based on the final status defaulting to success. In the very least we should update that comment explaining what is going to happen in client vs cluster mode. Since the DisassociatedEvent exits with success for client mode I think making the default as undefined for client mode is fine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4576][SQL] Add concatenation operator
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3433#issuecomment-68370094 [Test build #24896 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24896/consoleFull) for PR 3433 at commit [`6b47555`](https://github.com/apache/spark/commit/6b4755503e64153fb05425b2085076450a7cbe4a). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class Concat(left: Expression, right: Expression) extends BinaryExpression ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4576][SQL] Add concatenation operator
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3433#issuecomment-68370097 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24896/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [YARN][SPARK-4929] Bug fix: fix the yarn-clien...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/3771#discussion_r22353308 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala --- @@ -153,6 +153,19 @@ private[spark] class ApplicationMaster(args: ApplicationMasterArguments, } /** + * we should distinct the default final status between client and cluster, --- End diff -- can we clarify this comment a little. Perhaps something more like below (feel free to reword) Set the default final application status for client mode to UNDEFINED to handle if YARN HA restarts the application so that it properly retries. Set the final status to SUCCEEDED in cluster mode to handle if the user calls System.exit from the application code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4574][SQL] Adding support for defining ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3431#issuecomment-68371410 [Test build #24897 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24897/consoleFull) for PR 3431 at commit [`44eb70c`](https://github.com/apache/spark/commit/44eb70cda9049a68d7a3a4a4ca74e5bc41f04991). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4574][SQL] Adding support for defining ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3431#issuecomment-68371497 [Test build #24897 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24897/consoleFull) for PR 3431 at commit [`44eb70c`](https://github.com/apache/spark/commit/44eb70cda9049a68d7a3a4a4ca74e5bc41f04991). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class DefaultSource extends SchemaRelationProvider ` * `case class ParquetRelation2(` * `trait SchemaRelationProvider ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4574][SQL] Adding support for defining ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3431#issuecomment-68371499 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24897/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4574][SQL] Adding support for defining ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3431#issuecomment-68372387 [Test build #24898 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24898/consoleFull) for PR 3431 at commit [`02a662c`](https://github.com/apache/spark/commit/02a662c4cb3605b3abc7033ad14e3b7400c30964). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4920][UI] add version on master and wor...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3769#issuecomment-68375782 I've made the SPARK_VERSION change in the maintenance branches, so I'm now going to merge this into `master` (1.3.0), `branch-1.2` (1.2.1), `branch-1.1` (1.1.2), and `branch-1.0` (1.0.3). Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4920][UI] add version on master and wor...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3769 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4920][UI] add version on master and wor...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3769#issuecomment-68376239 Actually, it looks like there are other patches that need to be cherry-picked before this can be pulled into `branch-1.1` (1.1.2) and `branch-1.0` (1.0.3); I'll tag this in JIRA for followup and handle it myself. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4920][UI]:current spark version in UI i...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3763#issuecomment-68376619 Branch-1.1 backport is here: #3768 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3768#issuecomment-68376638 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3768#issuecomment-68376710 LGTM, pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4882] Register PythonBroadcast with Kry...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3831 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3768#issuecomment-68377078 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24899/consoleFull) for PR 3768 at commit [`ec2365f`](https://github.com/apache/spark/commit/ec2365fafa159bd4cc1d3a62a125ac76d4e0dd16). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4882] Register PythonBroadcast with Kry...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3831#issuecomment-68377184 I've merged this into `master` (1.3.0) and `branch-1.2` (1.2.1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3768#issuecomment-68377345 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24899/consoleFull) for PR 3768 at commit [`ec2365f`](https://github.com/apache/spark/commit/ec2365fafa159bd4cc1d3a62a125ac76d4e0dd16). * This patch **fails** unit tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3768#issuecomment-68377346 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24899/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3841#discussion_r22356478 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -176,6 +176,10 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli logInfo(sRunning Spark version $SPARK_VERSION) private[spark] val conf = config.clone() + val portRetriesConf = conf.getOption(spark.port.maxRetries) --- End diff -- You could use `conf.getOption(...).foreach { portRetriesConf = [...] }` but I'm not sure that it's a huge win. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3841#discussion_r22356523 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1691,15 +1691,12 @@ private[spark] object Utils extends Logging { /** * Default maximum number of retries when binding to a port before giving up. */ - val portMaxRetries: Int = { + lazy val portMaxRetries: Int = { --- End diff -- Why is this lazy? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3841#discussion_r22356578 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1719,6 +1716,7 @@ private[spark] object Utils extends Logging { serviceName: String = , maxRetries: Int = portMaxRetries): (T, Int) = { val serviceString = if (serviceName.isEmpty) else s '$serviceName' +logInfo(sStarting service$serviceString on port $startPort with maximum $maxRetries retries. ) --- End diff -- Typo: need extra space in `service$serviceString`). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3841#discussion_r22356620 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnableUtil.scala --- @@ -76,7 +76,9 @@ trait ExecutorRunnableUtil extends Logging { // uses Akka to connect to the scheduler, the akka settings are needed as well as the // authentication settings. sparkConf.getAll. - filter { case (k, v) = k.startsWith(spark.auth) || k.startsWith(spark.akka) }. + filter { case (k, v) = + k.startsWith(spark.auth) || k.startsWith(spark.akka) || k.equals(spark.port.maxRetries) --- End diff -- This line is underindented relative to `filter`; I'd move the `filter { case (k, v) = ` to the previous line, and the matching brace to the next line. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3841#discussion_r22356634 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -176,6 +176,10 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli logInfo(sRunning Spark version $SPARK_VERSION) private[spark] val conf = config.clone() + val portRetriesConf = conf.getOption(spark.port.maxRetries) + if (portRetriesConf.isDefined) { +System.setProperty(spark.port.maxRetries, portRetriesConf.get) --- End diff -- Won't changing from SparkConf to system properties break the ability to set this configuration via SparkConf? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-68378792 I'm a bit confused about this chance, since it seems like changing the code to read that value from system properties instead of SparkConf breaks our ability to configure it via SparkConf. Can you add a failing unit test which demonstrates the problem / bug that this patch addresses? If this issue has to do with initialization ordering, I'd like to see if we can come up with a cleaner approach which doesn't involve things like unexplained `lazy` keywords (since I'm concerned that such approaches will inevitably break when the code is modified). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4574][SQL] Adding support for defining ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3431#issuecomment-68378891 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24898/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4574][SQL] Adding support for defining ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3431#issuecomment-68378887 [Test build #24898 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24898/consoleFull) for PR 3431 at commit [`02a662c`](https://github.com/apache/spark/commit/02a662c4cb3605b3abc7033ad14e3b7400c30964). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class DefaultSource extends SchemaRelationProvider ` * `case class ParquetRelation2(` * `trait SchemaRelationProvider ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3768#issuecomment-68380658 That latest failure is my fault (bad merge that I've reverted). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3768#issuecomment-68380668 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3768#issuecomment-68380897 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24900/consoleFull) for PR 3768 at commit [`ec2365f`](https://github.com/apache/spark/commit/ec2365fafa159bd4cc1d3a62a125ac76d4e0dd16). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3794#discussion_r22357607 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -202,9 +202,6 @@ abstract class RDD[T: ClassTag]( */ final def partitions: Array[Partition] = { checkpointRDD.map(_.partitions).getOrElse { - if (partitions_ == null) { --- End diff -- Won't this now throw a NPE if we call `partitions` from a worker, since now this will return `null` after the RDD is serialized and deserialized? I guess maybe we never do that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5011][SQL] Add support for WITH SERDEPR...
GitHub user OopsOutOfMemory opened a pull request: https://github.com/apache/spark/pull/3847 [SPARK-5011][SQL] Add support for WITH SERDEPROPERTIES, TBLPROPERTIES in CREATE TEMPORARY TABLE issues is here: https://issues.apache.org/jira/browse/SPARK-5011 Currently, since I find a bug which block this PR: issues: https://issues.apache.org/jira/browse/SPARK-5009 , Temporarily, I use replace `SERDEPROPERTIES` with `SERDEPROP`, replace `TBLPROPERTIES` with `TBLPROP`. After fix that bug above, I will rename them back. And the final version will be like this, see below: ``` val hbaseDDL = s |CREATE TEMPORARY TABLE hbase_people(row_key string, name string, age int, job string) |USING com.shengli.spark.hbase |OPTIONS ( | someOptions 'abcdefg' |) |WITH SERDEPROPERTIES ( | 'hbase.columns.mapping'=':key, profile:name, profile:age, career:job' |) |TBLPROPERTIES ( | 'hbase.table.name' = 'people' |) .stripMargin``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/OopsOutOfMemory/spark params Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3847.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3847 commit 0657df4a95eb0d5db8bcbfe87eedfe1477ffa1a4 Author: OopsOutOfMemory victorshen...@126.com Date: 2014-12-30T17:50:52Z add support for SERDEPROPERTIES TBLPROPERTIES --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3794#discussion_r22357840 --- Diff: core/src/main/scala/org/apache/spark/rdd/BinaryFileRDD.scala --- @@ -46,6 +47,7 @@ private[spark] class BinaryFileRDD[T]( for (i - 0 until rawSplits.size) { result(i) = new NewHadoopPartition(id, i, rawSplits(i).asInstanceOf[InputSplit with Writable]) } +logDebug(Get these partitions took %f s.format((System.nanoTime - start) / 1e9)) --- End diff -- Since this `getPartitions` method is guaranteed to only be called once, I think we can just move this logging to its call site in `RDD.scala` (e.g. add a block near where we assign to `partitions_`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5011][SQL] Add support for WITH SERDEPR...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3847#issuecomment-68381796 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4501][Core] - Create build/mvn to autom...
Github user brennonyork commented on the pull request: https://github.com/apache/spark/pull/3707#issuecomment-68381889 @witgo just for clarity, does this mean you aren't seeing this issue anymore? Want to ensure you aren't having any more troubles! :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-68382391 @markhamstra How would this interact with the idea of @erikerlandson to defer partition computation? #3079 Maybe I'm overlooking something, but #3079 seems kind of orthogonal. It seems like that issue is concerned with making the `sortByKey` transformation lazy so that it does not eagerly trigger a Spark job to compute the range partition boundaries, whereas this pull request is related to eager vs. lazy evaluation of what's effectively a Hadoop filesystem metadata call. Maybe eager vs. lazy is the wrong way to think about this PR's issue, though, since I guess we're more concerned with _where_ the call is performed (blocking DAGScheduler's event loop vs. a driver user-code thread) than when it's performed. I suppose that maybe you could contrive an example where this patch changes the behavior of a user job, since maybe someone defines some transformations up-front, runs jobs to generate output, then reads it back in another RDD, in which case the data to be read might not exist at the time that the RDD is defined but will exist when the first action on it is invoked. So, maybe we should consider moving the first `partitions` call closer to the DAGScheduler's job submission methods, but not inside of the actor (e.g. don't change any code in `RDD`, but just add a call that traverses the lineage chain and calls `partitions` on each RDD, making sure that this call occurs before the job submitter sends a message into the DAGScheduler actor). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-4995] Replace Vector.toBreeze.activeIte...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3846#issuecomment-68382800 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-4995] Replace Vector.toBreeze.activeIte...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3846#issuecomment-68382810 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4998][MLlib]delete the train function
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3836#issuecomment-68383731 @srowen The Scala compiler doesn't generate the static `train` method under `DecisionTree` if there is a `train` method under `class DecisionTree`, regardless of the method signature. That's why we deprecated this method. From javap: ~~~ public class org.apache.spark.mllib.tree.DecisionTree implements scala.Serializable,org.apache.spark.Logging { public static scala.Optionorg.apache.spark.mllib.tree.impl.NodeIdCache findBestSplits$default$10(); public static org.apache.spark.mllib.tree.impl.TimeTracker findBestSplits$default$9(); public static org.apache.spark.mllib.tree.model.DecisionTreeModel trainRegressor(org.apache.spark.api.java.JavaRDDorg.apache.spark.mllib.regression.LabeledPoint, java.util.Mapjava.lang.Integer, java.lang.Integer, java.lang.String, int, int); public static org.apache.spark.mllib.tree.model.DecisionTreeModel trainRegressor(org.apache.spark.rdd.RDDorg.apache.spark.mllib.regression.LabeledPoint, scala.collection.immutable.Mapjava.lang.Object, java.lang.Object, java.lang.String, int, int); public static org.apache.spark.mllib.tree.model.DecisionTreeModel trainClassifier(org.apache.spark.api.java.JavaRDDorg.apache.spark.mllib.regression.LabeledPoint, int, java.util.Mapjava.lang.Integer, java.lang.Integer, java.lang.String, int, int); public static org.apache.spark.mllib.tree.model.DecisionTreeModel trainClassifier(org.apache.spark.rdd.RDDorg.apache.spark.mllib.regression.LabeledPoint, int, scala.collection.immutable.Mapjava.lang.Object, java.lang.Object, java.lang.String, int, int); public org.slf4j.Logger org$apache$spark$Logging$$log_(); public void org$apache$spark$Logging$$log__$eq(org.slf4j.Logger); public java.lang.String logName(); public org.slf4j.Logger log(); public void logInfo(scala.Function0java.lang.String); public void logDebug(scala.Function0java.lang.String); public void logTrace(scala.Function0java.lang.String); public void logWarning(scala.Function0java.lang.String); public void logError(scala.Function0java.lang.String); public void logInfo(scala.Function0java.lang.String, java.lang.Throwable); public void logDebug(scala.Function0java.lang.String, java.lang.Throwable); public void logTrace(scala.Function0java.lang.String, java.lang.Throwable); public void logWarning(scala.Function0java.lang.String, java.lang.Throwable); public void logError(scala.Function0java.lang.String, java.lang.Throwable); public boolean isTraceEnabled(); public org.apache.spark.mllib.tree.model.DecisionTreeModel run(org.apache.spark.rdd.RDDorg.apache.spark.mllib.regression.LabeledPoint); public org.apache.spark.mllib.tree.model.DecisionTreeModel train(org.apache.spark.rdd.RDDorg.apache.spark.mllib.regression.LabeledPoint); public org.apache.spark.mllib.tree.DecisionTree(org.apache.spark.mllib.tree.configuration.Strategy); } ~~~ One way to call those static method is quite ugly: ~~~ DecisionTree$.MODULE$.train(...) ~~~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-4995] Replace Vector.toBreeze.activeIte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3846#issuecomment-68383326 [Test build #24901 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24901/consoleFull) for PR 3846 at commit [`3eb7e37`](https://github.com/apache/spark/commit/3eb7e3711fcae74031a94708233db0d8da348ea4). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5011][SQL] Add support for WITH SERDEPR...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/3847#issuecomment-68384166 What is the rational behind this change? You already have options for passing key/value pairs to the library. Also, there is nothing called a `SerDe` in the external datasources API. Why not just pass them all as options. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org