[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10152#discussion_r52576421 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -272,15 +285,14 @@ class Word2Vec extends Serializable with Logging { /** * Computes the vector representation of each word in vocabulary. - * @param dataset an RDD of words + * @param dataset a RDD of sentences, --- End diff -- we should use "an" here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12982][SQL] Add table name validation i...
Github user jayadevanmurali commented on the pull request: https://github.com/apache/spark/pull/11051#issuecomment-182760091 @hvanhovell Thanks for the quick fix, I think we can retest this PR now. What you think ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13279] Remove unnecessary duplicate che...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/11167#issuecomment-182778235 I don't quite see why this solves a lock problem. Should this be a set? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11714][Mesos] Make Spark on Mesos honor...
Github user skonto commented on the pull request: https://github.com/apache/spark/pull/11157#issuecomment-182778238 done. Git fetch didnt work for the test build. Could someone re-launch it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13264][Doc] Removed multi-byte characte...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/11149#issuecomment-182781119 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13264][Doc] Removed multi-byte characte...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/11149 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/10152#discussion_r52575653 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -272,15 +285,14 @@ class Word2Vec extends Serializable with Logging { /** * Computes the vector representation of each word in vocabulary. - * @param dataset an RDD of words + * @param dataset a RDD of sentences, --- End diff -- That's right, though RDD effectively starts with a vowel sound: arr-dee-dee. A native speaker would certainly say "an RDD" like "an hour". In a similar way, people disagree over "a SQL database" vs "an SQL database" but it's really a disagreement over whether you say "a _sequel_ database" or "an _ess-cyoo-ell_ database. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12915] [SQL] add SQL metrics for whole ...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/11170#issuecomment-182777260 cc @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11102] [SQL] Uninformative exception wh...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/9490#issuecomment-182785426 Please close this PR --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...
Github user ygcao commented on a diff in the pull request: https://github.com/apache/spark/pull/10152#discussion_r52574236 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -272,15 +285,14 @@ class Word2Vec extends Serializable with Logging { /** * Computes the vector representation of each word in vocabulary. - * @param dataset an RDD of words + * @param dataset a RDD of sentences, --- End diff -- This is an interesting topic, seem r is not a vowel, not sounds like vowel either, why 'an'? I found this from web:You use the article âaâ before words that start with a consonant sound and âanâ before words that start with a vowel sound. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...
Github user ygcao commented on a diff in the pull request: https://github.com/apache/spark/pull/10152#discussion_r52574384 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -551,12 +551,17 @@ class Word2VecModel private[spark] ( } ind += 1 } -wordList.zip(cosVec) +var topResults = wordList.zip(cosVec) .toSeq - .sortBy(- _._2) + .sortBy(-_._2) .take(num + 1) .tail - .toArray +if (vecNorm != 0.0f) { + topResults = topResults.map { case (word, cosVec) => --- End diff -- Good point! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9#issuecomment-182772277 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9#issuecomment-182772167 **[Test build #51090 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51090/consoleFull)** for PR 9 at commit [`166a6ff`](https://github.com/apache/spark/commit/166a6fffcfb9ec8aacdcc91ce827450fca0e79d2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9#issuecomment-182772279 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51090/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13278][CORE] Launcher fails to start wi...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/11160#issuecomment-182779157 I think `SparkBuild.scala` has a similar computation that needs a similar treatment. Also `test("Kill process")` in `UtilsSuite.scala`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13277][SQL] ANTLR ignores other rule us...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11168#issuecomment-182781741 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51089/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13277][SQL] ANTLR ignores other rule us...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11168#issuecomment-182758075 **[Test build #51089 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51089/consoleFull)** for PR 11168 at commit [`4cb9d2a`](https://github.com/apache/spark/commit/4cb9d2a0401d10277195c7853999cc89a0853abd). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13260][SQL] count(*) does not work with...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11169#issuecomment-182772770 **[Test build #51091 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51091/consoleFull)** for PR 11169 at commit [`b52e156`](https://github.com/apache/spark/commit/b52e1564b578cddb35931af2dd0e9c2c9d97b6f3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13278][CORE] Launcher fails to start wi...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/11160#discussion_r52577418 --- Diff: launcher/src/main/java/org/apache/spark/launcher/CommandBuilderUtils.java --- @@ -336,4 +334,18 @@ static void addPermGenSizeOpt(List cmd) { cmd.add("-XX:MaxPermSize=256m"); } + /** + * Get the major version of the java.version string supplied. + */ + static int javaMajorVersion(String javaVersion) { +String[] version = javaVersion.split("[+.\\-]+"); --- End diff -- How about just splitting on non-numbers? It's all kind of a theoretical difference though. This looks OK. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12915] [SQL] add SQL metrics for whole ...
GitHub user davies opened a pull request: https://github.com/apache/spark/pull/11170 [SPARK-12915] [SQL] add SQL metrics for whole stage codegen This PR add SQL metrics for generated operators, the cost is about 0.2 nano seconds per row. You can merge this pull request into a Git repository by running: $ git pull https://github.com/davies/spark gen_metric Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11170.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11170 commit cf21f0538338dd14623b3aa8f93ad120182a0cd6 Author: Davies LiuDate: 2016-02-11T09:14:51Z add SQL metrics for whole stage codegen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13074][Core] Add JavaSparkContext. getP...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10978#issuecomment-182781589 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13074][Core] Add JavaSparkContext. getP...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10978 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13277][SQL] ANTLR ignores other rule us...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11168#issuecomment-182781740 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12915] [SQL] add SQL metrics for whole ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11170#issuecomment-182781676 **[Test build #51092 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51092/consoleFull)** for PR 11170 at commit [`cf21f05`](https://github.com/apache/spark/commit/cf21f0538338dd14623b3aa8f93ad120182a0cd6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...
Github user ygcao commented on a diff in the pull request: https://github.com/apache/spark/pull/10152#discussion_r52573698 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -289,24 +301,20 @@ class Word2Vec extends Serializable with Logging { val expTable = sc.broadcast(createExpTable()) val bcVocab = sc.broadcast(vocab) val bcVocabHash = sc.broadcast(vocabHash) - -val sentences: RDD[Array[Int]] = words.mapPartitions { iter => - new Iterator[Array[Int]] { -def hasNext: Boolean = iter.hasNext - -def next(): Array[Int] = { - val sentence = ArrayBuilder.make[Int] - var sentenceLength = 0 - while (iter.hasNext && sentenceLength < MAX_SENTENCE_LENGTH) { -val word = bcVocabHash.value.get(iter.next()) -word match { - case Some(w) => -sentence += w -sentenceLength += 1 - case None => -} +// each partition is a collection of sentences, +// will be translated into arrays of Index integer +val sentences: RDD[Array[Int]] = dataset.mapPartitions { sentenceIter => + // Each sentence will map to 0 or more Array[Int] + sentenceIter.flatMap { sentence => { + // Sentence of words, some of which map to a word index + val wordIndexes = sentence.flatMap(bcVocabHash.value.get) + if (wordIndexes.nonEmpty) { --- End diff -- will empty iterator makes flatMap skip it just like skipping None? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13260][SQL] count(*) does not work with...
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/11169 [SPARK-13260][SQL] count(*) does not work with CSV data source https://issues.apache.org/jira/browse/SPARK-13260 This is a quicky fix for `count(*)`. When the `requiredColumns` is empty, currently it returns `sqlContext.sparkContext.emptyRDD[Row]` which does not have the count. Just like JSON datasource, this PR let the CSV datasource count the rows but do not parse each tokens. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark SPARK-13260 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11169.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11169 commit b52e1564b578cddb35931af2dd0e9c2c9d97b6f3 Author: hyukjinkwonDate: 2016-02-11T08:42:00Z count(*) does not work with CSV data source --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/10152#discussion_r52575825 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -289,24 +301,20 @@ class Word2Vec extends Serializable with Logging { val expTable = sc.broadcast(createExpTable()) val bcVocab = sc.broadcast(vocab) val bcVocabHash = sc.broadcast(vocabHash) - -val sentences: RDD[Array[Int]] = words.mapPartitions { iter => - new Iterator[Array[Int]] { -def hasNext: Boolean = iter.hasNext - -def next(): Array[Int] = { - val sentence = ArrayBuilder.make[Int] - var sentenceLength = 0 - while (iter.hasNext && sentenceLength < MAX_SENTENCE_LENGTH) { -val word = bcVocabHash.value.get(iter.next()) -word match { - case Some(w) => -sentence += w -sentenceLength += 1 - case None => -} +// each partition is a collection of sentences, +// will be translated into arrays of Index integer +val sentences: RDD[Array[Int]] = dataset.mapPartitions { sentenceIter => + // Each sentence will map to 0 or more Array[Int] + sentenceIter.flatMap { sentence => { + // Sentence of words, some of which map to a word index + val wordIndexes = sentence.flatMap(bcVocabHash.value.get) + if (wordIndexes.nonEmpty) { --- End diff -- Yes, `flatMap` would flatten an empty iterator to nothing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13277][SQL] ANTLR ignores other rule us...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11168#issuecomment-182781527 **[Test build #51089 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51089/consoleFull)** for PR 11168 at commit [`4cb9d2a`](https://github.com/apache/spark/commit/4cb9d2a0401d10277195c7853999cc89a0853abd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL][WIP] Create native DDL comm...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11048#issuecomment-182784626 btw @viirya can we create a execution.commands package for this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183191566 @yanboliang Could you take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
GitHub user NarineK opened a pull request: https://github.com/apache/spark/pull/11179 [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegression.AFTAggregator improvements - Avoids creating new instances of arrays/vectors for each record As also mentioned/marked by TODO in AFTAggregator.AFTAggregator.add(data: AFTPoint) a new array is being created for intercept value and it is being concatenated with another array which contains the betas, the resulted Array is being converted into a Dense vector which in it's turn is being converted into breeze vector. This is expensive and not necessarily beautiful. I've tried to solve above mentioned problem by simple algebraic decompositions - keeping and treating intercept independently. Please let me know what do you think and if you have any questions. Thanks, Narine You can merge this pull request into a Git repository by running: $ git pull https://github.com/NarineK/spark survivaloptim Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11179.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11179 commit 8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7 Author: Narine KokhlikyanDate: 2016-02-12T02:42:08Z Initial commit - AFTSurvivalRegression improvements --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11100#issuecomment-183194619 **[Test build #51169 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51169/consoleFull)** for PR 11100 at commit [`79c11de`](https://github.com/apache/spark/commit/79c11de8954e137e134d3a8645b6936cd625f38e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11100#issuecomment-183195272 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51169/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183195289 **[Test build #51174 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51174/consoleFull)** for PR 9893 at commit [`fe79873`](https://github.com/apache/spark/commit/fe79873ef416f3fd4ca29b6970cc2991fb43d017). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11100#issuecomment-183195269 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183191594 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183197817 **[Test build #51173 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51173/consoleFull)** for PR 11179 at commit [`8d443e9`](https://github.com/apache/spark/commit/8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...
Github user ygcao commented on the pull request: https://github.com/apache/spark/pull/10152#issuecomment-183197942 addressed new comments. still kept the if statement as I explained by sample codes. reran test and lint test. Jenkins should still be happy :fireworks: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/8#issuecomment-183198008 Just saw this got merged. I'm probably missing some context, but can somebody explain to me why something so conceptually simple leads to such a big patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-12729 PhantomReferences to replace Final...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/11140#issuecomment-183198830 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6166] Limit number of in flight outboun...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/10838#issuecomment-183200242 Merging to master. Thanks, @redsanket --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6166] Limit number of in flight outboun...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10838 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6166] Limit number of in flight outboun...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/10838#issuecomment-183200705 @redsanket what's your JIRA account name? I want to assign it to you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-12729 PhantomReferences to replace Final...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11140#issuecomment-183201556 **[Test build #51175 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51175/consoleFull)** for PR 11140 at commit [`837252a`](https://github.com/apache/spark/commit/837252a74ec87e8f1ac07e80406bf0410c9088d7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183205820 **[Test build #51173 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51173/consoleFull)** for PR 11179 at commit [`8d443e9`](https://github.com/apache/spark/commit/8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183206177 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51173/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11178#issuecomment-183206102 **[Test build #51176 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51176/consoleFull)** for PR 11178 at commit [`bef62eb`](https://github.com/apache/spark/commit/bef62ebb8ec5065061ff0ca49a4cb7e0182c47b6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183206172 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11178#issuecomment-183211270 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11178#issuecomment-183211245 **[Test build #51176 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51176/consoleFull)** for PR 11178 at commit [`bef62eb`](https://github.com/apache/spark/commit/bef62ebb8ec5065061ff0ca49a4cb7e0182c47b6). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13294] [PROJECT INFRA] Don't build full...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11178#issuecomment-183211271 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51176/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183216221 **[Test build #51172 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51172/consoleFull)** for PR 9893 at commit [`e61ec6a`](https://github.com/apache/spark/commit/e61ec6a4a3b603d34c6f7de697d61ee559786337). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183216521 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51172/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183216520 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Added pygments.rb dependancy
GitHub user amitdev opened a pull request: https://github.com/apache/spark/pull/11180 Added pygments.rb dependancy Looks like pygments.rb gem is also required for jekyll build to work. At least on Ubuntu/RHEL I could not do build without this dependency. So added this to steps. You can merge this pull request into a Git repository by running: $ git pull https://github.com/amitdev/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11180.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11180 commit f705e9bbe7f1e6a6393062c07e239b23ebf53ac8 Author: Amit DevDate: 2016-02-12T07:43:13Z Added pygments.rb dependancy --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Documentation] Added pygments.rb dependancy
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11180#issuecomment-183219234 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183219699 **[Test build #51174 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51174/consoleFull)** for PR 9893 at commit [`fe79873`](https://github.com/apache/spark/commit/fe79873ef416f3fd4ca29b6970cc2991fb43d017). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183219843 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51174/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183219842 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12705] [SQL] push missing attributes fo...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/11153#issuecomment-183197064 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183193865 **[Test build #51170 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51170/consoleFull)** for PR 11179 at commit [`8d443e9`](https://github.com/apache/spark/commit/8d443e9d7cd4b8b4cf7a4e14bec8287b7db6aff7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183194055 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...
Github user ygcao commented on a diff in the pull request: https://github.com/apache/spark/pull/10152#discussion_r52708705 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -289,24 +301,20 @@ class Word2Vec extends Serializable with Logging { val expTable = sc.broadcast(createExpTable()) val bcVocab = sc.broadcast(vocab) val bcVocabHash = sc.broadcast(vocabHash) - -val sentences: RDD[Array[Int]] = words.mapPartitions { iter => - new Iterator[Array[Int]] { -def hasNext: Boolean = iter.hasNext - -def next(): Array[Int] = { - val sentence = ArrayBuilder.make[Int] - var sentenceLength = 0 - while (iter.hasNext && sentenceLength < MAX_SENTENCE_LENGTH) { -val word = bcVocabHash.value.get(iter.next()) -word match { - case Some(w) => -sentence += w -sentenceLength += 1 - case None => -} +// each partition is a collection of sentences, +// will be translated into arrays of Index integer +val sentences: RDD[Array[Int]] = dataset.mapPartitions { sentenceIter => + // Each sentence will map to 0 or more Array[Int] + sentenceIter.flatMap { sentence => { + // Sentence of words, some of which map to a word index + val wordIndexes = sentence.flatMap(bcVocabHash.value.get) + if (wordIndexes.nonEmpty) { --- End diff -- Sorry, still not quite sure about this. did a test, turns out I am right :grinning: scala> val sentences=List("test sen 1","","testsen 2") sentences: List[String] = List(test sen 1, "", testsen 2) scala> val rdd=sc.parallelize(sentences) rdd: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[0] at parallelize at :23 scala> val results=rdd.flatMap(sen=>sen.split(" ").grouped(1)) results: org.apache.spark.rdd.RDD[Array[String]] = MapPartitionsRDD[1] at flatMap at :25 scala> results.collect res0: Array[Array[String]] = Array(Array(test), Array(sen), Array(1), **Array("")**, Array(testsen), Array(2)) if we don't have the if statement, we'll result empty things which could cause trouble for following steps. I'd like to be on the safe side. if statement is cheap enough. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11179#issuecomment-183194060 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51170/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13196] [MLlib] Optimize the iterator in...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/11078#issuecomment-183194684 @hhbyyh Did you test it? `Iterator` is lazy. I think the new version would consume more memory because `modified` would store all the values. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12705] [SQL] push missing attributes fo...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/11153#discussion_r52707357 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -572,98 +572,64 @@ class Analyzer( // Skip sort with aggregate. This will be handled in ResolveAggregateFunctions case sa @ Sort(_, _, child: Aggregate) => sa - case s @ Sort(_, _, child) if !s.resolved && child.resolved => -val (newOrdering, missingResolvableAttrs) = collectResolvableMissingAttrs(s.order, child) - -if (missingResolvableAttrs.isEmpty) { - val unresolvableAttrs = s.order.filterNot(_.resolved) - logDebug(s"Failed to find $unresolvableAttrs in ${child.output.mkString(", ")}") - s // Nothing we can do here. Return original plan. -} else { - // Add the missing attributes into projectList of Project/Window or - // aggregateExpressions of Aggregate, if they are in the inputSet - // but not in the outputSet of the plan. - val newChild = child transformUp { -case p: Project => - p.copy(projectList = p.projectList ++ -missingResolvableAttrs.filter((p.inputSet -- p.outputSet).contains)) -case w: Window => - w.copy(projectList = w.projectList ++ -missingResolvableAttrs.filter((w.inputSet -- w.outputSet).contains)) -case a: Aggregate => - val resolvableAttrs = missingResolvableAttrs.filter(a.groupingExpressions.contains) - val notResolvedAttrs = resolvableAttrs.filterNot(a.aggregateExpressions.contains) - val newAggregateExpressions = a.aggregateExpressions ++ notResolvedAttrs - a.copy(aggregateExpressions = newAggregateExpressions) -case o => o - } - + case s @ Sort(order, _, child) if !s.resolved && child.resolved => +val newOrder = order.map(resolveExpressionRecursively(_, child).asInstanceOf[SortOrder]) +val requiredAttrs = AttributeSet(newOrder).filter(_.resolved) +val missingAttrs = requiredAttrs -- child.outputSet +if (missingAttrs.nonEmpty) { // Add missing attributes and then project them away after the sort. Project(child.output, -Sort(newOrdering, s.global, newChild)) +Sort(newOrder, s.global, addMissingAttr(child, missingAttrs))) +} else if (newOrder != order) { + s.copy(order = newOrder) +} else { + s } } /** - * Traverse the tree until resolving the sorting attributes - * Return all the resolvable missing sorting attributes - */ -@tailrec -private def collectResolvableMissingAttrs( -ordering: Seq[SortOrder], -plan: LogicalPlan): (Seq[SortOrder], Seq[Attribute]) = { + * Add the missing attributes into projectList of Project/Window or aggregateExpressions of + * Aggregate. + */ +private def addMissingAttr(plan: LogicalPlan, missingAttrs: AttributeSet): LogicalPlan = { + if (missingAttrs.isEmpty) { +return plan + } plan match { -// Only Windows and Project have projectList-like attribute. -case un: UnaryNode if un.isInstanceOf[Project] || un.isInstanceOf[Window] => - val (newOrdering, missingAttrs) = resolveAndFindMissing(ordering, un, un.child) - // If missingAttrs is non empty, that means we got it and return it; - // Otherwise, continue to traverse the tree. - if (missingAttrs.nonEmpty) { -(newOrdering, missingAttrs) - } else { -collectResolvableMissingAttrs(ordering, un.child) - } +case p: Project => + val missing = missingAttrs -- p.child.outputSet + Project(p.projectList ++ missingAttrs, addMissingAttr(p.child, missing)) +case w: Window => + val missing = missingAttrs -- w.child.outputSet + w.copy(projectList = w.projectList ++ missingAttrs, +child = addMissingAttr(w.child, missing)) case a: Aggregate => - val (newOrdering, missingAttrs) = resolveAndFindMissing(ordering, a, a.child) - // For Aggregate, all the order by columns must be specified in group by clauses - if (missingAttrs.nonEmpty && - missingAttrs.forall(ar => a.groupingExpressions.exists(_.semanticEquals(ar { -(newOrdering, missingAttrs) - } else { -// If missingAttrs is empty, we are unable to
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183189627 **[Test build #51172 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51172/consoleFull)** for PR 9893 at commit [`e61ec6a`](https://github.com/apache/spark/commit/e61ec6a4a3b603d34c6f7de697d61ee559786337). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183189659 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51171/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10521][SQL] Utilize Docker for test DB2...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9893#issuecomment-183189658 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12705] [SQL] push missing attributes fo...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/11153#discussion_r52707329 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -572,98 +572,64 @@ class Analyzer( // Skip sort with aggregate. This will be handled in ResolveAggregateFunctions case sa @ Sort(_, _, child: Aggregate) => sa - case s @ Sort(_, _, child) if !s.resolved && child.resolved => -val (newOrdering, missingResolvableAttrs) = collectResolvableMissingAttrs(s.order, child) - -if (missingResolvableAttrs.isEmpty) { - val unresolvableAttrs = s.order.filterNot(_.resolved) - logDebug(s"Failed to find $unresolvableAttrs in ${child.output.mkString(", ")}") - s // Nothing we can do here. Return original plan. -} else { - // Add the missing attributes into projectList of Project/Window or - // aggregateExpressions of Aggregate, if they are in the inputSet - // but not in the outputSet of the plan. - val newChild = child transformUp { -case p: Project => - p.copy(projectList = p.projectList ++ -missingResolvableAttrs.filter((p.inputSet -- p.outputSet).contains)) -case w: Window => - w.copy(projectList = w.projectList ++ -missingResolvableAttrs.filter((w.inputSet -- w.outputSet).contains)) -case a: Aggregate => - val resolvableAttrs = missingResolvableAttrs.filter(a.groupingExpressions.contains) - val notResolvedAttrs = resolvableAttrs.filterNot(a.aggregateExpressions.contains) - val newAggregateExpressions = a.aggregateExpressions ++ notResolvedAttrs - a.copy(aggregateExpressions = newAggregateExpressions) -case o => o - } - + case s @ Sort(order, _, child) if !s.resolved && child.resolved => +val newOrder = order.map(resolveExpressionRecursively(_, child).asInstanceOf[SortOrder]) +val requiredAttrs = AttributeSet(newOrder).filter(_.resolved) +val missingAttrs = requiredAttrs -- child.outputSet +if (missingAttrs.nonEmpty) { // Add missing attributes and then project them away after the sort. Project(child.output, -Sort(newOrdering, s.global, newChild)) +Sort(newOrder, s.global, addMissingAttr(child, missingAttrs))) +} else if (newOrder != order) { + s.copy(order = newOrder) +} else { + s } } /** - * Traverse the tree until resolving the sorting attributes - * Return all the resolvable missing sorting attributes - */ -@tailrec -private def collectResolvableMissingAttrs( -ordering: Seq[SortOrder], -plan: LogicalPlan): (Seq[SortOrder], Seq[Attribute]) = { + * Add the missing attributes into projectList of Project/Window or aggregateExpressions of + * Aggregate. + */ +private def addMissingAttr(plan: LogicalPlan, missingAttrs: AttributeSet): LogicalPlan = { --- End diff -- It makes sense to me. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13282][SQL] LogicalPlan toSql should ju...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11171#issuecomment-182792045 **[Test build #51093 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51093/consoleFull)** for PR 11171 at commit [`9fd34fc`](https://github.com/apache/spark/commit/9fd34fc1fa27c09cfa5426a53be85cbc5e0460c3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10780][ML][WIP] Add initial model to km...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9#issuecomment-182758831 **[Test build #51090 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51090/consoleFull)** for PR 9 at commit [`166a6ff`](https://github.com/apache/spark/commit/166a6fffcfb9ec8aacdcc91ce827450fca0e79d2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13252] [KAFKA] Bump up Kafka to 0.9.0.0
Github user mariobriggs commented on the pull request: https://github.com/apache/spark/pull/11143#issuecomment-182769749 FWIW, the [IBM Cloud Message Hub service](https://www.ng.bluemix.net/docs/services/MessageHub/index.html#messagehub050) which is Kafka, has already moved to 0.9.0 , so i support option 1 that @markgrover suggests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12414] [CORE] Remove closure serializer
Github user srowen closed the pull request at: https://github.com/apache/spark/pull/11150 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12982][SQL] Add table name validation i...
Github user hvanhovell commented on the pull request: https://github.com/apache/spark/pull/11051#issuecomment-182892635 @jayadevanmurali issueing `retest this please` should normally do the trick. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r52610253 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -0,0 +1,472 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.ml.regression + +import breeze.stats.distributions.{Gaussian => GD} + +import org.apache.spark.Logging +import org.apache.spark.annotation.{Experimental, Since} +import org.apache.spark.ml.PredictorParams +import org.apache.spark.ml.feature.Instance +import org.apache.spark.ml.optim._ +import org.apache.spark.ml.param._ +import org.apache.spark.ml.param.shared._ +import org.apache.spark.ml.util.Identifiable +import org.apache.spark.mllib.linalg.{BLAS, Vector} +import org.apache.spark.rdd.RDD +import org.apache.spark.sql.{DataFrame, Row} +import org.apache.spark.sql.functions._ + +/** + * Params for Generalized Linear Regression. + */ +private[regression] trait GeneralizedLinearRegressionParams extends PredictorParams + with HasFitIntercept with HasMaxIter with HasTol with HasRegParam with HasWeightCol + with HasSolver with Logging { + + /** + * Param for the name of family which is a description of the error distribution + * to be used in the model. + * Supported options: "gaussian", "binomial", "poisson" and "gamma". + * @group param + */ + @Since("2.0.0") + final val family: Param[String] = new Param(this, "family", +"the name of family which is a description of the error distribution to be used in the model", + ParamValidators.inArray[String](GeneralizedLinearRegression.supportedFamilies.toArray)) + + /** @group getParam */ + @Since("2.0.0") + def getFamily: String = $(family) + + /** + * Param for the name of the model link function. + * Supported options: "identity", "log", "inverse", "logit", "probit", "cloglog" and "sqrt". + * @group param + */ + @Since("2.0.0") + final val link: Param[String] = new Param(this, "link", "the name of the model link function", + ParamValidators.inArray[String](GeneralizedLinearRegression.supportedLinks.toArray)) + + /** @group getParam */ + @Since("2.0.0") + def getLink: String = $(link) + + @Since("2.0.0") + override def validateParams(): Unit = { + require(GeneralizedLinearRegression.supportedFamilyLinkPairs.contains($(family) -> $(link)), + s"Generalized Linear Regression with ${$(family)} family does not support ${$(link)} " + +s"link function.") + } +} + +/** + * :: Experimental :: + * + * Fit a Generalized Linear Model ([[https://en.wikipedia.org/wiki/Generalized_linear_model]]) + * specified by giving a symbolic description of the linear predictor and + * a description of the error distribution. + */ +@Experimental +@Since("2.0.0") +class GeneralizedLinearRegression @Since("2.0.0") (@Since("2.0.0") override val uid: String) + extends Regressor[Vector, GeneralizedLinearRegression, GeneralizedLinearRegressionModel] + with GeneralizedLinearRegressionParams with Logging { + + @Since("2.0.0") + def this() = this(Identifiable.randomUID("genLinReg")) + + /** + * Set the name of family which is a description of the error distribution + * to be used in the model. + * @group setParam + */ + @Since("2.0.0") + def setFamily(value: String): this.type = set(family, value) + + /** + * Set the name of the model link function. + * @group setParam + */ + @Since("2.0.0") + def setLink(value: String): this.type = set(link, value) + + /** + * Set if we should fit the intercept. + * Default is true. + * @group setParam + */ + @Since("2.0.0") + def setFitIntercept(value: Boolean): this.type = set(fitIntercept, value) +
[GitHub] spark pull request: [SPARK-13124] [Web UI] Fixed CSS and JS issues...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/11038#issuecomment-182895149 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13124] [Web UI] Fixed CSS and JS issues...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/11038 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11701][SPARK-13054] dynamic allocation ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/10951#issuecomment-182894494 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r52611868 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -0,0 +1,472 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.ml.regression + +import breeze.stats.distributions.{Gaussian => GD} + +import org.apache.spark.Logging +import org.apache.spark.annotation.{Experimental, Since} +import org.apache.spark.ml.PredictorParams +import org.apache.spark.ml.feature.Instance +import org.apache.spark.ml.optim._ +import org.apache.spark.ml.param._ +import org.apache.spark.ml.param.shared._ +import org.apache.spark.ml.util.Identifiable +import org.apache.spark.mllib.linalg.{BLAS, Vector} +import org.apache.spark.rdd.RDD +import org.apache.spark.sql.{DataFrame, Row} +import org.apache.spark.sql.functions._ + +/** + * Params for Generalized Linear Regression. + */ +private[regression] trait GeneralizedLinearRegressionParams extends PredictorParams + with HasFitIntercept with HasMaxIter with HasTol with HasRegParam with HasWeightCol + with HasSolver with Logging { + + /** + * Param for the name of family which is a description of the error distribution + * to be used in the model. + * Supported options: "gaussian", "binomial", "poisson" and "gamma". + * @group param + */ + @Since("2.0.0") + final val family: Param[String] = new Param(this, "family", +"the name of family which is a description of the error distribution to be used in the model", + ParamValidators.inArray[String](GeneralizedLinearRegression.supportedFamilies.toArray)) + + /** @group getParam */ + @Since("2.0.0") + def getFamily: String = $(family) + + /** + * Param for the name of the model link function. + * Supported options: "identity", "log", "inverse", "logit", "probit", "cloglog" and "sqrt". + * @group param + */ + @Since("2.0.0") + final val link: Param[String] = new Param(this, "link", "the name of the model link function", + ParamValidators.inArray[String](GeneralizedLinearRegression.supportedLinks.toArray)) + + /** @group getParam */ + @Since("2.0.0") + def getLink: String = $(link) + + @Since("2.0.0") + override def validateParams(): Unit = { + require(GeneralizedLinearRegression.supportedFamilyLinkPairs.contains($(family) -> $(link)), + s"Generalized Linear Regression with ${$(family)} family does not support ${$(link)} " + +s"link function.") + } +} + +/** + * :: Experimental :: + * + * Fit a Generalized Linear Model ([[https://en.wikipedia.org/wiki/Generalized_linear_model]]) + * specified by giving a symbolic description of the linear predictor and + * a description of the error distribution. + */ +@Experimental +@Since("2.0.0") +class GeneralizedLinearRegression @Since("2.0.0") (@Since("2.0.0") override val uid: String) + extends Regressor[Vector, GeneralizedLinearRegression, GeneralizedLinearRegressionModel] + with GeneralizedLinearRegressionParams with Logging { + + @Since("2.0.0") + def this() = this(Identifiable.randomUID("genLinReg")) + + /** + * Set the name of family which is a description of the error distribution + * to be used in the model. + * @group setParam + */ + @Since("2.0.0") + def setFamily(value: String): this.type = set(family, value) + + /** + * Set the name of the model link function. + * @group setParam + */ + @Since("2.0.0") + def setLink(value: String): this.type = set(link, value) + + /** + * Set if we should fit the intercept. + * Default is true. + * @group setParam + */ + @Since("2.0.0") + def setFitIntercept(value: Boolean): this.type = set(fitIntercept, value) +
[GitHub] spark pull request: [SPARK-12594] [SQL] Outer Join Elimination by ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10567#issuecomment-182898263 **[Test build #51098 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51098/consoleFull)** for PR 10567 at commit [`e7fa63f`](https://github.com/apache/spark/commit/e7fa63f2581b77fdbb1437ed0bf21b8fde137db0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL][WIP] Create native DDL comm...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11048#discussion_r52612493 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala --- @@ -52,7 +56,7 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly getClauses(Seq("TOK_CREATETABLE", "FORMATTED", "EXTENDED"), explainArgs) ExplainCommand(nodeToPlan(crtTbl), extended = extended.isDefined) - case Token("TOK_EXPLAIN", explainArgs) => + case Token("TOK_EXPLAIN", explainArgs) if "TOK_QUERY" == explainArgs.head.text => --- End diff -- Why not `Token("TOK_EXPLAIN", Token("TOK_QUERY", query) :: explainArgs) =>` ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12792][SPARKR] Refactor RRDD to support...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10947#issuecomment-182890890 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51094/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12792][SPARKR] Refactor RRDD to support...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10947#issuecomment-182890889 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13277][SQL] ANTLR ignores other rule us...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/11168#issuecomment-182895732 @hvanhovell there are two alternatives to match `tableProvider` rule: tableProvider tableOpts? (KW_AS selectStatementWithCTE)? And (LPAREN columnNameTypeList RPAREN)? (p=tableProvider?) ... Because `(LPAREN columnNameTypeList RPAREN)` is optional, an input `KW_USING Identifier` can be matched with both paths. So the warning is emitted and path 1 is chosen and path 2 is disabled. Actually it doesn't affect the functionality we need. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11701][SPARK-13054] dynamic allocation ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10951#issuecomment-182902087 **[Test build #51099 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51099/consoleFull)** for PR 10951 at commit [`5fc19c7`](https://github.com/apache/spark/commit/5fc19c7b292365644e8e615227f2cfa0b211d261). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11100#issuecomment-182902581 **[Test build #51100 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51100/consoleFull)** for PR 11100 at commit [`e62c3d0`](https://github.com/apache/spark/commit/e62c3d0f908eb219798c958a6af731ce2750fbb8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13277][SQL] ANTLR ignores other rule us...
Github user hvanhovell commented on the pull request: https://github.com/apache/spark/pull/11168#issuecomment-182913207 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12792][SPARKR] Refactor RRDD to support...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10947#issuecomment-182890645 **[Test build #51094 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51094/consoleFull)** for PR 10947 at commit [`e4d6b5f`](https://github.com/apache/spark/commit/e4d6b5fe233464a35524b3686d896351b0481f84). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...
Github user aray commented on the pull request: https://github.com/apache/spark/pull/11100#issuecomment-182891207 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12982][SQL] Add table name validation i...
Github user hvanhovell commented on the pull request: https://github.com/apache/spark/pull/11051#issuecomment-182891796 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11714][Mesos] Make Spark on Mesos honor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11157#issuecomment-182893361 **[Test build #51096 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51096/consoleFull)** for PR 11157 at commit [`a4e575d`](https://github.com/apache/spark/commit/a4e575d79dbbe7ec8935932ff284ebb0164c9971). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11136#issuecomment-182897395 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51097/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11136#issuecomment-182897389 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL][WIP] Create native DDL comm...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11048#discussion_r52611305 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/SparkQlSuite.scala --- @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution + +import org.apache.spark.sql.catalyst.plans.PlanTest + +class SparkQlSuite extends PlanTest { --- End diff -- We really should test the resulting plans here, and not wait for an `AnalysisException` to be thrown. I know this is a PITA, but it will save us a lot of headaches in the future. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r52611593 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -0,0 +1,472 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.ml.regression + +import breeze.stats.distributions.{Gaussian => GD} + +import org.apache.spark.Logging +import org.apache.spark.annotation.{Experimental, Since} +import org.apache.spark.ml.PredictorParams +import org.apache.spark.ml.feature.Instance +import org.apache.spark.ml.optim._ +import org.apache.spark.ml.param._ +import org.apache.spark.ml.param.shared._ +import org.apache.spark.ml.util.Identifiable +import org.apache.spark.mllib.linalg.{BLAS, Vector} +import org.apache.spark.rdd.RDD +import org.apache.spark.sql.{DataFrame, Row} +import org.apache.spark.sql.functions._ + +/** + * Params for Generalized Linear Regression. + */ +private[regression] trait GeneralizedLinearRegressionParams extends PredictorParams + with HasFitIntercept with HasMaxIter with HasTol with HasRegParam with HasWeightCol + with HasSolver with Logging { + + /** + * Param for the name of family which is a description of the error distribution + * to be used in the model. + * Supported options: "gaussian", "binomial", "poisson" and "gamma". + * @group param + */ + @Since("2.0.0") + final val family: Param[String] = new Param(this, "family", +"the name of family which is a description of the error distribution to be used in the model", + ParamValidators.inArray[String](GeneralizedLinearRegression.supportedFamilies.toArray)) + + /** @group getParam */ + @Since("2.0.0") + def getFamily: String = $(family) + + /** + * Param for the name of the model link function. + * Supported options: "identity", "log", "inverse", "logit", "probit", "cloglog" and "sqrt". + * @group param + */ + @Since("2.0.0") + final val link: Param[String] = new Param(this, "link", "the name of the model link function", + ParamValidators.inArray[String](GeneralizedLinearRegression.supportedLinks.toArray)) + + /** @group getParam */ + @Since("2.0.0") + def getLink: String = $(link) + + @Since("2.0.0") + override def validateParams(): Unit = { + require(GeneralizedLinearRegression.supportedFamilyLinkPairs.contains($(family) -> $(link)), --- End diff -- Good point! But we can not check ```isSet(link)``` in the setter for family, because users may set family before set link and it will produce mistake. We can check ```isSet(link)``` at the start of train. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11136#issuecomment-182905997 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6166] Limit number of in flight outboun...
Github user redsanket commented on the pull request: https://github.com/apache/spark/pull/10838#issuecomment-182911828 @zsxwing rebased and changed ArrayBuffer to HashSet @tgravescs might want to take a look at it one more time --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/8#discussion_r52614670 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -511,6 +545,14 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) bus: ReplayListenerBus): Option[FsApplicationAttemptInfo] = { val logPath = eventLog.getPath() logInfo(s"Replaying log path: $logPath") +// Note that the eventLog may have *increased* in size since when we grabbed the filestatus, +// and when we read the file here. That is OK -- it may result in an unnecessary refresh +// when there is no update, but will not result in missing an update. We *must* prevent +// an error the other way -- if we report a size bigger (ie later) than the file that is +// actually read, we may never refresh the app +// we expect FileStatus to return the file size when it was initially created, but the api +// is not explicit about this so lets be extra-safe. +val eventLogLength = eventLog.getLen() --- End diff -- ah I see, I expected it to behave that way but couldn't find any documentation which really made that explicit. I guess you're saying its guaranteed by the post-conditions for getFileStatus()? I've updated the comment now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12832][MESOS] mesos scheduler respect a...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10949#issuecomment-182912125 **[Test build #51103 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51103/consoleFull)** for PR 10949 at commit [`6c934bd`](https://github.com/apache/spark/commit/6c934bd23f2df13481262ac9506ae8ab1548027e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org