[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184564464
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51348/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12632][PYSPARK][DOC] PySpark fpm and al...

2016-02-15 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/11186#issuecomment-184564628
  
@BryanCutler made a quick pass. While we're doing the format change, we may 
as well make a few little doc clean ups as per my comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184564462
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184564001
  
**[Test build #51349 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51349/consoleFull)**
 for PR 10757 at commit 
[`78f156c`](https://github.com/apache/spark/commit/78f156ce0ff51fbd994e15b14bda982e9fcd0868).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12632][PYSPARK][DOC] PySpark fpm and al...

2016-02-15 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/11186#discussion_r52977115
  
--- Diff: python/pyspark/mllib/recommendation.py ---
@@ -234,11 +238,35 @@ def _prepare(cls, ratings):
 def train(cls, ratings, rank, iterations=5, lambda_=0.01, blocks=-1, 
nonnegative=False,
   seed=None):
 """
-Train a matrix factorization model given an RDD of ratings given 
by users to some products,
-in the form of (userID, productID, rating) pairs. We approximate 
the ratings matrix as the
-product of two lower-rank matrices of a given rank (number of 
features). To solve for these
-features, we run a given number of iterations of ALS. This is done 
using a level of
-parallelism given by `blocks`.
+Train a matrix factorization model given an RDD of ratings given by
+users to some products, in the form of (userID, productID, rating)
+pairs. We approximate the ratings matrix as the product of two
+lower-rank matrices of a given rank (number of features). To solve
+for these features, we run a given number of iterations of ALS. 
This
+is done using a level of parallelism given by `blocks`.
+
+:param ratings:
+  RDD of `Rating` or (userID, productID, rating) tuple.
+:param rank:
+  Rank of the feature matrices computed (number of features).
+:param iterations:
+  Number of iterations run for each batch of data.
--- End diff --

This is a little unclear - what is meant by "for each batch of data"? 
Perhaps this should simply be `Number of ALS iterations to run`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13334] [ML] ML KMeansModel / BisectingK...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11214#issuecomment-184563668
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13334] [ML] ML KMeansModel / BisectingK...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11214#issuecomment-184563669
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51346/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13334] [ML] ML KMeansModel / BisectingK...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11214#issuecomment-184563558
  
**[Test build #51346 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51346/consoleFull)**
 for PR 11214 at commit 
[`6fb0b4d`](https://github.com/apache/spark/commit/6fb0b4dc8d7d608f9e394fc1cac896cf645dc423).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12632][PYSPARK][DOC] PySpark fpm and al...

2016-02-15 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/11186#discussion_r52976995
  
--- Diff: python/pyspark/mllib/recommendation.py ---
@@ -234,11 +238,35 @@ def _prepare(cls, ratings):
 def train(cls, ratings, rank, iterations=5, lambda_=0.01, blocks=-1, 
nonnegative=False,
   seed=None):
 """
-Train a matrix factorization model given an RDD of ratings given 
by users to some products,
-in the form of (userID, productID, rating) pairs. We approximate 
the ratings matrix as the
-product of two lower-rank matrices of a given rank (number of 
features). To solve for these
-features, we run a given number of iterations of ALS. This is done 
using a level of
-parallelism given by `blocks`.
+Train a matrix factorization model given an RDD of ratings given by
+users to some products, in the form of (userID, productID, rating)
+pairs. We approximate the ratings matrix as the product of two
+lower-rank matrices of a given rank (number of features). To solve
+for these features, we run a given number of iterations of ALS. 
This
--- End diff --

I wonder if the last sentence `This is done using ...` is really necessary 
(it's better explained in the param doc string below)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12632][PYSPARK][DOC] PySpark fpm and al...

2016-02-15 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/11186#discussion_r52976896
  
--- Diff: python/pyspark/mllib/recommendation.py ---
@@ -249,11 +277,39 @@ def train(cls, ratings, rank, iterations=5, 
lambda_=0.01, blocks=-1, nonnegative
 def trainImplicit(cls, ratings, rank, iterations=5, lambda_=0.01, 
blocks=-1, alpha=0.01,
   nonnegative=False, seed=None):
 """
-Train a matrix factorization model given an RDD of 'implicit 
preferences' given by users
-to some products, in the form of (userID, productID, preference) 
pairs. We approximate the
-ratings matrix as the product of two lower-rank matrices of a 
given rank (number of
-features).  To solve for these features, we run a given number of 
iterations of ALS.
-This is done using a level of parallelism given by `blocks`.
+Train a matrix factorization model given an RDD of 'implicit
+preferences' given by users to some products, in the form of
+(userID, productID, preference) pairs. We approximate the ratings
--- End diff --

Same comment as above applies


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12583][Mesos] Mesos shuffle service: Do...

2016-02-15 Thread bbossy
Github user bbossy commented on the pull request:

https://github.com/apache/spark/pull/11207#issuecomment-184563434
  
@JoshRosen changed to a more descriptive title and added a more detailed 
problem description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12632][PYSPARK][DOC] PySpark fpm and al...

2016-02-15 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/11186#discussion_r52976861
  
--- Diff: python/pyspark/mllib/recommendation.py ---
@@ -234,11 +238,35 @@ def _prepare(cls, ratings):
 def train(cls, ratings, rank, iterations=5, lambda_=0.01, blocks=-1, 
nonnegative=False,
   seed=None):
 """
-Train a matrix factorization model given an RDD of ratings given 
by users to some products,
-in the form of (userID, productID, rating) pairs. We approximate 
the ratings matrix as the
-product of two lower-rank matrices of a given rank (number of 
features). To solve for these
-features, we run a given number of iterations of ALS. This is done 
using a level of
-parallelism given by `blocks`.
+Train a matrix factorization model given an RDD of ratings given by
+users to some products, in the form of (userID, productID, rating)
+pairs. We approximate the ratings matrix as the product of two
--- End diff --

We refer to `pairs` here but `tuple` below. Perhaps this should be 
consistent ("tuple" since it's not a pair in fact) 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12632][PYSPARK][DOC] PySpark fpm and al...

2016-02-15 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/11186#discussion_r52976763
  
--- Diff: python/pyspark/mllib/recommendation.py ---
@@ -165,28 +165,32 @@ def productFeatures(self):
 @since("1.4.0")
 def recommendUsers(self, product, num):
 """
-Recommends the top "num" number of users for a given product and 
returns a list
-of Rating objects sorted by the predicted rating in descending 
order.
+Recommends the top "num" number of users for a given product and
+returns a list of Rating objects sorted by the predicted rating in
+descending order.
 """
 return list(self.call("recommendUsers", product, num))
 
 @since("1.4.0")
 def recommendProducts(self, user, num):
 """
-Recommends the top "num" number of products for a given user and 
returns a list
-of Rating objects sorted by the predicted rating in descending 
order.
+Recommends the top "num" number of products for a given user and
+returns a list of Rating objects sorted by the predicted rating in
+descending order.
 """
 return list(self.call("recommendProducts", user, num))
 
 def recommendProductsForUsers(self, num):
 """
-Recommends top "num" products for all users. The number returned 
may be less than this.
+Recommends top "num" products for all users. The number returned 
may be
+less than this.
 """
 return self.call("wrappedRecommendProductsForUsers", num)
 
 def recommendUsersForProducts(self, num):
 """
-Recommends top "num" users for all products. The number returned 
may be less than this.
+Recommends top "num" users for all products. The number returned 
may be
--- End diff --

same comment applies as above


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12632][PYSPARK][DOC] PySpark fpm and al...

2016-02-15 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/11186#discussion_r52976749
  
--- Diff: python/pyspark/mllib/recommendation.py ---
@@ -165,28 +165,32 @@ def productFeatures(self):
 @since("1.4.0")
 def recommendUsers(self, product, num):
 """
-Recommends the top "num" number of users for a given product and 
returns a list
-of Rating objects sorted by the predicted rating in descending 
order.
+Recommends the top "num" number of users for a given product and
+returns a list of Rating objects sorted by the predicted rating in
+descending order.
 """
 return list(self.call("recommendUsers", product, num))
 
 @since("1.4.0")
 def recommendProducts(self, user, num):
 """
-Recommends the top "num" number of products for a given user and 
returns a list
-of Rating objects sorted by the predicted rating in descending 
order.
+Recommends the top "num" number of products for a given user and
+returns a list of Rating objects sorted by the predicted rating in
+descending order.
 """
 return list(self.call("recommendProducts", user, num))
 
 def recommendProductsForUsers(self, num):
 """
-Recommends top "num" products for all users. The number returned 
may be less than this.
+Recommends top "num" products for all users. The number returned 
may be
--- End diff --

While we're at this, can we say something like `... the number of 
recommendations returned per user may be less than this`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12632][PYSPARK][DOC] PySpark fpm and al...

2016-02-15 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/11186#discussion_r52976638
  
--- Diff: python/pyspark/mllib/fpm.py ---
@@ -128,17 +131,27 @@ class PrefixSpan(object):
 @since("1.6.0")
 def train(cls, data, minSupport=0.1, maxPatternLength=10, 
maxLocalProjDBSize=3200):
 """
-Finds the complete set of frequent sequential patterns in the 
input sequences of itemsets.
-
-:param data: The input data set, each element contains a sequnce 
of itemsets.
-:param minSupport: the minimal support level of the sequential 
pattern, any pattern appears
-more than  (minSupport * size-of-the-dataset) times will be 
output (default: `0.1`)
-:param maxPatternLength: the maximal length of the sequential 
pattern, any pattern appears
-less than maxPatternLength will be output. (default: `10`)
-:param maxLocalProjDBSize: The maximum number of items (including 
delimiters used in
-the internal storage format) allowed in a projected database 
before local
-processing. If a projected database exceeds this size, another
-iteration of distributed prefix growth is run. (default: 
`3200`)
+Finds the complete set of frequent sequential patterns in the
+input sequences of itemsets.
+
+:param data:
+  The input data set, each element contains a sequence of
+  itemsets.
+:param minSupport:
+  The minimal support level of the sequential pattern, any
+  pattern appears more than (minSupport * size-of-the-dataset)
+  times will be output.
+  (default: 0.1)
+:param maxPatternLength:
+  The maximal length of the sequential pattern, any pattern
+  appears less than maxPatternLength will be output.
--- End diff --

same here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12632][PYSPARK][DOC] PySpark fpm and al...

2016-02-15 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/11186#discussion_r52976590
  
--- Diff: python/pyspark/mllib/fpm.py ---
@@ -128,17 +131,27 @@ class PrefixSpan(object):
 @since("1.6.0")
 def train(cls, data, minSupport=0.1, maxPatternLength=10, 
maxLocalProjDBSize=3200):
 """
-Finds the complete set of frequent sequential patterns in the 
input sequences of itemsets.
-
-:param data: The input data set, each element contains a sequnce 
of itemsets.
-:param minSupport: the minimal support level of the sequential 
pattern, any pattern appears
-more than  (minSupport * size-of-the-dataset) times will be 
output (default: `0.1`)
-:param maxPatternLength: the maximal length of the sequential 
pattern, any pattern appears
-less than maxPatternLength will be output. (default: `10`)
-:param maxLocalProjDBSize: The maximum number of items (including 
delimiters used in
-the internal storage format) allowed in a projected database 
before local
-processing. If a projected database exceeds this size, another
-iteration of distributed prefix growth is run. (default: 
`3200`)
+Finds the complete set of frequent sequential patterns in the
+input sequences of itemsets.
+
+:param data:
+  The input data set, each element contains a sequence of
+  itemsets.
+:param minSupport:
+  The minimal support level of the sequential pattern, any
+  pattern appears more than (minSupport * size-of-the-dataset)
--- End diff --

Can we change this from `appears` -> `appearing` (or `... pattern that 
appears ...`)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12379][ML][MLLIB] Copy GBT implementati...

2016-02-15 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/10607#issuecomment-184557820
  
@sethah I did find the perf-test results very difficult to read. Would it 
be ok to summarize into a readable table to make it easier to compare the 
*before* and *after* numbers (for posterity)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...

2016-02-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11100


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...

2016-02-15 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/10152#discussion_r52975344
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala ---
@@ -289,24 +301,19 @@ class Word2Vec extends Serializable with Logging {
 val expTable = sc.broadcast(createExpTable())
 val bcVocab = sc.broadcast(vocab)
 val bcVocabHash = sc.broadcast(vocabHash)
-
-val sentences: RDD[Array[Int]] = words.mapPartitions { iter =>
-  new Iterator[Array[Int]] {
-def hasNext: Boolean = iter.hasNext
-
-def next(): Array[Int] = {
-  val sentence = ArrayBuilder.make[Int]
-  var sentenceLength = 0
-  while (iter.hasNext && sentenceLength < MAX_SENTENCE_LENGTH) {
-val word = bcVocabHash.value.get(iter.next())
-word match {
-  case Some(w) =>
-sentence += w
-sentenceLength += 1
-  case None =>
-}
-  }
-  sentence.result()
+// each partition is a collection of sentences,
+// will be translated into arrays of Index integer
+val sentences: RDD[Array[Int]] = dataset.mapPartitions { sentenceIter 
=>
+  // Each sentence will map to 0 or more Array[Int]
+  sentenceIter.flatMap { sentence =>
+// Sentence of words, some of which map to a word index
+val wordIndexes = sentence.flatMap(bcVocabHash.value.get)
+if (wordIndexes.nonEmpty) {
--- End diff --

@ygcao you have kept the if statement here, which I believe both @mengxr 
and @srowen have shown is not necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13325][SQL] Create a 64-bit hashcode ex...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11209#issuecomment-184556811
  
**[Test build #51347 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51347/consoleFull)**
 for PR 11209 at commit 
[`54c818b`](https://github.com/apache/spark/commit/54c818b4cd66f8108a90d7cf350f8c31b2cd8caa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13221] [SQL] Fixing GroupingSets when A...

2016-02-15 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/11100#issuecomment-184556107
  
LGTM, we keep the `VirtualColumn` to show a better error message, merging 
this into master, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13334] [ML] ML KMeansModel / BisectingK...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11214#issuecomment-184555139
  
**[Test build #51346 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51346/consoleFull)**
 for PR 11214 at commit 
[`6fb0b4d`](https://github.com/apache/spark/commit/6fb0b4dc8d7d608f9e394fc1cac896cf645dc423).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13325][SQL] Create a 64-bit hashcode ex...

2016-02-15 Thread hvanhovell
Github user hvanhovell commented on the pull request:

https://github.com/apache/spark/pull/11209#issuecomment-184554119
  
@jodersky / @cloud-fan The tests also failed on my machine. It turns out I 
messed up the initialization order during some cleaning up. This is fixed and 
the tests should pass now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184552644
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184552648
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51343/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184552364
  
**[Test build #51343 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51343/consoleFull)**
 for PR 10757 at commit 
[`f0eb991`](https://github.com/apache/spark/commit/f0eb9917f276a2f6f7690b9b48739d0bd2624433).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13329] [SQL] considering output for sta...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11210#issuecomment-184552013
  
**[Test build #51345 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51345/consoleFull)**
 for PR 11210 at commit 
[`f431fd8`](https://github.com/apache/spark/commit/f431fd87b0a6deb02d0e19f3310cc58eed04fa3a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13334] [ML] ML KMeansModel / BisectingK...

2016-02-15 Thread yanboliang
GitHub user yanboliang opened a pull request:

https://github.com/apache/spark/pull/11214

[SPARK-13334] [ML] ML KMeansModel / BisectingKMeansModel / 
QuantileDiscretizer should be set parent

ML KMeansModel / BisectingKMeansModel / QuantileDiscretizer should be set 
parent.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yanboliang/spark spark-13334

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11214.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11214


commit 6fb0b4dc8d7d608f9e394fc1cac896cf645dc423
Author: Yanbo Liang 
Date:   2016-02-16T06:49:08Z

ML KMeansModel / BisectingKMeansModel / QuantileDiscretizer should be set 
parent.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13310] [SQL] Resolve Missing Sorting Co...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11198#issuecomment-184549831
  
**[Test build #51344 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51344/consoleFull)**
 for PR 11198 at commit 
[`07de4bc`](https://github.com/apache/spark/commit/07de4bcaafdad13fa5528ad280781247aa40f63e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13308] ManagedBuffers passed to OneToOn...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11193#issuecomment-184547728
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51340/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13308] ManagedBuffers passed to OneToOn...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11193#issuecomment-184547724
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13308] ManagedBuffers passed to OneToOn...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11193#issuecomment-184547086
  
**[Test build #51340 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51340/consoleFull)**
 for PR 11193 at commit 
[`2c00f29`](https://github.com/apache/spark/commit/2c00f29272051b8092b6a8a976392e32eeb5488b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13302][PYSPARK][TESTS] Move the temp fi...

2016-02-15 Thread holdenk
Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/11197#issuecomment-184539074
  
great :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13012] [Documentation] Replace example ...

2016-02-15 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11053#issuecomment-184538336
  
Thanks for the review @yinxusen. I have configured the code format in IDE 
and using the same for formatting the code. I will fix these comments and 
update.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13136][SQL] Create a dedicated Broadcas...

2016-02-15 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11083#discussion_r52972187
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Broadcast.scala ---
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.execution
+
+import scala.concurrent._
+import scala.concurrent.duration._
+
+import org.apache.spark.broadcast
+import org.apache.spark.rdd.RDD
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.Attribute
+import org.apache.spark.sql.catalyst.plans.physical.BroadcastMode
+import org.apache.spark.util.ThreadUtils
+
+/**
+ * A broadcast collects, transforms and finally broadcasts the result of a 
transformed SparkPlan.
+ */
+case class Broadcast(
--- End diff --

Do we need to merge this class with Exchange?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13136][SQL] Create a dedicated Broadcas...

2016-02-15 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11083#discussion_r52971787
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Broadcast.scala ---
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.execution
+
+import scala.concurrent._
+import scala.concurrent.duration._
+
+import org.apache.spark.broadcast
+import org.apache.spark.rdd.RDD
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.Attribute
+import org.apache.spark.sql.catalyst.plans.physical.BroadcastMode
+import org.apache.spark.util.ThreadUtils
+
+/**
+ * A broadcast collects, transforms and finally broadcasts the result of a 
transformed SparkPlan.
+ */
+case class Broadcast(
+mode: BroadcastMode,
+child: SparkPlan) extends UnaryNode {
+
+  override def output: Seq[Attribute] = child.output
+
+  val timeout: Duration = {
+val timeoutValue = sqlContext.conf.broadcastTimeout
+if (timeoutValue < 0) {
+  Duration.Inf
+} else {
+  timeoutValue.seconds
+}
+  }
+
+  @transient
+  private lazy val relationFuture: Future[broadcast.Broadcast[Any]] = {
+// broadcastFuture is used in "doExecute". Therefore we can get the 
execution id correctly here.
+val executionId = 
sparkContext.getLocalProperty(SQLExecution.EXECUTION_ID_KEY)
+Future {
+  // This will run in another thread. Set the execution id so that we 
can connect these jobs
+  // with the correct execution.
+  SQLExecution.withExecutionId(sparkContext, executionId) {
+// Note that we use .execute().collect() because we don't want to 
convert data to Scala
+// types
+val input: Array[InternalRow] = child.execute().map { row =>
+  row.copy()
+}.collect()
+
+// Construct and broadcast the relation.
+sparkContext.broadcast(mode(input))
+  }
+}(Broadcast.executionContext)
+  }
+
+  override protected def doPrepare(): Unit = {
+// Materialize the future.
+relationFuture
+  }
+
+  override protected def doExecute(): RDD[InternalRow] = {
+child.execute() // TODO throw an Exception here?
--- End diff --

Throw an UnsupportedOperationException?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13302][PYSPARK][TESTS] Move the temp fi...

2016-02-15 Thread yanboliang
Github user yanboliang commented on the pull request:

https://github.com/apache/spark/pull/11197#issuecomment-184534787
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13136][SQL] Create a dedicated Broadcas...

2016-02-15 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11083#discussion_r52971719
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Exchange.scala ---
@@ -395,18 +395,31 @@ private[sql] case class 
EnsureRequirements(sqlContext: SQLContext) extends Rule[
 assert(requiredChildOrderings.length == children.length)
 
 // Ensure that the operator's children satisfy their output 
distribution requirements:
-children = children.zip(requiredChildDistributions).map { case (child, 
distribution) =>
-  if (child.outputPartitioning.satisfies(distribution)) {
+children = children.zip(requiredChildDistributions).map {
+  case (child, distribution) if 
child.outputPartitioning.satisfies(distribution) =>
 child
-  } else {
+  case (child, BroadcastDistribution(m1)) =>
+child match {
+  // The child is broadcasting the same variable: keep the child.
+  case Broadcast(m2, _) if m1 == m2 => child
--- End diff --

I also have the same question. If we have a `BroadcastPartitioning`, seems 
we can avoid of these changes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13302][PYSPARK][TESTS] Move the temp fi...

2016-02-15 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/11197#discussion_r52971579
  
--- Diff: python/pyspark/ml/clustering.py ---
@@ -310,7 +303,17 @@ def _create_model(self, java_model):
 sqlContext = SQLContext(sc)
 globs['sc'] = sc
 globs['sqlContext'] = sqlContext
-(failure_count, test_count) = doctest.testmod(globs=globs, 
optionflags=doctest.ELLIPSIS)
-sc.stop()
+import tempfile
+temp_path = tempfile.mkdtemp()
+globs['temp_path'] = temp_path
+try:
+(failure_count, test_count) = doctest.testmod(globs=globs, 
optionflags=doctest.ELLIPSIS)
+sc.stop()
+finally:
--- End diff --

Sorry for misunderstand, I think your are right.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13302][PYSPARK][TESTS] Move the temp fi...

2016-02-15 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/11197#discussion_r52971131
  
--- Diff: python/pyspark/ml/clustering.py ---
@@ -310,7 +303,17 @@ def _create_model(self, java_model):
 sqlContext = SQLContext(sc)
 globs['sc'] = sc
 globs['sqlContext'] = sqlContext
-(failure_count, test_count) = doctest.testmod(globs=globs, 
optionflags=doctest.ELLIPSIS)
-sc.stop()
+import tempfile
+temp_path = tempfile.mkdtemp()
+globs['temp_path'] = temp_path
+try:
+(failure_count, test_count) = doctest.testmod(globs=globs, 
optionflags=doctest.ELLIPSIS)
+sc.stop()
+finally:
--- End diff --

So finally is still useful even if we don't explicitly catch/handle any 
exceptions - are you saying the sc.stop and doctest will never throw any 
exceptions?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Correct SparseVector.parse documentation

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11213#issuecomment-184530007
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Correct SparseVector.parse documentation

2016-02-15 Thread mgyucht
GitHub user mgyucht opened a pull request:

https://github.com/apache/spark/pull/11213

Correct SparseVector.parse documentation

There's a small typo in the SparseVector.parse docstring (which says that 
it returns a DenseVector rather than a SparseVector), which seems to be 
incorrect.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mgyucht/spark fix-sparsevector-docs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11213.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11213


commit 1e73745d5f97161a4084f2a838f1a1144b221aad
Author: Miles Yucht 
Date:   2016-02-16T05:39:29Z

Correct SparseVector.parse documentation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...

2016-02-15 Thread ygcao
Github user ygcao commented on the pull request:

https://github.com/apache/spark/pull/10152#issuecomment-184528879
  
addressed the 'final' comment, and checked lint and test cases. shall we do 
the merge then? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13330][PYSPARK] PYTHONHASHSEED is not p...

2016-02-15 Thread zjffdu
Github user zjffdu commented on the pull request:

https://github.com/apache/spark/pull/11211#issuecomment-184526617
  
PYTHONHASHSEED is set in script spark-submit no matter what version of 
python. And it would only be set in executor when python version is greater 
than 3.3. PYTHONHASHSEED is introduced in python 3.2.3 
(https://docs.python.org/3.3/using/cmdline.html). I am not sure the purpose of 
disable random hash, just feel that we can set PYTHONHASHSEED as 0 in all the 
cases since it looks like there's no case we want to enable the random of hash. 
And it's also fine to set it in python 2, because it is only introduced after 
3.2.3 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184525577
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184525578
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51337/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-15 Thread NarineK
Github user NarineK commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-184524951
  
Thank you for the review comments, @yanboliang 
I've added your suggestions. Let me know if you have more comments.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184525108
  
**[Test build #51337 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51337/consoleFull)**
 for PR 10757 at commit 
[`21c94d2`](https://github.com/apache/spark/commit/21c94d2224609ce3171e62c7cb58ee64cca683e7).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13013][Docs] Replace example code in ml...

2016-02-15 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/6#issuecomment-184524570
  
@keypointt Please reformat other Java code files with 2-indent spaces, not 
only the lines that I pointed out.

For re-using the example code, even though they are not identical, they are 
very similar in functionalities of showing the usage of those classes. Take 
`PowerIterationClusteringExample` as an example, other than rewriting the 
previous example code with the code in the markdown file, I prefer to change it 
as follows:

```scala
  def run(params: Params) {
val conf = new SparkConf()
  .setMaster("local")
  .setAppName(s"PowerIterationClustering with $params")
val sc = new SparkContext(conf)

Logger.getRootLogger.setLevel(Level.WARN)

// $example on$
val circlesRdd = generateCirclesRdd(sc, params.k, params.numPoints)
val model = new PowerIterationClustering()
  .setK(params.k)
  .setMaxIterations(params.maxIterations)
  .setInitializationMode("degree")
  .run(circlesRdd)

val clusters = 
model.assignments.collect().groupBy(_.cluster).mapValues(_.map(_.id))
val assignments = clusters.toList.sortBy { case (k, v) => v.length }
val assignmentsStr = assignments
  .map { case (k, v) =>
  s"$k -> ${v.sorted.mkString("[", ",", "]")}"
}.mkString(", ")
val sizesStr = assignments.map {
  _._2.length
}.sorted.mkString("(", ",", ")")
println(s"Cluster assignments: $assignmentsStr\ncluster sizes: 
$sizesStr")
// $example off$

sc.stop()
  }
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184524502
  
**[Test build #51343 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51343/consoleFull)**
 for PR 10757 at commit 
[`f0eb991`](https://github.com/apache/spark/commit/f0eb9917f276a2f6f7690b9b48739d0bd2624433).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12375] [ML] add handleinvalid for vecto...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10466#issuecomment-184524069
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51342/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12375] [ML] add handleinvalid for vecto...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10466#issuecomment-184524066
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12375] [ML] add handleinvalid for vecto...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10466#issuecomment-184523963
  
**[Test build #51342 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51342/consoleFull)**
 for PR 10466 at commit 
[`6a0efed`](https://github.com/apache/spark/commit/6a0efede2b99a315895b1d3cccb9262ea845476c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184522878
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13018][Docs] Replace example code in ml...

2016-02-15 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/11126#discussion_r52969279
  
--- Diff: docs/mllib-pmml-model-export.md ---
@@ -45,41 +45,12 @@ The table below outlines the `spark.mllib` models that 
can be exported to PMML a
 
 To export a supported `model` (see table above) to PMML, simply call 
`model.toPMML`.
 
+As well as exporting the PMML model to a String (`model.toPMML` as in the 
example above), you can export the PMML model to other formats.
--- End diff --

Let's wrap it next time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-13332][SQL] Decimal datatype suppo...

2016-02-15 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/11212#discussion_r52968764
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
 ---
@@ -523,11 +523,45 @@ case class Atan2(left: Expression, right: Expression)
 
 case class Pow(left: Expression, right: Expression)
   extends BinaryMathExpression(math.pow, "POWER") {
-  override def genCode(ctx: CodegenContext, ev: ExprCode): String = {
-defineCodeGen(ctx, ev, (c1, c2) => s"java.lang.Math.pow($c1, $c2)")
-  }
-}
+  override def inputTypes: Seq[AbstractDataType] = Seq(NumericType, 
NumericType)
+
+  override def dataType: DataType = (left.dataType, right.dataType) match {
+case (dt: DecimalType, ByteType | ShortType | IntegerType) => dt
+case _ => DoubleType
+  }
+
+  protected override def nullSafeEval(input1: Any, input2: Any): Any =
+(left.dataType, right.dataType) match {
+  case (dt: DecimalType, ByteType) =>
+input1.asInstanceOf[Decimal].pow(input2.asInstanceOf[Byte])
+  case (dt: DecimalType, ShortType) =>
+input1.asInstanceOf[Decimal].pow(input2.asInstanceOf[Short])
+  case (dt: DecimalType, IntegerType) =>
+input1.asInstanceOf[Decimal].pow(input2.asInstanceOf[Int])
+  case (dt: DecimalType, FloatType) =>
+math.pow(input1.asInstanceOf[Decimal].toDouble, 
input2.asInstanceOf[Float])
+  case (dt: DecimalType, DoubleType) =>
+math.pow(input1.asInstanceOf[Decimal].toDouble, 
input2.asInstanceOf[Double])
+  case (dt1: DecimalType, dt2: DecimalType) =>
+math.pow(input1.asInstanceOf[Decimal].toDouble, 
input2.asInstanceOf[Decimal].toDouble)
--- End diff --

Shall we cast the result of `math.pow` back to `DecimalType` for these 
three cases?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-12260][wip][Streaming]Graceful Shutdown...

2016-02-15 Thread zzcclp
Github user zzcclp commented on the pull request:

https://github.com/apache/spark/pull/10252#issuecomment-184515856
  
@chenghao-intel @mwws , sorry for my late reply.

Currently, we just record the kafka offset and accumulators to third-party 
storage system after per batch, and then restore  them from the window‘s 
earliest start time. For stateful data, we have no good way to recover by now, 
so it will lose some statistical data. 
Next, one of our business system must ensure data integrity after software 
upgrade or even application logic update, so we urgently hope that spark can 
native support this feature. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-184515193
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51339/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-184515192
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-184515101
  
**[Test build #51339 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51339/consoleFull)**
 for PR 11179 at commit 
[`e4707e7`](https://github.com/apache/spark/commit/e4707e775f34c0018f74451d048fb28a9c08ef48).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13321][SQL] Support nested UNION in par...

2016-02-15 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/11204#discussion_r52968390
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/CatalystQlSuite.scala 
---
@@ -201,4 +201,68 @@ class CatalystQlSuite extends PlanTest {
 parser.parsePlan("select sum(product + 1) over (partition by (product 
+ (1)) order by 2) " +
   "from windowData")
   }
+
+  test("nesting UNION") {
+val parsed = parser.parsePlan(
+  """
+   |SELECT  `u_1`.`id` FROM (((SELECT  `t0`.`id` FROM `default`.`t0`)
+   |UNION ALL (SELECT  `t0`.`id` FROM `default`.`t0`)) UNION ALL
+   |(SELECT  `t0`.`id` FROM `default`.`t0`)) AS u_1
+  """.stripMargin)
+
+val expected = Project(
+  UnresolvedAlias(UnresolvedAttribute("u_1.id"), None) :: Nil,
+  Subquery("u_1",
+Union(
+  Union(
+Project(
+  UnresolvedAlias(UnresolvedAttribute("t0.id"), None) :: Nil,
+  UnresolvedRelation(TableIdentifier("t0", Some("default")), 
None)),
+Project(
+  UnresolvedAlias(UnresolvedAttribute("t0.id"), None) :: Nil,
+  UnresolvedRelation(TableIdentifier("t0", Some("default")), 
None))),
+  Project(
+UnresolvedAlias(UnresolvedAttribute("t0.id"), None) :: Nil,
+UnresolvedRelation(TableIdentifier("t0", Some("default")), 
None)
+
+comparePlans(parsed, expected)
+
+val parsedSame = parser.parsePlan(
+  """
+   |SELECT  `u_1`.`id` FROM ((SELECT  `t0`.`id` FROM `default`.`t0`)
+   |UNION ALL (SELECT  `t0`.`id` FROM `default`.`t0`) UNION ALL
+   |(SELECT  `t0`.`id` FROM `default`.`t0`)) AS u_1
+  """.stripMargin)
+
+comparePlans(parsedSame, expected)
+
+val parsed2 = parser.parsePlan(
--- End diff --

Recursively nested UNION.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13321][SQL] Support nested UNION in par...

2016-02-15 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/11204#discussion_r52968367
  
--- Diff: 
sql/catalyst/src/main/antlr3/org/apache/spark/sql/catalyst/parser/SparkSqlParser.g
 ---
@@ -2320,6 +2320,19 @@ regularBody[boolean topLevel]
)
|
selectStatement[topLevel]
+   |
+   (LPAREN selectStatement[true]) => nestedSetOpSelectStatement[topLevel]
+   ;
+
+nestedSetOpSelectStatement[boolean topLevel]
+   :
+   (
+   LPAREN s=selectStatement[topLevel] RPAREN -> {$s.tree}
+   )
+   (set=setOpSelectStatement[$nestedSetOpSelectStatement.tree, topLevel])
--- End diff --

I think it might be the simplest approach to support recursively nested 
UNION query.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-13332][SQL] Decimal datatype suppo...

2016-02-15 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/11212#discussion_r52968376
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeRowWriter.java
 ---
@@ -170,6 +170,7 @@ public void write(int ordinal, double value) {
   }
 
   public void write(int ordinal, Decimal input, int precision, int scale) {
+input = input.clone();
--- End diff --

Better add a comment that explains why we need to clone before write.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184513440
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51341/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13308] ManagedBuffers passed to OneToOn...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11193#issuecomment-184512866
  
**[Test build #51340 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51340/consoleFull)**
 for PR 11193 at commit 
[`2c00f29`](https://github.com/apache/spark/commit/2c00f29272051b8092b6a8a976392e32eeb5488b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-13332][SQL] Decimal datatype suppo...

2016-02-15 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/11212#discussion_r52968287
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MathFunctionsSuite.scala
 ---
@@ -351,6 +350,20 @@ class MathFunctionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   }
 
   test("pow") {
+testBinary(Pow, (d: Decimal, n: Byte) => d.pow(n),
+  (-5 to 5).map(v => (Decimal(v * 1.0), v.toByte)))
--- End diff --

maybe `v.toDouble` is better


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13321][SQL] Support nested UNION in par...

2016-02-15 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/11204#discussion_r52968331
  
--- Diff: 
sql/catalyst/src/main/antlr3/org/apache/spark/sql/catalyst/parser/SparkSqlParser.g
 ---
@@ -2320,6 +2320,19 @@ regularBody[boolean topLevel]
)
|
selectStatement[topLevel]
+   |
+   (LPAREN selectStatement[true]) => nestedSetOpSelectStatement[topLevel]
+   ;
+
+nestedSetOpSelectStatement[boolean topLevel]
+   :
+   (
+   LPAREN s=selectStatement[topLevel] RPAREN -> {$s.tree}
+   )
+   (set=setOpSelectStatement[$nestedSetOpSelectStatement.tree, topLevel])
--- End diff --

I made a little to support recursively nested UNION. I also updated the 
test. But it is basically the same approach.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-12729 PhantomReferences to replace Final...

2016-02-15 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/11140#issuecomment-184512589
  
No comment on the contents of this PR (since I haven't looked at them), but 
I did want to note that I think that the pull request description is a little 
thin here. Could you add a concise summary of the changes here, their impact on 
the code, and motivation for why we're doing this? This helps reviewers / 
readers know what to focus on and also helps future readers by allowing them to 
understand the gist of this change without having to read the entire JIRA / 
discussion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184513438
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SQL] Decimal datatype support for pow

2016-02-15 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/11212#discussion_r52968214
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MathFunctionsSuite.scala
 ---
@@ -103,8 +103,7 @@ class MathFunctionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   }
 } else {
   domain.foreach { case (v1, v2) =>
-checkEvaluation(c(Literal(v1), Literal(v2)), f(v1 + 0.0, v2 + 
0.0), EmptyRow)
-checkEvaluation(c(Literal(v2), Literal(v1)), f(v2 + 0.0, v1 + 
0.0), EmptyRow)
+checkEvaluation(c(Literal(v1), Literal(v2)), f(v1, v2), EmptyRow)
--- End diff --

keep the test of `f(v2, v1)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12375] [ML] add handleinvalid for vecto...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10466#issuecomment-184511393
  
**[Test build #51342 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51342/consoleFull)**
 for PR 10466 at commit 
[`6a0efed`](https://github.com/apache/spark/commit/6a0efede2b99a315895b1d3cccb9262ea845476c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12583][Mesos] Fix mesos shuffle service

2016-02-15 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/11207#issuecomment-184510940
  
No comment on the contents of this PR (since I haven't looked at it yet), 
but would you mind changing the PR to something more descriptive? As it stands 
now, "Fix Mesos shuffle service" is a lot less descriptive than, say, "Delete 
shuffle files after Mesos shuffle service exits" or something similar.

Could you also edit the description to include a concise one-sentence 
description of the user-facing bug / symptom that this fixes? Right now this 
describes a lot of mechanism, but I feel like the description is a bit thin on 
context for newcomers who are trying to understand what this patch is doing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13018][Docs] Replace example code in ml...

2016-02-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11126


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13330][PYSPARK] PYTHONHASHSEED is not p...

2016-02-15 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/11211#issuecomment-184509476
  
How do we handle this in Python 2? if we're running Python 2.x, do we 
currently propagate `PYTHONHASHSEED` to the worker?

Also, how are we going to ensure that this change isn't accidentally rolled 
back? This seems subtle, so adding an explanatory paragraph comment into the 
source code near this line would make sense.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13097][ML] Binarizer allowing Double AN...

2016-02-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/10976


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13018][Docs] Replace example code in ml...

2016-02-15 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/11126#discussion_r52967782
  
--- Diff: docs/mllib-pmml-model-export.md ---
@@ -45,41 +45,12 @@ The table below outlines the `spark.mllib` models that 
can be exported to PMML a
 
 To export a supported `model` (see table above) to PMML, simply call 
`model.toPMML`.
 
+As well as exporting the PMML model to a String (`model.toPMML` as in the 
example above), you can export the PMML model to other formats.
--- End diff --

minor: please wrap lines at 100 chars


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13018][Docs] Replace example code in ml...

2016-02-15 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/11126#issuecomment-184509386
  
Merged into master. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13308] ManagedBuffers passed to OneToOn...

2016-02-15 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/11193#issuecomment-184509228
  
Thanks for the careful review, @zsxwing. I agree with your feedback and 
also think that it makes a lot more sense to have `convertToNetty()` increment 
the reference count. I've gone ahead and updated the patch to do this and have 
rolled back a confusing `retain()` call in the test code (which you pointed out 
earlier).

Take a look at the `refCnt()` assertions that I added in the test suites to 
see whether they match up with what you had in mind.

Note that as of today `convertToNetty` is only called in one place in 
`MessageEncoder` and the result of this is passed to `MessageWithHeader` 
alongside the message that the buffer came from, so it should be verify that 
`MessageWithHeader.deallocate()` will free all of the references.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13097][ML] Binarizer allowing Double AN...

2016-02-15 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/10976#issuecomment-184509207
  
Merged into master. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13295] [ ML, MLlib ] AFTSurvivalRegress...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11179#issuecomment-184508061
  
**[Test build #51339 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51339/consoleFull)**
 for PR 11179 at commit 
[`e4707e7`](https://github.com/apache/spark/commit/e4707e775f34c0018f74451d048fb28a9c08ef48).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13013][Docs] Replace example code in ml...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6#issuecomment-184507173
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51338/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13013][Docs] Replace example code in ml...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6#issuecomment-184507169
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13013][Docs] Replace example code in ml...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6#issuecomment-184507162
  
**[Test build #51338 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51338/consoleFull)**
 for PR 6 at commit 
[`8195cdf`](https://github.com/apache/spark/commit/8195cdf6052ad226b8102c2d40d2341d409596e1).
 * This patch **fails Python style tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13013][Docs] Replace example code in ml...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6#issuecomment-184506760
  
**[Test build #51338 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51338/consoleFull)**
 for PR 6 at commit 
[`8195cdf`](https://github.com/apache/spark/commit/8195cdf6052ad226b8102c2d40d2341d409596e1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13310] [SQL] Resolve Missing Sorting Co...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11198#issuecomment-184501312
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51332/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13310] [SQL] Resolve Missing Sorting Co...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11198#issuecomment-184501309
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13310] [SQL] Resolve Missing Sorting Co...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11198#issuecomment-184501104
  
**[Test build #51332 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51332/consoleFull)**
 for PR 11198 at commit 
[`49a2d6e`](https://github.com/apache/spark/commit/49a2d6e8c153609901ed79035cd1abe236f1d39c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13013][Docs] Replace example code in ml...

2016-02-15 Thread keypointt
Github user keypointt commented on a diff in the pull request:

https://github.com/apache/spark/pull/6#discussion_r52966146
  
--- Diff: 
examples/src/main/scala/org/apache/spark/examples/mllib/PowerIterationClusteringExample.scala
 ---
@@ -18,141 +18,42 @@
 // scalastyle:off println
 package org.apache.spark.examples.mllib
 
-import org.apache.log4j.{Level, Logger}
-import scopt.OptionParser
-
 import org.apache.spark.{SparkConf, SparkContext}
-import org.apache.spark.mllib.clustering.PowerIterationClustering
-import org.apache.spark.rdd.RDD
+// $example on$
+import org.apache.spark.mllib.clustering.{PowerIterationClustering, 
PowerIterationClusteringModel}
+// $example off$
 
-/**
--- End diff --

@yinxusen could you please explain more how to reuse? 
previous examples are quite different from what is shown insdie {highlight} 
block


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13329] [SQL] considering output for sta...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11210#issuecomment-184497912
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51336/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13329] [SQL] considering output for sta...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11210#issuecomment-184497911
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12966][SQL] ArrayType(DecimalType) supp...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10928#issuecomment-184497847
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51330/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13329] [SQL] considering output for sta...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11210#issuecomment-184497831
  
**[Test build #51336 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51336/consoleFull)**
 for PR 11210 at commit 
[`2738737`](https://github.com/apache/spark/commit/273873753fb97721864ee9e85d9dc9f16edab8ce).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12966][SQL] ArrayType(DecimalType) supp...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10928#issuecomment-184497845
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12966][SQL] ArrayType(DecimalType) supp...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10928#issuecomment-184497738
  
**[Test build #51330 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51330/consoleFull)**
 for PR 10928 at commit 
[`68952b6`](https://github.com/apache/spark/commit/68952b65c65aebfe6bc5a41a80518b8fc2288c8b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12799] Simplify various string output f...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10757#issuecomment-184495970
  
**[Test build #51337 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51337/consoleFull)**
 for PR 10757 at commit 
[`21c94d2`](https://github.com/apache/spark/commit/21c94d2224609ce3171e62c7cb58ee64cca683e7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13237] [SQL] generated broadcast outer ...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11130#issuecomment-184495758
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51333/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13237] [SQL] generated broadcast outer ...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11130#issuecomment-184495757
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13237] [SQL] generated broadcast outer ...

2016-02-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11130#issuecomment-184495643
  
**[Test build #51333 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51333/consoleFull)**
 for PR 11130 at commit 
[`5744941`](https://github.com/apache/spark/commit/5744941063ba05b07e4a7265277162c331a9c48c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SQL] Decimal datatype support for pow

2016-02-15 Thread yucai
Github user yucai commented on the pull request:

https://github.com/apache/spark/pull/11212#issuecomment-184495551
  
@adrian-wang could you help review? Much thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13329] [SQL] considering output for sta...

2016-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11210#issuecomment-184492883
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51334/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >