[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-13 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/16030
  
okay, I'll try to fix in that way, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16263: [SPARK-18281][SQL][PySpark] Consumes the returned...

2016-12-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/16263#discussion_r92337930
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -400,10 +402,19 @@ def toLocalIterator(self):
 
 >>> list(df.toLocalIterator())
 [Row(age=2, name=u'Alice'), Row(age=5, name=u'Bob')]
+>>> it = df.toLocalIterator()
+>>> import time
+>>> time.sleep(5)
+>>> next(it)
--- End diff --

Ok. Thanks. I will update like that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16142: [SPARK-18716][CORE] Restrict the disk usage of spark eve...

2016-12-13 Thread uncleGen
Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/16142
  
cc @vanzin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16274: [SPARK-18853][SQL] Project (UnaryNode) is way too aggres...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16274
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16274: [SPARK-18853][SQL] Project (UnaryNode) is way too aggres...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16274
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70123/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16274: [SPARK-18853][SQL] Project (UnaryNode) is way too aggres...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16274
  
**[Test build #70123 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70123/consoleFull)**
 for PR 16274 at commit 
[`21570a7`](https://github.com/apache/spark/commit/21570a70572248428390ba0d36b335e1af0f5aa2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16266: [SPARK-18842][TESTS][LAUNCHER] De-duplicate paths in cla...

2016-12-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16266
  
@srowen, I think it is ready for a second look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16277: [SPARK-18854][SQL] numberedTreeString and apply(i) incon...

2016-12-13 Thread yhuai
Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/16277
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13909: [SPARK-16213][SQL] Reduce runtime overhead of a p...

2016-12-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/13909#discussion_r92335449
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -56,33 +58,93 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
   }
 
   override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
-val arrayClass = classOf[GenericArrayData].getName
-val values = ctx.freshName("values")
-ctx.addMutableState("Object[]", values, s"this.$values = null;")
+val array = ctx.freshName("array")
 
-ev.copy(code = s"""
-  this.$values = new Object[${children.size}];""" +
+val et = dataType.elementType
+val evals = children.map(e => e.genCode(ctx))
+val isPrimitiveArray = ctx.isPrimitiveType(et)
+val primitiveTypeName = if (isPrimitiveArray) 
ctx.primitiveTypeName(et) else ""
+val (preprocess, arrayData, arrayWriter) =
+  GenArrayData.getCodeArrayData(ctx, et, children.size, 
isPrimitiveArray, array)
+
+ev.copy(code =
+  preprocess +
   ctx.splitExpressions(
 ctx.INPUT_ROW,
-children.zipWithIndex.map { case (e, i) =>
-  val eval = e.genCode(ctx)
-  eval.code + s"""
-if (${eval.isNull}) {
-  $values[$i] = null;
+evals.zipWithIndex.map { case (eval, i) =>
+  eval.code +
+(if (isPrimitiveArray) {
+  (if (!children(i).nullable) {
+s"\n$arrayWriter.write($i, ${eval.value});"
+  } else {
+s"""
+if (${eval.isNull}) {
--- End diff --

@kiszk I think what @cloud-fan means is that we don't need to check 
`!children(i).nullable` and decide to generate the code of `setNull` or not.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14079
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70122/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14079
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14079
  
**[Test build #70122 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70122/consoleFull)**
 for PR 14079 at commit 
[`c95462f`](https://github.com/apache/spark/commit/c95462fe5c25d37b8658955304f739cc10ccf1f9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16119: [SPARK-18687][Pyspark][SQL]Backward compatibility...

2016-12-13 Thread vijoshi
Github user vijoshi commented on a diff in the pull request:

https://github.com/apache/spark/pull/16119#discussion_r92333556
  
--- Diff: python/pyspark/sql/context.py ---
@@ -72,8 +72,13 @@ def __init__(self, sparkContext, sparkSession=None, 
jsqlContext=None):
 self._sc = sparkContext
 self._jsc = self._sc._jsc
 self._jvm = self._sc._jvm
+
 if sparkSession is None:
-sparkSession = SparkSession(sparkContext)
+if sparkContext is SparkContext._active_spark_context:
+sparkSession = SparkSession.builder.getOrCreate()
--- End diff --

okay - I wanted to avoid adding code to the new SparkSession class to 
handle this compatibility issue arising out of the now deprecated SQLContext 
class. Looks like the  python impl of SparkSession builder does not allow a 
SparkContext to be passed in. Do we want to change the public builder interface 
for this ?  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-13 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16030
  
After an offline discussion with @liancheng , here is the result:

**Why does the test fail?**
1. We write a parquet file with schema `[a: long, b: int]` to path 
`/data/a=1`.
2. When read it back, we will infer the data schema as `[a: long, b: int]` 
and partition schema as `[a: int]`. 
3. In 
[`HadoopFsRelation.schema`](https://github.com/apache/spark/blob/branch-2.1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala#L51-L56),
 we merge data schema and partition schema, and announce to users that the 
output schema will be `[a: long, b: int]`
4. In 
[`FileSourceScanExec`](https://github.com/apache/spark/blob/branch-2.1/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L252-L260),
 we build a parquet reader and tell it that the data schema is `[a: long, b: 
int]`, the required schema is `[b: int]`, the partition schema is `[a: int]`.
5. In [vectorized parquet 
read](https://github.com/apache/spark/blob/branch-2.1/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedParquetRecordReader.java#L181-L188),
 we read the data from parquet files according to the required schema: `[b: 
int]`, and append partition values according to the partition schema: `[a: 
int]`, so the schema of the physical row data is: `[b: int, a: int]`
6. In 
[`FileSourceStrategy`](https://github.com/apache/spark/blob/branch-2.1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala#L74-L76),
 we mistakenly think the parquet scan will output data of schema `[b: int, a: 
long]`, and read the second column as long type. The vectorized parquet read 
can NOT read an int column as long and throw NPE.

**How to fix?**
The root cause is that, when data schema includes partition columns, how to 
determine the type of partition columns? Currently, at physical layer(the 
reader), we trust the partition schema, which is inferred from directory 
strings. At logical layer(`HadoopFsRelation`), we trust the partition columns 
inside of data schema. This inconsistency caused the bug.

w.r.t. the fact that we use partition values extracted from directory 
strings and ignore the partition columns inside physical data files, we think 
it's more reasonable to trust partition schema.

So the fix is simple, update `HadoopFsRelation.schema`, to respect the 
partition columns position in data schema, but also respect the partition 
columns type in partition schema.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16263: [SPARK-18281][SQL][PySpark] Consumes the returned...

2016-12-13 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/16263#discussion_r92332668
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -400,10 +402,19 @@ def toLocalIterator(self):
 
 >>> list(df.toLocalIterator())
 [Row(age=2, name=u'Alice'), Row(age=5, name=u'Bob')]
+>>> it = df.toLocalIterator()
+>>> import time
+>>> time.sleep(5)
+>>> next(it)
--- End diff --

It might make sense to test consuming the second element given that the fix 
eagerly serves the first element, (and this additional test probably belongs in 
tests.py rather than a doctest though).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16194: [SPARK-18767][ML] Unify Models' toString methods

2016-12-13 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/16194
  
Good point. This make me think of the usage of `instr.logParams(params: 
_*)` in training instrumentation [https://github.com/apache/spark/pull/15671].
IMIO, params are copied from its trainer, and some are not significant for 
the model.
For `KMeansModel`, I think only param `k` make sense, others are just the 
params to create the model.
I am not sure whether we should print all shared params of model.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16277: [SPARK-18854][SQL] numberedTreeString and apply(i) incon...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16277
  
**[Test build #70125 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70125/consoleFull)**
 for PR 16277 at commit 
[`e79306f`](https://github.com/apache/spark/commit/e79306f2f187587e50c1300150df8ba6a94a4691).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16277: [SPARK-18854][SQL] numberedTreeString and apply(i...

2016-12-13 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16277#discussion_r92331760
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala 
---
@@ -557,9 +562,10 @@ abstract class TreeNode[BaseType <: 
TreeNode[BaseType]] extends Product {
 }
 
 if (children.nonEmpty) {
-  children.init.foreach(
-_.generateTreeString(depth + 1, lastChildren :+ false, builder, 
verbose, prefix))
-  children.last.generateTreeString(depth + 1, lastChildren :+ true, 
builder, verbose, prefix)
+  children.init.foreach(_.generateTreeString(
--- End diff --

i re-arranged this so it looks exactly the same as the way we traverse 
innerChildren to make it more obvious.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16277: [SPARK-18854][SQL] numberedTreeString and apply(i...

2016-12-13 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16277#discussion_r92331508
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala 
---
@@ -534,19 +539,16 @@ abstract class TreeNode[BaseType <: 
TreeNode[BaseType]] extends Product {
   builder: StringBuilder,
   verbose: Boolean,
   prefix: String = ""): StringBuilder = {
+
 if (depth > 0) {
   lastChildren.init.foreach { isLast =>
-val prefixFragment = if (isLast) "   " else ":  "
--- End diff --

all these variable names are pretty confusing, so I got rid of all of them.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16277: [SPARK-18854][SQL] numberedTreeString and apply(i) incon...

2016-12-13 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16277
  
cc @yhuai @liancheng @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16277: [SPARK-18854][SQL] numberedTreeString and apply(i...

2016-12-13 Thread rxin
GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/16277

[SPARK-18854][SQL] numberedTreeString and apply(i) inconsistent for 
subqueries

## What changes were proposed in this pull request?
This is a bug introduced by subquery handling. numberedTreeString (which 
uses generateTreeString under the hood) numbers trees including innerChildren 
(used to print subqueries), but apply (which uses getNodeNumbered) ignores 
innerChildren. As a result, apply(i) would return the wrong plan node if there 
are subqueries.

This patch fixes the bug.

## How was this patch tested?
Added a test case in SubquerySuite.scala to test both the depth-first 
traversal of numbering as well as making sure the two methods are consistent.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark SPARK-18854

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16277.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16277


commit 28113223d0fd623705266599da2c8603eb7e26ab
Author: Reynold Xin 
Date:   2016-12-14T06:52:59Z

[SPARK-18854][SQL] numberedTreeString and apply(i) inconsistent for 
subqueries.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15018: [SPARK-17455][MLlib] Improve PAVA implementation ...

2016-12-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/15018#discussion_r92331409
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala 
---
@@ -344,27 +344,30 @@ class IsotonicRegression private (private var 
isotonic: Boolean) extends Seriali
 }
 
 var i = 0
-val len = input.length
-while (i < len) {
-  var j = i
-
-  // Find monotonicity violating sequence, if any.
-  while (j < len - 1 && input(j)._1 > input(j + 1)._1) {
-j = j + 1
-  }
+val n = input.length - 1
+var notFinished = true
+
+while (notFinished) {
+  i = 0
+  notFinished = false
+
+  // Iterate through the data, fix any monotonicity violations we find
+  // We may need to do this multiple times, as pooling can introduce 
violations
+  // at locations that were previously fine.
+  while (i < n) {
+var j = i
+
+// Find next monotonicity violating sequence, if any.
+while (j < n && input(j)._1 >= input(j + 1)._1) {
--- End diff --

I think the original one i.e., `input(j)._1 > input(j + 1)._1` is correct. 
Here it is going to select out-of-order blocks.

Quoted from the paper:
> We refer to two blocks [p, q] and [q + 1, r] as consecutive. We refer to 
two consecutive blocks [p, q] and [q +1, r] as in-order if  theta_pq <= 
theta_q+1, r and out-of-order otherwise.

LEMMA 1 is pointing the how a merged block is also a single-valued block.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16266: [WIP][SPARK-18842][TESTS][LAUNCHER] De-duplicate paths i...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16266
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16266: [WIP][SPARK-18842][TESTS][LAUNCHER] De-duplicate paths i...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16266
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70121/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16266: [WIP][SPARK-18842][TESTS][LAUNCHER] De-duplicate paths i...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16266
  
**[Test build #70121 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70121/consoleFull)**
 for PR 16266 at commit 
[`8c17f3a`](https://github.com/apache/spark/commit/8c17f3add2f558b346a40fcabc6c9e5f6e6c416e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-13 Thread liancheng
Github user liancheng commented on the issue:

https://github.com/apache/spark/pull/16030
  
@maropu @brkyvz Sorry for the delay, I was blocked by some other tasks 
during the last a few days.

@cloud-fan and I just did some investigation and we think we came up with a 
minimal fix to this issue that also preserves the old behavior. Although we do 
agree that the new behavior makes more sense (moving all partition columns to 
the end of the output schema), it is a breaking change and we'd like to 
minimize risks for 2.1 release.

@cloud-fan will give a summary of the new fix and reasons behind it soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...

2016-12-13 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15717
  
Sorry for the last minute comment. I did not realize it until I manually 
run these test cases in Hive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15717: [SPARK-17910][SQL] Allow users to update the comm...

2016-12-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15717#discussion_r92329696
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -877,6 +877,35 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder {
   }
 
   /**
+   * Create a [[AlterTableChangeColumnsCommand]] command.
+   *
+   * For example:
+   * {{{
+   *   ALTER TABLE table [PARTITION partition_spec]
+   *   CHANGE [COLUMN] `col` `col` dataType [COMMENT "comment"] [FIRST | 
AFTER `otherCol`]
+   *   [, `col2` `col2` dataType [COMMENT "comment"] [FIRST | AFTER 
`otherCol`], ...]
--- End diff --

When we supporting multiple columns syntax, we also need to consider the 
edge case. For example, duplicate column names. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15717: [SPARK-17910][SQL] Allow users to update the comm...

2016-12-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15717#discussion_r92329570
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -877,6 +877,35 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder {
   }
 
   /**
+   * Create a [[AlterTableChangeColumnsCommand]] command.
+   *
+   * For example:
+   * {{{
+   *   ALTER TABLE table [PARTITION partition_spec]
+   *   CHANGE [COLUMN] `col` `col` dataType [COMMENT "comment"] [FIRST | 
AFTER `otherCol`]
+   *   [, `col2` `col2` dataType [COMMENT "comment"] [FIRST | AFTER 
`otherCol`], ...]
--- End diff --

What is the reason do we allow users to modify multiple columns in the same 
DDL? This is different from what Hive supports. Should we do it? cc @cloud-fan 

In addition, based on the existing syntax, if we really want to support 
multiple columns, we should change the keywords to `CHANGE COLUMNS`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16273: [SPARK-18852][SS]StreamingQuery.lastProgress should be n...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16273
  
**[Test build #70124 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70124/consoleFull)**
 for PR 16273 at commit 
[`005204e`](https://github.com/apache/spark/commit/005204e7582fea5efa4e4dc70f5bb612a7d21a05).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16273: [SPARK-18852][SS]StreamingQuery.lastProgress should be n...

2016-12-13 Thread zsxwing
Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/16273
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16134: [SPARK-18703] [SQL] Drop Staging Directories and Data Fi...

2016-12-13 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16134
  
Yeah, let me update it now. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16263: [SPARK-18281][SQL][PySpark] Consumes the returned local ...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16263
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70119/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16263: [SPARK-18281][SQL][PySpark] Consumes the returned local ...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16263
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16263: [SPARK-18281][SQL][PySpark] Consumes the returned local ...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16263
  
**[Test build #70119 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70119/consoleFull)**
 for PR 16263 at commit 
[`fc02026`](https://github.com/apache/spark/commit/fc020267b0719ca7a350ffd766cd444eb011849f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16276: [SPARK-18855][CORE] Add RDD flatten function

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16276
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16276: [SPARK-18855][CORE] Add RDD flatten function

2016-12-13 Thread linbojin
GitHub user linbojin opened a pull request:

https://github.com/apache/spark/pull/16276

[SPARK-18855][CORE] Add RDD flatten function

## What changes were proposed in this pull request?

Added a new flatten function for RDD.

## How was this patch tested?

Unit tests inside RDDSuite and manually tests:
```
scala> val rdd = sc.makeRDD(List(List(1, 2, 3), List(4, 5), List(6)))
rdd: org.apache.spark.rdd.RDD[List[Int]] = ParallelCollectionRDD[0] at 
makeRDD at :24

scala> rdd.flatten.collect
res0: Array[Int] = Array(1, 2, 3, 4, 5, 6)
```


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/linbojin/spark SPARK-18855-add-rdd-flatten

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16276.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16276


commit 2c0903ac07367cf203e4b1ed6bf4ac1894976ec9
Author: linbojin 
Date:   2016-12-14T06:04:48Z

add RDD flatten function and tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/16253
  
@ajbozarth @vanzin 
Can anyone of you retest this please?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15915
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15915
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70118/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15915
  
**[Test build #70118 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70118/consoleFull)**
 for PR 15915 at commit 
[`1567c73`](https://github.com/apache/spark/commit/1567c73e61e3761a9019a51b663f74de5f048d69).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16247: [SPARK-18817][SparkR] set default spark-warehouse path t...

2016-12-13 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/16247
  
But yes _other_ Spark config properties would be set by the user in 
sparkConfig parameter of sparkR.session method. We would just add to that like
```
sparkConfig[["spark.sql.warehouse.default.dir"]] <- tempdir()
```
without adding another parameter to sparkR.session()




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16274: [SPARK-18853][SQL] Project (UnaryNode) is way too aggres...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16274
  
**[Test build #70123 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70123/consoleFull)**
 for PR 16274 at commit 
[`21570a7`](https://github.com/apache/spark/commit/21570a70572248428390ba0d36b335e1af0f5aa2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16173: [SPARK-18742][CORE]readd spark.broadcast.factory conf to...

2016-12-13 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16173
  
Is there actually an implementation you want to do? I'd push back against 
opening this up unless we have a good reason to (e.g. more scalable broadcast).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-13 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/15915
  
Current change looks good to me. cc @JoshRosen @srowen to check it again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15915: [SPARK-18485][CORE] Underlying integer overflow w...

2016-12-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/15915#discussion_r92323575
  
--- Diff: 
core/src/test/scala/org/apache/spark/storage/MemoryStoreSuite.scala ---
@@ -303,6 +303,34 @@ class MemoryStoreSuite
 assert(memoryStore.currentUnrollMemoryForThisTask === 0) // discard 
released the unroll memory
   }
 
+  test("set unrollMemoryThreshold a huge value larger than Int.MaxValue") {
+val tmpConf = conf.clone.set("spark.storage.unrollMemoryThreshold", 
s"${1L + Int.MaxValue}")
+val (memoryStore, blockInfoManager) = makeMemoryStore(12000L + 
Int.MaxValue, tmpConf)
+val smallList = List.fill(40)(new Array[Byte](100))
+def smallIterator: Iterator[Any] = 
smallList.iterator.asInstanceOf[Iterator[Any]]
+assert(memoryStore.currentUnrollMemoryForThisTask === 0)
+
+def putIteratorAsBytes[T](
+blockId: BlockId,
+iter: Iterator[T],
+classTag: ClassTag[T]): Either[PartiallySerializedBlock[T], Long] 
= {
+  assert(blockInfoManager.lockNewBlockForWriting(
+blockId,
+new BlockInfo(StorageLevel.MEMORY_ONLY_SER, classTag, tellMaster = 
false)))
+  val res = memoryStore.putIteratorAsBytes(blockId, iter, classTag, 
MemoryMode.ON_HEAP)
+  blockInfoManager.unlock(blockId)
+  res
+}
+
+// Unroll with plenty of space. This should succeed and cache both 
blocks.
+val result1 = putIteratorAsBytes("b1", smallIterator, ClassTag.Any)
+val result2 = putIteratorAsBytes("b2", smallIterator, ClassTag.Any)
+assert(memoryStore.contains("b1"))
+assert(memoryStore.contains("b2"))
+assert(result1.isRight) // unroll was successful
+assert(result2.isRight)
--- End diff --

nvm. It will be released once unrolling is successful.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15915: [SPARK-18485][CORE] Underlying integer overflow w...

2016-12-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/15915#discussion_r92322708
  
--- Diff: 
core/src/test/scala/org/apache/spark/storage/MemoryStoreSuite.scala ---
@@ -303,6 +303,34 @@ class MemoryStoreSuite
 assert(memoryStore.currentUnrollMemoryForThisTask === 0) // discard 
released the unroll memory
   }
 
+  test("set unrollMemoryThreshold a huge value larger than Int.MaxValue") {
+val tmpConf = conf.clone.set("spark.storage.unrollMemoryThreshold", 
s"${1L + Int.MaxValue}")
+val (memoryStore, blockInfoManager) = makeMemoryStore(12000L + 
Int.MaxValue, tmpConf)
+val smallList = List.fill(40)(new Array[Byte](100))
+def smallIterator: Iterator[Any] = 
smallList.iterator.asInstanceOf[Iterator[Any]]
+assert(memoryStore.currentUnrollMemoryForThisTask === 0)
+
+def putIteratorAsBytes[T](
+blockId: BlockId,
+iter: Iterator[T],
+classTag: ClassTag[T]): Either[PartiallySerializedBlock[T], Long] 
= {
+  assert(blockInfoManager.lockNewBlockForWriting(
+blockId,
+new BlockInfo(StorageLevel.MEMORY_ONLY_SER, classTag, tellMaster = 
false)))
+  val res = memoryStore.putIteratorAsBytes(blockId, iter, classTag, 
MemoryMode.ON_HEAP)
+  blockInfoManager.unlock(blockId)
+  res
+}
+
+// Unroll with plenty of space. This should succeed and cache both 
blocks.
+val result1 = putIteratorAsBytes("b1", smallIterator, ClassTag.Any)
+val result2 = putIteratorAsBytes("b2", smallIterator, ClassTag.Any)
+assert(memoryStore.contains("b1"))
+assert(memoryStore.contains("b2"))
+assert(result1.isRight) // unroll was successful
+assert(result2.isRight)
--- End diff --

Let's also check if the `currentUnrollMemoryForThisTask` is the default 
size 4MB.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15018: [SPARK-17455][MLlib] Improve PAVA implementation ...

2016-12-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/15018#discussion_r92322282
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala 
---
@@ -344,27 +344,30 @@ class IsotonicRegression private (private var 
isotonic: Boolean) extends Seriali
 }
 
 var i = 0
-val len = input.length
-while (i < len) {
-  var j = i
-
-  // Find monotonicity violating sequence, if any.
-  while (j < len - 1 && input(j)._1 > input(j + 1)._1) {
-j = j + 1
-  }
+val n = input.length - 1
+var notFinished = true
+
+while (notFinished) {
+  i = 0
+  notFinished = false
+
+  // Iterate through the data, fix any monotonicity violations we find
+  // We may need to do this multiple times, as pooling can introduce 
violations
+  // at locations that were previously fine.
+  while (i < n) {
+var j = i
+
+// Find next monotonicity violating sequence, if any.
+while (j < n && input(j)._1 >= input(j + 1)._1) {
+  j = j + 1
+}
 
-  // If monotonicity was not violated, move to next data point.
-  if (i == j) {
-i = i + 1
-  } else {
-// Otherwise pool the violating sequence
-// and check if pooling caused monotonicity violation in 
previously processed points.
-while (i >= 0 && input(i)._1 > input(i + 1)._1) {
+// Pool the violating sequence with the data point preceding it
--- End diff --

Is this comment correct? Looks like you pool the violating sequence [i, j] 
only. The preceding data points are checked for pooling in next outer loop.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14079
  
**[Test build #70122 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70122/consoleFull)**
 for PR 14079 at commit 
[`c95462f`](https://github.com/apache/spark/commit/c95462fe5c25d37b8658955304f739cc10ccf1f9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-13 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/14079
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16173: [SPARK-18742][CORE]readd spark.broadcast.factory conf to...

2016-12-13 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/16173
  
cc @rxin could you help to give some advise? thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16176: [SPARK-18746][SQL] Add implicit encoder for BigDecimal, ...

2016-12-13 Thread weiqingy
Github user weiqingy commented on the issue:

https://github.com/apache/spark/pull/16176
  
Thanks for the review. @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16142: [SPARK-18716][CORE] Restrict the disk usage of spark eve...

2016-12-13 Thread uncleGen
Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/16142
  
@AmplabJenkins  retest please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16266: [WIP][SPARK-18842][TESTS][LAUNCHER] De-duplicate paths i...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16266
  
**[Test build #70121 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70121/consoleFull)**
 for PR 16266 at commit 
[`8c17f3a`](https://github.com/apache/spark/commit/8c17f3add2f558b346a40fcabc6c9e5f6e6c416e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15915
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15915
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70115/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16266: [WIP][SPARK-18842][TESTS][LAUNCHER] De-duplicate paths i...

2016-12-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16266
  
Build started: [TESTS] `org.apache.spark.ShuffleSuite` 
[![PR-16266](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=CCF7EE3E-8554-4764-BA30-5A0606A3=true)](https://ci.appveyor.com/project/spark-test/spark/branch/CCF7EE3E-8554-4764-BA30-5A0606A3)
Build started: [TESTS] 
`org.apache.spark.sql.execution.joins.BroadcastJoinSuite` 
[![PR-16266](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=9DCF3A3E-29AE-4186-856D-C9E059F73BE1=true)](https://ci.appveyor.com/project/spark-test/spark/branch/9DCF3A3E-29AE-4186-856D-C9E059F73BE1)
Diff: 
https://github.com/apache/spark/compare/master...spark-test:CCF7EE3E-8554-4764-BA30-5A0606A3


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15915
  
**[Test build #70115 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70115/consoleFull)**
 for PR 15915 at commit 
[`d31d74f`](https://github.com/apache/spark/commit/d31d74f1b24273cc74484fc6cb12a2588831bb09).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16255: [SPARK-18609][SQL]Fix when CTE with Join between two tab...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16255
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70114/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16255: [SPARK-18609][SQL]Fix when CTE with Join between two tab...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16255
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16255: [SPARK-18609][SQL]Fix when CTE with Join between two tab...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16255
  
**[Test build #70114 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70114/consoleFull)**
 for PR 16255 at commit 
[`0413f9d`](https://github.com/apache/spark/commit/0413f9dad4ad1294e3400dc0f42f66529b1b055b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16253
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70117/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16253
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16266: [WIP][SPARK-18842][TESTS][LAUNCHER] De-duplicate paths i...

2016-12-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16266
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16253
  
**[Test build #70117 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70117/consoleFull)**
 for PR 16253 at commit 
[`e115cda`](https://github.com/apache/spark/commit/e115cdad29ae90c7d0b7da6d2a2e90047dc87985).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14079
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14079
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70120/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14079
  
**[Test build #70120 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70120/consoleFull)**
 for PR 14079 at commit 
[`c95462f`](https://github.com/apache/spark/commit/c95462fe5c25d37b8658955304f739cc10ccf1f9).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16264: [SPARK-18793] [SPARK-18794] [R] add spark.randomForest/s...

2016-12-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16264
  
Yup, it seems failed time to time due to network problem. FYI, there was 
some discussions about this in 
https://github.com/apache/spark/pull/15686#issuecomment-257217562 and 
https://github.com/apache/spark/pull/15697#issuecomment-257721992

I could not find a workaround to re-trigger this so far and it seems (I 
manually checked and privately asked if it is true) some Apahce projects using 
Travis CI/AppVeyor are also re-triggering this via closing and opening.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14079
  
**[Test build #70120 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70120/consoleFull)**
 for PR 14079 at commit 
[`c95462f`](https://github.com/apache/spark/commit/c95462fe5c25d37b8658955304f739cc10ccf1f9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16134: [SPARK-18703] [SQL] Drop Staging Directories and Data Fi...

2016-12-13 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16134
  
#16104 merged, can you update it? thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...

2016-12-13 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/15717
  
LGTM, @gatorsmile for final sign-off


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16263: [SPARK-18281][SQL][PySpark] Consumes the returned local ...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16263
  
**[Test build #70119 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70119/consoleFull)**
 for PR 16263 at commit 
[`fc02026`](https://github.com/apache/spark/commit/fc020267b0719ca7a350ffd766cd444eb011849f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16274: [SPARK-18853][SQL] Project (UnaryNode) is way too aggres...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16274
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16274: [SPARK-18853][SQL] Project (UnaryNode) is way too aggres...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16274
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70113/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16274: [SPARK-18853][SQL] Project (UnaryNode) is way too aggres...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16274
  
**[Test build #70113 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70113/consoleFull)**
 for PR 16274 at commit 
[`4d33dd8`](https://github.com/apache/spark/commit/4d33dd8211fc7279cdb2a90a40ce237838f27e25).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15995: [SPARK-18566][SQL] remove OverwriteOptions

2016-12-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15995


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15995: [SPARK-18566][SQL] remove OverwriteOptions

2016-12-13 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/15995
  
thanks for the review, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-13 Thread uncleGen
Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/15915
  
```
Discovery starting.
Discovery completed in 37 seconds, 984 milliseconds.
Run starting. Expected test count is: 10
MemoryStoreSuite:
- reserve/release unroll memory
- safely unroll blocks
- safely unroll blocks through putIteratorAsValues
- safely unroll blocks through putIteratorAsBytes
- set unrollMemoryThreshold a huge value larger than Int.MaxValue
- PartiallySerializedBlock.valuesIterator
- PartiallySerializedBlock.finishWritingToStream
- multiple unrolls by the same thread
- lazily create a big ByteBuffer to avoid OOM if it cannot be put into 
MemoryStore
- put a small ByteBuffer to MemoryStore
Run completed in 39 seconds, 91 milliseconds.
Total number of tests run: 10
Suites: completed 2, aborted 0
Tests: succeeded 10, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15915
  
**[Test build #70118 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70118/consoleFull)**
 for PR 15915 at commit 
[`1567c73`](https://github.com/apache/spark/commit/1567c73e61e3761a9019a51b663f74de5f048d69).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16189: [SPARK-18761][CORE] Introduce "task reaper" to oversee t...

2016-12-13 Thread JoshRosen
Github user JoshRosen commented on the issue:

https://github.com/apache/spark/pull/16189
  
Jenkins retest this please
On Tue, Dec 13, 2016 at 6:13 PM UCB AMPLab  wrote:

> Test FAILed.
>
>
> Refer to this link for build results (access rights to CI server needed):
>
> https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70109/
> Test FAILed.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16247: [SPARK-18817][SparkR] set default spark-warehouse path t...

2016-12-13 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/16247
  
But yes other Spark config properties would be set by the user in 
sparkConfig parameter of sparkR.session method. We would just add to that 
without adding another parameter to sparkR.session




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16264: [SPARK-18793] [SPARK-18794] [R] add spark.randomForest/s...

2016-12-13 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/16264
  
That looks like a network error when accessing github




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16247: [SPARK-18817][SparkR] set default spark-warehouse path t...

2016-12-13 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/16247
  
I mean it as something we set to the SparkContext or SparkSession and not a 
parameter of sparkR.session().




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16264: [SPARK-18793] [SPARK-18794] [R] add spark.randomForest/s...

2016-12-13 Thread mengxr
Github user mengxr commented on the issue:

https://github.com/apache/spark/pull/16264
  
@HyukjinKwon Could you take a look at the AppVeyor error?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16259: [Minor][SparkR]:fix kstest example error and add ...

2016-12-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16259


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16275: [SPARK-18588][Tests]Ignore KafkaSourceStressForDontFailO...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16275
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70116/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16275: [SPARK-18588][Tests]Ignore KafkaSourceStressForDontFailO...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16275
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16275: [SPARK-18588][Tests]Ignore KafkaSourceStressForDontFailO...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16275
  
**[Test build #70116 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70116/consoleFull)**
 for PR 16275 at commit 
[`3172db5`](https://github.com/apache/spark/commit/3172db5d4fa7a95bf0b3facc5b0d8e639ade0348).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16259: [Minor][SparkR]:fix kstest example error and add unit te...

2016-12-13 Thread yanboliang
Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/16259
  
Merged into master and branch-2.1. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16274: [SPARK-18853][SQL] Project (UnaryNode) is way too aggres...

2016-12-13 Thread davies
Github user davies commented on the issue:

https://github.com/apache/spark/pull/16274
  
lgtm


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16273: [SPARK-18852][SS]StreamingQuery.lastProgress should be n...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16273
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70112/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16273: [SPARK-18852][SS]StreamingQuery.lastProgress should be n...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16273
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16273: [SPARK-18852][SS]StreamingQuery.lastProgress should be n...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16273
  
**[Test build #70112 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70112/consoleFull)**
 for PR 16273 at commit 
[`005204e`](https://github.com/apache/spark/commit/005204e7582fea5efa4e4dc70f5bb612a7d21a05).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16275: [SPARK-18588][Tests]Ignore KafkaSourceStressForDo...

2016-12-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16275


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16275: [SPARK-18588][Tests]Ignore KafkaSourceStressForDontFailO...

2016-12-13 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16275
  
Merging in master/branch-2.1.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16266: [WIP][SPARK-18842][TESTS][LAUNCHER] De-duplicate paths i...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16266
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70111/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16266: [WIP][SPARK-18842][TESTS][LAUNCHER] De-duplicate paths i...

2016-12-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16266
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16266: [WIP][SPARK-18842][TESTS][LAUNCHER] De-duplicate paths i...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16266
  
**[Test build #70111 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70111/consoleFull)**
 for PR 16266 at commit 
[`8c17f3a`](https://github.com/apache/spark/commit/8c17f3add2f558b346a40fcabc6c9e5f6e6c416e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16253
  
**[Test build #70117 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70117/consoleFull)**
 for PR 16253 at commit 
[`e115cda`](https://github.com/apache/spark/commit/e115cdad29ae90c7d0b7da6d2a2e90047dc87985).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >