date:20160701

[GitHub] spark issue #13834: [SPARK-16339] [CORE] ScriptTransform does not print stde...

2016-07-01 Thread tejasapatil

Github user tejasapatil commented on the issue:

https://github.com/apache/spark/pull/13834
  
@srowen : In case of exception, we `destroy()` the `proc` which cleans up 
all the associated streams : 
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/java/lang/UNIXProcess.java#428


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13976: [SPARK-16288][SQL] Implement inline table generat...

2016-07-01 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13976#discussion_r69301324
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/GeneratorFunctionSuite.scala ---
@@ -89,4 +91,30 @@ class GeneratorFunctionSuite extends QueryTest with 
SharedSQLContext {
   exploded.join(exploded, exploded("i") === 
exploded("i")).agg(count("*")),
   Row(3) :: Nil)
   }
+
+  test("inline with empty table or empty array") {
--- End diff --

the test name is misleading: we do allow empty array, the problem is 
`array()` returns an array of null, which fails the type check.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13976: [SPARK-16288][SQL] Implement inline table generat...

2016-07-01 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13976#discussion_r69301056
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorExpressionSuite.scala
 ---
@@ -68,4 +69,23 @@ class GeneratorExpressionSuite extends SparkFunSuite 
with ExpressionEvalHelper {
   PosExplode(CreateArray(str_array.map(Literal(_,
   str_correct_answer.map(InternalRow.fromSeq(_)))
   }
+
+  test("inline") {
+val correct_answer = Seq(
+  Seq(0, UTF8String.fromString("a")),
--- End diff --

we can create a row directly in test: call `create_row(...)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13976: [SPARK-16288][SQL] Implement inline table generat...

2016-07-01 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13976#discussion_r69301097
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorExpressionSuite.scala
 ---
@@ -68,4 +69,23 @@ class GeneratorExpressionSuite extends SparkFunSuite 
with ExpressionEvalHelper {
   PosExplode(CreateArray(str_array.map(Literal(_,
   str_correct_answer.map(InternalRow.fromSeq(_)))
   }
+
+  test("inline") {
+val correct_answer = Seq(
+  Seq(0, UTF8String.fromString("a")),
--- End diff --

and it can help us convert string to UTF8String


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13976: [SPARK-16288][SQL] Implement inline table generat...

2016-07-01 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13976#discussion_r69300935
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorExpressionSuite.scala
 ---
@@ -68,4 +69,23 @@ class GeneratorExpressionSuite extends SparkFunSuite 
with ExpressionEvalHelper {
   PosExplode(CreateArray(str_array.map(Literal(_,
   str_correct_answer.map(InternalRow.fromSeq(_)))
   }
+
+  test("inline") {
+val correct_answer = Seq(
+  Seq(0, UTF8String.fromString("a")),
+  Seq(1, UTF8String.fromString("b")),
+  Seq(2, UTF8String.fromString("c")))
+
+checkTuple(
+  Inline(Literal.create(Array(), 
ArrayType(StructType(Seq(StructField("id1", LongType)),
--- End diff --

we usually use `new StructType().add("id", LongType)` to create struct type


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13976: [SPARK-16288][SQL] Implement inline table generat...

2016-07-01 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13976#discussion_r69300771
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala
 ---
@@ -195,3 +195,42 @@ case class Explode(child: Expression) extends 
ExplodeBase(child, position = fals
   extended = "> SELECT _FUNC_(array(10,20));\n  0\t10\n  1\t20")
 // scalastyle:on line.size.limit
 case class PosExplode(child: Expression) extends ExplodeBase(child, 
position = true)
+
+/**
+ * Explodes an array of structs into a table.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(a) - Explodes an array of structs into a table.",
+  extended = "> SELECT _FUNC_(array(struct(1, 'a'), struct(2, 'b')));\n 
[1,a]\n[2,b]")
+case class Inline(child: Expression) extends UnaryExpression with 
Generator with CodegenFallback {
+
+  override def children: Seq[Expression] = child :: Nil
+
+  override def checkInputDataTypes(): TypeCheckResult = child.dataType 
match {
+case ArrayType(et, _) if et.isInstanceOf[StructType] =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+s"input to function inline should be array of struct type, not 
${child.dataType}")
+  }
+
+  override def elementSchema: StructType = child.dataType match {
+case ArrayType(et : StructType, _) =>
+  StructType(et.fields.zipWithIndex.map {
+case (field, index) => StructField(field.name, field.dataType, 
nullable = field.nullable)
+  })
+  }
+
+  private lazy val ncol = elementSchema.fields.length
--- End diff --

I'd like to name it `numFields`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13976: [SPARK-16288][SQL] Implement inline table generat...

2016-07-01 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13976#discussion_r69300727
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala
 ---
@@ -195,3 +195,42 @@ case class Explode(child: Expression) extends 
ExplodeBase(child, position = fals
   extended = "> SELECT _FUNC_(array(10,20));\n  0\t10\n  1\t20")
 // scalastyle:on line.size.limit
 case class PosExplode(child: Expression) extends ExplodeBase(child, 
position = true)
+
+/**
+ * Explodes an array of structs into a table.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(a) - Explodes an array of structs into a table.",
+  extended = "> SELECT _FUNC_(array(struct(1, 'a'), struct(2, 'b')));\n 
[1,a]\n[2,b]")
+case class Inline(child: Expression) extends UnaryExpression with 
Generator with CodegenFallback {
+
+  override def children: Seq[Expression] = child :: Nil
+
+  override def checkInputDataTypes(): TypeCheckResult = child.dataType 
match {
+case ArrayType(et, _) if et.isInstanceOf[StructType] =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+s"input to function inline should be array of struct type, not 
${child.dataType}")
+  }
+
+  override def elementSchema: StructType = child.dataType match {
+case ArrayType(et : StructType, _) =>
+  StructType(et.fields.zipWithIndex.map {
+case (field, index) => StructField(field.name, field.dataType, 
nullable = field.nullable)
+  })
+  }
+
+  private lazy val ncol = elementSchema.fields.length
+
+  override def eval(input: InternalRow): TraversableOnce[InternalRow] = 
child.dataType match {
--- End diff --

Why do we pattern match here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13976: [SPARK-16288][SQL] Implement inline table generat...

2016-07-01 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13976#discussion_r69300459
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala
 ---
@@ -195,3 +195,42 @@ case class Explode(child: Expression) extends 
ExplodeBase(child, position = fals
   extended = "> SELECT _FUNC_(array(10,20));\n  0\t10\n  1\t20")
 // scalastyle:on line.size.limit
 case class PosExplode(child: Expression) extends ExplodeBase(child, 
position = true)
+
+/**
+ * Explodes an array of structs into a table.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(a) - Explodes an array of structs into a table.",
+  extended = "> SELECT _FUNC_(array(struct(1, 'a'), struct(2, 'b')));\n 
[1,a]\n[2,b]")
+case class Inline(child: Expression) extends UnaryExpression with 
Generator with CodegenFallback {
+
+  override def children: Seq[Expression] = child :: Nil
+
+  override def checkInputDataTypes(): TypeCheckResult = child.dataType 
match {
+case ArrayType(et, _) if et.isInstanceOf[StructType] =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+s"input to function inline should be array of struct type, not 
${child.dataType}")
+  }
+
+  override def elementSchema: StructType = child.dataType match {
+case ArrayType(et : StructType, _) =>
+  StructType(et.fields.zipWithIndex.map {
--- End diff --

hmm, so it's just `et` now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13981: [SPARK-16307] [ML] Add test to verify the predicted vari...

2016-07-01 Thread sethah

Github user sethah commented on the issue:

https://github.com/apache/spark/pull/13981
  
@MechCoder Thanks for adding this! I think it's a good test to protect 
against silent failures in the future. I just left a few small comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13981: [SPARK-16307] [ML] Add test to verify the predict...

2016-07-01 Thread sethah

Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/13981#discussion_r69298953
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/regression/DecisionTreeRegressorSuite.scala
 ---
@@ -96,6 +108,15 @@ class DecisionTreeRegressorSuite
   assert(variance === expectedVariance,
 s"Expected variance $expectedVariance but got $variance.")
 }
+
+val toyDF = TreeTests.setMetadata(toyData, Map.empty[Int, Int], 0)
+dt.setMaxDepth(1)
+  .setMaxBins(6)
+  .setSeed(0)
+val expectVariances = 
dt.fit(toyDF).transform(toyDF).select("variance").collect().map {
+  case Row(variance: Double) => variance }
+val trueVariances = Array(0.667, 0.667, 0.667, 2.667, 2.667, 2.667)
+trueVariances.zip(expectVariances).foreach(x => x._1 ~== x._2 absTol 
1e-3)
--- End diff --

Although this technically works, it is less confusing if use `assert` and 
unpack the tuple. Like

```scala
...foreach { case (actual, expected) =>
  assert(actual ~== expected absTol 1e-3)
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13981: [SPARK-16307] [ML] Add test to verify the predict...

2016-07-01 Thread sethah

Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/13981#discussion_r69298648
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/regression/DecisionTreeRegressorSuite.scala
 ---
@@ -96,6 +108,15 @@ class DecisionTreeRegressorSuite
   assert(variance === expectedVariance,
 s"Expected variance $expectedVariance but got $variance.")
 }
+
+val toyDF = TreeTests.setMetadata(toyData, Map.empty[Int, Int], 0)
+dt.setMaxDepth(1)
+  .setMaxBins(6)
+  .setSeed(0)
+val expectVariances = 
dt.fit(toyDF).transform(toyDF).select("variance").collect().map {
--- End diff --

`expectedVariances` and `trueVariances` are mixed up here. Expected should 
be the theoretical value computed below. Also, it would be good to leave a 
comment explaining where those expected values came from.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/14010
  
@srowen Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13976: [SPARK-16288][SQL] Implement inline table generating fun...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13976
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13976: [SPARK-16288][SQL] Implement inline table generating fun...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13976
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61615/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13981: [SPARK-16307] [ML] Add test to verify the predict...

2016-07-01 Thread sethah

Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/13981#discussion_r69298384
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/regression/DecisionTreeRegressorSuite.scala
 ---
@@ -96,6 +108,15 @@ class DecisionTreeRegressorSuite
   assert(variance === expectedVariance,
 s"Expected variance $expectedVariance but got $variance.")
 }
+
+val toyDF = TreeTests.setMetadata(toyData, Map.empty[Int, Int], 0)
+dt.setMaxDepth(1)
+  .setMaxBins(6)
--- End diff --

Not sure why we need to set maxBins here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14010
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61614/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14010
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13976: [SPARK-16288][SQL] Implement inline table generating fun...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13976
  
**[Test build #61615 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61615/consoleFull)**
 for PR 13976 at commit 
[`9382f64`](https://github.com/apache/spark/commit/9382f64a19c9671a679a75ce22b801aa32576da5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Inline(child: Expression) extends UnaryExpression with 
Generator with CodegenFallback `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14010
  
**[Test build #61614 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61614/consoleFull)**
 for PR 14010 at commit 
[`9fc83f6`](https://github.com/apache/spark/commit/9fc83f6c086eafcd58523234c4a95eb25158632b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14014: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14014
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61616/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14014: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14014
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14013: [SPARK-16344][SQL][BRANCH-1.6] Decoding Parquet array of...

2016-07-01 Thread liancheng

Github user liancheng commented on the issue:

https://github.com/apache/spark/pull/14013
  
@rdblue Verified that parquet-avro also suffers from this issue. Filed 
[PARQUET-651][1] to track it.

[1]: https://issues.apache.org/jira/browse/PARQUET-651


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14015: [SPARK-16345][Documentation][Examples][GraphX] Extract g...

2016-07-01 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/14015
  
@srowen Yes, the example code is exactly the same as those in graphx doc, 
and I test them all, can run normally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14014: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14014
  
**[Test build #61616 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61616/consoleFull)**
 for PR 14014 at commit 
[`3bfe45f`](https://github.com/apache/spark/commit/3bfe45fe8b81f44141b737df6b292f12cd37d06a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-01 Thread janplus

Github user janplus commented on a diff in the pull request:

https://github.com/apache/spark/pull/14008#discussion_r69297640
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala
 ---
@@ -725,4 +725,43 @@ class StringExpressionsSuite extends SparkFunSuite 
with ExpressionEvalHelper {
 checkEvaluation(FindInSet(Literal("abf"), Literal("abc,b,ab,c,def")), 
0)
 checkEvaluation(FindInSet(Literal("ab,"), Literal("abc,b,ab,c,def")), 
0)
   }
+
+  test("ParseUrl") {
+def checkParseUrl(expected: String, urlStr: String, partToExtract: 
String): Unit = {
+  checkEvaluation(
+ParseUrl(Literal.create(urlStr, StringType), 
Literal.create(partToExtract, StringType)),
+expected)
+}
+def checkParseUrlWithKey(expected: String, urlStr: String,
+  partToExtract: String, key: String): Unit = {
+  checkEvaluation(
+ParseUrl(Literal.create(urlStr, StringType), 
Literal.create(partToExtract, StringType),
+ Literal.create(key, StringType)), expected)
+}
+
+checkParseUrl("spark.apache.org", 
"http://spark.apache.org/path?query=1;, "HOST")
+checkParseUrl("/path", "http://spark.apache.org/path?query=1;, "PATH")
+checkParseUrl("query=1", "http://spark.apache.org/path?query=1;, 
"QUERY")
+checkParseUrl("Ref", "http://spark.apache.org/path?query=1#Ref;, "REF")
+checkParseUrl("http", "http://spark.apache.org/path?query=1;, 
"PROTOCOL")
+checkParseUrl("/path?query=1", "http://spark.apache.org/path?query=1;, 
"FILE")
+checkParseUrl("spark.apache.org:8080", 
"http://spark.apache.org:8080/path?query=1;, "AUTHORITY")
+checkParseUrl("jian", "http://j...@spark.apache.org/path?query=1;, 
"USERINFO")
+checkParseUrlWithKey("1", "http://spark.apache.org/path?query=1;, 
"QUERY", "query")
+
+// Null checking
+checkParseUrl(null, null, "HOST")
+checkParseUrl(null, "http://spark.apache.org/path?query=1;, null)
+checkParseUrl(null, null, null)
+checkParseUrl(null, "test", "HOST")
+checkParseUrl(null, "http://spark.apache.org/path?query=1;, "NO")
+checkParseUrlWithKey(null, "http://spark.apache.org/path?query=1;, 
"HOST", "query")
+checkParseUrlWithKey(null, "http://spark.apache.org/path?query=1;, 
"QUERY", "quer")
+checkParseUrlWithKey(null, "http://spark.apache.org/path?query=1;, 
"QUERY", null)
--- End diff --

I am not sure. Is there any exceptional case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14008: [SPARK-16281][SQL] Implement parse_url SQL function

2016-07-01 Thread janplus

Github user janplus commented on the issue:

https://github.com/apache/spark/pull/14008
  
@rxin and @dongjoon-hyun Thanks for your review.
I have add a new commit which does following things:

1. Put `parse_url` function in the right order.
2. Use `""" """` instead of `+` in `extended` part to work with Scala 2.1.
3. Remove unnecessary `lazy`s.
4. Correct `REGEXPREFIX` and add a new null test case.
5. Use `NonFatal(_)` instead of the specified exception.
6. Fix the indentation problems.

I have tried to not use varargs, but a separate constructor that accept two 
args does not help. As there isn't a magic key to make `parse_url(url, 
partToExtract, magic key)` to be treated as `parse_url(url, partToExtract)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14010
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14010
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61613/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14010
  
**[Test build #61613 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61613/consoleFull)**
 for PR 14010 at commit 
[`9fc83f6`](https://github.com/apache/spark/commit/9fc83f6c086eafcd58523234c4a95eb25158632b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13136: [SPARK-15350][mllib]add unit test function for Lo...

2016-07-01 Thread WeichenXu123

Github user WeichenXu123 closed the pull request at:

https://github.com/apache/spark/pull/13136


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14015: [SPARK-16345][Documentation][Examples][GraphX] Extract g...

2016-07-01 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14015
  
No changes to the code itself (except perhaps style fixes)? OK.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13494
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61610/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13494
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14013: [SPARK-16344][SQL][BRANCH-1.6] Decoding Parquet array of...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14013
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61612/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13494
  
**[Test build #61610 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61610/consoleFull)**
 for PR 13494 at commit 
[`2568193`](https://github.com/apache/spark/commit/2568193f91b9ae129c19a67bfd514065215840ac).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class MetadataOnlyOptimizerSuite extends QueryTest with 
SharedSQLContext `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14013: [SPARK-16344][SQL][BRANCH-1.6] Decoding Parquet array of...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14013
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14013: [SPARK-16344][SQL][BRANCH-1.6] Decoding Parquet array of...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14013
  
**[Test build #61612 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61612/consoleFull)**
 for PR 14013 at commit 
[`c40bccb`](https://github.com/apache/spark/commit/c40bccb631c2175d375e7c2e6ba83d1b831768af).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9207: [SPARK-11171][SPARK-11237][SPARK-11241][ML] Try adding PM...

2016-07-01 Thread MLnick

Github user MLnick commented on the issue:

https://github.com/apache/spark/pull/9207
  
@holden that's true for the fully generic approach. But `DataFrameWriter` 
for example exposes `json` as a shortcut (type-safe in a way) for 
`format("json")`.

I think we can achieve something similar here, but since each impl of 
`MLWriter` is different for each model, I think we can have a `PMML` trait 
attached to those that support it, enabling the "type-safe" approach: 
`model.write.pmml.save("/path")`.

`model.write.save("/path")` does the default built-in format, 
`model.write.pmml.save("/path")` does pmml for those models that actually 
support it using the trait. For generics, it's possible to do 
`model.write.format("pmml").save(...)` and it would then fail at runtime if not 
supported, while `model.write.format("my.custom.format").save(...)` could allow 
plugging in writers similar to the datasource API...

just thoughts, obviously more work will be required to see if it is 
feasible in practice.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14015: [SPARK-16345][Documentation][Examples][GraphX] Extract g...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14015
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14015: [SPARK-16345][Documentation][Examples][GraphX] Extract g...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14015
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61618/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14015: [SPARK-16345][Documentation][Examples][GraphX] Extract g...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14015
  
**[Test build #61618 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61618/consoleFull)**
 for PR 14015 at commit 
[`e9a096c`](https://github.com/apache/spark/commit/e9a096c8c7d1600ff4000560bd195db9a77a1046).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14015: [SPARK-16345][Documentation][Examples][GraphX] Extract g...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14015
  
**[Test build #61618 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61618/consoleFull)**
 for PR 14015 at commit 
[`e9a096c`](https://github.com/apache/spark/commit/e9a096c8c7d1600ff4000560bd195db9a77a1046).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14015: [SPARK-16345][Documentation][Examples][GraphX] Extract g...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14015
  
**[Test build #61617 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61617/consoleFull)**
 for PR 14015 at commit 
[`689d2f6`](https://github.com/apache/spark/commit/689d2f67e15c4e7d6b5b184712172bc46bce2128).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14015: [SPARK-16345][Documentation][Examples][GraphX] Extract g...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14015
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61617/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14015: [SPARK-16345][Documentation][Examples][GraphX] Extract g...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14015
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14015: [SPARK-16345][Documentation][Examples][GraphX] Extract g...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14015
  
**[Test build #61617 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61617/consoleFull)**
 for PR 14015 at commit 
[`689d2f6`](https://github.com/apache/spark/commit/689d2f67e15c4e7d6b5b184712172bc46bce2128).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14015: [SPARK-16345][Documentation][Examples][GraphX] Ex...

2016-07-01 Thread WeichenXu123

GitHub user WeichenXu123 opened a pull request:

https://github.com/apache/spark/pull/14015

[SPARK-16345][Documentation][Examples][GraphX] Extract graphx programming 
guide example snippets from source files instead of hard code them

## What changes were proposed in this pull request?

I extract 6 example programs from GraphX programming guide and replace them 
with
`include_example` label.

The 6 example programs are:
- AggregateMessagesExample.scala
- SSSPExample.scala
- TriangleCountingExample.scala
- ConnectedComponentsExample.scala
- ComprehensiveExample.scala
- PageRankExample.scala

All the example code can run using
`bin/run-example graphx.EXAMPLE_NAME`

## How was this patch tested?

Manual.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WeichenXu123/spark graphx_example_plugin

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14015.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14015


commit 689d2f67e15c4e7d6b5b184712172bc46bce2128
Author: WeichenXu 
Date:   2016-07-01T13:37:52Z

add graphx example.4




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14013: [SPARK-16344][SQL][BRANCH-1.6] Decoding Parquet array of...

2016-07-01 Thread liancheng

Github user liancheng commented on the issue:

https://github.com/apache/spark/pull/14013
  
@rdblue Would you mind to help review this one? My initial investigation 
suggested that parquet-avro probably suffers the same issue. Will file a 
parquet-mr JIRA ticket soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14014: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14014
  
**[Test build #61616 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61616/consoleFull)**
 for PR 14014 at commit 
[`3bfe45f`](https://github.com/apache/spark/commit/3bfe45fe8b81f44141b737df6b292f12cd37d06a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14014: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-01 Thread liancheng

Github user liancheng commented on the issue:

https://github.com/apache/spark/pull/14014
  
cc @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14014: [SPARK-16344][SQL] Decoding Parquet array of stru...

2016-07-01 Thread liancheng

GitHub user liancheng opened a pull request:

https://github.com/apache/spark/pull/14014

[SPARK-16344][SQL] Decoding Parquet array of struct with a single field 
named "element"

## What changes were proposed in this pull request?

This PR ports #14013 to master and branch-2.0.

## How was this patch tested?

See #14013.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liancheng/spark spark-16344-for-master-and-2.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14014.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14014


commit 3bfe45fe8b81f44141b737df6b292f12cd37d06a
Author: Cheng Lian 
Date:   2016-07-01T11:32:52Z

Fixes SPARK-16344




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14004: [SPARK-16285][SQL] Implement sentences SQL functions

2016-07-01 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14004
  
cc @rxin and @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13976: [SPARK-16288][SQL] Implement inline table generating fun...

2016-07-01 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13976
  
Thank you, @cloud-fan . :)   I've learn a lot in this PR again.
The followings are applied.
- Use `inputArray.getStruct(i, ncol)`
- Keep the original field name
- Fix elementSchema generation style
- Add a column-based test, `Array()` expression-level test, add empty row 
test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13976: [SPARK-16288][SQL] Implement inline table generating fun...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13976
  
**[Test build #61615 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61615/consoleFull)**
 for PR 13976 at commit 
[`9382f64`](https://github.com/apache/spark/commit/9382f64a19c9671a679a75ce22b801aa32576da5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13925: [SPARK-16226][SQL]change the way of JDBC commit

2016-07-01 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/13925
  
I think the important change regards the transaction isolation level that's 
in effect here, but yes that change is also a good one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14013: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14013
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14013: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14013
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61611/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14013: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14013
  
**[Test build #61611 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61611/consoleFull)**
 for PR 14013 at commit 
[`9620b48`](https://github.com/apache/spark/commit/9620b48d463ed2f2a8ede7397420050dc1e7d832).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14010
  
**[Test build #61614 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61614/consoleFull)**
 for PR 14010 at commit 
[`9fc83f6`](https://github.com/apache/spark/commit/9fc83f6c086eafcd58523234c4a95eb25158632b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14010
  
**[Test build #61613 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61613/consoleFull)**
 for PR 14010 at commit 
[`9fc83f6`](https://github.com/apache/spark/commit/9fc83f6c086eafcd58523234c4a95eb25158632b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/14010
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14013: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14013
  
**[Test build #61612 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61612/consoleFull)**
 for PR 14013 at commit 
[`c40bccb`](https://github.com/apache/spark/commit/c40bccb631c2175d375e7c2e6ba83d1b831768af).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/14010
  
Jenkins retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14013: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14013
  
**[Test build #61611 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61611/consoleFull)**
 for PR 14013 at commit 
[`9620b48`](https://github.com/apache/spark/commit/9620b48d463ed2f2a8ede7397420050dc1e7d832).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14013: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-01 Thread liancheng

Github user liancheng commented on the issue:

https://github.com/apache/spark/pull/14013
  
cc @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14013: [SPARK-16344][SQL] Decoding Parquet array of stru...

2016-07-01 Thread liancheng

Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/14013#discussion_r69283877
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala
 ---
@@ -481,13 +481,106 @@ private[parquet] class CatalystRowConverter(
  */
 // scalastyle:on
 private def isElementType(
-parquetRepeatedType: Type, catalystElementType: DataType, 
parentName: String): Boolean = {
+parquetRepeatedType: Type, catalystElementType: DataType, parent: 
GroupType): Boolean = {
+
+  def isStandardListLayout(t: GroupType): Boolean =
+Option(parent.getOriginalType) == Some(LIST) &&
+  t.getFieldCount == 1 &&
+  t.getName == "list" &&
+  t.getFieldName(0) == "element"
+
   (parquetRepeatedType, catalystElementType) match {
-case (t: PrimitiveType, _) => true
-case (t: GroupType, _) if t.getFieldCount > 1 => true
-case (t: GroupType, _) if t.getFieldCount == 1 && t.getName == 
"array" => true
-case (t: GroupType, _) if t.getFieldCount == 1 && t.getName == 
parentName + "_tuple" => true
-case (t: GroupType, StructType(Array(f))) if f.name == 
t.getFieldName(0) => true
+case (t: PrimitiveType, _) =>
+  // For legacy 2-level list types with primitive element type, 
e.g.:
+  //
+  //// List (nullable list, non-null elements)
+  //optional group my_list (LIST) {
+  //  repeated int32 element;
+  //}
+  true
+
+case (t: GroupType, _) if t.getFieldCount > 1 =>
+  // For legacy 2-level list types whose element type is a group 
type with 2 or more fields,
+  // e.g.:
+  //
+  //// List> (nullable list, non-null 
elements)
+  //optional group my_list (LIST) {
+  //  repeated group element {
+  //required binary str (UTF8);
+  //required int32 num;
+  //  };
+  //}
+  true
+
+case (t: GroupType, _) if t.getFieldCount == 1 && t.getName == 
"array" =>
+  // For Parquet data generated by parquet-thrift, e.g.:
+  //
+  //// List (nullable list, non-null 
elements)
+  //optional group my_list (LIST) {
+  //  repeated group my_list_tuple {
+  //required binary str (UTF8);
+  //  };
+  //}
+  true
+
+case (t: GroupType, _) if t.getFieldCount == 1 && t.getName == 
parent + "_tuple" =>
+  // For Parquet data generated by parquet-thrift, e.g.:
+  //
+  //// List (nullable list, non-null 
elements)
+  //optional group my_list (LIST) {
+  //  repeated group my_list_tuple {
+  //required binary str (UTF8);
+  //  };
+  //}
+  true
+
+case (t: GroupType, _) if isStandardListLayout(t) =>
+  // For standard 3-level list types, e.g.:
+  //
+  //// List (list nullable, elements non-null)
+  //optional group my_list (LIST) {
+  //  repeated group list {
+  //required binary element (UTF8);
+  //  }
+  //}
+  //
+  // This case branch must appear before the next one. See 
comments of the next case branch
+  // for details.
+  false
--- End diff --

This case branch is essential for the bug fix. Basically, it matches the 
standard 3-level layout first before trying to match the legacy 2-level layout, 
so that the "element" syntactic group in Parquet LIST won't be mistaken for the 
"element" field in the nested struct.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13374: [SPARK-13638][SQL] Add escapeAll option to CSV Da...

2016-07-01 Thread jurriaan

Github user jurriaan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13374#discussion_r69283761
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
 ---
@@ -366,6 +366,32 @@ class CSVSuite extends QueryTest with SharedSQLContext 
with SQLTestUtils {
 }
   }
 
+  test("save csv with quoteAll enabled") {
--- End diff --

Fixed :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14013: [SPARK-16344][SQL] Decoding Parquet array of stru...

2016-07-01 Thread liancheng

GitHub user liancheng opened a pull request:

https://github.com/apache/spark/pull/14013

[SPARK-16344][SQL] Decoding Parquet array of struct with a single field 
named "element"

## What changes were proposed in this pull request?

Please refer to [SPARK-16344][1] for details about this issue.

## How was this patch tested?

New test case added in `ParquetQuerySuite`.

[1]: https://issues.apache.org/jira/browse/SPARK-16344

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liancheng/spark 
spark-16344-parquet-schema-corner-case

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14013.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14013


commit 9620b48d463ed2f2a8ede7397420050dc1e7d832
Author: Cheng Lian 
Date:   2016-07-01T10:52:29Z

Fixes SPARK-16344




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14010
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14010
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61603/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14010: [GRAPHX][EXAMPLES] move graphx test data directory and u...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14010
  
**[Test build #61603 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61603/consoleFull)**
 for PR 14010 at commit 
[`9fc83f6`](https://github.com/apache/spark/commit/9fc83f6c086eafcd58523234c4a95eb25158632b).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13494
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61607/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13494
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13494
  
**[Test build #61607 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61607/consoleFull)**
 for PR 13494 at commit 
[`a22e962`](https://github.com/apache/spark/commit/a22e9626e6294671e0915822def6eb283a72a643).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13494
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13494
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61608/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13494
  
**[Test build #61608 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61608/consoleFull)**
 for PR 13494 at commit 
[`41fef2c`](https://github.com/apache/spark/commit/41fef2c40f4929fd26476ecdfa3ee8160394a7d3).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13494
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61609/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13494
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13494
  
**[Test build #61609 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61609/consoleFull)**
 for PR 13494 at commit 
[`88f7308`](https://github.com/apache/spark/commit/88f7308173829ca2473690a0c409c438d3cd5cf4).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13925: [SPARK-16226][SQL]change the way of JDBC commit

2016-07-01 Thread maver1ck

Github user maver1ck commented on the issue:

https://github.com/apache/spark/pull/13925
  
@srowen 
Maybe we should change this condition to 
`conn.getMetaData().supportsTransactions()` ?
I can prepare PR.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13494
  
**[Test build #61610 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61610/consoleFull)**
 for PR 13494 at commit 
[`2568193`](https://github.com/apache/spark/commit/2568193f91b9ae129c19a67bfd514065215840ac).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13894: [SPARK-15254][DOC] Improve ML pipeline Cross Vali...

2016-07-01 Thread MLnick

Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/13894#discussion_r69281183
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala ---
@@ -56,7 +56,10 @@ private[ml] trait CrossValidatorParams extends 
ValidatorParams {
 
 /**
  * :: Experimental ::
- * K-fold cross validation.
+ * CrossValidator begins by splitting the dataset into a set of 
non-overlapping randomly
+ * partitioned folds which are used as separate training and test datasets 
e.g., with k=3 folds,
+ * CrossValidator will generate 3 (training, test) dataset pairs, each of 
which uses 2/3 of
+ * the data for training and 1/3 for testing. Each fold is used in the 
testing set exactly once.
--- End diff --

"used in the testing set" -> "used as the test set"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13389: [SPARK-9876][SQL][FOLLOWUP] Enable string and bin...

2016-07-01 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/13389#discussion_r69280087
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystWriteSupport.scala
 ---
@@ -150,7 +150,8 @@ private[parquet] class CatalystWriteSupport extends 
WriteSupport[InternalRow] wi
 
   case StringType =>
 (row: SpecializedGetters, ordinal: Int) =>
-  
recordConsumer.addBinary(Binary.fromByteArray(row.getUTF8String(ordinal).getBytes))
+  recordConsumer.addBinary(
+
Binary.fromReusedByteArray(row.getUTF8String(ordinal).getBytes))
--- End diff --

Thank you for your review! (Actually it is `UTF8String`. So, it has to be 
converted into `String` to use `Binary.fromString`).. though.. I am a bit 
worried that it might possibly be reused in the future (although I think it is 
not reused for now). 

This can write corrupt statistics if this is reused.. Is my understanding 
correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13374: [SPARK-13638][SQL] Add quoteAll option to CSV Dat...

2016-07-01 Thread jurriaan

Github user jurriaan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13374#discussion_r69279736
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -745,6 +748,8 @@ def csv(self, path, mode=None, compression=None, 
sep=None, quote=None, escape=No
 self.option("nullValue", nullValue)
 if escapeQuotes is not None:
 self.option("escapeQuotes", nullValue)
+if escapeAll is not None:
+self.option("escapeAll", nullValue)
--- End diff --

Wow!, we should fix this `escapeQuotes` thing too..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-01 Thread janplus

Github user janplus commented on a diff in the pull request:

https://github.com/apache/spark/pull/14008#discussion_r69276506
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala
 ---
@@ -725,4 +725,43 @@ class StringExpressionsSuite extends SparkFunSuite 
with ExpressionEvalHelper {
 checkEvaluation(FindInSet(Literal("abf"), Literal("abc,b,ab,c,def")), 
0)
 checkEvaluation(FindInSet(Literal("ab,"), Literal("abc,b,ab,c,def")), 
0)
   }
+
+  test("ParseUrl") {
+def checkParseUrl(expected: String, urlStr: String, partToExtract: 
String): Unit = {
+  checkEvaluation(
+ParseUrl(Literal.create(urlStr, StringType), 
Literal.create(partToExtract, StringType)),
+expected)
+}
+def checkParseUrlWithKey(expected: String, urlStr: String,
+  partToExtract: String, key: String): Unit = {
+  checkEvaluation(
+ParseUrl(Literal.create(urlStr, StringType), 
Literal.create(partToExtract, StringType),
+ Literal.create(key, StringType)), expected)
+}
+
+checkParseUrl("spark.apache.org", 
"http://spark.apache.org/path?query=1;, "HOST")
+checkParseUrl("/path", "http://spark.apache.org/path?query=1;, "PATH")
+checkParseUrl("query=1", "http://spark.apache.org/path?query=1;, 
"QUERY")
+checkParseUrl("Ref", "http://spark.apache.org/path?query=1#Ref;, "REF")
+checkParseUrl("http", "http://spark.apache.org/path?query=1;, 
"PROTOCOL")
+checkParseUrl("/path?query=1", "http://spark.apache.org/path?query=1;, 
"FILE")
+checkParseUrl("spark.apache.org:8080", 
"http://spark.apache.org:8080/path?query=1;, "AUTHORITY")
+checkParseUrl("jian", "http://j...@spark.apache.org/path?query=1;, 
"USERINFO")
--- End diff --

OK


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14008: [SPARK-16281][SQL] Implement parse_url SQL functi...

2016-07-01 Thread janplus

Github user janplus commented on a diff in the pull request:

https://github.com/apache/spark/pull/14008#discussion_r69276469
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -653,6 +655,128 @@ case class StringRPad(str: Expression, len: 
Expression, pad: Expression)
 }
 
 /**
+ * Extracts a part from a URL
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(url, partToExtract[, key]) - extracts a part from a URL",
+  extended = "Parts: HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, 
USERINFO\n"
+  + "key specifies which query to extract\n"
+  + "Examples:\n"
+  + "  > SELECT _FUNC_('http://spark.apache.org/path?query=1', "
+  + "'HOST') FROM src LIMIT 1;\n" + "  'spark.apache.org'\n"
+  + "  > SELECT _FUNC_('http://spark.apache.org/path?query=1', "
+  + "'QUERY') FROM src LIMIT 1;\n"  + "  'query=1'\n"
+  + "  > SELECT _FUNC_('http://spark.apache.org/path?query=1', "
+  + "'QUERY', 'query') FROM src LIMIT 1;\n" + "  '1'")
+case class ParseUrl(children: Expression*)
+  extends Expression with ImplicitCastInputTypes with CodegenFallback {
+
+  override def nullable: Boolean = true
+
+  override def inputTypes: Seq[DataType] = 
Seq.fill(children.size)(StringType)
+  override def dataType: DataType = StringType
+
+  private lazy val stringExprs = children.toArray
--- End diff --

Try to avoid Scala Seqs' potential performance problems.
https://github.com/apache/spark/pull/13966/files#r69184719


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14012: [SPARK-16343][SQL] Improve the PushDownPredicate rule to...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14012
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14012: [SPARK-16343][SQL] Improve the PushDownPredicate rule to...

2016-07-01 Thread jiangxb1987

Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/14012
  
cc @liancheng @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14012: [SPARK-16343][SQL] Improve the PushDownPredicate ...

2016-07-01 Thread jiangxb1987

GitHub user jiangxb1987 opened a pull request:

https://github.com/apache/spark/pull/14012

[SPARK-16343][SQL] Improve the PushDownPredicate rule to pushdown preâ¦

## What changes were proposed in this pull request?

Currently our Optimizer may reorder the predicates to run them more 
efficient, but in non-deterministic condition, change the order between 
deterministic parts and non-deterministic parts may change the number of input 
rows. For example:
SELECT a FROM t WHERE rand() < 0.1 AND a = 1
And
SELECT a FROM t WHERE a = 1 AND rand() < 0.1
may call rand() for different times and therefore the output rows differ.

This PR improved this condition by check the predicate is placed before any 
non-deterministic predicates.

## How was this patch tested?

Expanded related testcases in FilterPushdownSuite.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiangxb1987/spark ppd

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14012.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14012


commit 856d86d788b318c2975a5318b181678f4b71f5bc
Author: èæå 
Date:   2016-07-01T09:10:50Z

[SPARK-16343][SQL] Improve the PushDownPredicate rule to pushdown 
predicates currectly in non-deterministic condition.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13494
  
**[Test build #61609 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61609/consoleFull)**
 for PR 13494 at commit 
[`88f7308`](https://github.com/apache/spark/commit/88f7308173829ca2473690a0c409c438d3cd5cf4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14002: [SPARK-16335][SQL] Structured streaming should fail if s...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14002
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61605/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14002: [SPARK-16335][SQL] Structured streaming should fail if s...

2016-07-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14002
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread lianhuiwang

Github user lianhuiwang commented on the issue:

https://github.com/apache/spark/pull/13494
  
@cloud-fan I have updated with your branch code. Thanks a lot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14002: [SPARK-16335][SQL] Structured streaming should fail if s...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14002
  
**[Test build #61605 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61605/consoleFull)**
 for PR 14002 at commit 
[`2e5f2ef`](https://github.com/apache/spark/commit/2e5f2efb5481ae900c9c87fd9daf180a18347998).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13919: [SPARK-16222] [SQL] JDBC Sources - Handling illeg...

2016-07-01 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13919


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13919: [SPARK-16222] [SQL] JDBC Sources - Handling illegal inpu...

2016-07-01 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/13919
  
Merged to master/2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13494
  
**[Test build #61608 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61608/consoleFull)**
 for PR 13494 at commit 
[`41fef2c`](https://github.com/apache/spark/commit/41fef2c40f4929fd26476ecdfa3ee8160394a7d3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work ...

2016-07-01 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/12601#discussion_r69267676
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcRelationProvider.scala
 ---
@@ -19,37 +19,105 @@ package org.apache.spark.sql.execution.datasources.jdbc
 
 import java.util.Properties
 
-import org.apache.spark.sql.SQLContext
-import org.apache.spark.sql.sources.{BaseRelation, DataSourceRegister, 
RelationProvider}
+import org.apache.spark.sql.{DataFrame, SaveMode, SQLContext}
+import org.apache.spark.sql.sources.{BaseRelation, 
CreatableRelationProvider, DataSourceRegister, RelationProvider, 
SchemaRelationProvider}
+import org.apache.spark.sql.types.StructType
 
-class JdbcRelationProvider extends RelationProvider with 
DataSourceRegister {
+class JdbcRelationProvider extends CreatableRelationProvider
+  with SchemaRelationProvider with RelationProvider with 
DataSourceRegister {
 
   override def shortName(): String = "jdbc"
 
-  /** Returns a new base relation with the given parameters. */
   override def createRelation(
   sqlContext: SQLContext,
   parameters: Map[String, String]): BaseRelation = {
-val jdbcOptions = new JDBCOptions(parameters)
-if (jdbcOptions.partitionColumn != null
-  && (jdbcOptions.lowerBound == null
-|| jdbcOptions.upperBound == null
-|| jdbcOptions.numPartitions == null)) {
+createRelation(sqlContext, parameters, null)
+  }
+
+  /** Returns a new base relation with the given parameters. */
+  override def createRelation(
+  sqlContext: SQLContext,
+  parameters: Map[String, String],
+  schema: StructType): BaseRelation = {
+val url = parameters.getOrElse("url", sys.error("Option 'url' not 
specified"))
+val table = parameters.getOrElse("dbtable", sys.error("Option 
'dbtable' not specified"))
+val partitionColumn = parameters.getOrElse("partitionColumn", null)
+val lowerBound = parameters.getOrElse("lowerBound", null)
+val upperBound = parameters.getOrElse("upperBound", null)
+val numPartitions = parameters.getOrElse("numPartitions", null)
--- End diff --

I think the validation can be done together in `JDBCOptions`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13494: [SPARK-15752] [SQL] support optimization for metadata on...

2016-07-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13494
  
**[Test build #61607 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61607/consoleFull)**
 for PR 13494 at commit 
[`a22e962`](https://github.com/apache/spark/commit/a22e9626e6294671e0915822def6eb283a72a643).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 >

301 - 400 of 490 matches

Mail list logo