date:20170311

[GitHub] spark issue #17264: [SPARK-19923][SQL] Remove unnecessary type conversions p...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17264
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17264: [SPARK-19923][SQL] Remove unnecessary type conversions p...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17264
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74396/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17264: [SPARK-19923][SQL] Remove unnecessary type conversions p...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17264
  
**[Test build #74396 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74396/testReport)**
 for PR 17264 at commit 
[`6468fde`](https://github.com/apache/spark/commit/6468fde7a9b726843e505b029cb5f7ac865690fe).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17232
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17232
  
**[Test build #74395 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74395/testReport)**
 for PR 17232 at commit 
[`ace4f02`](https://github.com/apache/spark/commit/ace4f0224bf67d9143b07e8a9ca610568cc49ffb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17232
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74395/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17264: [SPARK-19923][SQL] Remove unnecessary type conversions p...

2017-03-11 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/17264
  
I quickly did check performance changes;
```
public class TestGenericUDF extends GenericUDF {
  @Override
  public ObjectInspector initialize(ObjectInspector[] objectInspectors) 
throws UDFArgumentException {
  return PrimitiveObjectInspectorFactory.javaLongObjectInspector;
  }

  @Override
  public Object evaluate(DeferredObject[] args) throws HiveException {
  final long a1 = 
PrimitiveObjectInspectorFactory.javaLongObjectInspector.get(args[0].get());
  final long a2 = 
PrimitiveObjectInspectorFactory.javaLongObjectInspector.get(args[1].get());
  final long a3 = 
PrimitiveObjectInspectorFactory.javaLongObjectInspector.get(args[2].get());
  final long a4 = 
PrimitiveObjectInspectorFactory.javaLongObjectInspector.get(args[3].get());
  return a1 + a2 + a3 + a4;
  }
}
```

```
$./bin/spark-shell --master local[1]  --conf spark.sql.shuffle.partitions=1 
-v

scala> sql("CREATE TEMPORARY FUNCTION testUdf AS 
'hivemall.ftvec.TestGenericUDF'")
scala> :paste
def timer[R](block: => R): R = {
  val t0 = System.nanoTime()
  val result = block
  val t1 = System.nanoTime()
  println("Elapsed time: " + ((t1 - t0 + 0.0) / 10.0)+ "s")
  result
}

scala> spark.range(300).createOrReplaceTempView("t")
scala> timer { sql("SELECT testUdf(id, id, id, id) FROM 
t").queryExecution.executedPlan.execute().foreach(x => {}) }
```

```
# performance w/ this patch
Elapsed time: 1.901269167s

# performance w/o this patch
Elapsed time: 0.492860666s
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17264: [SPARK-19923][SQL] Remove unnecessary type conversions p...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17264
  
**[Test build #74396 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74396/testReport)**
 for PR 17264 at commit 
[`6468fde`](https://github.com/apache/spark/commit/6468fde7a9b726843e505b029cb5f7ac865690fe).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17264: [SPARK-19923][SQL] Remove unnecessary type conversions p...

2017-03-11 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/17264
  
I'm not sure this makes sense, so could you check? cc: @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17264: [SPARK-19923][SQL] Remove unnecessary type conver...

2017-03-11 Thread maropu

GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/17264

[SPARK-19923][SQL] Remove unnecessary type conversions per call in Hive

## What changes were proposed in this pull request?
This pr removed unnecessary type conversions per call in Hive: 
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala#L116

## How was this patch tested?
Existing tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-19923

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17264.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17264


commit 6468fde7a9b726843e505b029cb5f7ac865690fe
Author: Takeshi Yamamuro 
Date:   2017-03-12T06:19:20Z

Remove unnecessary type conversions per call




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17232
  
**[Test build #74395 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74395/testReport)**
 for PR 17232 at commit 
[`ace4f02`](https://github.com/apache/spark/commit/ace4f0224bf67d9143b07e8a9ca610568cc49ffb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17260: [SPARK-19921] [SQL] [TEST] Enable end-to-end testing usi...

2017-03-11 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17260
  
cc @cloud-fan @yhuai @sameeragarwal 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...

2017-03-11 Thread shivaram

Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16330#discussion_r105548957
  
--- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
@@ -2897,6 +2898,27 @@ test_that("Collect on DataFrame when NAs exists at 
the top of a timestamp column
   expect_equal(class(ldf3$col3), c("POSIXct", "POSIXt"))
 })
 
+compare_list <- function(list1, list2) {
+  # get testthat to show the diff by first making the 2 lists equal in 
length
+  expect_equal(length(list1), length(list2))
+  l <- max(length(list1), length(list2))
--- End diff --

Got it - that sounds good


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...

2017-03-11 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17232
  
To show the impact of `boolean hasFollowingStatsTask`, we need to deliver 
the fix in `VersionSuite.scala` at first. The PR has been submitted: 
https://github.com/apache/spark/pull/17260


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17232: [SPARK-18112] [SQL] Support reading data from Hiv...

2017-03-11 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17232#discussion_r105548919
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -94,6 +94,10 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
 try {
   body
 } catch {
+  case i: InvocationTargetException if 
isClientException(i.getTargetException) =>
+val e = i.getTargetException
+throw new AnalysisException(
+  e.getClass.getCanonicalName + ": " + e.getMessage, cause = 
Some(e))
--- End diff --

This fix needs to be backported to the previous releases. Will submit a 
separate one for this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...

2017-03-11 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17209#discussion_r105548820
  
--- Diff: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala
 ---
@@ -606,6 +607,36 @@ class KafkaSourceSuite extends KafkaSourceTest {
 assert(query.exception.isEmpty)
   }
 
+  for((optionKey, optionValue, answer) <- Seq(
+(STARTING_OFFSETS_OPTION_KEY, "earLiEst", EarliestOffsetRangeLimit),
+(ENDING_OFFSETS_OPTION_KEY, "laTest", LatestOffsetRangeLimit),
+(STARTING_OFFSETS_OPTION_KEY, """{"topic-A":{"0":23}}""",
+  SpecificOffsetRangeLimit(Map(new TopicPartition("topic-A", 0) -> 
23) {
+test(s"test offsets containing uppercase characters 
(${answer.getClass.getSimpleName})") {
+  val offset = getKafkaOffsetRangeLimit(
+Map(optionKey -> optionValue),
+optionKey,
+answer
+  )
+
+  assert(offset == answer)
--- End diff --

nit `==` => `===`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...

2017-03-11 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17209#discussion_r105548818
  
--- Diff: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala
 ---
@@ -606,6 +607,36 @@ class KafkaSourceSuite extends KafkaSourceTest {
 assert(query.exception.isEmpty)
   }
 
+  for((optionKey, optionValue, answer) <- Seq(
--- End diff --

nit: move the `for` loop into the `test`. Not need to create many tests 
here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...

2017-03-11 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17209#discussion_r105548819
  
--- Diff: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala
 ---
@@ -606,6 +607,36 @@ class KafkaSourceSuite extends KafkaSourceTest {
 assert(query.exception.isEmpty)
   }
 
+  for((optionKey, optionValue, answer) <- Seq(
+(STARTING_OFFSETS_OPTION_KEY, "earLiEst", EarliestOffsetRangeLimit),
+(ENDING_OFFSETS_OPTION_KEY, "laTest", LatestOffsetRangeLimit),
+(STARTING_OFFSETS_OPTION_KEY, """{"topic-A":{"0":23}}""",
+  SpecificOffsetRangeLimit(Map(new TopicPartition("topic-A", 0) -> 
23) {
+test(s"test offsets containing uppercase characters 
(${answer.getClass.getSimpleName})") {
+  val offset = getKafkaOffsetRangeLimit(
+Map(optionKey -> optionValue),
+optionKey,
+answer
+  )
+
+  assert(offset == answer)
+}
+  }
+
+  for((optionKey, answer) <- Seq(
--- End diff --

Same as above


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...

2017-03-11 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17209#discussion_r105548749
  
--- Diff: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
 ---
@@ -128,18 +123,18 @@ private[kafka010] class KafkaSourceProvider extends 
DataSourceRegister
 .map { k => k.drop(6).toString -> parameters(k) }
 .toMap
 
-val startingRelationOffsets =
-  
caseInsensitiveParams.get(STARTING_OFFSETS_OPTION_KEY).map(_.trim.toLowerCase) 
match {
-case Some("earliest") => EarliestOffsetRangeLimit
-case Some(json) => 
SpecificOffsetRangeLimit(JsonUtils.partitionOffsets(json))
-case None => EarliestOffsetRangeLimit
+val startingRelationOffsets = 
KafkaSourceProvider.getKafkaOffsetRangeLimit(
+  caseInsensitiveParams, STARTING_OFFSETS_OPTION_KEY, 
EarliestOffsetRangeLimit) match {
+case earliest @ EarliestOffsetRangeLimit => earliest
--- End diff --

`startingRelationOffsets` won't be `latest` since it's checked in 
`validateBatchOptions`.
Why not just:
```Scala
val startingRelationOffsets = KafkaSourceProvider.getKafkaOffsetRangeLimit(
  caseInsensitiveParams,
  STARTING_OFFSETS_OPTION_KEY,
  EarliestOffsetRangeLimit)
assert(startingRelationOffsets != LatestOffsetRangeLimit)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...

2017-03-11 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17209#discussion_r105548762
  
--- Diff: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
 ---
@@ -388,34 +383,34 @@ private[kafka010] class KafkaSourceProvider extends 
DataSourceRegister
 
   private def validateBatchOptions(caseInsensitiveParams: Map[String, 
String]) = {
 // Batch specific options
-
caseInsensitiveParams.get(STARTING_OFFSETS_OPTION_KEY).map(_.trim.toLowerCase) 
match {
-  case Some("earliest") => // good to go
-  case Some("latest") =>
+KafkaSourceProvider.getKafkaOffsetRangeLimit(
+  caseInsensitiveParams, STARTING_OFFSETS_OPTION_KEY, 
EarliestOffsetRangeLimit) match {
+  case EarliestOffsetRangeLimit => // good to go
+  case LatestOffsetRangeLimit =>
 throw new IllegalArgumentException("starting offset can't be 
latest " +
   "for batch queries on Kafka")
-  case Some(json) => 
(SpecificOffsetRangeLimit(JsonUtils.partitionOffsets(json)))
-.partitionOffsets.foreach {
+  case specific: SpecificOffsetRangeLimit =>
+specific.partitionOffsets.foreach {
   case (tp, off) if off == KafkaOffsetRangeLimit.LATEST =>
 throw new IllegalArgumentException(s"startingOffsets for $tp 
can't " +
   "be latest for batch queries on Kafka")
   case _ => // ignore
 }
-  case _ => // default to earliest
 }
 
-
caseInsensitiveParams.get(ENDING_OFFSETS_OPTION_KEY).map(_.trim.toLowerCase) 
match {
-  case Some("earliest") =>
+KafkaSourceProvider.getKafkaOffsetRangeLimit(
+  caseInsensitiveParams, ENDING_OFFSETS_OPTION_KEY, 
LatestOffsetRangeLimit) match {
+  case EarliestOffsetRangeLimit =>
 throw new IllegalArgumentException("ending offset can't be 
earliest " +
   "for batch queries on Kafka")
-  case Some("latest") => // good to go
-  case Some(json) => 
(SpecificOffsetRangeLimit(JsonUtils.partitionOffsets(json)))
-.partitionOffsets.foreach {
+  case LatestOffsetRangeLimit => // good to go
+  case specific: SpecificOffsetRangeLimit =>
--- End diff --

nit: `case SpecificOffsetRangeLimit(partitionOffsets) =>`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...

2017-03-11 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17209#discussion_r105548753
  
--- Diff: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
 ---
@@ -128,18 +123,18 @@ private[kafka010] class KafkaSourceProvider extends 
DataSourceRegister
 .map { k => k.drop(6).toString -> parameters(k) }
 .toMap
 
-val startingRelationOffsets =
-  
caseInsensitiveParams.get(STARTING_OFFSETS_OPTION_KEY).map(_.trim.toLowerCase) 
match {
-case Some("earliest") => EarliestOffsetRangeLimit
-case Some(json) => 
SpecificOffsetRangeLimit(JsonUtils.partitionOffsets(json))
-case None => EarliestOffsetRangeLimit
+val startingRelationOffsets = 
KafkaSourceProvider.getKafkaOffsetRangeLimit(
+  caseInsensitiveParams, STARTING_OFFSETS_OPTION_KEY, 
EarliestOffsetRangeLimit) match {
+case earliest @ EarliestOffsetRangeLimit => earliest
+case specific @ SpecificOffsetRangeLimit(_) => specific
+case _ => EarliestOffsetRangeLimit
   }
 
-val endingRelationOffsets =
-  
caseInsensitiveParams.get(ENDING_OFFSETS_OPTION_KEY).map(_.trim.toLowerCase) 
match {
-case Some("latest") => LatestOffsetRangeLimit
-case Some(json) => 
SpecificOffsetRangeLimit(JsonUtils.partitionOffsets(json))
-case None => LatestOffsetRangeLimit
+val endingRelationOffsets = 
KafkaSourceProvider.getKafkaOffsetRangeLimit(caseInsensitiveParams,
+  ENDING_OFFSETS_OPTION_KEY, LatestOffsetRangeLimit) match {
+case latest @ LatestOffsetRangeLimit => latest
--- End diff --

Same as above


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...

2017-03-11 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17209#discussion_r105548760
  
--- Diff: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
 ---
@@ -388,34 +383,34 @@ private[kafka010] class KafkaSourceProvider extends 
DataSourceRegister
 
   private def validateBatchOptions(caseInsensitiveParams: Map[String, 
String]) = {
 // Batch specific options
-
caseInsensitiveParams.get(STARTING_OFFSETS_OPTION_KEY).map(_.trim.toLowerCase) 
match {
-  case Some("earliest") => // good to go
-  case Some("latest") =>
+KafkaSourceProvider.getKafkaOffsetRangeLimit(
+  caseInsensitiveParams, STARTING_OFFSETS_OPTION_KEY, 
EarliestOffsetRangeLimit) match {
+  case EarliestOffsetRangeLimit => // good to go
+  case LatestOffsetRangeLimit =>
 throw new IllegalArgumentException("starting offset can't be 
latest " +
   "for batch queries on Kafka")
-  case Some(json) => 
(SpecificOffsetRangeLimit(JsonUtils.partitionOffsets(json)))
-.partitionOffsets.foreach {
+  case specific: SpecificOffsetRangeLimit =>
--- End diff --

nit: `case SpecificOffsetRangeLimit(partitionOffsets) =>`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17263
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17263
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74394/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17263
  
**[Test build #74394 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74394/testReport)**
 for PR 17263 at commit 
[`63b7ae8`](https://github.com/apache/spark/commit/63b7ae8246f53a16dfbaf3763f73feb8488a1566).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17263
  
**[Test build #74394 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74394/testReport)**
 for PR 17263 at commit 
[`63b7ae8`](https://github.com/apache/spark/commit/63b7ae8246f53a16dfbaf3763f73feb8488a1566).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...

2017-03-11 Thread yanji84

Github user yanji84 commented on the issue:

https://github.com/apache/spark/pull/17109
  
@mgummelt comments addressed, please take another look


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17263
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17263
  
**[Test build #74393 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74393/testReport)**
 for PR 17263 at commit 
[`f22f47f`](https://github.com/apache/spark/commit/f22f47f5b341f930b42ccea507a3697c0953abc1).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17263
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74393/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17258: [SPARK-19807][Web UI]Add reason for cancellation when a ...

2017-03-11 Thread ajbozarth

Github user ajbozarth commented on the issue:

https://github.com/apache/spark/pull/17258
  
Thanks, I like this change but I think the reason string could be simpler, 
like `"killed via the Web UI"`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17263: [SPARK-19922][ML] small speedups to findSynonyms

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17263
  
**[Test build #74393 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74393/testReport)**
 for PR 17263 at commit 
[`f22f47f`](https://github.com/apache/spark/commit/f22f47f5b341f930b42ccea507a3697c0953abc1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17263: [SPARK-19922][ML] small speedups to findSynonyms

2017-03-11 Thread Krimit

GitHub user Krimit opened a pull request:

https://github.com/apache/spark/pull/17263

[SPARK-19922][ML] small speedups to findSynonyms

Currently generating synonyms using a large model (I've tested with 3m 
words) is very slow. These efficiencies have sped things up for us by ~17%

I wasn't sure if such small changes were worthy of a jira, but the 
guidelines seemed to suggest that that is the preferred approach

## What changes were proposed in this pull request?

Address a few small issues in the findSynonyms logic:
1) remove usage of ``Array.fill`` to zero out the ``cosineVec`` array. The 
default float value in Scala and Java is 0.0f, so explicitly setting the values 
to zero is not needed
2) use Floats throughout. The conversion to Doubles before doing the 
``priorityQueue`` is totally superfluous, since all the similarity computations 
are done using Floats anyway. Creating a second large array just serves to put 
extra strain on the GC
3) convert the slow ``for(i <- cosVec.indices)`` to an ugly, but faster, 
``while`` loop

These efficiencies are really only apparent when working with a large model
## How was this patch tested?

Existing unit tests + some in-house tests to time the difference

cc @jkbradley @MLNick @srowen 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Krimit/spark fasterFindSynonyms

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17263.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17263


commit f22f47f5b341f930b42ccea507a3697c0953abc1
Author: Asher Krim 
Date:   2017-03-12T01:19:24Z

small speedups to findSynonyms

Currently generating synonyms using a model with 3m words is painfully 
slow. These efficiencies have sped things up by more than 17%.

Address a few issues in the findSynonyms logic:
1) no need to zero out the cosineVec array each time, since default value 
for float arrays is 0.0f. This should offer some nice speedups
2) use floats throughout. The conversion to Doubles before doing the 
priorityQueue is totally superflous, since all the computations are done using 
floats anyway
3) convert the slow for(i <- cosVec.indices), which combines a scala 
closure with a Range, to an ugly but faster while loop




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17109
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17109
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74392/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17109
  
**[Test build #74392 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74392/testReport)**
 for PR 17109 at commit 
[`737acf0`](https://github.com/apache/spark/commit/737acf07ceea8f4bc92b9eaa8c572af19b2e0b88).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17109
  
**[Test build #74392 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74392/testReport)**
 for PR 17109 at commit 
[`737acf0`](https://github.com/apache/spark/commit/737acf07ceea8f4bc92b9eaa8c572af19b2e0b88).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...

2017-03-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16330#discussion_r105546348
  
--- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
@@ -2897,6 +2898,27 @@ test_that("Collect on DataFrame when NAs exists at 
the top of a timestamp column
   expect_equal(class(ldf3$col3), c("POSIXct", "POSIXt"))
 })
 
+compare_list <- function(list1, list2) {
+  # get testthat to show the diff by first making the 2 lists equal in 
length
+  expect_equal(length(list1), length(list2))
+  l <- max(length(list1), length(list2))
--- End diff --

here's what it looks like
```
1. Failure: No extra files are created in SPARK_HOME by starting session 
and making calls (@test_sparkSQL.R#2917)
length(list1) not equal to length(list2).
1/1 mismatches
[1] 22 - 23 == -1


2. Failure: No extra files are created in SPARK_HOME by starting session 
and making calls (@test_sparkSQL.R#2917)
sort(list1, na.last = TRUE) not equal to sort(list2, na.last = TRUE).
3/23 mismatches
x[21]: "unit-tests.out"
y[21]: "spark-warehouse"

x[22]: "WINDOWS.md"
y[22]: "unit-tests.out"

x[23]: NA
y[23]: "WINDOWS.md"
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...

2017-03-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16330#discussion_r105545762
  
--- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
@@ -2897,6 +2898,27 @@ test_that("Collect on DataFrame when NAs exists at 
the top of a timestamp column
   expect_equal(class(ldf3$col3), c("POSIXct", "POSIXt"))
 })
 
+compare_list <- function(list1, list2) {
+  # get testthat to show the diff by first making the 2 lists equal in 
length
+  expect_equal(length(list1), length(list2))
+  l <- max(length(list1), length(list2))
+  length(list1) <- l
+  length(list2) <- l
+  expect_equal(sort(list1, na.last = TRUE), sort(list2, na.last = TRUE))
+}
+
+# This should always be the last test in this test file.
+test_that("No extra files are created in SPARK_HOME by starting session 
and making calls", {
+  # Check that it is not creating any extra file.
+  # Does not check the tempdir which would be cleaned up after.
+  filesAfter <- list.files(path = file.path(Sys.getenv("SPARK_HOME"), 
"R"), all.files = TRUE)
+
+  expect_true(length(sparkHomeFileBefore) > 0)
+  compare_list(sparkHomeFileBefore, filesBefore)
--- End diff --

I'm trying to catch a few things with this - will add some comment on.
for instance,
1) what's created by calling `sparkR.session(enableHiveSupport = F)` (every 
tests except test_sparkSQL.R)
2) what's created by calling `sparkR.session(enableHiveSupport = T)` 
(test_sparkSQL.R)



this unfortunately doesn't quite work as expected - it should have failed 
actually instead of passing - because we are running Scala tests before and 
they have caused spark-warehouse and metastore_db to be created already, before 
any R code is run.

reworking that now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...

2017-03-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16330#discussion_r105545724
  
--- Diff: core/src/main/scala/org/apache/spark/api/r/RRDD.scala ---
@@ -127,6 +127,13 @@ private[r] object RRDD {
   sparkConf.setExecutorEnv(name.toString, value.toString)
 }
 
+if (sparkEnvirMap.containsKey("spark.r.sql.default.derby.dir") &&
--- End diff --

well, in revisiting this I thought it would be easier to minimize the 
impact by making this R only.
it would be much easier if we make the derby log going to tmp always for 
all lang binding 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...

2017-03-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16330#discussion_r105545712
  
--- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
@@ -2897,6 +2898,27 @@ test_that("Collect on DataFrame when NAs exists at 
the top of a timestamp column
   expect_equal(class(ldf3$col3), c("POSIXct", "POSIXt"))
 })
 
+compare_list <- function(list1, list2) {
+  # get testthat to show the diff by first making the 2 lists equal in 
length
+  expect_equal(length(list1), length(list2))
+  l <- max(length(list1), length(list2))
--- End diff --

the idea is to show enough information from the log without having to rerun 
the check manually.
the first check will show the numeric values but it wouldn't say how 
exactly they are different.
the next check (or moved to compare_list() here) will get testthat to dump 
the delta too, but first it must set the 2 lists into the same size etc.. 

in fact, all of these are well tested in "Check masked functions" test in 
test_context.R, just duplicated here.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...

2017-03-11 Thread shivaram

Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16330#discussion_r105545530
  
--- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
@@ -2897,6 +2898,27 @@ test_that("Collect on DataFrame when NAs exists at 
the top of a timestamp column
   expect_equal(class(ldf3$col3), c("POSIXct", "POSIXt"))
 })
 
+compare_list <- function(list1, list2) {
+  # get testthat to show the diff by first making the 2 lists equal in 
length
+  expect_equal(length(list1), length(list2))
+  l <- max(length(list1), length(list2))
--- End diff --

The lengths should be equal if we get to this line ? Or am I missing 
something ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...

2017-03-11 Thread shivaram

Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16330#discussion_r105545521
  
--- Diff: core/src/main/scala/org/apache/spark/api/r/RRDD.scala ---
@@ -127,6 +127,13 @@ private[r] object RRDD {
   sparkConf.setExecutorEnv(name.toString, value.toString)
 }
 
+if (sparkEnvirMap.containsKey("spark.r.sql.default.derby.dir") &&
--- End diff --

Its a little awkward that this is set in RRDD. Is there a more general 
place we can set this in across languages / runtimes (i.e. for Python / Scala 
as well) ? 

@cloud-fan @gatorsmile Any thoughts on this ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16330: [SPARK-18817][SPARKR][SQL] change derby log outpu...

2017-03-11 Thread shivaram

Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16330#discussion_r105545556
  
--- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
@@ -2897,6 +2898,27 @@ test_that("Collect on DataFrame when NAs exists at 
the top of a timestamp column
   expect_equal(class(ldf3$col3), c("POSIXct", "POSIXt"))
 })
 
+compare_list <- function(list1, list2) {
+  # get testthat to show the diff by first making the 2 lists equal in 
length
+  expect_equal(length(list1), length(list2))
+  l <- max(length(list1), length(list2))
+  length(list1) <- l
+  length(list2) <- l
+  expect_equal(sort(list1, na.last = TRUE), sort(list2, na.last = TRUE))
+}
+
+# This should always be the last test in this test file.
+test_that("No extra files are created in SPARK_HOME by starting session 
and making calls", {
+  # Check that it is not creating any extra file.
+  # Does not check the tempdir which would be cleaned up after.
+  filesAfter <- list.files(path = file.path(Sys.getenv("SPARK_HOME"), 
"R"), all.files = TRUE)
+
+  expect_true(length(sparkHomeFileBefore) > 0)
+  compare_list(sparkHomeFileBefore, filesBefore)
--- End diff --

I'm not sure what we are checking by having both `sparkHomeFilesBefore` and 
`filesBefore` -- Wouldn't just one of them do the job and if not can we add a 
comment here ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16596: [SPARK-19237][SPARKR][WIP] R should check for java when ...

2017-03-11 Thread shivaram

Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/16596
  
@felixcheung Any update on this ? Looking through the list of PRs I thought 
this might be a good one to add to a CRAN submission


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16330
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16330
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74391/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16330
  
**[Test build #74391 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74391/testReport)**
 for PR 16330 at commit 
[`e5b69ca`](https://github.com/apache/spark/commit/e5b69ca67230525c5819c52b581023475a7d7e5c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17260: [SPARK-19921] [SQL] [TEST] Enable end-to-end testing usi...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17260
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17260: [SPARK-19921] [SQL] [TEST] Enable end-to-end testing usi...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17260
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74389/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17260: [SPARK-19921] [SQL] [TEST] Enable end-to-end testing usi...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17260
  
**[Test build #74389 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74389/testReport)**
 for PR 17260 at commit 
[`e0887d0`](https://github.com/apache/spark/commit/e0887d0568eb04392801f0d44901b4bb1d555cf6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16330
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74388/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16330
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17171
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16330
  
**[Test build #74388 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74388/testReport)**
 for PR 16330 at commit 
[`8062ee1`](https://github.com/apache/spark/commit/8062ee1e953b2d4393a983c20ed80ab29d8aeffc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17171
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74390/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17171
  
**[Test build #74390 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74390/testReport)**
 for PR 17171 at commit 
[`22b7db8`](https://github.com/apache/spark/commit/22b7db8bc013d5dcd23c3ef0f45483c47ea66b98).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17262: [SPARK-17262][SQL] Fixed missing closing bracket spark/s...

2017-03-11 Thread elviento

Github user elviento commented on the issue:

https://github.com/apache/spark/pull/17262
  
closed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17262: [SPARK-17262][SQL] Fixed missing closing bracket ...

2017-03-11 Thread elviento

Github user elviento closed the pull request at:

https://github.com/apache/spark/pull/17262


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17262: [SPARK-17262][SQL] Fixed missing closing bracket spark/s...

2017-03-11 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17262
  
@elviento this pull request is completely wrong. Close it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17262: [SPARK-17262][SQL] Fixed missing closing bracket spark/s...

2017-03-11 Thread elviento

Github user elviento commented on the issue:

https://github.com/apache/spark/pull/17262
  
it merges into branch-2.0 - added missing bracket in DataFrameSuite.scala 
line 1704.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17262: [SPARK-17261][SQL] Fixed missing closing bracket spark/s...

2017-03-11 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17262
  
@elviento close this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17261: [SPARK-17261][SQL] Fixed missing closing bracket ...

2017-03-11 Thread elviento

Github user elviento closed the pull request at:

https://github.com/apache/spark/pull/17261


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17262: [SPARK-17261][SQL] Fixed missing closing bracket spark/s...

2017-03-11 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17262
  
Can you close it please?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17262: [SPARK-17261][SQL] Fixed missing closing bracket spark/s...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17262
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17262: [SPARK-17261][SQL] Fixed missing closing bracket ...

2017-03-11 Thread elviento

GitHub user elviento opened a pull request:

https://github.com/apache/spark/pull/17262

[SPARK-17261][SQL] Fixed missing closing bracket 
spark/sql/DataFrameSuite.scala

## What changes were proposed in this pull request?

Fixed missing closing bracket in branch-2.0 line:1704 of 
DataFrameSuite.scala which was found during ./dev/make-distribution.sh

/spark/sql/core/target/scala-2.11/test-classes...

/spark/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala:1704: 
Missing closing brace `}' assumed here
[error] }
[error] ^
[error] one error found
[error] Compile failed at Mar 11, 2017 2:36:12 PM [0.610s]

## How was this patch tested?

Tested:
$SPARK_SRC/spark/sql/core/target/scala-2.11/test-classes...

Successful Build:
$SPARK_SRC/dev/make-distribution.sh --tgz -Psparkr -Phadoop-2.7 -Phive 
-Phive-thriftserver -Pyarn

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/elviento/spark fix-dataframesuite

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17262.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17262


commit f4594900d86bb39358ff19047dfa8c1e4b78aa6b
Author: Andrew Mills 
Date:   2016-09-26T20:41:10Z

[Docs] Update spark-standalone.md to fix link

Corrected a link to the configuration.html page, it was pointing to a page 
that does not exist (configurations.html).

Documentation change, verified in preview.

Author: Andrew Mills 

Closes #15244 from ammills01/master.

(cherry picked from commit 00be16df642317137f17d2d7d2887c41edac3680)
Signed-off-by: Andrew Or 

commit 98bbc4410181741d903a703eac289408cb5b2c5e
Author: Josh Rosen 
Date:   2016-09-27T21:14:27Z

[SPARK-17618] Guard against invalid comparisons between UnsafeRow and other 
formats

This patch ports changes from #15185 to Spark 2.x. In that patch, a  
correctness bug in Spark 1.6.x which was caused by an invalid `equals()` 
comparison between an `UnsafeRow` and another row of a different format. Spark 
2.x is not affected by that specific correctness bug but it can still reap the 
error-prevention benefits of that patch's changes, which modify  
``UnsafeRow.equals()` to throw an IllegalArgumentException if it is called with 
an object that is not an `UnsafeRow`.

Author: Josh Rosen 

Closes #15265 from JoshRosen/SPARK-17618-master.

(cherry picked from commit 2f84a686604b298537bfd4d087b41594d2aa7ec6)
Signed-off-by: Josh Rosen 

commit 2cd327ef5e4c3f6b8468ebb2352479a1686b7888
Author: Liang-Chi Hsieh 
Date:   2016-09-27T23:00:39Z

[SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in 
MemoryStore

## What changes were proposed in this pull request?

There is an assert in MemoryStore's putIteratorAsValues method which is 
used to check if unroll memory is not released too much. This assert looks 
wrong.

## How was this patch tested?

Jenkins tests.

Author: Liang-Chi Hsieh 

Closes #14642 from viirya/fix-unroll-memory.

(cherry picked from commit e7bce9e1876de6ee975ccc89351db58119674aef)
Signed-off-by: Josh Rosen 

commit 1b02f8820ddaf3f2a0e7acc9a7f27afc20683cca
Author: Josh Rosen 
Date:   2016-09-28T07:59:00Z

[SPARK-17666] Ensure that RecordReaders are closed by data source file 
scans (backport)

This is a branch-2.0 backport of #15245.

## What changes were proposed in this pull request?

This patch addresses a potential cause of resource leaks in data source 
file scans. As reported in 
[SPARK-17666](https://issues.apache.org/jira/browse/SPARK-17666), tasks which 
do not fully-consume their input may cause file handles / network connections 
(e.g. S3 connections) to be leaked. Spark's `NewHadoopRDD` uses a TaskContext 
callback to [close its record 
readers](https://github.com/apache/spark/blame/master/core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala#L208),
 but the new data source file scans will only close record readers once their 
iterators are fully-consumed.

This patch modifies `RecordReaderIterator` and `HadoopFileLinesReader` to 
add `close()` methods and modifies all six implementations of 
`FileFormat.buildReader()` to register TaskContext task completion callbacks to 
guarantee that cleanup is eventually performed.

## How was this patch tested?

Tested manually for now.

Author: Josh Rosen 

Closes #15271 from JoshRosen/SPARK-17666-backport.

commit 4d73d5cd82ebc980f996c78f9afb8a97418ab7ab
Author: hyukjinkwon 
Date:   2016-09-28T10:19:04Z

[MINOR][PYSPARK][DOCS] Fix examples in PySpark documentation

## What changes were proposed in this pull request?

T

[GitHub] spark issue #17261: Branch 2.0

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17261
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17261: Branch 2.0

2017-03-11 Thread elviento

GitHub user elviento opened a pull request:

https://github.com/apache/spark/pull/17261

Branch 2.0

## What changes were proposed in this pull request?

Fixed missing closing bracket in branch-2.0 line:1704 of 
DataFrameSuite.scala which was found during ./dev/make-distribution.sh

[warn] Pruning sources from previous analysis, due to incompatible 
CompileSetup.
[info] Compiling 174 Scala sources and 19 Java sources to 
/Users/wesleyfabella/Documents/projects/RLE-CloudTeam/spark/sql/core/target/scala-2.11/test-classes...
[error] 
/Users/wesleyfabella/Documents/projects/RLE-CloudTeam/spark/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala:1704:
 Missing closing brace `}' assumed here
[error] }
[error] ^
[error] one error found
[error] Compile failed at Mar 11, 2017 2:36:12 PM [0.610s]

## How was this patch tested?

Tested:
$SPARK_SRC/spark/sql/core/target/scala-2.11/test-classes...

Successful Build:
$SPARK_SRC/dev/make-distribution.sh --tgz -Psparkr -Phadoop-2.7 -Phive 
-Phive-thriftserver -Pyarn

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/elviento/spark branch-2.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17261.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17261


commit f4594900d86bb39358ff19047dfa8c1e4b78aa6b
Author: Andrew Mills 
Date:   2016-09-26T20:41:10Z

[Docs] Update spark-standalone.md to fix link

Corrected a link to the configuration.html page, it was pointing to a page 
that does not exist (configurations.html).

Documentation change, verified in preview.

Author: Andrew Mills 

Closes #15244 from ammills01/master.

(cherry picked from commit 00be16df642317137f17d2d7d2887c41edac3680)
Signed-off-by: Andrew Or 

commit 98bbc4410181741d903a703eac289408cb5b2c5e
Author: Josh Rosen 
Date:   2016-09-27T21:14:27Z

[SPARK-17618] Guard against invalid comparisons between UnsafeRow and other 
formats

This patch ports changes from #15185 to Spark 2.x. In that patch, a  
correctness bug in Spark 1.6.x which was caused by an invalid `equals()` 
comparison between an `UnsafeRow` and another row of a different format. Spark 
2.x is not affected by that specific correctness bug but it can still reap the 
error-prevention benefits of that patch's changes, which modify  
``UnsafeRow.equals()` to throw an IllegalArgumentException if it is called with 
an object that is not an `UnsafeRow`.

Author: Josh Rosen 

Closes #15265 from JoshRosen/SPARK-17618-master.

(cherry picked from commit 2f84a686604b298537bfd4d087b41594d2aa7ec6)
Signed-off-by: Josh Rosen 

commit 2cd327ef5e4c3f6b8468ebb2352479a1686b7888
Author: Liang-Chi Hsieh 
Date:   2016-09-27T23:00:39Z

[SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in 
MemoryStore

## What changes were proposed in this pull request?

There is an assert in MemoryStore's putIteratorAsValues method which is 
used to check if unroll memory is not released too much. This assert looks 
wrong.

## How was this patch tested?

Jenkins tests.

Author: Liang-Chi Hsieh 

Closes #14642 from viirya/fix-unroll-memory.

(cherry picked from commit e7bce9e1876de6ee975ccc89351db58119674aef)
Signed-off-by: Josh Rosen 

commit 1b02f8820ddaf3f2a0e7acc9a7f27afc20683cca
Author: Josh Rosen 
Date:   2016-09-28T07:59:00Z

[SPARK-17666] Ensure that RecordReaders are closed by data source file 
scans (backport)

This is a branch-2.0 backport of #15245.

## What changes were proposed in this pull request?

This patch addresses a potential cause of resource leaks in data source 
file scans. As reported in 
[SPARK-17666](https://issues.apache.org/jira/browse/SPARK-17666), tasks which 
do not fully-consume their input may cause file handles / network connections 
(e.g. S3 connections) to be leaked. Spark's `NewHadoopRDD` uses a TaskContext 
callback to [close its record 
readers](https://github.com/apache/spark/blame/master/core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala#L208),
 but the new data source file scans will only close record readers once their 
iterators are fully-consumed.

This patch modifies `RecordReaderIterator` and `HadoopFileLinesReader` to 
add `close()` methods and modifies all six implementations of 
`FileFormat.buildReader()` to register TaskContext task completion callbacks to 
guarantee that cleanup is eventually performed.

## How was this patch tested?

Tested manually for now.

Author: Josh Rosen 

Closes #

[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16330
  
**[Test build #74391 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74391/testReport)**
 for PR 16330 at commit 
[`e5b69ca`](https://github.com/apache/spark/commit/e5b69ca67230525c5819c52b581023475a7d7e5c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17260: [SPARK-19921] [SQL] [TEST] Enable end-to-end testing usi...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17260
  
**[Test build #74389 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74389/testReport)**
 for PR 17260 at commit 
[`e0887d0`](https://github.com/apache/spark/commit/e0887d0568eb04392801f0d44901b4bb1d555cf6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17171
  
**[Test build #74390 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74390/testReport)**
 for PR 17171 at commit 
[`22b7db8`](https://github.com/apache/spark/commit/22b7db8bc013d5dcd23c3ef0f45483c47ea66b98).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16373: [SPARK-18961][SQL] Support `SHOW TABLE EXTENDED ... PART...

2017-03-11 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16373
  
Will review this today. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

2017-03-11 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17171
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17260: [SPARK-19921] [SQL] [TEST] Enable end-to-end test...

2017-03-11 Thread gatorsmile

GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/17260

[SPARK-19921] [SQL] [TEST] Enable end-to-end testing using different Hive 
metastore versions. 

### What changes were proposed in this pull request?

To improve the quality of our Spark SQL in different Hive metastore 
versions, this PR is to enable end-to-end testing using different versions. 
This PR allows the test cases in sql/hive to pass the existing Hive client to 
create a SparkSession. 
- Since Derby does not allow concurrent connections, the pre-built Hive 
clients use different database from the TestHive's built-in 1.2.1 client. 
- Since our test cases in sql/hive only can create a single Spark context 
in the same JVM, the newly created SparkSession share the same spark context 
with the existing TestHive's corresponding SparkSession. 

### How was this patch tested?
Fixed the existing test cases.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark versionSuite

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17260.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17260


commit e0887d0568eb04392801f0d44901b4bb1d555cf6
Author: Xiao Li 
Date:   2017-03-11T19:04:07Z

fix.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output to te...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16330
  
**[Test build #74388 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74388/testReport)**
 for PR 16330 at commit 
[`8062ee1`](https://github.com/apache/spark/commit/8062ee1e953b2d4393a983c20ed80ab29d8aeffc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17259: Branch 2.0

2017-03-11 Thread elviento

Github user elviento closed the pull request at:

https://github.com/apache/spark/pull/17259


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17259: Branch 2.0

2017-03-11 Thread elviento

GitHub user elviento opened a pull request:

https://github.com/apache/spark/pull/17259

Branch 2.0

## What changes were proposed in this pull request?

Missing closing bracket '}' line 1704 found during mvn build.

(Please fill in changes proposed in this fix)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
index 6a9279f..3967d07 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
@@ -1701,4 +1701,5 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   assert(e3.message.contains(
 "Cannot have map type columns in DataFrame which calls set 
operations"))
 }
+  }
 }

## How was this patch tested?

Cloned branch, applied above fix, then successfully compiled using 
./dev/make-distributable.sh

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/spark branch-2.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17259.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17259


commit 8a58f2e8ec413591ec00da1e37b91b1bf49e4d1d
Author: Sameer Agarwal 
Date:   2016-09-26T20:21:08Z

[SPARK-17652] Fix confusing exception message while reserving capacity

## What changes were proposed in this pull request?

This minor patch fixes a confusing exception message while reserving 
additional capacity in the vectorized parquet reader.

## How was this patch tested?

Exisiting Unit Tests

Author: Sameer Agarwal 

Closes #15225 from sameeragarwal/error-msg.

(cherry picked from commit 7c7586aef9243081d02ea5065435234b5950ab66)
Signed-off-by: Yin Huai 

commit f4594900d86bb39358ff19047dfa8c1e4b78aa6b
Author: Andrew Mills 
Date:   2016-09-26T20:41:10Z

[Docs] Update spark-standalone.md to fix link

Corrected a link to the configuration.html page, it was pointing to a page 
that does not exist (configurations.html).

Documentation change, verified in preview.

Author: Andrew Mills 

Closes #15244 from ammills01/master.

(cherry picked from commit 00be16df642317137f17d2d7d2887c41edac3680)
Signed-off-by: Andrew Or 

commit 98bbc4410181741d903a703eac289408cb5b2c5e
Author: Josh Rosen 
Date:   2016-09-27T21:14:27Z

[SPARK-17618] Guard against invalid comparisons between UnsafeRow and other 
formats

This patch ports changes from #15185 to Spark 2.x. In that patch, a  
correctness bug in Spark 1.6.x which was caused by an invalid `equals()` 
comparison between an `UnsafeRow` and another row of a different format. Spark 
2.x is not affected by that specific correctness bug but it can still reap the 
error-prevention benefits of that patch's changes, which modify  
``UnsafeRow.equals()` to throw an IllegalArgumentException if it is called with 
an object that is not an `UnsafeRow`.

Author: Josh Rosen 

Closes #15265 from JoshRosen/SPARK-17618-master.

(cherry picked from commit 2f84a686604b298537bfd4d087b41594d2aa7ec6)
Signed-off-by: Josh Rosen 

commit 2cd327ef5e4c3f6b8468ebb2352479a1686b7888
Author: Liang-Chi Hsieh 
Date:   2016-09-27T23:00:39Z

[SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in 
MemoryStore

## What changes were proposed in this pull request?

There is an assert in MemoryStore's putIteratorAsValues method which is 
used to check if unroll memory is not released too much. This assert looks 
wrong.

## How was this patch tested?

Jenkins tests.

Author: Liang-Chi Hsieh 

Closes #14642 from viirya/fix-unroll-memory.

(cherry picked from commit e7bce9e1876de6ee975ccc89351db58119674aef)
Signed-off-by: Josh Rosen 

commit 1b02f8820ddaf3f2a0e7acc9a7f27afc20683cca
Author: Josh Rosen 
Date:   2016-09-28T07:59:00Z

[SPARK-17666] Ensure that RecordReaders are closed by data source file 
scans (backport)

This is a branch-2.0 backport of #15245.

## What changes were proposed in this pull request?

This patch addresses a potential cause of resource leaks in data source 
file scans. As reported in 
[SPARK-17666](https://issues.apache.org/jira/browse/SPARK-17666), tasks which 
do not fully-consume their input may cause file handles / network connections

[GitHub] spark issue #16006: [SPARK-18580] [DStreams] [external/kafka-0-10] Use spark...

2017-03-11 Thread omuravskiy

Github user omuravskiy commented on the issue:

https://github.com/apache/spark/pull/16006
  
Yes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17258: [SPARK-19807][Web UI]Add reason for cancellation when a ...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17258
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17258: [SPARK-19807][Web UI]Add reason for cancellation ...

2017-03-11 Thread shaolinliu

GitHub user shaolinliu opened a pull request:

https://github.com/apache/spark/pull/17258

[SPARK-19807][Web UI]Add reason for cancellation when a stage is killed 
using web UI


## What changes were proposed in this pull request?

When a user kills a stage using web UI (in Stages page), 
StagesTab.handleKillRequest requests SparkContext to cancel the stage without 
giving a reason. SparkContext has cancelStage(stageId: Int, reason: String) 
that Spark could use to pass the information for monitoring/debugging purposes.

## How was this patch tested?

manual tests

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shaolinliu/spark SPARK-19807

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17258.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17258


commit f43d1d689800d4d6fabdd7c3c4a85065f93bc34c
Author: lvdongr 
Date:   2017-03-11T11:37:00Z

[SPARK-19807][Web UI]Add reason for cancellation when a stage is killed 
using web UI




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17242: [SPARK-19902][SQL] Support more expression canonicalizat...

2017-03-11 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/17242
  
@cloud-fan I would like to defer the optimization part to another PR, if 
possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17257: [DOCS][SS] fix structured streaming python example

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17257
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74387/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17257: [DOCS][SS] fix structured streaming python example

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17257
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17257: [DOCS][SS] fix structured streaming python example

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17257
  
**[Test build #74387 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74387/testReport)**
 for PR 17257 at commit 
[`cd82690`](https://github.com/apache/spark/commit/cd8269022e3066d07c6fc480305ccef89efd0993).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17251
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17251
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74385/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17251
  
**[Test build #74385 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74385/testReport)**
 for PR 17251 at commit 
[`0cd5d88`](https://github.com/apache/spark/commit/0cd5d88609a2e36459498a86caec5046d9ebe2b1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17254: [SPARK-19917][SQL]qualified partition path stored in cat...

2017-03-11 Thread windpiger

Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17254
  
cc @cloud-fan @gatorsmile


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17254: [SPARK-19917][SQL]qualified partition path stored in cat...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17254
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74384/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17254: [SPARK-19917][SQL]qualified partition path stored in cat...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17254
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17254: [SPARK-19917][SQL]qualified partition path stored in cat...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17254
  
**[Test build #74384 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74384/testReport)**
 for PR 17254 at commit 
[`36a3463`](https://github.com/apache/spark/commit/36a34632dbb000799c35727c00d1542d4bb1ce00).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17256: [SPARK-19919][SQL] Defer throwing the exception for empt...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17256
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74386/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17256: [SPARK-19919][SQL] Defer throwing the exception for empt...

2017-03-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17256
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17256: [SPARK-19919][SQL] Defer throwing the exception for empt...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17256
  
**[Test build #74386 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74386/testReport)**
 for PR 17256 at commit 
[`9d91da1`](https://github.com/apache/spark/commit/9d91da124e0723adee7744a64999ea1c07acfe66).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-03-11 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/15435
  
Done. cc @sethah @jkbradley thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17257: [DOCS][SS] fix structured streaming python example

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17257
  
**[Test build #74387 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74387/testReport)**
 for PR 17257 at commit 
[`cd82690`](https://github.com/apache/spark/commit/cd8269022e3066d07c6fc480305ccef89efd0993).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17257: [DOCS][SS] fix structured streaming python exampl...

2017-03-11 Thread uncleGen

GitHub user uncleGen opened a pull request:

https://github.com/apache/spark/pull/17257

[DOCS][SS] fix structured streaming python example

## What changes were proposed in this pull request?

- SS python example: `TypeError: 'xxx' object is not callable`
- some other doc issue.

## How was this patch tested?

Jenkins.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uncleGen/spark docs-ss-python

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17257.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17257


commit cd8269022e3066d07c6fc480305ccef89efd0993
Author: uncleGen 
Date:   2017-03-11T09:27:40Z

fix structured streaming python example code




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17256: [SPARK-19919][SQL] Defer throwing the exception for empt...

2017-03-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17256
  
**[Test build #74386 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74386/testReport)**
 for PR 17256 at commit 
[`9d91da1`](https://github.com/apache/spark/commit/9d91da124e0723adee7744a64999ea1c07acfe66).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17255: [SPARK-19918[SQL] Use TextFileFormat in implementation o...

2017-03-11 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17255
  
cc @cloud-fan, @joshrosen and @NathanHowell  could you take a look and see 
if it makes sense when you have some time?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17256: [SPARK-19919][SQL] Defer throwing the exception for empt...

2017-03-11 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17256
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 >

1 - 100 of 117 matches

Mail list logo