date:20170123

[GitHub] spark pull request #16684: [SPARK-16101][HOTFIX] Fix the build with Scala 2....

2017-01-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16684


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16308: [SPARK-18936][SQL] Infrastructure for session loc...

2017-01-23 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16308#discussion_r97490521
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala ---
@@ -475,6 +1164,45 @@ class DateFunctionsSuite extends QueryTest with 
SharedSQLContext {
   Row(ts1.getTime / 1000L), Row(ts2.getTime / 1000L)))
   }
 
+  test("to_unix_timestamp with session local timezone") {
--- End diff --

I agree that there are so many similar tests, but I have no idea to 
generalize them.
Would you please give me some code snippets? I'll be able to expand them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16138: [SPARK-16609] Add to_date/to_timestamp with forma...

2017-01-23 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16138#discussion_r97490239
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 ---
@@ -1047,6 +1048,64 @@ case class ToDate(child: Expression) extends 
UnaryExpression with ImplicitCastIn
 }
 
 /**
+ * Parses a column to a date based on the given format.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(date_str, fmt) - Parses the `left` expression with the 
`fmt` expression. Returns null with invalid input.",
+  extended = """
+Examples:
+  > SELECT _FUNC_('2016-12-31', '-MM-dd');
+   2016-12-31
+  """)
+// scalastyle:on line.size.limit
+case class ParseToDate(left: Expression, format: Expression, child: 
Expression)
+  extends RuntimeReplaceable {
+
+  def this(left: Expression, format: Expression) = {
+this(left, format, Cast(Cast(new UnixTimestamp(left, format), 
TimestampType), DateType))
+  }
+
+  def this(left: Expression) = {
+// RuntimeReplaceable forces the signature, the second value
+// is ignored completely
+this(left, Literal(""), ToDate(left))
+  }
+
+  override def flatArguments: Iterator[Any] = Iterator(left, format)
+  override def sql: String = s"$prettyName(${left.sql}, ${format.sql})"
+
+  override def prettyName: String = "to_date"
+  override def dataType: DataType = DateType
--- End diff --

this is already defined in `RuntimeReplaceable`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16687: [SPARK-19343][DStreams] Do once optimistic checkpoint be...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16687
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71914/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16687: [SPARK-19343][DStreams] Do once optimistic checkpoint be...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16687
  
**[Test build #71914 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71914/testReport)**
 for PR 16687 at commit 
[`a63306e`](https://github.com/apache/spark/commit/a63306e53c19b0db6574260c9716c6a76cf223e0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16687: [SPARK-19343][DStreams] Do once optimistic checkpoint be...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16687
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16138: [SPARK-16609] Add to_date/to_timestamp with forma...

2017-01-23 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16138#discussion_r97489950
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 ---
@@ -1047,6 +1048,64 @@ case class ToDate(child: Expression) extends 
UnaryExpression with ImplicitCastIn
 }
 
 /**
+ * Parses a column to a date based on the given format.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(date_str, fmt) - Parses the `left` expression with the 
`fmt` expression. Returns null with invalid input.",
+  extended = """
+Examples:
+  > SELECT _FUNC_('2016-12-31', '-MM-dd');
+   2016-12-31
+  """)
+// scalastyle:on line.size.limit
+case class ParseToDate(left: Expression, format: Expression, child: 
Expression)
--- End diff --

we don't need to put `child` in constructor, but simply add a `def child` 
in the class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16552
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16552
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71910/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16552
  
**[Test build #71910 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71910/testReport)**
 for PR 16552 at commit 
[`f34ab6d`](https://github.com/apache/spark/commit/f34ab6dab0bb7ce80d362c0c248bc2c735aeb60b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16138: [SPARK-16609] Add to_date/to_timestamp with format funct...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16138
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16138: [SPARK-16609] Add to_date/to_timestamp with format funct...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16138
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71913/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16138: [SPARK-16609] Add to_date/to_timestamp with format funct...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16138
  
**[Test build #71913 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71913/testReport)**
 for PR 16138 at commit 
[`8fa4bfb`](https://github.com/apache/spark/commit/8fa4bfbb72c5c1de214b4a35ef3ed4585e33cf3a).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16594
  
**[Test build #71921 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71921/testReport)**
 for PR 16594 at commit 
[`bd45854`](https://github.com/apache/spark/commit/bd4585442209334e17b50efd2fdc88328ab78c7e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16269: [SPARK-19080][SQL] simplify data source analysis

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16269
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16269: [SPARK-19080][SQL] simplify data source analysis

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16269
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71908/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16269: [SPARK-19080][SQL] simplify data source analysis

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16269
  
**[Test build #71908 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71908/testReport)**
 for PR 16269 at commit 
[`4b68c16`](https://github.com/apache/spark/commit/4b68c168b0e16071b91c93fc7f2be8fabda46fbe).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16654: [SPARK-19303][ML][WIP] Add evaluate method in clustering...

2017-01-23 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/16654
  
Existing metrics (WSSSE,Loglikelihood) are relevant to detail of algorithm. 
Computation of WSSSE for KMeans/BisectKMeans use the average vectors as the 
centers, but for KMedoids the medoids, other than averages, should be used. If 
we use the same logic in KMeans to compute the WSSSE for KMedoids, I think it 
will be a mistake.
And I found that some supervised algorithms support evaluate method in 
models: LiR,LoR,GLR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16553: [SPARK-9435][SQL] Reuse function in Java UDF to correctl...

2017-01-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16553
  
Thank you @gatorsmile 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16658: [DOCS] Fix typo in docs

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16658
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16658: [DOCS] Fix typo in docs

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16658
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71915/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16658: [DOCS] Fix typo in docs

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16658
  
**[Test build #71915 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71915/testReport)**
 for PR 16658 at commit 
[`9e1e32a`](https://github.com/apache/spark/commit/9e1e32ab2821503db5236d3c13c9904ad6a641a9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16687: [SPARK-19343][DStreams] Do once optimistic checkp...

2017-01-23 Thread uncleGen

Github user uncleGen commented on a diff in the pull request:

https://github.com/apache/spark/pull/16687#discussion_r97483845
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
 ---
@@ -146,6 +147,11 @@ class JobGenerator(jobScheduler: JobScheduler) extends 
Logging {
   while (!hasTimedOut && !haveAllBatchesBeenProcessed) {
 Thread.sleep(pollTime)
   }
+  if (shouldCheckpoint
+&& !(lastProcessedBatch - 
graph.zeroTime).isMultipleOf(ssc.checkpointDuration)) {
+ssc.graph.updateCheckpointData(lastProcessedBatch)
+checkpointWriter.write(new Checkpoint(ssc, lastProcessedBatch), 
false)
+  }
--- End diff --

do once more checkpoint before stop


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16689: SPARK-19342 bug fixed in collect method for collecting t...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16689
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16687: [SPARK-19343][DStreams] Do once optimistic checkp...

2017-01-23 Thread uncleGen

Github user uncleGen commented on a diff in the pull request:

https://github.com/apache/spark/pull/16687#discussion_r97483687
  
--- Diff: 
streaming/src/test/scala/org/apache/spark/streaming/StreamingContextSuite.scala 
---
@@ -837,6 +839,29 @@ class StreamingContextSuite extends SparkFunSuite with 
BeforeAndAfter with Timeo
 assert(latch.await(60, TimeUnit.SECONDS))
   }
 
+  test("SPARK-19343 Do once optimistic checkpoint before stop") {
+val testDirectory = Utils.createTempDir().getAbsolutePath()
+val checkpointDirectory = Utils.createTempDir().getAbsolutePath()
+ssc = new StreamingContext(conf.clone.set("someKey", "someValue"), 
batchDuration)
+ssc.checkpoint(checkpointDirectory)
+val stream = 
ssc.textFileStream(testDirectory).checkpoint(batchDuration * 11)
+stream.foreachRDD { rdd => rdd.count() }
+ssc.start()
+try {
+  Thread.sleep(batchDuration.milliseconds * 13)
+  ssc.stop(true, true)
--- End diff --

Sleep for 13 batch duration, so there should only do once checkpoint before 
pr


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16689: SPARK-19342 bug fixed in collect method for colle...

2017-01-23 Thread titicaca

GitHub user titicaca opened a pull request:

https://github.com/apache/spark/pull/16689

SPARK-19342 bug fixed in collect method for collecting timestamp column

## What changes were proposed in this pull request?

Fix a bug in collect method for collecting timestamp column, the bug can be 
reproduced as shown in the following codes and outputs:

```
library(SparkR)
sparkR.session(master = "local")
df <- data.frame(col1 = c(0, 1, 2), 
 col2 = c(as.POSIXct("2017-01-01 00:00:01"), NA, 
as.POSIXct("2017-01-01 12:00:01")))

sdf1 <- createDataFrame(df)
print(dtypes(sdf1))
df1 <- collect(sdf1)
print(lapply(df1, class))

sdf2 <- filter(sdf1, "col1 > 0")
print(dtypes(sdf2))
df2 <- collect(sdf2)
print(lapply(df2, class))
```

As we can see from the printed output, the column type of col2 in df2 is 
converted to numeric unexpectedly, when NA exists at the top of the column. 

This is caused by method `do.call(c, list)`, if we convert a list, i.e. 
`do.call(c, list(NA, as.POSIXct("2017-01-01 12:00:01"))`, the class of the 
result is numeric instead of POSIXct. 

Therefore, we need to cast the data type of the vector explicitly. 



## How was this patch tested?

The patch can be tested manually with the same code above.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/titicaca/spark sparkr-dev

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16689.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16689


commit a51c2eb54ca672ad63495d0709bd3ae7b254bd14
Author: titicaca 
Date:   2017-01-24T06:24:47Z

SPARK-19342 bug fixed in collect method for collecting timestamp column




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16552
  
**[Test build #71920 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71920/testReport)**
 for PR 16552 at commit 
[`7bf5b50`](https://github.com/apache/spark/commit/7bf5b50c5cfba1ecb02b95c2fa9bb1ae7830ca99).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16688: [TESTS][SQL] Setup testdata at the beginning for tests t...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16688
  
**[Test build #71919 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71919/testReport)**
 for PR 16688 at commit 
[`b71120d`](https://github.com/apache/spark/commit/b71120d562b28c94b8a1b0689b3c2fac11d84a37).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16688: [TESTS][SQL] Setup testdata at the beginning for ...

2017-01-23 Thread dilipbiswal

GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/16688

[TESTS][SQL] Setup testdata at the beginning for tests to run independently

## What changes were proposed in this pull request?

In CachedTableSuite, we are not setting up the test data at the beginning. 
Some tests fail while trying to run individually. When running the entire suite 
they run fine.

Here are some of the tests that fail -

- test("SELECT star from cached table")
- test("Self-join cached") 

As part of this simplified a couple of tests by calling a support method to 
count the number of
InMemoryRelations.

## How was this patch tested?

Ran the failing tests individually.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark cachetablesuite_simple

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16688.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16688


commit b71120d562b28c94b8a1b0689b3c2fac11d84a37
Author: Dilip Biswal 
Date:   2017-01-24T06:34:11Z

Setup testdata at the beginning for tests to run independently




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16552
  
**[Test build #71918 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71918/testReport)**
 for PR 16552 at commit 
[`7bf5b50`](https://github.com/apache/spark/commit/7bf5b50c5cfba1ecb02b95c2fa9bb1ae7830ca99).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-23 Thread windpiger

Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/16552
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15945
  
**[Test build #71917 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71917/testReport)**
 for PR 15945 at commit 
[`bea519f`](https://github.com/apache/spark/commit/bea519f2ba12312ec96884c3545f74b3bc28c4a2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16658: [DOCS] Fix typo in docs

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16658
  
**[Test build #71915 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71915/testReport)**
 for PR 16658 at commit 
[`9e1e32a`](https://github.com/apache/spark/commit/9e1e32ab2821503db5236d3c13c9904ad6a641a9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-23 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/16594#discussion_r97482084
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala
 ---
@@ -54,11 +56,32 @@ case class Statistics(
 
   /** Readable string representation for the Statistics. */
   def simpleString: String = {
-Seq(s"sizeInBytes=$sizeInBytes",
-  if (rowCount.isDefined) s"rowCount=${rowCount.get}" else "",
+Seq(s"sizeInBytes=${format(sizeInBytes, isSize = true)}",
+  if (rowCount.isDefined) s"rowCount=${format(rowCount.get, isSize = 
false)}" else "",
   s"isBroadcastable=$isBroadcastable"
 ).filter(_.nonEmpty).mkString(", ")
   }
+
+  /** Print the given number in a readable format. */
+  def format(number: BigInt, isSize: Boolean): String = {
--- End diff --

I'll try to use that method in combination with current logic, thanks for 
reminding


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16308: [SPARK-18936][SQL] Infrastructure for session loc...

2017-01-23 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16308#discussion_r97482071
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DateFunctionsSuite.scala ---
@@ -103,6 +153,51 @@ class DateFunctionsSuite extends QueryTest with 
SharedSQLContext {
   Row("2015", "2015", "2013"))
   }
 
+  test("date format with session local timezone") {
+val df = Seq((d, sdf.format(d), ts)).toDF("a", "b", "c")
+
+// The child of date_format is implicitly casted to TimestampType with 
session local timezone.
+//
+// +---+-+-+-+
+// |   | df  | timestamp   | date_format |
+// +---+-+-+-+
+// | a |16533|142847640|"2015-04-08 00:00:00"|
+// | b |"2015-04-08 13:10:15"|1428523815000|"2015-04-08 13:10:15"|
--- End diff --

Do you mean you are wondering why `sdf.format(d)` has the time info 
`13:10:15` ?

If so, `java.sql.Date` DOES have the time info if it was initialized with 
the constructor `Date(long date)` or even if it was initalized with the 
constructor `Date(int year, int month, int day)` or with `Date.valueOf(String 
s)`, it has the time info `00:00:00` of the day in the timezone 
`TimeZone.getDefault()`.

```scala
scala> TimeZone.setDefault(TimeZone.getTimeZone("GMT"))

scala> val gmtDate = Date.valueOf("2017-01-24")
gmtDate: java.sql.Date = 2017-01-24

scala> val gmtTime = gmtDate.getTime
gmtTime: Long = 148521600

scala> TimeZone.setDefault(TimeZone.getTimeZone("PST"))

scala> val pstDate = Date.valueOf("2017-01-24")
pstDate: java.sql.Date = 2017-01-24

scala> val pstTime = pstDate.getTime
pstTime: Long = 148524480

scala> val sdf = new SimpleDateFormat("-MM-dd HH:mm:ss")
sdf: java.text.SimpleDateFormat = java.text.SimpleDateFormat@4f76f1a0

scala> sdf.setTimeZone(TimeZone.getTimeZone("GMT"))

scala> sdf.format(gmtTime)
res12: String = 2017-01-24 00:00:00

scala> sdf.format(pstTime)
res13: String = 2017-01-24 08:00:00

scala> val d = new Date(sdf.parse("2015-04-08 13:10:15").getTime)
d: java.sql.Date = 2015-04-08

scala> sdf.format(d)
res14: String = 2015-04-08 13:10:15
```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare atomic and string type column...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15880
  
**[Test build #71916 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71916/testReport)**
 for PR 15880 at commit 
[`a11f89b`](https://github.com/apache/spark/commit/a11f89bf5ed13b4061a29daf007a608314465a94).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16683: [SPARK-19268][SS]Disallow adaptive query executio...

2017-01-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16683


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16171: [SPARK-18739][ML][PYSPARK] Classification and regression...

2017-01-23 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/16171
  
cc @yanboliang @sethah @jkbradley 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

2017-01-23 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16672
  
It seems to me following hive is safer, any other ideas?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16606: [SPARK-19246][SQL]CataLogTable's partitionSchema order a...

2017-01-23 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16606
  
LGTM, pending tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16683: [SPARK-19268][SS]Disallow adaptive query execution for s...

2017-01-23 Thread zsxwing

Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/16683
  
Thanks. Merging to master and 2.1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16687: [SPARK-19343][DStreams] Do once optimistic checkpoint be...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16687
  
**[Test build #71914 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71914/testReport)**
 for PR 16687 at commit 
[`a63306e`](https://github.com/apache/spark/commit/a63306e53c19b0db6574260c9716c6a76cf223e0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-23 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/16594#discussion_r97481455
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala
 ---
@@ -54,11 +56,32 @@ case class Statistics(
 
   /** Readable string representation for the Statistics. */
   def simpleString: String = {
-Seq(s"sizeInBytes=$sizeInBytes",
-  if (rowCount.isDefined) s"rowCount=${rowCount.get}" else "",
+Seq(s"sizeInBytes=${format(sizeInBytes, isSize = true)}",
+  if (rowCount.isDefined) s"rowCount=${format(rowCount.get, isSize = 
false)}" else "",
   s"isBroadcastable=$isBroadcastable"
 ).filter(_.nonEmpty).mkString(", ")
   }
+
+  /** Print the given number in a readable format. */
+  def format(number: BigInt, isSize: Boolean): String = {
--- End diff --

That method can only accepts Long parameter, and estimated stats can still 
be unreadable even when using TB as unit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16687: [SPARK-19343][DStreams] Do once optimistic checkp...

2017-01-23 Thread uncleGen

GitHub user uncleGen opened a pull request:

https://github.com/apache/spark/pull/16687

[SPARK-19343][DStreams] Do once optimistic checkpoint before stop

## What changes were proposed in this pull request?

Streaming job restarts from checkpoint, and it will rebuild several batch 
until finding latest checkpointed RDD. So we can do once optimistic checkpoint 
just before stop, so that reducing unnecessary recomputation.

## How was this patch tested?

add new unit test


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uncleGen/spark SPARK-19343

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16687.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16687


commit a63306e53c19b0db6574260c9716c6a76cf223e0
Author: uncleGen 
Date:   2017-01-24T06:24:08Z

SPARK-19343: Do once optimistic checkpoint before stop




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14038: [SPARK-16317][SQL] Add a new interface to filter files i...

2017-01-23 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/14038
  
@liancheng ping


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16553: [SPARK-9435][SQL] Reuse function in Java UDF to c...

2017-01-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16553


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-23 Thread wzhfy

Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/16594
  
@gatorsmile I just did a quick fix to show how the improved stats look 
like. If @rxin @hvanhovell accept the change proposed in this pr, I'll update 
to remove the flag :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable sup...

2017-01-23 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16552#discussion_r97481030
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala 
---
@@ -1461,6 +1461,25 @@ class SQLQuerySuite extends QueryTest with 
SQLTestUtils with TestHiveSingleton {
 })
   }
 
+  test("run sql directly on files - hive") {
+withTable("t") {
--- End diff --

you don't need to create a table
```
withTempPath { path =>
  spark.range(100).toDF.write.parquet(path.getAbsolutePath)
  ...
  sql(s"select id from hive.`${path.getAbsolutePath}`")
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16553: [SPARK-9435][SQL] Reuse function in Java UDF to correctl...

2017-01-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16553
  
Thanks! Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable sup...

2017-01-23 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16552#discussion_r97480933
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ddl.scala ---
@@ -65,6 +65,10 @@ case class CreateTempViewUsing(
   }
 
   def run(sparkSession: SparkSession): Seq[Row] = {
+if (provider.toLowerCase == DDLUtils.HIVE_PROVIDER) {
+  throw new AnalysisException("Currently Hive data source can not be 
created as a view")
--- End diff --

`Hive data source can only be used with tables, you cannot use it with 
CREATE TEMP VIEW USING`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable sup...

2017-01-23 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16552#discussion_r97480861
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala 
---
@@ -112,12 +112,6 @@ case class AnalyzeCreateTable(sparkSession: 
SparkSession) extends Rule[LogicalPl
 throw new AnalysisException("Saving data into a view is not 
allowed.")
   }
 
-  if (DDLUtils.isHiveTable(existingTable)) {
-throw new AnalysisException(s"Saving data in the Hive serde table 
$tableName is " +
-  "not supported yet. Please use the insertInto() API as an 
alternative.")
-  }
-
-  // Check if the specified data source match the data source of the 
existing table.
--- End diff --

why remove this line?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16582: [SPARK-19220][UI] Make redirection to HTTPS apply...

2017-01-23 Thread sarutak

Github user sarutak commented on a diff in the pull request:

https://github.com/apache/spark/pull/16582#discussion_r97478738
  
--- Diff: core/src/main/scala/org/apache/spark/ui/JettyUtils.scala ---
@@ -337,17 +350,20 @@ private[spark] object JettyUtils extends Logging {
 // The number of selectors always equals to the number of acceptors
 minThreads += connector.getAcceptors * 2
   }
-  server.setConnectors(connectors.toArray)
   pool.setMaxThreads(math.max(pool.getMaxThreads, minThreads))
 
   val errorHandler = new ErrorHandler()
   errorHandler.setShowStacks(true)
   errorHandler.setServer(server)
   server.addBean(errorHandler)
+
+  gzipHandlers.foreach(collection.addHandler)
   server.setHandler(collection)
+
+  server.setConnectors(connectors.toArray)
--- End diff --

Why did you move `server.setConnectors(connectors.toArray)` and 
`gzipHandlers.foreach(collection.addHandler)`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16582: [SPARK-19220][UI] Make redirection to HTTPS apply...

2017-01-23 Thread sarutak

Github user sarutak commented on a diff in the pull request:

https://github.com/apache/spark/pull/16582#discussion_r97479049
  
--- Diff: core/src/main/scala/org/apache/spark/ui/JettyUtils.scala ---
@@ -274,25 +277,28 @@ private[spark] object JettyUtils extends Logging {
   conf: SparkConf,
   serverName: String = ""): ServerInfo = {
 
-val collection = new ContextHandlerCollection
 addFilters(handlers, conf)
 
 val gzipHandlers = handlers.map { h =>
+  h.setVirtualHosts(Array("@" + SPARK_CONNECTOR_NAME))
--- End diff --

Do we need this code here? `setVirtualHosts` should always be called in 
`addHandler` for each handler right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16171: [SPARK-18739][ML][PYSPARK] Classification and regression...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16171
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16171: [SPARK-18739][ML][PYSPARK] Classification and regression...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16171
  
**[Test build #71911 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71911/testReport)**
 for PR 16171 at commit 
[`b6dd52c`](https://github.com/apache/spark/commit/b6dd52cda34051e5e76df55a76ff83d57fb8a51b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16171: [SPARK-18739][ML][PYSPARK] Classification and regression...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16171
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71911/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16661: [SPARK-19313][ML][MLLIB] GaussianMixture should l...

2017-01-23 Thread sethah

Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/16661#discussion_r97479326
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -486,6 +491,9 @@ class GaussianMixture @Since("2.0.0") (
 @Since("2.0.0")
 object GaussianMixture extends DefaultParamsReadable[GaussianMixture] {
 
+  /** Limit number of features such that numFeatures^2^ < Integer.MaxValue 
*/
+  private[clustering] val MAX_NUM_FEATURES = 46000
--- End diff --

We have to unpack the covariance matrix to a full covariance matrix before 
returning the model. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16594
  
I still do not think using an internal configuration is a user friendly way 
to show the plan costs. Using this way, we do not want users to see it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-23 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16594#discussion_r97478978
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala
 ---
@@ -54,11 +56,32 @@ case class Statistics(
 
   /** Readable string representation for the Statistics. */
   def simpleString: String = {
-Seq(s"sizeInBytes=$sizeInBytes",
-  if (rowCount.isDefined) s"rowCount=${rowCount.get}" else "",
+Seq(s"sizeInBytes=${format(sizeInBytes, isSize = true)}",
+  if (rowCount.isDefined) s"rowCount=${format(rowCount.get, isSize = 
false)}" else "",
   s"isBroadcastable=$isBroadcastable"
 ).filter(_.nonEmpty).mkString(", ")
   }
+
+  /** Print the given number in a readable format. */
+  def format(number: BigInt, isSize: Boolean): String = {
--- End diff --

We are having [`bytesToString` 
](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/Utils.scala#L1109-L1132)
 in Utils.scala





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16594
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71906/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16677: [WIP][SQL] Use map output statistices to improve global ...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16677
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71905/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16677: [WIP][SQL] Use map output statistices to improve global ...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16677
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16594
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16638: [SPARK-19115] [SQL] Supporting Create External Table Lik...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16638
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16594
  
**[Test build #71906 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71906/testReport)**
 for PR 16594 at commit 
[`0af8d7f`](https://github.com/apache/spark/commit/0af8d7f410b36547727cb2e6445dccf9d12f2cef).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16638: [SPARK-19115] [SQL] Supporting Create External Table Lik...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16638
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71904/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16638: [SPARK-19115] [SQL] Supporting Create External Table Lik...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16638
  
**[Test build #71904 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71904/testReport)**
 for PR 16638 at commit 
[`b80f8e6`](https://github.com/apache/spark/commit/b80f8e66e1cbb7111c090358cabc925c6af233d2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16677: [WIP][SQL] Use map output statistices to improve global ...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16677
  
**[Test build #71905 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71905/testReport)**
 for PR 16677 at commit 
[`0a2e96f`](https://github.com/apache/spark/commit/0a2e96fcb42a6fada315fc65a6610314c56ded58).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class FakePartitioning(orgPartition: Partitioning, numPartitions: 
Int) extends Partitioning `
  * `case class LocalLimitExec(limit: Int, child: SparkPlan) extends 
UnaryExecNode with CodegenSupport `
  * `case class GlobalLimitExec(limit: Int, child: SparkPlan) extends 
UnaryExecNode `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16138: [SPARK-16609] Add to_date/to_timestamp with format funct...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16138
  
**[Test build #71913 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71913/testReport)**
 for PR 16138 at commit 
[`8fa4bfb`](https://github.com/apache/spark/commit/8fa4bfbb72c5c1de214b4a35ef3ed4585e33cf3a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16661: [SPARK-19313][ML][MLLIB] GaussianMixture should l...

2017-01-23 Thread zhengruifeng

Github user zhengruifeng commented on a diff in the pull request:

https://github.com/apache/spark/pull/16661#discussion_r97478414
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -486,6 +491,9 @@ class GaussianMixture @Since("2.0.0") (
 @Since("2.0.0")
 object GaussianMixture extends DefaultParamsReadable[GaussianMixture] {
 
+  /** Limit number of features such that numFeatures^2^ < Integer.MaxValue 
*/
+  private[clustering] val MAX_NUM_FEATURES = 46000
--- End diff --

In https://github.com/apache/spark/pull/15413, the symmetry of covariance 
matrix is taken into account and only the upper triangular part is store. So 
this number seems to be 65535? (`math.sqrt(Int.MaxValue.toDouble * 2)`)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15880
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71907/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15880
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15880
  
**[Test build #71907 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71907/testReport)**
 for PR 15880 at commit 
[`32e4f52`](https://github.com/apache/spark/commit/32e4f52a7673d1d1f573b9d83177c093327d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16138: [SPARK-16609] Add to_date/to_timestamp with format funct...

2017-01-23 Thread anabranch

Github user anabranch commented on the issue:

https://github.com/apache/spark/pull/16138
  
@cloud-fan - Reynold referred me to your for this test failure. 

My two tests are failing because Hive tests *allegedly* cover something 
like this.

```
SELECT to_date('2001-10-30 10:30:00', '')
```
However, Hive doesn't support passing in multiple parameters to `to_date` 
as specified in the [language 
manual](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions).
 The only instance I see for `to_date` with multiple parameters is [when it 
talks to ORACLE DB as the 
metastore](https://github.com/apache/hive/blob/2d813f4d4a0bb42345d153c362f7416f05ab2749/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java#L1122)
 although I don't know the code base (I grepped for `to_date` and tried to find 
instances of this occurring).

It seems like this test case should not be running in the first place. Can 
you advise on any suggestions you might have?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16606: [SPARK-19246][SQL]CataLogTable's partitionSchema order a...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16606
  
**[Test build #71912 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71912/testReport)**
 for PR 16606 at commit 
[`72164eb`](https://github.com/apache/spark/commit/72164eb02c1b7acd836a5038fddb8bcd8225a1c6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16171: [SPARK-18739][ML][PYSPARK] Classification and regression...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16171
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16171: [SPARK-18739][ML][PYSPARK] Classification and regression...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16171
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71909/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16171: [SPARK-18739][ML][PYSPARK] Classification and regression...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16171
  
**[Test build #71909 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71909/testReport)**
 for PR 16171 at commit 
[`863c9f4`](https://github.com/apache/spark/commit/863c9f45b0ccf066e34d7539ca1f29baf0b49e85).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class GBTClassificationModel(TreeEnsembleModel, 
JavaProbabilisticClassificationModel,`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16171: [SPARK-18739][ML][PYSPARK] Classification and regression...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16171
  
**[Test build #71911 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71911/testReport)**
 for PR 16171 at commit 
[`b6dd52c`](https://github.com/apache/spark/commit/b6dd52cda34051e5e76df55a76ff83d57fb8a51b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2017-01-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15880
  
LGTM

The PR title is not right. BTW, we might need a release note for this PR. 
This will change the behaviors. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12135: [SPARK-14352][SQL] approxQuantile should support multi c...

2017-01-23 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/12135
  
@MLnick  @jkbradley   Could you mind making a final pass?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16138: [SPARK-16609] Add to_date/to_timestamp with format funct...

2017-01-23 Thread anabranch

Github user anabranch commented on the issue:

https://github.com/apache/spark/pull/16138
  
@felixcheung Thank you for your feedback! Small request, can you tell me if 
my R test case is sufficient for this? It doesn't seem like there is extensive 
R testing right now for virtually any function.

Obviously tests will pass soon, running into strange edge cases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16606: [SPARK-19246][SQL]CataLogTable's partitionSchema order a...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16606
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71902/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16606: [SPARK-19246][SQL]CataLogTable's partitionSchema order a...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16606
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16606: [SPARK-19246][SQL]CataLogTable's partitionSchema order a...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16606
  
**[Test build #71902 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71902/testReport)**
 for PR 16606 at commit 
[`04d3940`](https://github.com/apache/spark/commit/04d39406cc5ce43e51adc931b5dc012d6e5fefa9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16552
  
**[Test build #71910 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71910/testReport)**
 for PR 16552 at commit 
[`f34ab6d`](https://github.com/apache/spark/commit/f34ab6dab0bb7ce80d362c0c248bc2c735aeb60b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14872: [SPARK-3162][MLlib][WIP] Add local tree training for dec...

2017-01-23 Thread smurching

Github user smurching commented on the issue:

https://github.com/apache/spark/pull/14872
  
No worries, apologies for being busy on my end -- I'll leave the branch up 
& try to contribute in other ways when I have the time!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16171: [SPARK-18739][ML][PYSPARK] Classification and regression...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16171
  
**[Test build #71909 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71909/testReport)**
 for PR 16171 at commit 
[`863c9f4`](https://github.com/apache/spark/commit/863c9f45b0ccf066e34d7539ca1f29baf0b49e85).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16668: [SPARK-18788][SPARKR] Add API for getNumPartition...

2017-01-23 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16668#discussion_r97475352
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -3406,3 +3406,28 @@ setMethod("randomSplit",
 }
 sapply(sdfs, dataFrame)
   })
+
+#' getNumPartitions
+#'
+#' Return the number of partitions
+#' Note: in order to compute the number of partition the SparkDataFrame 
has to be converted into a
+#' RDD temporarily internally.
+#'
+#' @param x A SparkDataFrame
+#' @family SparkDataFrame functions
+#' @aliases getNumPartitions,SparkDataFrame-method
+#' @rdname getNumPartitions
+#' @name getNumPartitions
+#' @export
+#' @examples
+#'\dontrun{
+#' sparkR.session()
+#' df <- createDataFrame(cars, numPartitions = 2)
+#' getNumPartitions(df)
+#' }
+#' @note getNumPartitions since 2.1.1
+setMethod("getNumPartitions",
+  signature(x = "SparkDataFrame"),
+  function(x) {
+getNumPartitionsRDD(toRDD(x))
--- End diff --

you said this filled a hole for Spark 2.1, what's this hole? is this Spark 
R only?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16668: [SPARK-18788][SPARKR] Add API for getNumPartition...

2017-01-23 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16668#discussion_r97475188
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -3406,3 +3406,28 @@ setMethod("randomSplit",
 }
 sapply(sdfs, dataFrame)
   })
+
+#' getNumPartitions
+#'
+#' Return the number of partitions
+#' Note: in order to compute the number of partition the SparkDataFrame 
has to be converted into a
+#' RDD temporarily internally.
+#'
+#' @param x A SparkDataFrame
+#' @family SparkDataFrame functions
+#' @aliases getNumPartitions,SparkDataFrame-method
+#' @rdname getNumPartitions
+#' @name getNumPartitions
+#' @export
+#' @examples
+#'\dontrun{
+#' sparkR.session()
+#' df <- createDataFrame(cars, numPartitions = 2)
+#' getNumPartitions(df)
+#' }
+#' @note getNumPartitions since 2.1.1
+setMethod("getNumPartitions",
+  signature(x = "SparkDataFrame"),
+  function(x) {
+getNumPartitionsRDD(toRDD(x))
--- End diff --

ah, that we could do easily. is that something ok for Spark 2.1.1? If yes, 
I could go ahead with changes here for Scala, Python and R.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16668: [SPARK-18788][SPARKR] Add API for getNumPartition...

2017-01-23 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16668#discussion_r97474262
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -3406,3 +3406,28 @@ setMethod("randomSplit",
 }
 sapply(sdfs, dataFrame)
   })
+
+#' getNumPartitions
+#'
+#' Return the number of partitions
+#' Note: in order to compute the number of partition the SparkDataFrame 
has to be converted into a
+#' RDD temporarily internally.
+#'
+#' @param x A SparkDataFrame
+#' @family SparkDataFrame functions
+#' @aliases getNumPartitions,SparkDataFrame-method
+#' @rdname getNumPartitions
+#' @name getNumPartitions
+#' @export
+#' @examples
+#'\dontrun{
+#' sparkR.session()
+#' df <- createDataFrame(cars, numPartitions = 2)
+#' getNumPartitions(df)
+#' }
+#' @note getNumPartitions since 2.1.1
+setMethod("getNumPartitions",
+  signature(x = "SparkDataFrame"),
+  function(x) {
+getNumPartitionsRDD(toRDD(x))
--- End diff --

isn't just calling `rdd.numPartitions`? we need to materialize the RDD 
inside DataFrame anyway, but it's cheap at scala side.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16668: [SPARK-18788][SPARKR] Add API for getNumPartition...

2017-01-23 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16668#discussion_r97473647
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -3406,3 +3406,28 @@ setMethod("randomSplit",
 }
 sapply(sdfs, dataFrame)
   })
+
+#' getNumPartitions
+#'
+#' Return the number of partitions
+#' Note: in order to compute the number of partition the SparkDataFrame 
has to be converted into a
+#' RDD temporarily internally.
+#'
+#' @param x A SparkDataFrame
+#' @family SparkDataFrame functions
+#' @aliases getNumPartitions,SparkDataFrame-method
+#' @rdname getNumPartitions
+#' @name getNumPartitions
+#' @export
+#' @examples
+#'\dontrun{
+#' sparkR.session()
+#' df <- createDataFrame(cars, numPartitions = 2)
+#' getNumPartitions(df)
+#' }
+#' @note getNumPartitions since 2.1.1
+setMethod("getNumPartitions",
+  signature(x = "SparkDataFrame"),
+  function(x) {
+getNumPartitionsRDD(toRDD(x))
--- End diff --

Give this is a bit of a hole I think it would be worthwhile to think if 
there is a reasonable workaround for 2.1.1 release (say JVM wrapper for 
`.rdd.getNumPartitions`), @shivaram would you agree?

As for the new Scala API, since it has broader implications it might be 
something to target the 2.2 release? If so that would be better served in a 
different PR.
I don't mind taking a shot at that - I'm not super familiar with that and 
from a quick scan it seems to be non-trivial (to handle different RDD subtypes 
and so on), so a few pointers would be appreciated, @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16679: [SPARK-19272][SQL] Remove the param `viewOriginal...

2017-01-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16679


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16679: [SPARK-19272][SQL] Remove the param `viewOriginalText` f...

2017-01-23 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16679
  
thanks, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16679: [SPARK-19272][SQL] Remove the param `viewOriginalText` f...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16679
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71900/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16679: [SPARK-19272][SQL] Remove the param `viewOriginalText` f...

2017-01-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16679
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16269: [SPARK-19080][SQL] simplify data source analysis

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16269
  
**[Test build #71908 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71908/testReport)**
 for PR 16269 at commit 
[`4b68c16`](https://github.com/apache/spark/commit/4b68c168b0e16071b91c93fc7f2be8fabda46fbe).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16679: [SPARK-19272][SQL] Remove the param `viewOriginalText` f...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16679
  
**[Test build #71900 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71900/testReport)**
 for PR 16679 at commit 
[`b5a48da`](https://github.com/apache/spark/commit/b5a48daab41f8462843a062475413400482d1213).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16638: [SPARK-19115] [SQL] Supporting Create External Ta...

2017-01-23 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16638#discussion_r97471913
  
--- Diff: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -81,8 +81,8 @@ statement
 rowFormat?  createFileFormat? locationSpec?
 (TBLPROPERTIES tablePropertyList)?
 (AS? query)?   
#createHiveTable
-| CREATE TABLE (IF NOT EXISTS)? target=tableIdentifier
-LIKE source=tableIdentifier
#createTableLike
+| CREATE EXTERNAL? TABLE (IF NOT EXISTS)? target=tableIdentifier
--- End diff --

ok then let's simplify the logic: if `location` is specified, we create an 
external table internally. Else, create managed table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2017-01-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15880
  
**[Test build #71907 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71907/testReport)**
 for PR 15880 at commit 
[`32e4f52`](https://github.com/apache/spark/commit/32e4f52a7673d1d1f573b9d83177c093327d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 >

1 - 100 of 650 matches

Mail list logo