date:20170823

[GitHub] spark issue #18652: [SPARK-21497][SQL] Pull non-deterministic equi join keys...

2017-08-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18652
  
The order is different from the original one that is evaluated in the join 
conditions. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18581: [SPARK-21289][SQL][ML] Supports custom line separ...

2017-08-23 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18581#discussion_r134932083
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFileLinesReader.scala
 ---
@@ -32,7 +32,9 @@ import 
org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
  * in that file.
  */
 class HadoopFileLinesReader(
-file: PartitionedFile, conf: Configuration) extends Iterator[Text] 
with Closeable {
+file: PartitionedFile,
+lineSeparator: Option[String],
--- End diff --

We do not know what is the default line separator?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18962: [SPARK-21714][CORE][YARN] Avoiding re-uploading r...

2017-08-23 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18962#discussion_r134932043
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -330,19 +332,21 @@ object SparkSubmit extends CommandLineUtils {
 args.archives = Option(args.archives).map(resolveGlobPaths(_, 
hadoopConf)).orNull
 
 // In client mode, download remote files.
+var localPrimaryResource: String = null
+var localJars: String = null
+var localPyFiles: String = null
--- End diff --

@tgravescs , if we also download files/archives to local path, then how do 
we leverage them, since we don't expose the path to the user, even with 
previous code downloaded files seems never can be used for driver. So for the 
semantic completeness, we still need to change some codes to support this 
feature as what @vanzin mentioned.

I agree with you current state of the code is confused for user (some are 
downloaded which other are not). I think we could fix it in the following PR, 
what do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19017: [SPARK-21804][SQL] json_tuple returns null values...

2017-08-23 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19017#discussion_r134931404
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -447,7 +448,18 @@ case class JsonTuple(children: Seq[Expression])
   generator => copyCurrentStructure(generator, parser)
 }
 
-row(idx) = UTF8String.fromBytes(output.toByteArray)
+val jsonValue = UTF8String.fromBytes(output.toByteArray)
+row(idx) = jsonValue
+idx = idx + 1
+
+// SPARK-21804: json_tuple returns null values within repeated 
columns
+// except the first one; so that we need to check the 
remaining fields.
+while (idx < fieldNames.length) {
+  if (fieldNames(idx) == jsonField) {
+row(idx) = jsonValue
+  }
+  idx = idx + 1
+}
--- End diff --

I am also thinking if we should use a Hash table. However,,, the number of 
columns is not large. Thus, it might not get a noticeable benefit. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19017: [SPARK-21804][SQL] json_tuple returns null values...

2017-08-23 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19017#discussion_r134931120
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -447,7 +448,18 @@ case class JsonTuple(children: Seq[Expression])
   generator => copyCurrentStructure(generator, parser)
 }
 
-row(idx) = UTF8String.fromBytes(output.toByteArray)
+val jsonValue = UTF8String.fromBytes(output.toByteArray)
+row(idx) = jsonValue
+idx = idx + 1
+
+// SPARK-21804: json_tuple returns null values within repeated 
columns
+// except the first one; so that we need to check the 
remaining fields.
+while (idx < fieldNames.length) {
+  if (fieldNames(idx) == jsonField) {
+row(idx) = jsonValue
+  }
+  idx = idx + 1
+}
--- End diff --

```Scala
row(idx) = jsonValue
idx = idx + 1

// SPARK-21804: json_tuple returns null values within repeated 
columns
// except the first one; so that we need to check the remaining 
fields.
while (idx < fieldNames.length) {
  if (fieldNames(idx) == jsonField) {
row(idx) = jsonValue
  }
  idx = idx + 1
}
```

->

```Scala
do {
  row(idx) = jsonValue
  idx = fieldNames.indexOf(jsonField, idx + 1)
} while (idx >= 0)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18962: [SPARK-21714][CORE][YARN] Avoiding re-uploading remote r...

2017-08-23 Thread jerryshao

Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/18962
  
Yes @tgravescs if we download everything to local and then upload to yarn, 
http/https/ftp should be unrelated here. But still in yarn cluster mode, if we 
specify remote http jars, then yarn client will be fail to handle this jar, so 
the issue still exists. And I'm going to create a separate JIRA to track this 
issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19018: [SPARK-21801][SPARKR][TEST] unit test randomly fail with...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19018
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81070/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19018: [SPARK-21801][SPARKR][TEST] unit test randomly fail with...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19018
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19018: [SPARK-21801][SPARKR][TEST] unit test randomly fail with...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19018
  
**[Test build #81070 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81070/testReport)**
 for PR 19018 at commit 
[`05b3cab`](https://github.com/apache/spark/commit/05b3cabaff89396d352ece41d57c7fd9eb2ef917).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18730
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19028: [MINOR][SQL] The comment of Class ExchangeCoordinator ex...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19028
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19028: [MINOR][SQL] The comment of Class ExchangeCoordinator ex...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19028
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81063/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18730
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81064/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19028: [MINOR][SQL] The comment of Class ExchangeCoordinator ex...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19028
  
**[Test build #81063 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81063/testReport)**
 for PR 19028 at commit 
[`837536f`](https://github.com/apache/spark/commit/837536fee8427f9b527ace401924f9a703ba38d7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18730
  
**[Test build #81064 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81064/testReport)**
 for PR 18730 at commit 
[`72aef67`](https://github.com/apache/spark/commit/72aef679b498bb042ecb9ffa8df62ed41e1f519d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF...

2017-08-23 Thread neoremind

Github user neoremind commented on the issue:

https://github.com/apache/spark/pull/18964
  
@zsxwing I did try to create a performance test against spark rpc, the test 
result can be found 
[here](https://github.com/neoremind/kraps-rpc#4-performance-test), note that I 
created the project for studying purpose and the code is based on 2.1.0. But as 
you said, the performance would not be dropped as client not using `SO_RCVBUF` 
and `SO_SNDBUF` set in `SparkConf`.  

For example, I use the scenario of concurrent calls 10, total calls 100k, 
keep all things as default, the QPS would be around 11k. When I set `SO_RCVBUF` 
and ` SO_SNDBUF`  to extremely small number like 100 the performance is 
affected tremendously. If they are set to a large number like 128k, the results 
won't be boosted by whether clients set the corresponding `SO_RCVBUF` and 
`SO_SNDBUF` value or not. 

I admit that the update is trivial but from users' perspective, if 
`spark.{module}.io.sendBuffer` and `spark.{module}.io.sendBuffer` are exposed 
outside and could be set, and they only works on server side, I think it is a 
little bit not consistent, so I raise the PR to try to make it work on both 
server and client side, just to make them consistent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18730
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18730
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81061/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18730
  
**[Test build #81061 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81061/testReport)**
 for PR 18730 at commit 
[`bab91db`](https://github.com/apache/spark/commit/bab91db933947b57159b21e5f6506570b6b721cb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...

2017-08-23 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/19027
  
LGTM.
Btw, I'm just curious why we need tests with `numpy` here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...

2017-08-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19027
  
Will probably take a look through the problem in the near future including 
hard dependencies and etc. I took a quick look but I think I need more time but 
yes it looks appearently vaild point.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18581: [SPARK-21289][SQL][ML] Supports custom line separator fo...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18581
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81060/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18581: [SPARK-21289][SQL][ML] Supports custom line separator fo...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18581
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable maxLines...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19031
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable maxLines...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19031
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81062/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18581: [SPARK-21289][SQL][ML] Supports custom line separator fo...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18581
  
**[Test build #81060 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81060/testReport)**
 for PR 18581 at commit 
[`47e8d37`](https://github.com/apache/spark/commit/47e8d3761681611a9ee6d50d6c812babd395dace).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable maxLines...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19031
  
**[Test build #81062 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81062/testReport)**
 for PR 19031 at commit 
[`9438655`](https://github.com/apache/spark/commit/94386550523baf5f98427d3ef0b9f9815cee4c69).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19018: [SPARK-21801][SPARKR][TEST] unit test randomly fail with...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19018
  
**[Test build #81070 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81070/testReport)**
 for PR 19018 at commit 
[`05b3cab`](https://github.com/apache/spark/commit/05b3cabaff89396d352ece41d57c7fd9eb2ef917).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19018: [SPARK-21801][SPARKR][TEST] unit test randomly fail with...

2017-08-23 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19018
  
jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18581: [SPARK-21289][SQL][ML] Supports custom line separator fo...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18581
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81059/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18581: [SPARK-21289][SQL][ML] Supports custom line separator fo...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18581
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19017: [SPARK-21804][SQL] json_tuple returns null values...

2017-08-23 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/19017#discussion_r134925698
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -447,7 +448,18 @@ case class JsonTuple(children: Seq[Expression])
   generator => copyCurrentStructure(generator, parser)
 }
 
-row(idx) = UTF8String.fromBytes(output.toByteArray)
+val jsonValue = UTF8String.fromBytes(output.toByteArray)
+row(idx) = jsonValue
+idx = idx + 1
+
+// SPARK-21804: json_tuple returns null values within repeated 
columns
+// except the first one; so that we need to check the 
remaining fields.
+while (idx < fieldNames.length) {
+  if (fieldNames(idx) == jsonField) {
+row(idx) = jsonValue
+  }
+  idx = idx + 1
+}
--- End diff --

Would you maybe have a suggestion? The current status looks fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18581: [SPARK-21289][SQL][ML] Supports custom line separator fo...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18581
  
**[Test build #81059 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81059/testReport)**
 for PR 18581 at commit 
[`3555e5d`](https://github.com/apache/spark/commit/3555e5dafa85dcee404599c78b17cbb97b1709f0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19017: [SPARK-21804][SQL] json_tuple returns null values...

2017-08-23 Thread jmchung

Github user jmchung commented on a diff in the pull request:

https://github.com/apache/spark/pull/19017#discussion_r134925669
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -447,7 +448,18 @@ case class JsonTuple(children: Seq[Expression])
   generator => copyCurrentStructure(generator, parser)
 }
 
-row(idx) = UTF8String.fromBytes(output.toByteArray)
+val jsonValue = UTF8String.fromBytes(output.toByteArray)
+row(idx) = jsonValue
+idx = idx + 1
+
+// SPARK-21804: json_tuple returns null values within repeated 
columns
+// except the first one; so that we need to check the 
remaining fields.
+while (idx < fieldNames.length) {
+  if (fieldNames(idx) == jsonField) {
+row(idx) = jsonValue
+  }
+  idx = idx + 1
+}
--- End diff --

If I comment out the L451-452, the repeated fields still have the same 
jsonValue because `fieldNames(idx) == jsonField`, but the first comparison is 
not necessary since `idx >= 0` means matched.

Could you please give me some advice? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19018: [SPARK-21801][SPARKR][TEST] unit test randomly fail with...

2017-08-23 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19018
  
but i think in general it's better to make tests more predictable like this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19018: [SPARK-21801][SPARKR][TEST] unit test randomly fail with...

2017-08-23 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19018
  
the error specifically is:
`The input column stridx_87ea3065aeb2 should have at least two distinct 
values.`
I don't think this would be only happening in R - I suppose whenever the 
string label has only one distinct value, ml's random forest will just give up 
like this.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19016: [SPARK-21805][SPARKR] Disable R vignettes code on...

2017-08-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19016


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18945: Add option to convert nullable int columns to flo...

2017-08-23 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18945#discussion_r134925269
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1762,7 +1762,7 @@ def toPandas(self):
 else:
--- End diff --

If we use this approach, how about the following to check if the type 
corrections are needed:

```python
dtype = {}
for field in self.schema:
pandas_type = _to_corrected_pandas_type(field.dataType)
if pandas_type is not None and not(field.nullable and 
pdf[field.name].isnull().any()):
dtype[field.name] = pandas_type
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19016: [SPARK-21805][SPARKR] Disable R vignettes code on Window...

2017-08-23 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19016
  
thanks, merged to master/2.2.
will check for nightly build from tonight.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19017: [SPARK-21804][SQL] json_tuple returns null values...

2017-08-23 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19017#discussion_r134923877
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -447,7 +448,18 @@ case class JsonTuple(children: Seq[Expression])
   generator => copyCurrentStructure(generator, parser)
 }
 
-row(idx) = UTF8String.fromBytes(output.toByteArray)
+val jsonValue = UTF8String.fromBytes(output.toByteArray)
+row(idx) = jsonValue
+idx = idx + 1
+
+// SPARK-21804: json_tuple returns null values within repeated 
columns
+// except the first one; so that we need to check the 
remaining fields.
+while (idx < fieldNames.length) {
+  if (fieldNames(idx) == jsonField) {
+row(idx) = jsonValue
+  }
+  idx = idx + 1
+}
--- End diff --

You still can simplify the codes a lot without functional transformation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19017: [SPARK-21804][SQL] json_tuple returns null values...

2017-08-23 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19017#discussion_r134923403
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -447,7 +448,18 @@ case class JsonTuple(children: Seq[Expression])
   generator => copyCurrentStructure(generator, parser)
 }
 
-row(idx) = UTF8String.fromBytes(output.toByteArray)
+val jsonValue = UTF8String.fromBytes(output.toByteArray)
+row(idx) = jsonValue
+idx = idx + 1
+
+// SPARK-21804: json_tuple returns null values within repeated 
columns
+// except the first one; so that we need to check the 
remaining fields.
+while (idx < fieldNames.length) {
+  if (fieldNames(idx) == jsonField) {
+row(idx) = jsonValue
+  }
+  idx = idx + 1
+}
--- End diff --

We have followed @HyukjinKwon's suggestion to avoid functional 
transformation with a while, since this is a hot path.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19017: [SPARK-21804][SQL] json_tuple returns null values...

2017-08-23 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19017#discussion_r134923130
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -447,7 +448,18 @@ case class JsonTuple(children: Seq[Expression])
   generator => copyCurrentStructure(generator, parser)
 }
 
-row(idx) = UTF8String.fromBytes(output.toByteArray)
+val jsonValue = UTF8String.fromBytes(output.toByteArray)
+row(idx) = jsonValue
+idx = idx + 1
+
+// SPARK-21804: json_tuple returns null values within repeated 
columns
+// except the first one; so that we need to check the 
remaining fields.
+while (idx < fieldNames.length) {
+  if (fieldNames(idx) == jsonField) {
+row(idx) = jsonValue
+  }
+  idx = idx + 1
+}
--- End diff --

Could you rewrite it in short? More Scala?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19032: [SPARK-17321][YARN] Avoid writing shuffle metadata to di...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19032
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19032: [SPARK-17321][YARN] Avoid writing shuffle metadata to di...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19032
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81067/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19032: [SPARK-17321][YARN] Avoid writing shuffle metadata to di...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19032
  
**[Test build #81067 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81067/testReport)**
 for PR 19032 at commit 
[`5abbe75`](https://github.com/apache/spark/commit/5abbe75072cf3f172f0b2e448941b94d72268c90).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18730
  
**[Test build #81069 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81069/testReport)**
 for PR 18730 at commit 
[`aeabe1d`](https://github.com/apache/spark/commit/aeabe1d1aacf5abf58d631bc291dd409728b5569).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...

2017-08-23 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19027
  
I'm ok without the test since this is unlikely to break in the future. We 
do have tests that depends on (optionally) numpy (and Arrow) - seems like we 
should be able to take on dependencies more formally so we could test them 
properly?




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18730
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81068/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18730
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18730
  
**[Test build #81068 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81068/testReport)**
 for PR 18730 at commit 
[`4789772`](https://github.com/apache/spark/commit/478977293aadb9383740eabbaee23a43cc64b062).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18730
  
**[Test build #81068 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81068/testReport)**
 for PR 18730 at commit 
[`4789772`](https://github.com/apache/spark/commit/478977293aadb9383740eabbaee23a43cc64b062).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19032: [SPARK-17321][YARN] Avoid writing shuffle metadata to di...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19032
  
**[Test build #81067 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81067/testReport)**
 for PR 19032 at commit 
[`5abbe75`](https://github.com/apache/spark/commit/5abbe75072cf3f172f0b2e448941b94d72268c90).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19032: [SPARK-17321][YARN] Avoid writing shuffle metadata to di...

2017-08-23 Thread jerryshao

Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19032
  
CC @lishuming please take a look at another approach to fix the bad disk 
issue.

Also ping @tgravescs to view the PR.

Thanks a lot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable maxLines...

2017-08-23 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19031
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread caneGuy

Github user caneGuy commented on the issue:

https://github.com/apache/spark/pull/18730
  
I mock some local test for two different api. From the simple test result, 
we can see slice will not affect the performance of write bytes.Test result 
below:
```
ãTest 10 chunks each with 30m for 1 loopã
Time cost with 1 loop for writeFully(): 83 ms
Time cost with 1 loop for writeWithSlice(): 76 ms
ãEndingã

ãTest 10 chunks each with 100m for 1 loopã
Time cost with 1 loop for writeFully(): 219 ms
Time cost with 1 loop for writeWithSlice(): 213 ms
ãEndingã

ãTest 10 chunks each with 30m for 10 loopã
Time cost with 10 loop for writeFully(): 982 ms
Time cost with 10 loop for writeWithSlice(): 1000 ms
ãEndingã

ãTest 10 chunks each with 100m for 10 loopã
Time cost with 10 loop for writeFully(): 3298 ms
Time cost with 10 loop for writeWithSlice(): 3454 ms
ãEndingã

ãTest 10 chunks each with 30m for 50 loopã
Time cost with 50 loop for writeFully(): 3444 ms
Time cost with 50 loop for writeWithSlice(): 3329 ms
ãEndingã

ãTest 10 chunks each with 100m for 50 loopã
Time cost with 50 loop for writeFully(): 21913 ms
Time cost with 50 loop for writeWithSlice(): 17574 ms
ãEndingã
```
Test code below:
```
test("benchmark testing") {
// scalastyle:off
val buffer100 = ByteBuffer.allocate(1024 * 1024 * 100)
val buffer30 = ByteBuffer.allocate(1024 * 1024 * 30)
testWithLoop(1, new ChunkedByteBuffer(Array.fill(10)(buffer30)), "Test 
10 chunks each with 30m for 1 loop")
testWithLoop(1, new ChunkedByteBuffer(Array.fill(10)(buffer100)), "Test 
10 chunks each with 100m for 1 loop")

testWithLoop(10, new ChunkedByteBuffer(Array.fill(10)(buffer30)), "Test 
10 chunks each with 30m for 10 loop")
testWithLoop(10, new ChunkedByteBuffer(Array.fill(10)(buffer100)), 
"Test 10 chunks each with 100m for 10 loop")

testWithLoop(50, new ChunkedByteBuffer(Array.fill(10)(buffer30)), "Test 
10 chunks each with 30m for 50 loop")
testWithLoop(50, new ChunkedByteBuffer(Array.fill(10)(buffer100)), 
"Test 10 chunks each with 100m for 50 loop")
  }

  // scalastyle:off
  private def testWithLoop(loopTimes : Int, chunkedByteBuffer: 
ChunkedByteBuffer, testString: String) {
System.out.println(s"ã$testStringã")
var starTime = System.currentTimeMillis()
for (i <- 1 to loopTimes) {
  chunkedByteBuffer.writeFully(new 
ByteArrayWritableChannel(chunkedByteBuffer.size.toInt))
}
System.out.println(s"Time cost with $loopTimes loop for 
writeFully():${Utils.getUsedTimeMs(starTime)}")
starTime = System.currentTimeMillis()
for (i <- 1 to loopTimes) {
  chunkedByteBuffer.writeWithSlice(new 
ByteArrayWritableChannel(chunkedByteBuffer.size.toInt))
}
System.out.println(s"Time cost with $loopTimes loop for 
writeWithSlice():${Utils.getUsedTimeMs(starTime)}")
System.out.println("ãEndingã")
System.out.println("")
  }
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19032: [SPARK-17321][YARN] Avoid writing shuffle metadat...

2017-08-23 Thread jerryshao

GitHub user jerryshao opened a pull request:

https://github.com/apache/spark/pull/19032

[SPARK-17321][YARN] Avoid writing shuffle metadata to disk if NM recovery 
is disabled

## What changes were proposed in this pull request?

In the current code, if NM recovery is not enabled then 
`YarnShuffleService` will write shuffle metadata to NM local dir-1, if this 
local dir-1 is on bad disk, then `YarnShuffleService` will be failed to start. 
So to solve this issue, in Spark side if NM recovery is not enabled, then Spark 
will not persist data into leveldb, in that case yarn shuffle service can still 
be served but lose the ability for recovery, (it is fine because the failure of 
NM will kill the containers as well as applications).

## How was this patch tested?

Tested in the local cluster with NM recovery off and on to see if folder is 
created or not. MiniCluster UT isn't added because in MiniCluster NM will 
always set port to 0, but NM recovery requires non-ephemeral port.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jerryshao/apache-spark SPARK-17321

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19032.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19032


commit 5abbe75072cf3f172f0b2e448941b94d72268c90
Author: jerryshao 
Date:   2017-08-24T03:28:48Z

Avoid writing shuffle metadata to disk if NM recovery is disabled

Change-Id: Id062d71589f46052706058c151c706dae38b1e6e




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19017: [SPARK-21804][SQL] json_tuple returns null values within...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19017
  
**[Test build #81066 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81066/testReport)**
 for PR 19017 at commit 
[`ff01e04`](https://github.com/apache/spark/commit/ff01e04a8c9f1f8447c3b536f9288d3f6eaf62be).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable maxLines...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19031
  
**[Test build #81065 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81065/testReport)**
 for PR 19031 at commit 
[`a0854ad`](https://github.com/apache/spark/commit/a0854ad16003020eaa6f0d1f1c08db726b9196e2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19017: [SPARK-21804][SQL] json_tuple returns null values within...

2017-08-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19017
  
LGTM too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19022: [Spark-21807][SQL]Override ++ operation in Expres...

2017-08-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19022


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18968: [SPARK-21759][SQL] In.checkInputDataTypes should ...

2017-08-23 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18968#discussion_r134920136
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala
 ---
@@ -274,17 +274,24 @@ object ScalarSubquery {
 case class ListQuery(
 plan: LogicalPlan,
 children: Seq[Expression] = Seq.empty,
-exprId: ExprId = NamedExpression.newExprId)
+exprId: ExprId = NamedExpression.newExprId,
+childOutputs: Seq[Attribute] = Seq.empty)
   extends SubqueryExpression(plan, children, exprId) with Unevaluable {
-  override def dataType: DataType = plan.schema.fields.head.dataType
+  override def dataType: DataType = if (childOutputs.length > 1) {
+childOutputs.toStructType
+  } else {
+childOutputs.head.dataType
+  }
+  override lazy val resolved: Boolean = childrenResolved && plan.resolved 
&& childOutputs.nonEmpty
--- End diff --

Before we fill in `childOutputs`, this `ListQuery` cannot be resolved. 
Otherwise, to access its `dataType` causes failure in `In.checkInputDataTypes`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19022: [Spark-21807][SQL]Override ++ operation in ExpressionSet...

2017-08-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19022
  
Thanks! Merging to master. 

You can fix this in your future PRs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable m...

2017-08-23 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/19031#discussion_r134920085
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -577,9 +577,11 @@ object SQLConf {
 .doc("The maximum lines of a single Java function generated by 
whole-stage codegen. " +
   "When the generated function exceeds this threshold, " +
   "the whole-stage codegen is deactivated for this subtree of the 
current query plan. " +
-  "The default value 4000 is the max length of byte code JIT supported 
" +
-  "for a single function(8000) divided by 2.")
+  "The default value 2667 is the max length of byte code JIT supported 
" +
+  "for a single function(8000) divided by 2. Use -1 to disable this.")
 .intConf
+.checkValue(maxLines => maxLines >= -1, "The maximum must not be a 
negative integer, -1 to " +
+  "always activate whole-stage codegen.")
--- End diff --

ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable m...

2017-08-23 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/19031#discussion_r134920074
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -577,9 +577,11 @@ object SQLConf {
 .doc("The maximum lines of a single Java function generated by 
whole-stage codegen. " +
   "When the generated function exceeds this threshold, " +
   "the whole-stage codegen is deactivated for this subtree of the 
current query plan. " +
-  "The default value 4000 is the max length of byte code JIT supported 
" +
-  "for a single function(8000) divided by 2.")
+  "The default value 2667 is the max length of byte code JIT supported 
" +
--- End diff --

missed...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19022: [Spark-21807][SQL]Override ++ operation in Expres...

2017-08-23 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19022#discussion_r134919964
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionSetSuite.scala
 ---
@@ -210,4 +210,13 @@ class ExpressionSetSuite extends SparkFunSuite {
 assert((initialSet - (aLower + 1)).size == 0)
 
   }
+
+  test("add multiple elements to set") {
+val initialSet = ExpressionSet(aUpper + 1 :: Nil)
+val setToAddWithSameExpression = ExpressionSet(aUpper + 1 :: aUpper + 
2 :: Nil)
+val setToAddWithOutSameExpression = ExpressionSet(aUpper + 3 :: aUpper 
+ 4 :: Nil)
--- End diff --

Nit: `WithOut` -> `Without`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19022: [Spark-21807][SQL]Override ++ operation in ExpressionSet...

2017-08-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19022
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19017: [SPARK-21804][SQL] json_tuple returns null values within...

2017-08-23 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19017
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18997: [SPARK-21788][SS]Handle more exceptions when stopping a ...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18997
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18997: [SPARK-21788][SS]Handle more exceptions when stopping a ...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18997
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81058/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18997: [SPARK-21788][SS]Handle more exceptions when stopping a ...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18997
  
**[Test build #81058 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81058/testReport)**
 for PR 18997 at commit 
[`bbb0b0e`](https://github.com/apache/spark/commit/bbb0b0eb5a3517bb6c278588c2a66d4b6da8027f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19017: [SPARK-21804][SQL] json_tuple returns null values within...

2017-08-23 Thread jmchung

Github user jmchung commented on the issue:

https://github.com/apache/spark/pull/19017
  
@viirya PR title fixed, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19017: SPARK-21804: json_tuple returns null values within repea...

2017-08-23 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19017
  
Please edit the PR title as `[SPARK-21804][SQL] json_tuple returns ...`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19022: [Spark-21807][SQL]Override ++ operation in ExpressionSet...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19022
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19022: [Spark-21807][SQL]Override ++ operation in ExpressionSet...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19022
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81057/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19022: [Spark-21807][SQL]Override ++ operation in ExpressionSet...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19022
  
**[Test build #81057 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81057/testReport)**
 for PR 19022 at commit 
[`0762840`](https://github.com/apache/spark/commit/07628402eeed958c45905974c82b06211f1bc934).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable m...

2017-08-23 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19031#discussion_r134918924
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -577,9 +577,11 @@ object SQLConf {
 .doc("The maximum lines of a single Java function generated by 
whole-stage codegen. " +
   "When the generated function exceeds this threshold, " +
   "the whole-stage codegen is deactivated for this subtree of the 
current query plan. " +
-  "The default value 4000 is the max length of byte code JIT supported 
" +
-  "for a single function(8000) divided by 2.")
+  "The default value 2667 is the max length of byte code JIT supported 
" +
--- End diff --

2667?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable m...

2017-08-23 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19031#discussion_r13491
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -577,9 +577,11 @@ object SQLConf {
 .doc("The maximum lines of a single Java function generated by 
whole-stage codegen. " +
   "When the generated function exceeds this threshold, " +
   "the whole-stage codegen is deactivated for this subtree of the 
current query plan. " +
-  "The default value 4000 is the max length of byte code JIT supported 
" +
-  "for a single function(8000) divided by 2.")
+  "The default value 2667 is the max length of byte code JIT supported 
" +
+  "for a single function(8000) divided by 2. Use -1 to disable this.")
 .intConf
+.checkValue(maxLines => maxLines >= -1, "The maximum must not be a 
negative integer, -1 to " +
+  "always activate whole-stage codegen.")
--- End diff --

`The maximum must not be a negative integer, except for -1 using to always 
activate whole-stage codegen.`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable maxLines...

2017-08-23 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19031
  
I'd prefer using `-1` to disable `maxLinesPerFunction` check like this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18730
  
**[Test build #81064 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81064/testReport)**
 for PR 18730 at commit 
[`72aef67`](https://github.com/apache/spark/commit/72aef679b498bb042ecb9ffa8df62ed41e1f519d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19028: [MINOR][SQL] The comment of Class ExchangeCoordinator ex...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19028
  
**[Test build #81063 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81063/testReport)**
 for PR 19028 at commit 
[`837536f`](https://github.com/apache/spark/commit/837536fee8427f9b527ace401924f9a703ba38d7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19028: [MINOR][SQL] The comment of Class ExchangeCoordinator ex...

2017-08-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19028
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19008: [SPARK-21756][SQL]Add JSON option to allow unquoted cont...

2017-08-23 Thread vinodkc

Github user vinodkc commented on the issue:

https://github.com/apache/spark/pull/19008
  
@rxin , 
Sure, I'll update it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18652: [SPARK-21497][SQL] Pull non-deterministic equi join keys...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18652
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81054/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18652: [SPARK-21497][SQL] Pull non-deterministic equi join keys...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18652
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19013: [SPARK-21728][core] Allow SparkSubmit to use Logging.

2017-08-23 Thread jerryshao

Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19013
  
LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18652: [SPARK-21497][SQL] Pull non-deterministic equi join keys...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18652
  
**[Test build #81054 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81054/testReport)**
 for PR 18652 at commit 
[`793dac4`](https://github.com/apache/spark/commit/793dac4403926fb9f1421f4bbee59a8e9b82d7e8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18969: [SPARK-21520][SQL][FOLLOW-UP]fix a special case f...

2017-08-23 Thread heary-cao

Github user heary-cao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18969#discussion_r134915918
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
 ---
@@ -24,6 +24,24 @@ import org.apache.spark.sql.catalyst.plans._
 import org.apache.spark.sql.catalyst.plans.logical._
 
 /**
+ * A pattern that matches any number of project if fields is deterministic
+ * or child is LeafNode of project on top of another relational operator.
+ */
+object ProjectOperation extends PredicateHelper {
+  type ReturnType = (Seq[NamedExpression], LogicalPlan)
+
+  def unapply(plan: LogicalPlan): Option[ReturnType] = plan match {
+case Project(fields, child) if fields.forall(_.deterministic) =>
+  Some((fields, child))
+
+case Project(fields, child: LeafNode) =>
--- End diff --

Hi, @gatorsmile .
Could you review again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19021: [SPARK-21603][SQL][FOLLOW-UP] Change the default value o...

2017-08-23 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19021
  
@maropu Should be a good idea. Especially the number of lines of code may 
not be intuitive to set for this purpose.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable maxLines...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19031
  
**[Test build #81062 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81062/testReport)**
 for PR 19031 at commit 
[`9438655`](https://github.com/apache/spark/commit/94386550523baf5f98427d3ef0b9f9815cee4c69).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable maxLines...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19031
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81055/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable maxLines...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19031
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19031: [SPARK-21603][SQL][FOLLOW-UP] Use -1 to disable maxLines...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19031
  
**[Test build #81055 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81055/testReport)**
 for PR 19031 at commit 
[`60dc64e`](https://github.com/apache/spark/commit/60dc64e3dc9ad5de5604ea68d3bb5cf7defc4553).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18730
  
**[Test build #81061 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81061/testReport)**
 for PR 18730 at commit 
[`bab91db`](https://github.com/apache/spark/commit/bab91db933947b57159b21e5f6506570b6b721cb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18730: [SPARK-21527][CORE] Use buffer limit in order to ...

2017-08-23 Thread caneGuy

Github user caneGuy commented on a diff in the pull request:

https://github.com/apache/spark/pull/18730#discussion_r134912125
  
--- Diff: 
core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala ---
@@ -63,6 +65,19 @@ private[spark] class ChunkedByteBuffer(var chunks: 
Array[ByteBuffer]) {
   }
 
   /**
+   * Write this buffer to a channel with slice.
+   */
+  def writeWithSlice(channel: WritableByteChannel): Unit = {
+for (bytes <- getChunks()) {
+  val capacity = bytes.limit()
+  while (bytes.position() < capacity) {
+bytes.limit(Math.min(capacity, bytes.position + 
NIO_BUFFER_LIMIT.toInt))
--- End diff --

Good review.I refactor the code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18581: [SPARK-21289][SQL][ML] Supports custom line separator fo...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18581
  
**[Test build #81060 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81060/testReport)**
 for PR 18581 at commit 
[`47e8d37`](https://github.com/apache/spark/commit/47e8d3761681611a9ee6d50d6c812babd395dace).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...

2017-08-23 Thread caneGuy

Github user caneGuy commented on the issue:

https://github.com/apache/spark/pull/18730
  
@jiangxb1987 Ok,i will try to do some benchmark tesing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18581: [SPARK-21289][SQL][ML] Supports custom line separator fo...

2017-08-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18581
  
**[Test build #81059 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81059/testReport)**
 for PR 18581 at commit 
[`3555e5d`](https://github.com/apache/spark/commit/3555e5dafa85dcee404599c78b17cbb97b1709f0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19021: [SPARK-21603][SQL][FOLLOW-UP] Change the default value o...

2017-08-23 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/19021
  
Just for your info, again, I looked into this issue in TPC-DS quries; I 
added [some 
code](https://github.com/apache/spark/compare/master...maropu:SPARK-21603-FOLLOWUP-3)
 to check  the actual bytecode size of these quries and I found the gen'd 
function in Q17/Q66  only had too-long bytecode over `8000`:
```
= TPCDS QUERY BENCHMARK OUTPUT FOR q17 =
17/08/23 14:45:02 WARN CodeGenerator: 
GeneratedClass.agg_doAggregateWithKeys is too large to do JIT compilation on 
HotSpot; the size of agg_doAggregateWithKeys is 17665; the limit is 8000

= TPCDS QUERY BENCHMARK OUTPUT FOR q66 =
17/08/23 14:55:39 WARN CodeGenerator: 
GeneratedClass.agg_doAggregateWithKeys is too large to do JIT compilation on 
HotSpot; the size of agg_doAggregateWithKeys is 11012; the limit is 8000
17/08/23 14:55:39 WARN CodeGenerator: 
GeneratedClass.agg_doAggregateWithKeys is too large to do JIT compilation on 
HotSpot; the size of agg_doAggregateWithKeys is 13420; the limit is 8000
17/08/23 14:55:39 WARN CodeGenerator: 
GeneratedClass.agg_doAggregateWithKeys is too large to do JIT compilation on 
HotSpot; the size of agg_doAggregateWithKeys is 16641; the limit is 8000
```

BTW, why we don't check if  gen'd bytecode size is over `8000` directly 
instead of code line num. in #18810? cc: @gatorsmile  @viirya @kiszk 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18652: [SPARK-21497][SQL] Pull non-deterministic equi join keys...

2017-08-23 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18652
  

Join [t1.a = rand(t2.b), t1.c = rand(t2.d)]
  Sort
  Project [t1.a, t1.c]
TableScan t1
  Sort
Project [rand(t2.b) as rand(t2.b), rand(t2.d) as rand(t2.d)]
  TableScan t2

Aren't `rand(t2.b)` and `rand(t2.d)` already evaluated in `Project`? Why 
`Sort` will change the evaluation order?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...

2017-08-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19027
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 425 matches

Mail list logo