date:20170825

[GitHub] spark pull request #18971: [SPARK-21764][TESTS] Fix tests failures on Window...

2017-08-25 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18971#discussion_r135382352
  
--- Diff: 
core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala ---
@@ -824,7 +824,7 @@ class SparkSubmitSuite
 val hadoopConf = new Configuration()
 val tmpDir = Files.createTempDirectory("tmp").toFile
 updateConfWithFakeS3Fs(hadoopConf)
-val sourcePath = s"s3a://${jarFile.getAbsolutePath}"
+val sourcePath = s"s3a://${jarFile.toURI.getPath}"
--- End diff --

**Windows:**

Before:

```
scala> f.toURI.getPath
res1: String = /C:/a/b/c
```

After:

```
scala> f.getAbsolutePath
res2: String = C:\a\b\c
```

**Linux:**

Before:

```
scala> new File("/a/b/c").getAbsolutePath
res0: String = /a/b/c
```

After:

```
scala> new File("/a/b/c").toURI.getPath
res1: String = /a/b/c
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18971: [SPARK-21764][TESTS] Fix tests failures on Window...

2017-08-25 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18971#discussion_r135382393
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala ---
@@ -203,7 +203,7 @@ class StatisticsSuite extends 
StatisticsCollectionTestBase with TestHiveSingleto
   sql(s"INSERT INTO TABLE $tableName PARTITION (ds='$ds') SELECT * 
FROM src")
 }
 
-sql(s"ALTER TABLE $tableName SET LOCATION '$path'")
+sql(s"ALTER TABLE $tableName SET LOCATION '${path.toURI}'")
--- End diff --

These tests here do not look dedicated to test path. I have fixed those so 
far.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18971: [SPARK-21764][TESTS] Fix tests failures on Window...

2017-08-25 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18971#discussion_r135382374
  
--- Diff: 
core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala ---
@@ -221,12 +226,14 @@ class ReplayListenerSuite extends SparkFunSuite with 
BeforeAndAfter with LocalSp
 def didFail: Boolean = countDown.get == 0
 
 @throws[IOException]
-def read: Int = {
+override def read(): Int = {
   if (countDown.get == 0) {
 throw new EOFException("Stream ended prematurely")
   }
   countDown.decrementAndGet()
-  in.read
+  in.read()
 }
+
+override def close(): Unit = in.close()
--- End diff --

`EarlyEOFInputStream` was not being closed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18971: [SPARK-21764][TESTS] Fix tests failures on Window...

2017-08-25 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18971#discussion_r135382209
  
--- Diff: 
core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala ---
@@ -137,9 +137,10 @@ class RPackageUtilsSuite
 IvyTestUtils.withRepository(main, None, None) { repo =>
   val jar = IvyTestUtils.packJar(new File(new URI(repo)), dep1, Nil,
 useIvyLayout = false, withR = false, None)
-  val jarFile = new JarFile(jar)
-  assert(jarFile.getManifest == null, "jar file should have null 
manifest")
-  assert(!RPackageUtils.checkManifestForR(jarFile), "null manifest 
should return false")
+  Utils.tryWithResource(new JarFile(jar)) { jarFile =>
+assert(jarFile.getManifest == null, "jar file should have null 
manifest")
+assert(!RPackageUtils.checkManifestForR(jarFile), "null manifest 
should return false")
+  }
--- End diff --

Simply closes `JarFile`. This should be closed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18971: [SPARK-21764][TESTS] Fix tests failures on Window...

2017-08-25 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18971#discussion_r135382368
  
--- Diff: 
core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala ---
@@ -112,17 +112,19 @@ class ReplayListenerSuite extends SparkFunSuite with 
BeforeAndAfter with LocalSp
 
 // Verify the replay returns the events given the input maybe 
truncated.
 val logData = EventLoggingListener.openEventLog(logFilePath, 
fileSystem)
-val failingStream = new EarlyEOFInputStream(logData, buffered.size - 
10)
-replayer.replay(failingStream, logFilePath.toString, true)
+Utils.tryWithResource(new EarlyEOFInputStream(logData, buffered.size - 
10)) { failingStream =>
--- End diff --

Here `EarlyEOFInputStream` was not being closed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18971: [SPARK-21764][TESTS] Fix tests failures on Windows: reso...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18971
  
**[Test build #81151 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81151/testReport)**
 for PR 18971 at commit 
[`236b986`](https://github.com/apache/spark/commit/236b986bfd5fcedfe390ad3b6b566d53f84dd89c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19019: [MINOR][DOCS] Minor doc fixes related with doc bu...

2017-08-25 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19019


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18581: [SPARK-21289][SQL][ML] Supports custom line separator fo...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18581
  
**[Test build #81150 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81150/testReport)**
 for PR 18581 at commit 
[`18b2684`](https://github.com/apache/spark/commit/18b268457fa0124d1c3c484ee02210ee674a9466).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19019: [MINOR][DOCS] Minor doc fixes related with doc build and...

2017-08-25 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19019
  
Merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18581: [SPARK-21289][SQL][ML] Supports custom line separ...

2017-08-25 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18581#discussion_r135381676
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFileLinesReader.scala
 ---
@@ -32,7 +32,9 @@ import 
org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
  * in that file.
  */
 class HadoopFileLinesReader(
-file: PartitionedFile, conf: Configuration) extends Iterator[Text] 
with Closeable {
+file: PartitionedFile,
+lineSeparator: Option[String],
--- End diff --

Thanks for clarifying it. Here is my investigation:

> When the line delimiter is '\n', any of the follow sequences will count 
as a delimiter: "\n", "\r\n", or "\r"

With this input:

```
a\nb\r\nc\rd
```

Case with `\n`:

```sql
CREATE EXTERNAL TABlE tbl(value STRING)
ROW FORMAT DELIMITED LINES TERMINATED BY '\n'
STORED AS TEXTFILE LOCATION '...';
```

```sql
SELECT value FROM tbl;
```

produced

```
a
b
c
d
```

This looks incorrect. I _guess_ `\n` is not being set and it looks working 
as the default behaviour in `LineRecordReader`.


> Accepting a single "\r" is pretty strange, but that's what Hive does so 
we emulate this behavior.

Case with `\r`:

```sql
CREATE EXTERNAL TABlE tbl(value STRING) 
ROW FORMAT DELIMITED LINES TERMINATED BY '\r'
STORED AS TEXTFILE LOCATION '...';
```

produced

```
FAILED: SemanticException 2:41 LINES TERMINATED BY only supports newline 
'\n' right now. Error encountered near token ''\r''
...
org.apache.hadoop.hive.ql.parse.SemanticException: 2:41 LINES TERMINATED BY 
only supports newline '\n' right now. Error encountered near token ''\r''
```

This looks related with https://issues.apache.org/jira/browse/HIVE-5999

and these lines:


https://github.com/apache/hive/blob/696be9f52dfc6fb59c24de19726b4460100fc9ba/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java#L198-L203

if I am not mistaken.

I am curious how this case was tested in the JIRA. If this test used 
`textinputformat.record.delimiter`, then, this seems Hadoop's property, which 
is basically the same thing as what I am doing here.


> Is Hive using Hadoop's LineRecordReader?

In the case above, the input format was 
`org.apache.hadoop.mapred.TextInputFormat`, which uses `LineRecordReader`. 

> How does Hive support it?

It looks Hive tries to support it by `LINES TERMINATED BY '\r'` 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableCreate/Drop/TruncateTable.
 I could not find other (formal) ways.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18966: [SPARK-21751][SQL] CodeGeneraor.splitExpressions ...

2017-08-25 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18966#discussion_r135381627
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -769,16 +769,21 @@ class CodegenContext {
   foldFunctions: Seq[String] => String = _.mkString("", ";\n", ";")): 
String = {
 val blocks = new ArrayBuffer[String]()
 val blockBuilder = new StringBuilder()
+val maxLines = SQLConf.get.maxCodegenLinesPerFunction
--- End diff --

Got it. Depends on calling context, it may take an effect or not. Should we 
pass `SQLConf` to this method?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18989: [SPARK-21781][SQL] Modify DataSourceScanExec to use conc...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18989
  
**[Test build #81149 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81149/testReport)**
 for PR 18989 at commit 
[`9effea9`](https://github.com/apache/spark/commit/9effea9379313b0aac1f392ca11ce0f678bb1e0c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18989: [SPARK-21781][SQL] Modify DataSourceScanExec to use conc...

2017-08-25 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/18989
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18966: [SPARK-21751][SQL] CodeGeneraor.splitExpressions ...

2017-08-25 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18966#discussion_r135380779
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -769,16 +769,21 @@ class CodegenContext {
   foldFunctions: Seq[String] => String = _.mkString("", ";\n", ";")): 
String = {
 val blocks = new ArrayBuffer[String]()
 val blockBuilder = new StringBuilder()
+val maxLines = SQLConf.get.maxCodegenLinesPerFunction
--- End diff --

I see. I had another interpretation that "this value may not change 
performance".
Let me check this while I did it like other flags.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19043: [SPARK-21831][TEST] Remove `spark.sql.hive.convertMetast...

2017-08-25 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19043
  
Thank you for reviewing and merging!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19043: [SPARK-21831][TEST] Remove `spark.sql.hive.conver...

2017-08-25 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19043


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19043: [SPARK-21831][TEST] Remove `spark.sql.hive.convertMetast...

2017-08-25 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19043
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-25 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/19048
  
> May be it would be cleaner if we provide a new api like this - 
killExecutorsAndNotUpdateTotal?

I think the main thing that bothers me is that adding anything to the API 
is making all this code even more complicated and confusing than it already is.

Having two (3 if you count the YARN allocator) places track all this state 
is bound to lead to these issues. Optimally only the EAM would keep track of 
these things; the CGSB shouldn't really be dealing with executor allocation and 
de-allocation, just with managing the existing executors that connect to it. 
But fixing things like that is probably a much larger change (the words 
"hornets' nest" come to mind).

Barring that, I think that we should make the change that leads to the 
correct behavior without making the internal interface more complicated than it 
needs to be. If changing the semantics of `ExecutorAllocationClient` lead to 
the code being easier to follow, then that's what we should do. After all, 
there is a single implementation of it (the CGSB). (And, digressing back to my 
paragraph above, maybe `ExecutorAllocationClient` shouldn't even exist and we 
should only have the EAM. But back to this PR.)

Or maybe you can reach the same thing through other means. For example, 
maybe if you get rid of the `replace` argument and make `killExecutors` not 
update the CGSB target count, and then force the caller to call 
`requestTotalExecutors` before killing executors, you could achieve the same 
thing. Maybe there are corner cases doing that, but maybe it works?

If none of those work, then we can talk about adding new things.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-25 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/18953
  
Now, it becomes `+432 â98`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-25 Thread sitalkedia

Github user sitalkedia commented on the issue:

https://github.com/apache/spark/pull/19048
  
>> this code in the EAM: Should be changed to account for the current 
number of executors, so that the EAM doesn't tell the CGSB that it wants less 
executors than currently exist. 

Actually if you look at the api, `ExecutorAllocationManager` api, this is 
how `requestTotalExecutors` behaves - `The total number of executors we'd like 
to have. The cluster manager shouldn't kill any running executor to reach this 
number, but, if all existing executors were to die, this is the number of 
executors
we'd want to be allocated.` So the EAM is right in setting the number of 
total executors it needs to 5 because lets say all executors die, it is up to 
the cluster manager to spawn 5 executors (not 10). 

>>Your solution (the new updateTotalExecutor) looks too much like the 
existing replace parameter, and it's a little confusing if you try to think 
about how to use both. What does it mean to ask for updateTotalExecutor = false 
and replace = false? The latter means you want the executor count to go down, 
while the former means you don't.

I agree with you on this. May be it would be cleaner if we provide a new 
api like this - `killExecutorsAndNotUpdateTotal`? 


 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18966: [SPARK-21751][SQL] CodeGeneraor.splitExpressions ...

2017-08-25 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18966#discussion_r135378299
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -769,16 +769,21 @@ class CodegenContext {
   foldFunctions: Seq[String] => String = _.mkString("", ";\n", ";")): 
String = {
 val blocks = new ArrayBuffer[String]()
 val blockBuilder = new StringBuilder()
+val maxLines = SQLConf.get.maxCodegenLinesPerFunction
--- End diff --

What do you mean? Could you elaborate this review comment?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18488: [SPARK-21255][SQL][WIP] Fixed NPE when creating e...

2017-08-25 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18488#discussion_r135377225
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala
 ---
@@ -118,6 +119,10 @@ object JavaTypeInference {
 val (valueDataType, nullable) = inferDataType(valueType, 
seenTypeSet)
 (MapType(keyDataType, valueDataType, nullable), true)
 
+  case other if other.isEnum =>
+(StructType(Seq(StructField(typeToken.getRawType.getSimpleName,
--- End diff --

why we map enum to struct type? shouldn't enum always have a single field?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18645: [SPARK-14280][BUILD][WIP] Update change-version.sh and p...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18645
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18645: [SPARK-14280][BUILD][WIP] Update change-version.sh and p...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18645
  
**[Test build #81143 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81143/testReport)**
 for PR 18645 at commit 
[`273dbdb`](https://github.com/apache/spark/commit/273dbdb4b9544630141415ef43b31ec522ff0bd8).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18645: [SPARK-14280][BUILD][WIP] Update change-version.sh and p...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18645
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81143/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19057: SPARK-21843: testNameNote should be (minNumPostShufflePa...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19057
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19057: SPARK-21843: testNameNote should be (minNumPostSh...

2017-08-25 Thread iamhumanbeing

GitHub user iamhumanbeing opened a pull request:

https://github.com/apache/spark/pull/19057

SPARK-21843: testNameNote should be (minNumPostShufflePartitions: 5)" in 
ExchangeCoordinatorSuitegi

## What changes were proposed in this pull request?

testNameNote = "(minNumPostShufflePartitions: 3) is not correct. 
it should be "(minNumPostShufflePartitions: " + numPartitions + ")" in 
ExchangeCoordinatorSuite

## How was this patch tested?

unit tests


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/iamhumanbeing/spark testNameNote

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19057.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19057


commit 77fee981d2cc0ee505a1bf853d1fe3fb53346340
Author: Peng Xiao 
Date:   2017-08-24T18:15:00Z

change Set(5, 3) to Seq(5, 3, 5) & Set(2, 3) to Seq(2, 2, 2, 3) in 
ExchangeCoordinatorSuite.scala

commit 5a069486d4dace726f6792f82454975ae10a2190
Author: Peng Xiao 
Date:   2017-08-26T00:42:43Z

SPARK-21843: testNameNote should be  in 
ExchangeCoordinatorSuite

Signed-off-by: Peng Xiao 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19056
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81147/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19056
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19056
  
**[Test build #81147 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81147/testReport)**
 for PR 19056 at commit 
[`5c61a13`](https://github.com/apache/spark/commit/5c61a13f53f09673705fcc1baa6c084e593c8b00).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18953
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81148/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18953
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18953
  
**[Test build #81148 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81148/testReport)**
 for PR 18953 at commit 
[`6548cf8`](https://github.com/apache/spark/commit/6548cf877cf71eccf7cc6c4e14072b7d478c4e74).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-25 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/19048
  
I think I see what you're saying. But I still think it's the fault of the 
EAM.

> But please note that while killing 2 executors the EAM did not reduce its 
target to 3, it is still 5.

And I think the problem here is that the EAM should not be telling the CGSB 
that the target is 5 when 5 is actually the "minimum" the EAM wants, but there 
may be more executors running that haven't timed out yet. Basically, this code 
in the EAM:

```
  if (numExecutorsTarget < oldNumExecutorsTarget) {
client.requestTotalExecutors(numExecutorsTarget, 
localityAwareTasks, hostToLocalTaskCount)
logDebug(s"Lowering target number of executors to 
$numExecutorsTarget (previously " +
  s"$oldNumExecutorsTarget) because not all requested executors are 
actually needed")
  }
```

Should be changed to account for the current number of executors, so that 
the EAM doesn't tell the CGSB that it wants less executors than currently 
exist. Because even if the EAM may not currently "need" the extra executors, it 
hasn't timed them out, so they need to be counted towards the "number of 
executors that I expect to be active".

Your solution (the new `updateTotalExecutor`) looks too much like the 
existing `replace` parameter, and it's a little confusing if you try to think 
about how to use both. What does it mean to ask for `updateTotalExecutor = 
false` and `replace  = false`? The latter means you want the executor count to 
go down, while the former means you don't.

Now if the EAM tells the CGSB the correct amount of executors it expects to 
be active (which means something like `max(executors I need, active 
executors)`) then the problem should go away, no?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-25 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/18805
  
Yes, licenses have to be updated, that's the one type of thing that's not 
optional.
But Marcelo is right that the library actually doesn't yet include the 
newer dependency with the right license. We can't pull it in until it pulls in 
1.3.1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-08-25 Thread sitalkedia

Github user sitalkedia commented on the issue:

https://github.com/apache/spark/pull/18805
  
>> I think this will be OK but we do need to add these two licenses to 
licenses/ (see the convention there) and also add a line for each in LICENSE 
here.

@srowen - Does that need to be done with this PR? What are the next steps 
for this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19056
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81145/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19056
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19056
  
**[Test build #81145 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81145/testReport)**
 for PR 19056 at commit 
[`4036767`](https://github.com/apache/spark/commit/4036767f68770324901ee3edbe01f30fe3bba1b4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-25 Thread sitalkedia

Github user sitalkedia commented on the issue:

https://github.com/apache/spark/pull/19048
  
>> Why? Because of the idle timeout? If that's your point, then the change 
I referenced above should avoid that.

Yes because of idle timeout. Note that the `numExecutorsTarget` is 5 and 
EAM has 10 executors available, so it is fine to kill 2 of them. That is not 
the issue. 

>> How? The scheduler (a.k.a. CGSB) does not kill executors on its own. It 
has to be told to do so in some way

Because the EAM asks it to kill 2 of them. But please note that while 
killing 2 executors the EAM did not reduce its target to 3, it is still 5. But 
since scheduler keeps its internal target, it reduces its target from 5 to 3. 
And the EAM and scheduler gets out of sync.

>> If you can actually provide logs that show what you're trying to say 
that would probably be easier.

Actually, I added a lot of debug log to find this issue so probably the log 
is not going to be of any help to you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-25 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/19048
  
If you can actually provide logs that show what you're trying to say that 
would probably be easier.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-25 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/19048
  
> but the EAM asks the scheduler to kill 2 of them.

Why? Because of the idle timeout? If that's your point, then the change I 
referenced above should avoid that.

> The scheduler decieds to kill 2 of them and sets the new target as 3. 
While the EAM has set the target as 5

How? The scheduler (a.k.a. CGSB) does not kill executors on its own. It has 
to be told to do so in some way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19048: [SPARK-21834] Incorrect executor request in case of dyna...

2017-08-25 Thread sitalkedia

Github user sitalkedia commented on the issue:

https://github.com/apache/spark/pull/19048
  
To be clear there is no issue on EAM side. Consider the following situation 
-

- 10 executors are running, each executor can run 4 tasks at max.
- 20 tasks are running so EAM sets the internal target to 5 and also asks 
the CGSB to set its `requestedTotalExecutors` to 5. However, it can not kill 
any executor yet because all of them have atleast one running tasks.
- 2 tasks on 2 executors succeeds, now the target is still 5 (18/4), but 
the EAM asks the scheduler to kill 2 of them. 
- The scheduler decieds to kill 2 of them and sets the new target as 3. 
While the EAM has set the target as 5.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19056
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81144/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19056
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19056
  
**[Test build #81144 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81144/testReport)**
 for PR 19056 at commit 
[`b833495`](https://github.com/apache/spark/commit/b83349567760dd0d33388d3fc68d8db1b648e1f1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18953
  
**[Test build #81148 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81148/testReport)**
 for PR 18953 at commit 
[`6548cf8`](https://github.com/apache/spark/commit/6548cf877cf71eccf7cc6c4e14072b7d478c4e74).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19047: [SPARK-21798]: No config to replace deprecated SPARK_CLA...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19047
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19047: [SPARK-21798]: No config to replace deprecated SPARK_CLA...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19047
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81141/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19047: [SPARK-21798]: No config to replace deprecated SPARK_CLA...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19047
  
**[Test build #81141 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81141/testReport)**
 for PR 19047 at commit 
[`e421a03`](https://github.com/apache/spark/commit/e421a03acbd410a835cf3117fe6592523dc649b5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18098: [SPARK-16944][Mesos] Improve data locality when launchin...

2017-08-25 Thread gpang

Github user gpang commented on the issue:

https://github.com/apache/spark/pull/18098
  
@mgummelt @lins05 @skonto The tests passed with SparkQA, but something on 
AppVeyor failed. The AppVeyor output also shows the tests pass but it timed out 
with something. Do you know how to re-trigger the AppVeyor test? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19055: [SPARK-21839][SQL] Support SQL config for ORC compressio...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19055
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81140/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19055: [SPARK-21839][SQL] Support SQL config for ORC compressio...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19055
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19055: [SPARK-21839][SQL] Support SQL config for ORC compressio...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19055
  
**[Test build #81140 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81140/testReport)**
 for PR 19055 at commit 
[`5998c29`](https://github.com/apache/spark/commit/5998c296407a677b0cc7a810802c9b8dfb171b53).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18953
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18953
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81142/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18953
  
**[Test build #81142 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81142/testReport)**
 for PR 18953 at commit 
[`b9b348d`](https://github.com/apache/spark/commit/b9b348de40bab16fd43d033f7191e6ee868246af).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread joseph-torres

Github user joseph-torres commented on the issue:

https://github.com/apache/spark/pull/19056
  
Addressed all comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19056
  
**[Test build #81147 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81147/testReport)**
 for PR 19056 at commit 
[`5c61a13`](https://github.com/apache/spark/commit/5c61a13f53f09673705fcc1baa6c084e593c8b00).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19055: [SPARK-21839][SQL] Support SQL config for ORC compressio...

2017-08-25 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19055
  
Hi, @cloud-fan and @gatorsmile .
Could you review this ORC option PR? This is spun off from #18953 in order 
to reduce the review burden.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19055: [SPARK-21839][SQL] Support SQL config for ORC compressio...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19055
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19055: [SPARK-21839][SQL] Support SQL config for ORC compressio...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19055
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81139/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19055: [SPARK-21839][SQL] Support SQL config for ORC compressio...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19055
  
**[Test build #81139 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81139/testReport)**
 for PR 19055 at commit 
[`f3ccfec`](https://github.com/apache/spark/commit/f3ccfec01851079393521884c3c5df1d0cc92644).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19056
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19056
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81146/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19056
  
**[Test build #81146 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81146/testReport)**
 for PR 19056 at commit 
[`a25534e`](https://github.com/apache/spark/commit/a25534eb2ef7c303ff77dce92aad543ca6c171d7).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19056: [SPARK-21765] Check that optimization doesn't aff...

2017-08-25 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/19056#discussion_r135360845
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala
 ---
@@ -65,11 +66,12 @@ object PropagateEmptyRelation extends Rule[LogicalPlan] 
with PredicateHelper {
   case _: RepartitionByExpression => empty(p)
   // An aggregate with non-empty group expression will return one 
output row per group when the
   // input to the aggregate is not empty. If the input to the 
aggregate is empty then all groups
-  // will be empty and thus the output will be empty.
+  // will be empty and thus the output will be empty. If we're working 
on batch data, we can
+  // then treat the aggregate as redundant.
   //
   // If the grouping expressions are empty, however, then the 
aggregate will always produce a
   // single output row and thus we cannot propagate the EmptyRelation.
-  case Aggregate(ge, _, _) if ge.nonEmpty => empty(p)
+  case Aggregate(ge, _, _) if ge.nonEmpty and !p.isStreaming => 
empty(p)
--- End diff --

also make sure that this exception is covered by the tests. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19056
  
**[Test build #81146 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81146/testReport)**
 for PR 19056 at commit 
[`a25534e`](https://github.com/apache/spark/commit/a25534eb2ef7c303ff77dce92aad543ca6c171d7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19056: [SPARK-21765] Check that optimization doesn't aff...

2017-08-25 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/19056#discussion_r135360650
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala
 ---
@@ -65,11 +66,12 @@ object PropagateEmptyRelation extends Rule[LogicalPlan] 
with PredicateHelper {
   case _: RepartitionByExpression => empty(p)
   // An aggregate with non-empty group expression will return one 
output row per group when the
   // input to the aggregate is not empty. If the input to the 
aggregate is empty then all groups
-  // will be empty and thus the output will be empty.
+  // will be empty and thus the output will be empty. If we're working 
on batch data, we can
+  // then treat the aggregate as redundant.
   //
   // If the grouping expressions are empty, however, then the 
aggregate will always produce a
   // single output row and thus we cannot propagate the EmptyRelation.
-  case Aggregate(ge, _, _) if ge.nonEmpty => empty(p)
+  case Aggregate(ge, _, _) if ge.nonEmpty and !p.isStreaming => 
empty(p)
--- End diff --

Can you add to the docs above why we are avoiding this when its streaming.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19056
  
**[Test build #81145 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81145/testReport)**
 for PR 19056 at commit 
[`4036767`](https://github.com/apache/spark/commit/4036767f68770324901ee3edbe01f30fe3bba1b4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19056: [SPARK-21765] Check that optimization doesn't aff...

2017-08-25 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/19056#discussion_r135358693
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
 ---
@@ -63,6 +63,11 @@ abstract class RuleExecutor[TreeType <: TreeNode[_]] 
extends Logging {
   /** Defines a sequence of rule batches, to be overridden by the 
implementation. */
   protected def batches: Seq[Batch]
 
+  /** Checks invariants that should hold across rule execution. */
--- End diff --

nit: rule executions*s*


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19056: [SPARK-21765] Check that optimization doesn't aff...

2017-08-25 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/19056#discussion_r135358635
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala
 ---
@@ -86,6 +91,8 @@ abstract class RuleExecutor[TreeType <: TreeNode[_]] 
extends Logging {
 val runTime = System.nanoTime() - startTime
 RuleExecutor.timeMap.addAndGet(rule.ruleName, runTime)
 
+checkInvariants(result, plan, rule)
--- End diff --

Call this only when the plan has changed. So just move this inside the 
condition below.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19056: [SPARK-21765] Check that optimization doesn't aff...

2017-08-25 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/19056#discussion_r135358597
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -39,6 +39,15 @@ abstract class Optimizer(sessionCatalog: SessionCatalog)
 
   protected def fixedPoint = FixedPoint(SQLConf.get.optimizerMaxIterations)
 
+  override protected def checkInvariants(
+  result: LogicalPlan,
+  original: LogicalPlan,
+  rule: Rule[LogicalPlan]): Unit = {
+assert(
+  result.isStreaming == original.isStreaming,
+  s"Rule ${rule.ruleName} changed isStreaming from original 
${original.isStreaming}")
--- End diff --

Print the original and result plans as well. So that its easy to debug.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19032: [SPARK-17321][YARN] Avoid writing shuffle metadat...

2017-08-25 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/19032#discussion_r135356694
  
--- Diff: 
common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java
 ---
@@ -170,7 +178,7 @@ protected void serviceInit(Configuration conf) throws 
Exception {
   List bootstraps = Lists.newArrayList();
   boolean authEnabled = conf.getBoolean(SPARK_AUTHENTICATE_KEY, 
DEFAULT_SPARK_AUTHENTICATE);
   if (authEnabled) {
-createSecretManager();
+createSecretManager(recoveryEnabled);
--- End diff --

I think at this point it would be cleaner to do:

```
secretManager = new ShuffleSecretManager();
if (recoveryEnabled) {
  loadSecretsFromDb();
}
```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19032: [SPARK-17321][YARN] Avoid writing shuffle metadat...

2017-08-25 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/19032#discussion_r135356403
  
--- Diff: 
common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java
 ---
@@ -73,6 +75,8 @@
 public class YarnShuffleService extends AuxiliaryService {
   private static final Logger logger = 
LoggerFactory.getLogger(YarnShuffleService.class);
 
+  private static final boolean DEFAULT_NM_RECOVERY_ENABLED = false;
--- End diff --

Isn't this in `YarnConfiguration`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18659: [SPARK-21404][PYSPARK][WIP] Simple Python Vectorized UDF...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18659
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18659: [SPARK-21404][PYSPARK][WIP] Simple Python Vectorized UDF...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18659
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81138/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18659: [SPARK-21404][PYSPARK][WIP] Simple Python Vectorized UDF...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18659
  
**[Test build #81138 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81138/testReport)**
 for PR 18659 at commit 
[`38474d8`](https://github.com/apache/spark/commit/38474d8cf78ecb2ffad7c185bf9d74a2a52c2de7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-25 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r135353557
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/config.scala
 ---
@@ -58,13 +58,40 @@ package object config {
 
   private[spark] val DRIVER_LABELS =
 ConfigBuilder("spark.mesos.driver.labels")
-  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs.  Key-value " +
+  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs. Key-value " +
 "pairs should be separated by a colon, and commas used to list 
more than one." +
 "Ex. key:value,key2:value2")
   .stringConf
   .createOptional
 
-  private[spark] val DRIVER_FAILOVER_TIMEOUT =
+  private[spark] val SECRET_NAME =
+ConfigBuilder("spark.mesos.driver.secret.name")
+  .doc("A comma-separated list of secret references. Consult the Mesos 
Secret protobuf for " +
+"more information.")
+  .stringConf
+  .createOptional
+
+  private[spark] val SECRET_VALUE =
+ConfigBuilder("spark.mesos.driver.secret.value")
+  .doc("A comma-separated list of secret values.")
--- End diff --

Ditto.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-25 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r135353586
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/config.scala
 ---
@@ -58,13 +58,40 @@ package object config {
 
   private[spark] val DRIVER_LABELS =
 ConfigBuilder("spark.mesos.driver.labels")
-  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs.  Key-value " +
+  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs. Key-value " +
 "pairs should be separated by a colon, and commas used to list 
more than one." +
 "Ex. key:value,key2:value2")
   .stringConf
   .createOptional
 
-  private[spark] val DRIVER_FAILOVER_TIMEOUT =
+  private[spark] val SECRET_NAME =
+ConfigBuilder("spark.mesos.driver.secret.name")
+  .doc("A comma-separated list of secret references. Consult the Mesos 
Secret protobuf for " +
+"more information.")
+  .stringConf
+  .createOptional
+
+  private[spark] val SECRET_VALUE =
+ConfigBuilder("spark.mesos.driver.secret.value")
+  .doc("A comma-separated list of secret values.")
+  .stringConf
+  .createOptional
+
+  private[spark] val SECRET_ENVKEY =
+ConfigBuilder("spark.mesos.driver.secret.envkey")
+  .doc("A comma-separated list of the environment variables to contain 
the secrets." +
--- End diff --

Ditto.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-25 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r135353454
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/config.scala
 ---
@@ -58,13 +58,40 @@ package object config {
 
   private[spark] val DRIVER_LABELS =
 ConfigBuilder("spark.mesos.driver.labels")
-  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs.  Key-value " +
+  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs. Key-value " +
 "pairs should be separated by a colon, and commas used to list 
more than one." +
 "Ex. key:value,key2:value2")
   .stringConf
   .createOptional
 
-  private[spark] val DRIVER_FAILOVER_TIMEOUT =
+  private[spark] val SECRET_NAME =
+ConfigBuilder("spark.mesos.driver.secret.name")
+  .doc("A comma-separated list of secret references. Consult the Mesos 
Secret protobuf for " +
--- End diff --

If this is a list it should be created with `.toSequence`; it returns the 
value properly parsed as a list to you. And it should probably be called 
`.names`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-25 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r135353607
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/config.scala
 ---
@@ -58,13 +58,40 @@ package object config {
 
   private[spark] val DRIVER_LABELS =
 ConfigBuilder("spark.mesos.driver.labels")
-  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs.  Key-value " +
+  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs. Key-value " +
 "pairs should be separated by a colon, and commas used to list 
more than one." +
 "Ex. key:value,key2:value2")
   .stringConf
   .createOptional
 
-  private[spark] val DRIVER_FAILOVER_TIMEOUT =
+  private[spark] val SECRET_NAME =
+ConfigBuilder("spark.mesos.driver.secret.name")
+  .doc("A comma-separated list of secret references. Consult the Mesos 
Secret protobuf for " +
+"more information.")
+  .stringConf
+  .createOptional
+
+  private[spark] val SECRET_VALUE =
+ConfigBuilder("spark.mesos.driver.secret.value")
+  .doc("A comma-separated list of secret values.")
+  .stringConf
+  .createOptional
+
+  private[spark] val SECRET_ENVKEY =
+ConfigBuilder("spark.mesos.driver.secret.envkey")
+  .doc("A comma-separated list of the environment variables to contain 
the secrets." +
+"The environment variable will be set on the driver.")
+  .stringConf
+  .createOptional
+
+  private[spark] val SECRET_FILENAME =
+ConfigBuilder("spark.mesos.driver.secret.filename")
+  .doc("A comma-seperated list of file paths secret will be written 
to.  Consult the Mesos " +
--- End diff --

Ditto.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19056: [SPARK-21765] Check that optimization doesn't affect isS...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19056
  
**[Test build #81144 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81144/testReport)**
 for PR 19056 at commit 
[`b833495`](https://github.com/apache/spark/commit/b83349567760dd0d33388d3fc68d8db1b648e1f1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19056: [SPARK-21765] Check that optimization doesn't aff...

2017-08-25 Thread joseph-torres

GitHub user joseph-torres opened a pull request:

https://github.com/apache/spark/pull/19056

[SPARK-21765] Check that optimization doesn't affect isStreaming bit.

## What changes were proposed in this pull request?

Add an assert in logical plan optimization that the isStreaming bit stays 
the same, and fix empty relation rules where that wasn't happening.

## How was this patch tested?

new and existing unit tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/joseph-torres/spark SPARK-21765-followup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19056.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19056


commit b83349567760dd0d33388d3fc68d8db1b648e1f1
Author: Jose Torres 
Date:   2017-08-25T20:48:49Z

Check that optimization doesn't affect isStreaming bit.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19038: [SPARK-21806][MLLIB] BinaryClassificationMetrics ...

2017-08-25 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/19038#discussion_r135348933
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/mllib/evaluation/BinaryClassificationMetricsSuite.scala
 ---
@@ -111,7 +109,7 @@ class BinaryClassificationMetricsSuite extends 
SparkFunSuite with MLlibTestSpark
 val fpr = Seq(1.0)
 val rocCurve = Seq((0.0, 0.0)) ++ fpr.zip(recalls) ++ Seq((1.0, 1.0))
 val pr = recalls.zip(precisions)
-val prCurve = Seq((0.0, 1.0)) ++ pr
+val prCurve = Seq((0.0, 0.0)) ++ pr
--- End diff --

(This was the only actual test change)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18645: [SPARK-14280][BUILD][WIP] Update change-version.sh and p...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18645
  
**[Test build #81143 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81143/testReport)**
 for PR 18645 at commit 
[`273dbdb`](https://github.com/apache/spark/commit/273dbdb4b9544630141415ef43b31ec522ff0bd8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19013: [SPARK-21728][core] Allow SparkSubmit to use Logging.

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19013
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19013: [SPARK-21728][core] Allow SparkSubmit to use Logging.

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19013
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81135/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19013: [SPARK-21728][core] Allow SparkSubmit to use Logging.

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19013
  
**[Test build #81135 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81135/testReport)**
 for PR 19013 at commit 
[`8665f71`](https://github.com/apache/spark/commit/8665f7199cccef0b447bc13128c9152d5169e2a0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19053: [SPARK-21837][SQL][TESTS] UserDefinedTypeSuite Lo...

2017-08-25 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19053


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18953
  
**[Test build #81142 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81142/testReport)**
 for PR 18953 at commit 
[`b9b348d`](https://github.com/apache/spark/commit/b9b348de40bab16fd43d033f7191e6ee868246af).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19053: [SPARK-21837][SQL][TESTS] UserDefinedTypeSuite Local UDT...

2017-08-25 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19053
  
Thanks! Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19053: [SPARK-21837][SQL][TESTS] UserDefinedTypeSuite Local UDT...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19053
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81137/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18581: [SPARK-21289][SQL][ML] Supports custom line separ...

2017-08-25 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18581#discussion_r135346452
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFileLinesReader.scala
 ---
@@ -32,7 +32,9 @@ import 
org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
  * in that file.
  */
 class HadoopFileLinesReader(
-file: PartitionedFile, conf: Configuration) extends Iterator[Text] 
with Closeable {
+file: PartitionedFile,
+lineSeparator: Option[String],
--- End diff --

Is Hive using Hadoop's `LineRecordReader`? How does Hive support it?

If possible, we always try to behave the same like Hive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19053: [SPARK-21837][SQL][TESTS] UserDefinedTypeSuite Local UDT...

2017-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19053
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19053: [SPARK-21837][SQL][TESTS] UserDefinedTypeSuite Local UDT...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19053
  
**[Test build #81137 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81137/testReport)**
 for PR 19053 at commit 
[`a80c79a`](https://github.com/apache/spark/commit/a80c79a39ac572fa25de79cabdfe28e2b6e95db4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19053: [SPARK-21837][SQL][TESTS] UserDefinedTypeSuite Local UDT...

2017-08-25 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19053
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19047: [SPARK-21798]: No config to replace deprecated SPARK_CLA...

2017-08-25 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/19047
  
LGTM. I'd like to see these daemons start using normal Spark configs like 
the applications do, but that's a separate, larger change...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19055: [SPARK-21839][SQL] Support SQL config for ORC compressio...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19055
  
**[Test build #81140 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81140/testReport)**
 for PR 19055 at commit 
[`5998c29`](https://github.com/apache/spark/commit/5998c296407a677b0cc7a810802c9b8dfb171b53).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19047: [SPARK-21798]: No config to replace deprecated SPARK_CLA...

2017-08-25 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19047
  
**[Test build #81141 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81141/testReport)**
 for PR 19047 at commit 
[`e421a03`](https://github.com/apache/spark/commit/e421a03acbd410a835cf3117fe6592523dc649b5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 >

1 - 100 of 300 matches

Mail list logo