[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14803
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14803
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65812/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15205: [SPARK-16240][ML] ML persistence backward compati...

2016-09-22 Thread jkbradley
Github user jkbradley closed the pull request at:

https://github.com/apache/spark/pull/15205


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14803
  
**[Test build #65812 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65812/consoleFull)**
 for PR 14803 at commit 
[`e21536e`](https://github.com/apache/spark/commit/e21536e7c20253cf2c04f80041592d8b095dbff4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class DeleteFile(file: File) extends ExternalAction `
  * `  trait ExternalAction extends StreamAction `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10212: [SPARK-12221] add cpu time to metrics

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/10212
  
**[Test build #65815 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65815/consoleFull)**
 for PR 10212 at commit 
[`f0ef503`](https://github.com/apache/spark/commit/f0ef503f9e732b91d405d1e15dade58e78999052).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when the data ...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15041
  
**[Test build #65814 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65814/consoleFull)**
 for PR 15041 at commit 
[`4b29ded`](https://github.com/apache/spark/commit/4b29ded0e678a50c53a38bcac5d0b6906141558e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #10212: [SPARK-12221] add cpu time to metrics

2016-09-22 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/10212#discussion_r80184799
  
--- Diff: core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala 
---
@@ -1097,7 +1100,9 @@ private[spark] object JsonProtocolSuite extends 
Assertions {
   |  },
   |  "Task Metrics": {
   |"Executor Deserialize Time": 300,
+  |"Executor Deserialize CPU Time": 0,
--- End diff --

Yeah I tested it on my testing cluster, but this makes sense. I will add 
non-zero CPU times by setting the CPU times same as given wall times.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #10212: [SPARK-12221] add cpu time to metrics

2016-09-22 Thread jisookim0513
Github user jisookim0513 commented on a diff in the pull request:

https://github.com/apache/spark/pull/10212#discussion_r80184744
  
--- Diff: 
core/src/test/resources/HistoryServerExpectations/complete_stage_list_json_expectation.json
 ---
@@ -6,6 +6,7 @@
   "numCompleteTasks" : 8,
   "numFailedTasks" : 0,
   "executorRunTime" : 162,
+  "executorCpuTime" : 0,
--- End diff --

Oh no, these are expected outputs. I think the inputs are stored under 
`src/test/resources/spark-events`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15205: [SPARK-16240][ML] ML persistence backward compatibility ...

2016-09-22 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/15205
  
Local testing worked, so merging with branch-2.0 now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15102: [SPARK-17346][SQL] Add Kafka source for Structured Strea...

2016-09-22 Thread jwbear
Github user jwbear commented on the issue:

https://github.com/apache/spark/pull/15102
  
Just curious looking at this, if you are comparing "sequential" offsets 
across partitions a rebalance would definitely affect this and, unless 
something has changed, it probably not a good idea to compare offsets from 
kafka across partitions. You could simply add an id/timestamp to the producer 
and send it with the message rather than using this methodology or if you must 
use offset query the broker for the full list and compare what you consumed to 
that list (small increase in latency btwn consumption and processing).  This is 
from the Kafka paper, which makes me question your scheme: "...Note that our 
message ids are increasing but not consecutive. To compute the id of the next 
message, we have to add the length of the current message to its id." This 
means simply comparing which offsets are larger will not necessarily yield you 
the most recent message across partitions and definitely won't hold in a 
rebalance during which time some broker logs will be on hold and not consumed.
  In my own implementation, the offsets are great for message guarantees (eg 
delivery/consumption checks), because the broker has a full ordered list, but 
not for cross partition ordering. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14897: [SPARK-17338][SQL] add global temp view

2016-09-22 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14897
  
We also need to update [the Analyzer rule 
`ResolveRelations`](https://github.com/apache/spark/blob/248922fd4fb7c11a40304431e8cc667a8911a906/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L461-L462).
 Otherwise, the following query will fail: 
```Scala
sql(s"SELECT * from $globalTempDB.src").show()
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14359: [SPARK-16719][ML] Random Forests should communica...

2016-09-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14359


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14359: [SPARK-16719][ML] Random Forests should communicate fewe...

2016-09-22 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/14359
  
Merging with master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-22 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80183259
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -188,6 +199,10 @@ class SessionCatalog(
 
   def setCurrentDatabase(db: String): Unit = {
 val dbName = formatDatabaseName(db)
+if (dbName == globalTempDB) {
--- End diff --

When `globalTempDB` is set to a name that is not in the lower case, this 
compare is not right. Thus, `formatDatabaseName` need to be applied to both 
sides. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15210: [SPARK-17604][SQL][Streaming] Supprt purging aged file e...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15210
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65811/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15210: [SPARK-17604][SQL][Streaming] Supprt purging aged file e...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15210
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14698: [SPARK-17061][SPARK-17093][SQL] `MapObjects` shou...

2016-09-22 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14698#discussion_r80182811
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -136,7 +136,7 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks {
 // some expression is reusing variable names across different 
instances.
 // This behavior is tested in ExpressionEvalHelperSuite.
 val plan = generateProject(
-  GenerateUnsafeProjection.generate(
+  UnsafeProjection.create(
--- End diff --

@lw-lin without this patch's changes to ExpressionEvalHelper.scala, this 
test still passes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15210: [SPARK-17604][SQL][Streaming] Supprt purging aged file e...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15210
  
**[Test build #65811 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65811/consoleFull)**
 for PR 15210 at commit 
[`20a6c4b`](https://github.com/apache/spark/commit/20a6c4b2116c8b41bf675e40c0bb9a5297225051).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15172
  
**[Test build #65813 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65813/consoleFull)**
 for PR 15172 at commit 
[`da8aee6`](https://github.com/apache/spark/commit/da8aee619310cbe3525626bb652dcaf53beed42d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15090: [SPARK-17073] [SQL] generate column-level statistics

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15090
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15090: [SPARK-17073] [SQL] generate column-level statistics

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15090
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65810/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15090: [SPARK-17073] [SQL] generate column-level statistics

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15090
  
**[Test build #65810 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65810/consoleFull)**
 for PR 15090 at commit 
[`bb19f72`](https://github.com/apache/spark/commit/bb19f72789abc960efb937712512c0716fecd800).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-22 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80180856
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -36,6 +36,9 @@ import org.apache.spark.sql.catalyst.util.StringUtils
 
 object SessionCatalog {
   val DEFAULT_DATABASE = "default"
+
+  val GLOBAL_TEMP_DB_CONF_KEY = "spark.sql.database.globalTemp"
--- End diff --

Should we follow `spark.sql.catalogImplementation` and define it 
[here](https://github.com/apache/spark/blob/2cd1bfa4f0c6625b0ab1dbeba2b9586b9a6a9f42/core/src/main/scala/org/apache/spark/internal/config/package.scala#L95-L99)?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15209: replace function type with function isinstance

2016-09-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15209#discussion_r80180841
  
--- Diff: python/pyspark/ml/linalg/__init__.py ---
@@ -101,7 +101,7 @@ def _vector_size(v):
 return len(v)
 elif type(v) in (array.array, list, tuple, xrange):
--- End diff --

If this change is legitimate, we should change this to `isinstance(v, 
(array.array, list, tuple, xrange))`




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15172: [SPARK-13331] AES support for over-the-wire encryption

2016-09-22 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/15172
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15204: [SPARK-17639][build] Add jce.jar to buildclasspat...

2016-09-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15204


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15204: [SPARK-17639][build] Add jce.jar to buildclasspath when ...

2016-09-22 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/15204
  
Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15209: replace function type with function isinstance

2016-09-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/15209
  
I think we need a JIRA because `type` and `isinstance` are not exactly 
same. Also, maybe it'd better if the PR descriptions explains the bug and how 
this PR tries to resolve it.

BTW, it seems you intend to support sub-classes via `isinstance` 
consistently across the API, right?

If so, there are some instances similar with this. Maybe we should check 
those as well.

```
$ grep -r "type(.*) [=|\!]" . | grep .python | grep -v "tests.py"

./python/pyspark/ml/linalg/__init__.py:elif type(v) == np.ndarray:
./python/pyspark/ml/linalg/__init__.py:if type(other) == np.ndarray:
./python/pyspark/ml/linalg/__init__.py:if type(pairs) == dict:
./python/pyspark/ml/param/__init__.py:if type(value) == list:
./python/pyspark/ml/param/__init__.py:elif type(value) == 
np.unicode_:
./python/pyspark/ml/param/__init__.py:if type(value) == bool:
./python/pyspark/mllib/linalg/__init__.py:elif type(v) == np.ndarray:
./python/pyspark/mllib/linalg/__init__.py:if type(other) == 
np.ndarray:
./python/pyspark/mllib/linalg/__init__.py:if type(pairs) == 
dict:
./python/pyspark/mllib/stat/_statistics.py:if type(y) == str:
./python/pyspark/sql/column.py:if type(startPos) != type(length):
./python/pyspark/sql/readwriter.py:if type(path) != list:
./python/pyspark/sql/readwriter.py:if type(path) == list:
./python/pyspark/sql/streaming.py:if type(interval) != str or 
len(interval.strip()) == 0:
./python/pyspark/sql/streaming.py:if type(path) != str or 
len(path.strip()) == 0:
./python/pyspark/sql/streaming.py:if not outputMode or 
type(outputMode) != str or len(outputMode.strip()) == 0:
./python/pyspark/sql/streaming.py:if not queryName or 
type(queryName) != str or len(queryName.strip()) == 0:
./python/pyspark/sql/streaming.py:if type(processingTime) != 
str or len(processingTime.strip()) == 0:
./python/pyspark/sql/types.py:return type(self) == type(other)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15204: [SPARK-17639][build] Add jce.jar to buildclasspath when ...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15204
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15204: [SPARK-17639][build] Add jce.jar to buildclasspath when ...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15204
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65808/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15204: [SPARK-17639][build] Add jce.jar to buildclasspath when ...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15204
  
**[Test build #65808 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65808/consoleFull)**
 for PR 15204 at commit 
[`33fed28`](https://github.com/apache/spark/commit/33fed28341d387def52a915c762c3db8f5c01abd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15200: Skip building R vignettes if Spark is not built

2016-09-22 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/15200
  
we could hand write the result instead of running test/Spark to generate 
the vignettes?
It seems that could be problematic if output are getting out of sync - and 
similar problem if we build doc without jar and then just skip the vignettes?

maybe we should add vignettes to profile `-Psparkr` in Maven?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14818: [SPARK-17157][SPARKR][WIP]: Add multiclass logistic regr...

2016-09-22 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14818
  
`glm` has a `link=logit` parameter? not sure if it maps to this
http://www.statmethods.net/advstats/glm.html



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15082
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65807/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15082
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15082
  
**[Test build #65807 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65807/consoleFull)**
 for PR 15082 at commit 
[`c56de6d`](https://github.com/apache/spark/commit/c56de6da72c18b2cd1f65eed956cdee89371b075).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15207: [SPARK-17643] Remove comparable requirement from Offset

2016-09-22 Thread koeninger
Github user koeninger commented on the issue:

https://github.com/apache/spark/pull/15207
  
LGTM.  

You probably already checked this, but FWIW I verified the kafka topic 
deletion test does pass once this is merged:  
https://github.com/koeninger/spark-1/tree/kafka-source-deletion


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14803
  
**[Test build #65812 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65812/consoleFull)**
 for PR 14803 at commit 
[`e21536e`](https://github.com/apache/spark/commit/e21536e7c20253cf2c04f80041592d8b095dbff4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15208: [SPARK-17641][SQL] Collect_list/Collect_set should not c...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15208
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65806/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15208: [SPARK-17641][SQL] Collect_list/Collect_set should not c...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15208
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-22 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/14803
  
>
* What error is printed (if any) if an invalid partition directory is 
created midstream.

The error is:

[info]   org.apache.spark.sql.streaming.StreamingQueryException: Query 
query-14 terminated with exception: assertio
n failed: Conflicting partition column names detected:
[info] 
[info]  Partition column name list #0: partition2
[info]  Partition column name list #1: partition
[info] 
[info] For partitioned table directories, data files should only live 
in leaf directories.
[info] And directories at the same level should have the same partition 
column name.
[info] Please check the following directories for unexpected files or 
inconsistent partition column names:
[info] 
[info]  
file:/root/repos/spark-1/target/tmp/streaming.src-c3a9895d-7be1-4ded-9154-7a24026513d7/partition2=bar
[info]  
file:/root/repos/spark-1/target/tmp/streaming.src-c3a9895d-7be1-4ded-9154-7a24026513d7/partition=bar
[info]  
file:/root/repos/spark-1/target/tmp/streaming.src-c3a9895d-7be1-4ded-9154-7a24026513d7/partition=foo
[info]   at 
org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$Strea
mExecution$$runBatches(StreamExecution.scala:211)
[info]   at 
org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:124)
[info]   Cause: java.lang.AssertionError: assertion failed: Conflicting 
partition column names detected:
[info] 
[info]  Partition column name list #0: partition2
[info]  Partition column name list #1: partition

>
* Are we okay if all of the data disappears (that has already been 
processed) and then new data arrives?

I enhanced the added test to test this. It okay, if I understand your point 
correctly here.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15208: [SPARK-17641][SQL] Collect_list/Collect_set should not c...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15208
  
**[Test build #65806 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65806/consoleFull)**
 for PR 15208 at commit 
[`37c4539`](https://github.com/apache/spark/commit/37c4539978f4e92fef9055dfae292b22392a0bf8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15195: [SPARK-17632][SQL]make console sink and other sin...

2016-09-22 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/15195#discussion_r80178129
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala 
---
@@ -290,8 +284,8 @@ final class DataStreamWriter[T] private[sql](ds: 
Dataset[T]) {
 df,
 dataSource.createSink(outputMode),
 outputMode,
-useTempCheckpointLocation = useTempCheckpointLocation,
-recoverFromCheckpointLocation = recoverFromCheckpointLocation,
+useTempCheckpointLocation = true,
--- End diff --

AFAIK, It is not suitable to use temporary checkpoint location for other 
sinks beside "console", temporary directory will be deleted after process is 
finished. So the ability of checkpoint recovery is lost.

Also for `ConsoleSink`, there's no consistency and failure recovery 
guarantee, so it should not set  `recoverFromCheckpointLocation` to `true`.

Correct me if I'm wrong.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14644: [SPARK-14082][MESOS] Enable GPU support with Mesos

2016-09-22 Thread tnachen
Github user tnachen commented on the issue:

https://github.com/apache/spark/pull/14644
  
@klueska Just updated the patch and I think it's using the right semantics 
now, where it has a global gpus max just like cores. Can you try it out?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15207: [SPARK-17643] Remove comparable requirement from Offset

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15207
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65805/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15207: [SPARK-17643] Remove comparable requirement from Offset

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15207
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15207: [SPARK-17643] Remove comparable requirement from Offset

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15207
  
**[Test build #65805 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65805/consoleFull)**
 for PR 15207 at commit 
[`76ae1ba`](https://github.com/apache/spark/commit/76ae1ba1d899b02195e6008337d38739a81f6874).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `trait Offset extends Serializable `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15159: [SPARK-17605][SPARK_SUBMIT] Add option spark.usePython a...

2016-09-22 Thread zjffdu
Github user zjffdu commented on the issue:

https://github.com/apache/spark/pull/15159
  
Add @rxin @davies @JoshRosen @shivaram for more feedback. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15205: [SPARK-16240][ML] ML persistence backward compatibility ...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15205
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15205: [SPARK-16240][ML] ML persistence backward compatibility ...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15205
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65804/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15205: [SPARK-16240][ML] ML persistence backward compatibility ...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15205
  
**[Test build #65804 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65804/consoleFull)**
 for PR 15205 at commit 
[`a3d02ce`](https://github.com/apache/spark/commit/a3d02ce8ce8ccadd59c3df0c9748367a379f1b1d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15210: [SPARK-17604][SQL][Streaming] Supprt purging aged file e...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15210
  
**[Test build #65811 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65811/consoleFull)**
 for PR 15210 at commit 
[`20a6c4b`](https://github.com/apache/spark/commit/20a6c4b2116c8b41bf675e40c0bb9a5297225051).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15210: [SPARK-17604][SQL][Streaming] Supprt purging aged...

2016-09-22 Thread jerryshao
GitHub user jerryshao opened a pull request:

https://github.com/apache/spark/pull/15210

[SPARK-17604][SQL][Streaming] Supprt purging aged file entries in 
FileStreamSourceLog

## What changes were proposed in this pull request?

Currently with 
[SPARK-15698](https://issues.apache.org/jira/browse/SPARK-15698), 
FileStreamSource metadata log will be compacted periodically (10 batches by 
default), this means compacted batch file will contain whole file entries been 
processed. With the time passed, the compacted batch file will be accumulated 
to a very large file.

With [SPARK-17165](https://issues.apache.org/jira/browse/SPARK-17165), now 
FileStreamSource doesn't track the aged file entry in memory, but in the log we 
still keep the full logs, this is not necessary and quite time-consuming during 
recovery. So here propose to also add file entry purging ability to remove aged 
file entries.

## How was this patch tested?

Unit test added.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jerryshao/apache-spark SPARK-17604

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15210.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15210


commit 20a6c4b2116c8b41bf675e40c0bb9a5297225051
Author: jerryshao 
Date:   2016-09-23T02:20:12Z

Supprt purging aged file entries in FileStreamSourceLog




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15209: replace function type with function isinstance

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15209
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15209: replace function type with function isinstance

2016-09-22 Thread frankfqchen
GitHub user frankfqchen opened a pull request:

https://github.com/apache/spark/pull/15209

replace function type with function isinstance

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)


## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)


(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/frankfqchen/spark patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15209.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15209


commit a216d2d26d0a9275f0da5e97a0ceb6fb40ec1a29
Author: frankfqchen 
Date:   2016-09-23T03:07:31Z

replace function type with function isinstance




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15204: [SPARK-17639][build] Add jce.jar to buildclasspath when ...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15204
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65802/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15089: [SPARK-15621] [SQL] Support spilling for Python U...

2016-09-22 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/15089#discussion_r80174300
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/RowQueue.scala ---
@@ -0,0 +1,278 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.spark.sql.execution.python
+
+import java.io._
+
+import com.google.common.io.Closeables
+
+import org.apache.spark.SparkException
+import org.apache.spark.memory.{MemoryConsumer, TaskMemoryManager}
+import org.apache.spark.sql.catalyst.expressions.UnsafeRow
+import org.apache.spark.unsafe.Platform
+import org.apache.spark.unsafe.memory.MemoryBlock
+
+/**
+ * A RowQueue is an FIFO queue for UnsafeRow.
+ */
+private[python] trait RowQueue {
+  /**
+   * Add a row to the end of it, returns true iff the row has added into 
it.
+   */
+  def add(row: UnsafeRow): Boolean
+
+  /**
+   * Retrieve and remove the first row, returns null if it's empty.
+   *
+   * It can only be called after add is called.
+   */
+  def remove(): UnsafeRow
+
+  /**
+   * Cleanup all the resources.
+   */
+  def close(): Unit
+}
+
+/**
+ * A RowQueue that is based on in-memory page. UnsafeRows are appended 
into it until it's full.
+ * Another thread could read from it at the same time (behind the writer).
+ *
+ * The format of UnsafeRow in page:
+ * [4 bytes to hold length of record (N)] [N bytes to hold record] [...]
+ */
+private[python] abstract class InMemoryRowQueue(val page: MemoryBlock, 
numFields: Int)
+  extends RowQueue {
+  private val base: AnyRef = page.getBaseObject
+  private val endOfPage: Long = page.getBaseOffset + page.size
+  // the first location where a new row would be written
+  private var writeOffset = page.getBaseOffset
+  // points to the start of the next row to read
+  private var readOffset = page.getBaseOffset
+  private val resultRow = new UnsafeRow(numFields)
+
+  def add(row: UnsafeRow): Boolean = {
+val size = row.getSizeInBytes
+if (writeOffset + 4 + size > endOfPage) {
+  // if there is not enough space in this page to hold the new record
+  if (writeOffset + 4 <= endOfPage) {
+// if there's extra space at the end of the page, store a special 
"end-of-page" length (-1)
+Platform.putInt(base, writeOffset, -1)
+  }
+  false
+} else {
+  Platform.putInt(base, writeOffset, size)
+  Platform.copyMemory(row.getBaseObject, row.getBaseOffset, base, 
writeOffset + 4, size)
+  writeOffset += 4 + size
+  true
+}
+  }
+
+  def remove(): UnsafeRow = {
+if (readOffset + 4 > endOfPage || Platform.getInt(base, readOffset) < 
0) {
+  null
+} else {
+  val size = Platform.getInt(base, readOffset)
+  resultRow.pointTo(base, readOffset + 4, size)
+  readOffset += 4 + size
+  resultRow
+}
+  }
+}
+
+/**
+ * A RowQueue that is backed by a file on disk. This queue will stop 
accepting new rows once any
+ * reader has begun reading from the queue.
+ */
+private[python] case class DiskRowQueue(file: File, fields: Int) extends 
RowQueue {
+  private var fout = new FileOutputStream(file.toString)
+  private var out = new DataOutputStream(new BufferedOutputStream(fout))
+  private var unreadBytes = 0L
+
+  private var fin: FileInputStream = _
+  private var in: DataInputStream = _
+  private val resultRow = new UnsafeRow(fields)
+
+  def add(row: UnsafeRow): Boolean = synchronized {
+if (out == null) {
+  // Another thread is reading, stop writing this one
+  return false
+}
+out.writeInt(row.getSizeInBytes)
+out.write(row.getBytes)
+unreadBytes += 4 + row.getSizeInBytes
+true
+  }
+
+  def remove(): UnsafeRow = synchronized {
+if (out != null) {
+  

[GitHub] spark issue #15204: [SPARK-17639][build] Add jce.jar to buildclasspath when ...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15204
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15204: [SPARK-17639][build] Add jce.jar to buildclasspath when ...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15204
  
**[Test build #65802 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65802/consoleFull)**
 for PR 15204 at commit 
[`d0136f5`](https://github.com/apache/spark/commit/d0136f585b10a7b9583a1e79417120b2d3219db2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15090: [SPARK-17073] [SQL] generate column-level statistics

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15090
  
**[Test build #65810 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65810/consoleFull)**
 for PR 15090 at commit 
[`bb19f72`](https://github.com/apache/spark/commit/bb19f72789abc960efb937712512c0716fecd800).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15206: [SPARK-17640][SQL]Avoid using -1 as the default batchId ...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15206
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65803/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15206: [SPARK-17640][SQL]Avoid using -1 as the default batchId ...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15206
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15206: [SPARK-17640][SQL]Avoid using -1 as the default batchId ...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15206
  
**[Test build #65803 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65803/consoleFull)**
 for PR 15206 at commit 
[`f506a43`](https://github.com/apache/spark/commit/f506a43b5401844e568708cee6a354c4212e3dea).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class FileEntry(path: String, timestamp: Timestamp, batchId: 
Long) extends Serializable`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15102: [SPARK-17346][SQL] Add Kafka source for Structured Strea...

2016-09-22 Thread koeninger
Github user koeninger commented on the issue:

https://github.com/apache/spark/pull/15102
  
> I agree that if/when we add that ability to add existing partitions 
midstream we'd probably need to add two offsets in to the SQL offset for new 
partitions.

It's not just existing partitions.  If you have a low-value high-volume 
stream (which is the kind of situation where you'd want auto offset reset 
latest to begin with), you may not even want your first batch to have however 
many messages got in between creation and subscription rebalance.  I dunno, I 
just don't want to assume too much.

> I'd also support JSON here, but I would not mandate it (i.e. try json 
parsing and fall back to comma separation). Its not ambiguous, supports 
consistent usage, and doesn't penalize the simple use cases.

Cool, seems reasonable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when the data ...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15041
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when the data ...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15041
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65809/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when the data ...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15041
  
**[Test build #65809 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65809/consoleFull)**
 for PR 15041 at commit 
[`09e0740`](https://github.com/apache/spark/commit/09e0740bead1a4b2fd888abdbbdfdd404dff1ead).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15199: [SPARK-17635][SQL] Remove hardcode "agg_plan" in HashAgg...

2016-09-22 Thread yucai
Github user yucai commented on the issue:

https://github.com/apache/spark/pull/15199
  
Thanks all, it should be fixed in master only, my mistake.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when the data ...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15041
  
**[Test build #65809 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65809/consoleFull)**
 for PR 15041 at commit 
[`09e0740`](https://github.com/apache/spark/commit/09e0740bead1a4b2fd888abdbbdfdd404dff1ead).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15204: [SPARK-17639][build] Add jce.jar to buildclasspath when ...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15204
  
**[Test build #65808 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65808/consoleFull)**
 for PR 15204 at commit 
[`33fed28`](https://github.com/apache/spark/commit/33fed28341d387def52a915c762c3db8f5c01abd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15102: [SPARK-17346][SQL] Add Kafka source for Structured Strea...

2016-09-22 Thread marmbrus
Github user marmbrus commented on the issue:

https://github.com/apache/spark/pull/15102
  
> "I want to be able to add a topicpartition mid stream, but I don't want 
to start it from the beginning."

I see, I was thinking only of new topics that appear that match your 
pattern.  I agree that if/when we add that ability to add existing partitions 
midstream we'd probably need to add two offsets in to the SQL offset for new 
partitions.

> I think consistency in using json for any non-scalar values is worth 2 
extra characters per topic and 4 at the ends.

I'd also support JSON here, but I would not mandate it (i.e. try json 
parsing and fall back to comma separation).  Its not ambiguous, supports 
consistent usage, and doesn't penalize the simple use cases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15082
  
**[Test build #65807 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65807/consoleFull)**
 for PR 15082 at commit 
[`c56de6d`](https://github.com/apache/spark/commit/c56de6da72c18b2cd1f65eed956cdee89371b075).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14659
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14659
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65799/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS and YA...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14659
  
**[Test build #65799 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65799/consoleFull)**
 for PR 14659 at commit 
[`47de8a2`](https://github.com/apache/spark/commit/47de8a2a9e1640e0ea942d1a689150d7b7a66c10).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15174: [SPARK-17502] [17609] [SQL] [Backport] [2.0] Fix ...

2016-09-22 Thread gatorsmile
Github user gatorsmile closed the pull request at:

https://github.com/apache/spark/pull/15174


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15174: [SPARK-17502] [17609] [SQL] [Backport] [2.0] Fix Multipl...

2016-09-22 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15174
  
Let me close it. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15102: [SPARK-17346][SQL] Add Kafka source for Structured Strea...

2016-09-22 Thread koeninger
Github user koeninger commented on the issue:

https://github.com/apache/spark/pull/15102
  
@tdas I think as long as marmbrus' PR to remove comparable from the 
interface works for sane variations of subscription changes it's the best way 
to go.  I'm honestly fine with someone getting what they deserve if they delete 
and recreate a topic in the space of a single batch or while a stream is down.

@marmbrus 
> Why do you care when it acquired it? 

This isn't so much a temporal thing, as a let the consumer do its job 
thing.  This sort of configuration should ideally be handled by 
auto.offset.reset, and we shouldn't bake in too much second guessing about it.  
There's plenty of use case for "I want to be able to add a topicpartition mid 
stream, but I don't want to start it from the beginning."

> Are you proposing users have to type

I'm saying that you guys proposed json as a workaround for the 
string->string thing.  Given that, yeah, I think consistency in using json for 
any non-scalar values is worth 2 extra characters per topic and 4 at the ends.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15206: [SPARK-17640][SQL]Avoid using -1 as the default batchId ...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15206
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15206: [SPARK-17640][SQL]Avoid using -1 as the default batchId ...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15206
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65801/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15174: [SPARK-17502] [17609] [SQL] [Backport] [2.0] Fix Multipl...

2016-09-22 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/15174
  
thanks, merging to 2.0!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15206: [SPARK-17640][SQL]Avoid using -1 as the default batchId ...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15206
  
**[Test build #65801 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65801/consoleFull)**
 for PR 15206 at commit 
[`d9177e5`](https://github.com/apache/spark/commit/d9177e5bd9cd89f70e4f5080587311d58a3a12f8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class FileEntry(path: String, timestamp: Timestamp, batchId: 
Long) extends Serializable`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15208: [SPARK-17641][SQL] Collect_list/Collect_set should not c...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15208
  
**[Test build #65806 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65806/consoleFull)**
 for PR 15208 at commit 
[`37c4539`](https://github.com/apache/spark/commit/37c4539978f4e92fef9055dfae292b22392a0bf8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15208: [SPARK-17641][SQL] Collect_list/Collect_set should not c...

2016-09-22 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/15208
  
cc @mengxr 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15208: [SPARK-17641][SQL] Collect_list/Collect_set shoul...

2016-09-22 Thread hvanhovell
GitHub user hvanhovell opened a pull request:

https://github.com/apache/spark/pull/15208

[SPARK-17641][SQL] Collect_list/Collect_set should not collect null values.

## What changes were proposed in this pull request?
We added native versions of `collect_set` and `collect_list` in Spark 2.0. 
These currently also (try to) collect null values, this is different from the 
original Hive implementation. This PR fixes this by adding a null check to the 
`Collect.update` method.

## How was this patch tested?
Added a regression test to `DataFrameAggregateSuite`.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hvanhovell/spark SPARK-17641

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15208.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15208


commit 37c4539978f4e92fef9055dfae292b22392a0bf8
Author: Herman van Hovell 
Date:   2016-09-23T01:45:38Z

Do not collect null values.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15102: [SPARK-17346][SQL] Add Kafka source for Structured Strea...

2016-09-22 Thread marmbrus
Github user marmbrus commented on the issue:

https://github.com/apache/spark/pull/15102
  
Comparable requirement removed in #15207.

> I think in the absence of prior information about the position in a 
topicpartition, you start a new batch on topic B starting from wherever the 
consumer's position was at the time it acquired the subscription, which might 
not be 0. I.e. you call position() before seekToEnd().

Why do you care when it acquired it?  If it appeared in-between the the 
last batch and now, don't you want to consume all of the available data from 
it?  Otherwise the answer is going to depend on the specifics on when you see 
the topic, which seems counter to the model of Structured Streaming.

> I think the main thing that would be confusing is to specify topics in 
one way (custom-delimited string) for one configuration, and in another way 
(structured json) for another configuration.

Are you proposing users have to type `"[\"topic1\", \"topic2\"]` (or pull 
in a json library) instead of `"topic1,topic2"`?  Seems we could pretty 
seamlessly add support for JSON in the future, while still making the common 
case easy to type.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15207: [SPARK-17643] Remove comparable requirement from Offset

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15207
  
**[Test build #65805 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65805/consoleFull)**
 for PR 15207 at commit 
[`76ae1ba`](https://github.com/apache/spark/commit/76ae1ba1d899b02195e6008337d38739a81f6874).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14971
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65796/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15207: [SPARK-17643] Remove comparable requirement from ...

2016-09-22 Thread marmbrus
GitHub user marmbrus opened a pull request:

https://github.com/apache/spark/pull/15207

[SPARK-17643] Remove comparable requirement from Offset

For some sources, it is difficult to provide a global ordering based only 
on the data in the offset.  Since we don't use comparison for correctness, lets 
remove it.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/marmbrus/spark removeComparable

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15207.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15207


commit 76ae1ba1d899b02195e6008337d38739a81f6874
Author: Michael Armbrust 
Date:   2016-09-23T01:34:38Z

[SPARK-17643] Remove comparable requirement from Offset




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14971
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14971
  
**[Test build #65796 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65796/consoleFull)**
 for PR 14971 at commit 
[`4c89d92`](https://github.com/apache/spark/commit/4c89d92ab65d7f4f061e32aa22780fd6e4b7c798).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15102: [SPARK-17346][SQL] Add Kafka source for Structured Strea...

2016-09-22 Thread tdas
Github user tdas commented on the issue:

https://github.com/apache/spark/pull/15102
  
@koeninger
I did some independent brainstorming with @zsxwing on topic deletion, and 
yeah I agree with you that attempting to account for deleted topics in the 
offset in the KafkaSourceOffset such that compareTo is satisfied is more 
complicated than just eliminating compareTo. That said, there are still a few 
corner case - of the same topic being deleted and recreated. I am not familiar 
with how often this can happen (let us know your thoughts). But the general 
idea we can implement that that we attach a unique id to the topic in the 
KafkaSourceOffset. Whenever the new topic is detected (while running or across 
query restarts), generate a unique id so that it is consider as a new topic. 
Here are the options

**Option 1: When getOffset detects new topic, if the topic existed in 
previous offset, create new (topic, unique id)**
- Pro: Simple
- Con: Cannot detect if topic gets deleted+recreated between triggers 
(possibly, across query restarts), 

**Option 2: Use RebalanceListener to know when topic has been deleted**
- Pro: Handles topic deletion+recreation between triggers while query is 
active
- Con: Misses deletion+recreation during query restarts
- Con: Listener called on different thread, so possible race conditions

**Option 3: Use the creation time / cZxid of topic info stored in ZK to 
disambiguate**
- Pro: Zookeeper maintains uniques ness across any component restarts
- Con: Requires depending on full Kafka + ZK, 
- Con: Requires knowing the exact ZK path where topics are saved, but this 
can be tested and made sure that it never fails when we upgrade Kafka

I feel that we should just keep it simple for now, and go for Option 1. 
What do you think?







---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14971
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65795/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14971
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14971
  
**[Test build #65795 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65795/consoleFull)**
 for PR 14971 at commit 
[`f4c0ebb`](https://github.com/apache/spark/commit/f4c0ebb0901216ea09eaf3f77e4fdcd431b15d37).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15204: [SPARK-17639][build] Add jce.jar to buildclasspath when ...

2016-09-22 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/15204
  
If tests pass I'll merge this to unblock #15172.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15205: [SPARK-16240][ML] ML persistence backward compatibility ...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15205
  
**[Test build #65804 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65804/consoleFull)**
 for PR 15205 at commit 
[`a3d02ce`](https://github.com/apache/spark/commit/a3d02ce8ce8ccadd59c3df0c9748367a379f1b1d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to register Jav...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9766
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65797/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to register Jav...

2016-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9766
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to register Jav...

2016-09-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/9766
  
**[Test build #65797 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65797/consoleFull)**
 for PR 9766 at commit 
[`dc31d78`](https://github.com/apache/spark/commit/dc31d78381e325e2b9af406bd1701594941866c9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >