[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21535
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/40/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21535
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3930/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit execution w...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21450
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91698/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit execution w...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21450
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit execution w...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21450
  
**[Test build #91698 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91698/testReport)**
 for PR 21450 at commit 
[`b03e3de`](https://github.com/apache/spark/commit/b03e3de9b326a8cf9061125e0f22bde2a12bf30f).
 * This patch **fails Java style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21535
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit execution w...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21450
  
**[Test build #91698 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91698/testReport)**
 for PR 21450 at commit 
[`b03e3de`](https://github.com/apache/spark/commit/b03e3de9b326a8cf9061125e0f22bde2a12bf30f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21462: [SPARK-24428][K8S] Fix unused code

2018-06-12 Thread skonto
Github user skonto commented on the issue:

https://github.com/apache/spark/pull/21462
  
@foxish gentle ping.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21535
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91692/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21535
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21535
  
**[Test build #91692 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91692/testReport)**
 for PR 21535 at commit 
[`b8c7238`](https://github.com/apache/spark/commit/b8c7238aec9d6d79b8528eb3f47c0de7a48d23e8).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `trait CodegenInterpretedTest extends QueryTest with SharedSQLContext `
  * `class DataFrameSuite extends CodegenInterpretedTest `
  * `class DatasetSuite extends CodegenInterpretedTest `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21366: [SPARK-24248][K8S] Use level triggering and state...

2018-06-12 Thread skonto
Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/21366#discussion_r194664576
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsPollingSnapshotSource.scala
 ---
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.scheduler.cluster.k8s
+
+import java.util.concurrent.{Future, ScheduledExecutorService, TimeUnit}
+
+import io.fabric8.kubernetes.client.KubernetesClient
+import scala.collection.JavaConverters._
+
+import org.apache.spark.SparkConf
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.util.ThreadUtils
+
+private[spark] class ExecutorPodsPollingSnapshotSource(
+conf: SparkConf,
+kubernetesClient: KubernetesClient,
+snapshotsStore: ExecutorPodsSnapshotsStore,
+pollingExecutor: ScheduledExecutorService) {
+
+  private val pollingInterval = 
conf.get(KUBERNETES_EXECUTOR_API_POLLING_INTERVAL)
+
+  private var pollingFuture: Future[_] = _
+
+  def start(applicationId: String): Unit = {
+require(pollingFuture == null, "Cannot start polling more than once.")
+pollingFuture = pollingExecutor.scheduleWithFixedDelay(
+  new PollRunnable(applicationId), pollingInterval, pollingInterval, 
TimeUnit.MILLISECONDS)
+  }
+
+  def stop(): Unit = {
+if (pollingFuture != null) {
+  pollingFuture.cancel(true)
+  pollingFuture = null
+}
+ThreadUtils.shutdown(pollingExecutor)
--- End diff --

The are a number of such calls, are we sure they will be executed in any 
scenario like an exception?
Are the stop calls bound to some shutdown hook? Is this covered by RX-java? 



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...

2018-06-12 Thread ssonker
Github user ssonker commented on the issue:

https://github.com/apache/spark/pull/21505
  
@viirya Done.
cc: @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21537
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3929/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21537
  
**[Test build #91697 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91697/testReport)**
 for PR 21537 at commit 
[`89d0252`](https://github.com/apache/spark/commit/89d025225b557689389d16c207be8a25f5e82fa5).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21520: [SPARK-24505][SQL] Forbidding string interpolatio...

2018-06-12 Thread viirya
Github user viirya closed the pull request at:

https://github.com/apache/spark/pull/21520


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21520: [SPARK-24505][SQL] Forbidding string interpolation in Co...

2018-06-12 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21520
  
As I will incrementally split this into smaller PRs, I will first close 
this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21537
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21537
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21537
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/39/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-06-12 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21537
  
cc @cloud-fan @hvanhovell @kiszk @mgaido91 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21537: [SPARK-24505][SQL] Convert strings in codegen to ...

2018-06-12 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/21537

[SPARK-24505][SQL] Convert strings in codegen to blocks: Cast and 
BoundAttribute

## What changes were proposed in this pull request?

This is split from #21520. This includes changes of `BoundAttribute` and 
`Cast`.
This patch also adds few convenient APIs:

```scala
CodeGenerator.freshName(name: String, dt: DataType): VariableValue
CodeGenerator.freshName(name: String, javaClass: Class[_]): VariableValue
CodeGenerator.isNullFreshName(name: String): VariableValue

JavaCode.className(javaClass: Class[_]): InlineBlock
JavaCode.javaType(dataType: DataType): InlineBlock
JavaCode.boxedType(dataType: DataType): InlineBlock
```


## How was this patch tested?

Existing tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-24505-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21537.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21537


commit 89d025225b557689389d16c207be8a25f5e82fa5
Author: Liang-Chi Hsieh 
Date:   2018-06-12T08:40:20Z

Convert strings in codegen to blocks.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...

2018-06-12 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21505
  
cc @cloud-fan  


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...

2018-06-12 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21505
  
The PR title is too long and truncated. Can you shorten it?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21501
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21501
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91693/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21501
  
**[Test build #91693 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91693/testReport)**
 for PR 21501 at commit 
[`bbd167b`](https://github.com/apache/spark/commit/bbd167b79a073f6eca67b57012d936d692f7d7c8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...

2018-06-12 Thread ssonker
Github user ssonker commented on the issue:

https://github.com/apache/spark/pull/21505
  
@kiszk @viirya Do you have more review comments that need to be 
incorporated? If not, can you please get this merged?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...

2018-06-12 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20636
  
When I check callers of `BufferHolder.grow()`, some of them call 
`ByteArrayMethods.roundNumberOfBytesToNearestWord()` and other do not call it 
(i.e. implicitly ensure word-aligned).

Is it better way to call 
`ByteArrayMethods.roundNumberOfBytesToNearestWord()` at `BufferHolder.grow()` 
instread of a caller to gurantee word-aligned?
Then, we can check whether `UnsafeRow.getSizeInBytes()` is a multiple of 8 
in `BufferHolderSparkSubmitSuite`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21534: [SPARK-24526][build] Spaces in the build dir causes fail...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21534
  
**[Test build #91696 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91696/testReport)**
 for PR 21534 at commit 
[`bb12f3e`](https://github.com/apache/spark/commit/bb12f3e2ad74f9d4c89e1c7adab4d306fa87b101).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21534: [SPARK-24526][build] Spaces in the build dir causes fail...

2018-06-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21534
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21536
  
**[Test build #91695 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91695/testReport)**
 for PR 21536 at commit 
[`2ea2181`](https://github.com/apache/spark/commit/2ea2181697038dbd2109f2daeb347d98724b93af).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21536
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21536
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/38/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21536
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3928/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21536
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21536
  
Hm .. ? retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-12 Thread jainaks
Github user jainaks commented on the issue:

https://github.com/apache/spark/pull/21320
  
Hi @mallman ,
I found another major issue after having this fix.
Schema:
a: struct (nullable = true)
 ||-- b: struct (nullable = true)
 |||-- c1: string (nullable = true)
 |||-- c2: string (nullable = true)
 |||-- c3: string (nullable = true)
 |||-- c4: string (nullable = true)
 |||-- c5: boolean (nullable = true)
id: struct (nullable = true)
 ||-- i1: struct (nullable = true)
 |||-- i2: string (nullable = true)
timestamp: bigint
**Query:**
select  a.b.c3 as c3, 
first(a.b.c3) over (partition by id.i1.i2 order by timestamp 
rows between current row and unbounded following) as first_c3
fromtemp;
The column "first_c3" gets the value of column "c2".
It works well, if i just turn the parquetSchemaPrunning flag to false.
It may sound odd in the first look and so does for me, but this is what i 
am getting.
PS: I am running all my tests using #16578 pr.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF should assi...

2018-06-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21427
  
2.3.1 wouldn't have this behaviour change and we marked this as 
experimental. So, on the other hand, it probably will give more time to expose 
that this is discouraged in production and there might be a bit of behaviour 
changes. Actually, It isn't long time comparing to other APIs we have as well 
on the other hand ...

> it turns runnable code into failure, and the old behavior is kind of 
self-consistent(by-position match). it's not like turning failures into 
runnable or fix a correctness bug.

It still sounds like we treat this API as a old stable API. It doesn't 
replace the self-consistent way completely. This PR partially fixes its 
behaviour to make it more sense, causing some corner behaviour changes which 
are quite unlikely and making no sense (IMHO).

We should be relatively less conservative for new and experimental APIs to 
promote to make it more stable and coherent as soon as possible until we remove 
the experimental note ..

The only special reason I see is that it's not a correctness bug but it 
changes the existing behaviour (which I actually don't completely agree but I 
get what you mean at least). But then what can we do for experimental APIs 
specifically .. ?




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-06-12 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21370#discussion_r194641579
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -3209,6 +3222,19 @@ class Dataset[T] private[sql](
 }
   }
 
+  private[sql] def getRowsToPython(
+  _numRows: Int,
+  truncate: Int,
+  vertical: Boolean): Array[Any] = {
+EvaluatePython.registerPicklers()
+val numRows = _numRows.max(0).min(Int.MaxValue - 1)
+val rows = getRows(numRows, truncate, vertical).map(_.toArray).toArray
+val toJava: (Any) => Any = EvaluatePython.toJava(_, 
ArrayType(ArrayType(StringType)))
+val iter: Iterator[Array[Byte]] = new SerDeUtil.AutoBatchedPickler(
+  rows.iterator.map(toJava))
+PythonRDD.serveIterator(iter, "serve-GetRows")
--- End diff --

I think we return `Array[Any]` for `PythonRDD.serveIterator` too.


https://github.com/apache/spark/blob/628c7b517969c4a7ccb26ea67ab3dd61266073ca/core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala#L400

Did I maybe miss something?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21496: docs: fix typo

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21496
  
**[Test build #91694 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91694/testReport)**
 for PR 21496 at commit 
[`fea9616`](https://github.com/apache/spark/commit/fea9616fb35e3fcf886073767da040aef3a408e0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21496: docs: fix typo

2018-06-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21496
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21276: [SPARK-24216][SQL] Spark TypedAggregateExpression uses g...

2018-06-12 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21276
  
I think this fixing is nice to have. cc @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20313: [SPARK-22974][ML] Attach attributes to output col...

2018-06-12 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20313#discussion_r194636521
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala ---
@@ -264,7 +265,9 @@ class CountVectorizerModel(
 
   Vectors.sparse(dictBr.value.size, effectiveCounts)
 }
-dataset.withColumn($(outputCol), vectorizer(col($(inputCol
+val attrs = vocabulary.map(_ => new 
NumericAttribute).asInstanceOf[Array[Attribute]]
--- End diff --

Sorry for replying late. Though I agree that this attributes don't provide 
much info, I'm wondering if we can let it lazily generated. At this point, I 
think we don't know if following transformer will need it or not?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20313: [SPARK-22974][ML] Attach attributes to output column of ...

2018-06-12 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20313
  
cc @dbtsai too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21501
  
**[Test build #91693 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91693/testReport)**
 for PR 21501 at commit 
[`bbd167b`](https://github.com/apache/spark/commit/bbd167b79a073f6eca67b57012d936d692f7d7c8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21535
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21535
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3927/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21498: [SPARK-24410][SQL][Core] Optimization for Union outputPa...

2018-06-12 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21498
  
@mgaido91 WDYT? Does the benchmark make sense to you?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover

2018-06-12 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21501
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21535
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21535
  
**[Test build #91692 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91692/testReport)**
 for PR 21535 at commit 
[`b8c7238`](https://github.com/apache/spark/commit/b8c7238aec9d6d79b8528eb3f47c0de7a48d23e8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21535
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/37/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21319
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91685/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21535
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21319
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19528: [SPARK-20393][WEBU UI][1.6] Strengthen Spark to prevent ...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19528
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91690/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21496: docs: fix typo

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21496
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91688/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21535
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19528: [SPARK-20393][WEBU UI][1.6] Strengthen Spark to prevent ...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19528
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21535
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91687/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21496: docs: fix typo

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21496
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21496: docs: fix typo

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21496
  
**[Test build #91688 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91688/testReport)**
 for PR 21496 at commit 
[`fea9616`](https://github.com/apache/spark/commit/fea9616fb35e3fcf886073767da040aef3a408e0).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21501
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91689/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21501
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-12 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21320
  
@mallman Sorry for the delay. Super busy during the Spark summit. Will 
continue the code review in the next few days. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21535
  
**[Test build #91687 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91687/testReport)**
 for PR 21535 at commit 
[`b8c7238`](https://github.com/apache/spark/commit/b8c7238aec9d6d79b8528eb3f47c0de7a48d23e8).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `trait CodegenInterpretedTest extends QueryTest with SharedSQLContext `
  * `class DataFrameSuite extends CodegenInterpretedTest `
  * `class DatasetSuite extends CodegenInterpretedTest `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21319
  
**[Test build #91685 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91685/testReport)**
 for PR 21319 at commit 
[`91fdedc`](https://github.com/apache/spark/commit/91fdedc4d91a7abde5f6b64dbfcf354b67d89a48).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21501
  
**[Test build #91689 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91689/testReport)**
 for PR 21501 at commit 
[`bbd167b`](https://github.com/apache/spark/commit/bbd167b79a073f6eca67b57012d936d692f7d7c8).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21536
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21536
  
**[Test build #91691 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91691/testReport)**
 for PR 21536 at commit 
[`2ea2181`](https://github.com/apache/spark/commit/2ea2181697038dbd2109f2daeb347d98724b93af).
 * This patch **fails Java style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21536
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91691/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21510: [SPARK-24490][WebUI] Use WebUI.addStaticHandler i...

2018-06-12 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/21510#discussion_r194632125
  
--- Diff: core/src/main/scala/org/apache/spark/ui/WebUI.scala ---
@@ -88,41 +90,41 @@ private[spark] abstract class WebUI(
 handlers += renderHandler
   }
 
-  /** Attach a handler to this UI. */
+  /** Attaches a handler to this UI. */
   def attachHandler(handler: ServletContextHandler) {
 handlers += handler
 serverInfo.foreach(_.addHandler(handler))
   }
 
-  /** Detach a handler from this UI. */
+  /** Detaches a handler from this UI. */
   def detachHandler(handler: ServletContextHandler) {
 handlers -= handler
 serverInfo.foreach(_.removeHandler(handler))
   }
 
   /**
-   * Add a handler for static content.
+   * Adds a handler for static content.
*
* @param resourceBase Root of where to find resources to serve.
* @param path Path in UI where to mount the resources.
*/
-  def addStaticHandler(resourceBase: String, path: String): Unit = {
+  def addStaticHandler(resourceBase: String, path: String = "/static"): 
Unit = {
 attachHandler(JettyUtils.createStaticHandler(resourceBase, path))
   }
 
   /**
-   * Remove a static content handler.
+   * Removes a static content handler.
*
* @param path Path in UI to unmount.
*/
   def removeStaticHandler(path: String): Unit = {
--- End diff --

OK...since @vanzin requested I'm gonna make all the other changes while at 
it :)



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21536
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/36/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21536
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21536
  
**[Test build #91691 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91691/testReport)**
 for PR 21536 at commit 
[`2ea2181`](https://github.com/apache/spark/commit/2ea2181697038dbd2109f2daeb347d98724b93af).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21536
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21536
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3926/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21536: [MINOR][CORE][TEST] Remove unnecessary sort in Un...

2018-06-12 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request:

https://github.com/apache/spark/pull/21536

[MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInMemorySorterSuite

## What changes were proposed in this pull request?

We don't require specific ordering of the input data, the sort action is 
not necessary and misleading.

## How was this patch tested?

Existing test suite.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiangxb1987/spark sorterSuite

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21536.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21536


commit 2ea2181697038dbd2109f2daeb347d98724b93af
Author: Xingbo Jiang 
Date:   2018-06-12T06:50:42Z

remove unnecessary sort




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21357
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21357
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91686/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21357
  
**[Test build #91686 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91686/testReport)**
 for PR 21357 at commit 
[`8ad2a3f`](https://github.com/apache/spark/commit/8ad2a3f8112662a865ee1dbaf7c5269197c3ee4f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21276: [SPARK-24216][SQL] Spark TypedAggregateExpression...

2018-06-12 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21276#discussion_r194630048
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2715,6 +2716,62 @@ private[spark] object Utils extends Logging {
 HashCodes.fromBytes(secretBytes).toString()
   }
 
+  /**
+   * Safer than Class obj's getSimpleName which may throw Malformed class 
name error in scala.
+   * This method mimicks scalatest's getSimpleNameOfAnObjectsClass.
+   */
+  def getSimpleName(cls: Class[_]): String = {
+try {
+  return cls.getSimpleName
+} catch {
+  case err: InternalError => return 
stripDollars(stripPackages(cls.getName))
+}
+  }
+
+  /**
+   * Remove the packages from full qualified class name
+   */
+  private def stripPackages(fullyQualifiedName: String): String = {
+fullyQualifiedName.split("\\.").takeRight(1)(0)
+  }
+
+  /**
+   * Remove trailing dollar signs from qualified class name,
+   * and return the trailing part after the last dollar sign in the middle
+   */
+  private def stripDollars(s: String): String = {
+val lastDollarIndex = s.lastIndexOf('$')
+if (lastDollarIndex < s.length - 1) {
+  // The last char is not a dollar sign
+  if (lastDollarIndex == -1 || !s.contains("$iw")) {
+// The name does not have dollar sign or is not an intepreter
+// generated class, so we should return the full string
+s
+  } else {
+// The class name is intepreter generated,
+// return the part after the last dollar sign
+// This is the same behavior as getClass.getSimpleName
+s.substring(lastDollarIndex + 1)
+  }
+}
+else {
--- End diff --

style:
```scala
if (...) {
} else {
}


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-06-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21370#discussion_r194629747
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -3209,6 +3222,19 @@ class Dataset[T] private[sql](
 }
   }
 
+  private[sql] def getRowsToPython(
+  _numRows: Int,
+  truncate: Int,
+  vertical: Boolean): Array[Any] = {
+EvaluatePython.registerPicklers()
+val numRows = _numRows.max(0).min(Int.MaxValue - 1)
+val rows = getRows(numRows, truncate, vertical).map(_.toArray).toArray
+val toJava: (Any) => Any = EvaluatePython.toJava(_, 
ArrayType(ArrayType(StringType)))
+val iter: Iterator[Array[Byte]] = new SerDeUtil.AutoBatchedPickler(
+  rows.iterator.map(toJava))
+PythonRDD.serveIterator(iter, "serve-GetRows")
--- End diff --

`PythonRDD.serveIterator(iter, "serve-GetRows")` returns `Int`, but the 
return type of `getRowsToPython `  is `Array[Any]`. How does it work? cc 
@xuanyuanking @HyukjinKwon 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19528: [SPARK-20393][WEBU UI][1.6] Strengthen Spark to prevent ...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19528
  
**[Test build #91690 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91690/consoleFull)**
 for PR 19528 at commit 
[`76ad8c5`](https://github.com/apache/spark/commit/76ad8c5e62a7233c16399043716139b52ee1c97d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20640: [SPARK-19755][Mesos] Blacklist is always active for Meso...

2018-06-12 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/20640
  
@IgorBerman any thought on this comment? 
https://github.com/apache/spark/pull/20640#discussion_r191272487


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21515: [SPARK-24372][build] Add scripts to help with pre...

2018-06-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/21515#discussion_r194625766
  
--- Diff: dev/create-release/vote.tmpl ---
@@ -0,0 +1,64 @@
+Please vote on releasing the following candidate as Apache Spark version 
{version}.
+
+The vote is open until {deadline} and passes if a majority of at least 3 
+1 PMC votes are cast.
--- End diff --

nit: personally I find `3 PMC +1 votes` more clear 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19528: [SPARK-20393][WEBU UI][1.6] Strengthen Spark to prevent ...

2018-06-12 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19528
  
Jenkins test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21515: [SPARK-24372][build] Add scripts to help with pre...

2018-06-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/21515#discussion_r194625981
  
--- Diff: dev/.rat-excludes ---
@@ -106,3 +106,4 @@ spark-warehouse
 structured-streaming/*
 kafka-source-initial-offset-version-2.1.0.bin
 kafka-source-initial-offset-future-version.bin
+vote.tmpl
--- End diff --

even if rat doesn't check, isn't vote.tmpl packaged into the source release 
this way?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21515: [SPARK-24372][build] Add scripts to help with pre...

2018-06-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/21515#discussion_r194626614
  
--- Diff: dev/create-release/spark-rm/Dockerfile ---
@@ -0,0 +1,89 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Image for building Spark releases. Based on Ubuntu 16.04.
+#
+# Includes:
+# * Java 8
+# * Ivy
+# * Python/PyPandoc (2.7.12/3.5.2)
+# * R-base/R-base-dev (3.3.2+)
+# * Ruby 2.3 build utilities
+
+FROM ubuntu:16.04
+
+# These arguments are just for reuse and not really meant to be customized.
+ARG APT_INSTALL="apt-get install --no-install-recommends -y"
+
+# Install extra needed repos and refresh.
+# - CRAN repo
+# - Ruby repo (for doc generation)
+RUN echo 'deb http://cran.cnr.Berkeley.edu/bin/linux/ubuntu xenial/' >> 
/etc/apt/sources.list && \
+  gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9 && \
+  gpg -a --export E084DAB9 | apt-key add - && \
+  apt-get clean && \
+  rm -rf /var/lib/apt/lists/* && \
+  apt-get clean && \
+  apt-get update && \
+  $APT_INSTALL software-properties-common && \
+  apt-add-repository -y ppa:brightbox/ruby-ng && \
+  apt-get update
+
+# Install openjdk 8.
+RUN $APT_INSTALL openjdk-8-jdk && \
+  update-alternatives --set java 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
+
+# Install build / source control tools
+RUN $APT_INSTALL curl wget git maven ivy subversion make gcc libffi-dev \
+pandoc pandoc-citeproc libssl-dev libcurl4-openssl-dev libxml2-dev && \
+  ln -s -T /usr/share/java/ivy.jar /usr/share/ant/lib/ivy.jar && \
+  curl -sL https://deb.nodesource.com/setup_4.x | bash && \
+  $APT_INSTALL nodejs
+
+# Install needed python packages. Use pip for installing packages (for 
consistency).
+ARG BASE_PIP_PKGS="setuptools wheel virtualenv"
+ARG PIP_PKGS="pyopenssl pypandoc numpy pygments sphinx"
+
+RUN $APT_INSTALL libpython2.7-dev libpython3-dev python-pip python3-pip && 
\
+  pip install $BASE_PIP_PKGS && \
+  pip install $PIP_PKGS && \
+  cd && \
+  virtualenv -p python3 p35 && \
+  . p35/bin/activate && \
+  pip install $BASE_PIP_PKGS && \
+  pip install $PIP_PKGS
+
+# Install R packages and dependencies used when building.
+# R depends on pandoc*, libssl (which are installed above).
+RUN $APT_INSTALL r-base r-base-dev && \
+  $APT_INSTALL texlive-latex-base texlive texlive-fonts-extra texinfo qpdf 
&& \
+  Rscript -e "install.packages(c('curl', 'xml2', 'httr', 'devtools', 
'testthat', 'knitr', 'rmarkdown', 'roxygen2', 'e1071', 'survival'), 
repos='http://cran.us.r-project.org/')" && \
+  Rscript -e "devtools::install_github('jimhester/lintr')"
+
+# Install tools needed to build the documentation.
+RUN $APT_INSTALL ruby2.3 ruby2.3-dev && \
+  gem install jekyll --no-rdoc --no-ri && \
+  gem install jekyll-redirect-from && \
+  gem install pygments.rb
+
+WORKDIR /opt/spark-rm/output
+
+ARG UID
+RUN useradd -m -s /bin/bash -p spark-rm -u $UID spark-rm
--- End diff --

does that mean the do-release script is run as the user "spark-rm"?
I thought it's generally best practice to gpg sign as yourself?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21515: [SPARK-24372][build] Add scripts to help with pre...

2018-06-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/21515#discussion_r194626379
  
--- Diff: dev/.rat-excludes ---
@@ -106,3 +106,4 @@ spark-warehouse
 structured-streaming/*
 kafka-source-initial-offset-version-2.1.0.bin
 kafka-source-initial-offset-future-version.bin
+vote.tmpl
--- End diff --

for example, `.git` is removed from release here 
https://github.com/apache/spark/blob/master/dev/create-release/release-build.sh#L157



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20260: [SPARK-23039][SQL] Finish TODO work in alter tabl...

2018-06-12 Thread xubo245
Github user xubo245 closed the pull request at:

https://github.com/apache/spark/pull/20260


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

2018-06-12 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21533
  
cc @jiangxb1987 @jerryshao 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21533
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91682/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

2018-06-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21533
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...

2018-06-12 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21501#discussion_r194626092
  
--- Diff: python/pyspark/ml/feature.py ---
@@ -2582,25 +2582,31 @@ class StopWordsRemover(JavaTransformer, 
HasInputCol, HasOutputCol, JavaMLReadabl
   typeConverter=TypeConverters.toListString)
 caseSensitive = Param(Params._dummy(), "caseSensitive", "whether to do 
a case sensitive " +
   "comparison over the stop words", 
typeConverter=TypeConverters.toBoolean)
+locale = Param(Params._dummy(), "locale", "locale of the input. 
ignored when case sensitive " +
+   "is true", typeConverter=TypeConverters.toString)
--- End diff --

And also don't forget to mention default value is JVM default locale.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21533
  
**[Test build #91682 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91682/testReport)**
 for PR 21533 at commit 
[`f922fd8`](https://github.com/apache/spark/commit/f922fd8c995164cada4a8b72e92c369a827def16).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...

2018-06-12 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21501#discussion_r194625679
  
--- Diff: python/pyspark/ml/feature.py ---
@@ -2582,25 +2582,31 @@ class StopWordsRemover(JavaTransformer, 
HasInputCol, HasOutputCol, JavaMLReadabl
   typeConverter=TypeConverters.toListString)
 caseSensitive = Param(Params._dummy(), "caseSensitive", "whether to do 
a case sensitive " +
   "comparison over the stop words", 
typeConverter=TypeConverters.toBoolean)
+locale = Param(Params._dummy(), "locale", "locale of the input. 
ignored when case sensitive " +
+   "is true", typeConverter=TypeConverters.toString)
--- End diff --

I'm not sure if users are familiar with available locale setting values 
here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...

2018-06-12 Thread dongjinleekr
Github user dongjinleekr commented on a diff in the pull request:

https://github.com/apache/spark/pull/21501#discussion_r194623958
  
--- Diff: python/pyspark/ml/feature.py ---
@@ -2582,25 +2582,31 @@ class StopWordsRemover(JavaTransformer, 
HasInputCol, HasOutputCol, JavaMLReadabl
   typeConverter=TypeConverters.toListString)
 caseSensitive = Param(Params._dummy(), "caseSensitive", "whether to do 
a case sensitive " +
   "comparison over the stop words", 
typeConverter=TypeConverters.toBoolean)
+locale = Param(Params._dummy(), "locale", "locale of the input. 
ignored when case sensitive " +
+   "is true", typeConverter=TypeConverters.toString)
--- End diff --

Thank you for the comment but... is that necessary?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover

2018-06-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21501
  
**[Test build #91689 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91689/testReport)**
 for PR 21501 at commit 
[`bbd167b`](https://github.com/apache/spark/commit/bbd167b79a073f6eca67b57012d936d692f7d7c8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    2   3   4   5   6   7   8   >