[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/40/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3930/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit execution w...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21450 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91698/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit execution w...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21450 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit execution w...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21450 **[Test build #91698 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91698/testReport)** for PR 21450 at commit [`b03e3de`](https://github.com/apache/spark/commit/b03e3de9b326a8cf9061125e0f22bde2a12bf30f). * This patch **fails Java style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21535 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit execution w...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21450 **[Test build #91698 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91698/testReport)** for PR 21450 at commit [`b03e3de`](https://github.com/apache/spark/commit/b03e3de9b326a8cf9061125e0f22bde2a12bf30f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21462: [SPARK-24428][K8S] Fix unused code
Github user skonto commented on the issue: https://github.com/apache/spark/pull/21462 @foxish gentle ping. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91692/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21535 **[Test build #91692 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91692/testReport)** for PR 21535 at commit [`b8c7238`](https://github.com/apache/spark/commit/b8c7238aec9d6d79b8528eb3f47c0de7a48d23e8). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait CodegenInterpretedTest extends QueryTest with SharedSQLContext ` * `class DataFrameSuite extends CodegenInterpretedTest ` * `class DatasetSuite extends CodegenInterpretedTest ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21366: [SPARK-24248][K8S] Use level triggering and state...
Github user skonto commented on a diff in the pull request: https://github.com/apache/spark/pull/21366#discussion_r194664576 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsPollingSnapshotSource.scala --- @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.scheduler.cluster.k8s + +import java.util.concurrent.{Future, ScheduledExecutorService, TimeUnit} + +import io.fabric8.kubernetes.client.KubernetesClient +import scala.collection.JavaConverters._ + +import org.apache.spark.SparkConf +import org.apache.spark.deploy.k8s.Config._ +import org.apache.spark.deploy.k8s.Constants._ +import org.apache.spark.util.ThreadUtils + +private[spark] class ExecutorPodsPollingSnapshotSource( +conf: SparkConf, +kubernetesClient: KubernetesClient, +snapshotsStore: ExecutorPodsSnapshotsStore, +pollingExecutor: ScheduledExecutorService) { + + private val pollingInterval = conf.get(KUBERNETES_EXECUTOR_API_POLLING_INTERVAL) + + private var pollingFuture: Future[_] = _ + + def start(applicationId: String): Unit = { +require(pollingFuture == null, "Cannot start polling more than once.") +pollingFuture = pollingExecutor.scheduleWithFixedDelay( + new PollRunnable(applicationId), pollingInterval, pollingInterval, TimeUnit.MILLISECONDS) + } + + def stop(): Unit = { +if (pollingFuture != null) { + pollingFuture.cancel(true) + pollingFuture = null +} +ThreadUtils.shutdown(pollingExecutor) --- End diff -- The are a number of such calls, are we sure they will be executed in any scenario like an exception? Are the stop calls bound to some shutdown hook? Is this covered by RX-java? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...
Github user ssonker commented on the issue: https://github.com/apache/spark/pull/21505 @viirya Done. cc: @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21537 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3929/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21537 **[Test build #91697 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91697/testReport)** for PR 21537 at commit [`89d0252`](https://github.com/apache/spark/commit/89d025225b557689389d16c207be8a25f5e82fa5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21520: [SPARK-24505][SQL] Forbidding string interpolatio...
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/21520 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21520: [SPARK-24505][SQL] Forbidding string interpolation in Co...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21520 As I will incrementally split this into smaller PRs, I will first close this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21537 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21537 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21537 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/39/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21537 cc @cloud-fan @hvanhovell @kiszk @mgaido91 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21537: [SPARK-24505][SQL] Convert strings in codegen to ...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/21537 [SPARK-24505][SQL] Convert strings in codegen to blocks: Cast and BoundAttribute ## What changes were proposed in this pull request? This is split from #21520. This includes changes of `BoundAttribute` and `Cast`. This patch also adds few convenient APIs: ```scala CodeGenerator.freshName(name: String, dt: DataType): VariableValue CodeGenerator.freshName(name: String, javaClass: Class[_]): VariableValue CodeGenerator.isNullFreshName(name: String): VariableValue JavaCode.className(javaClass: Class[_]): InlineBlock JavaCode.javaType(dataType: DataType): InlineBlock JavaCode.boxedType(dataType: DataType): InlineBlock ``` ## How was this patch tested? Existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 SPARK-24505-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21537.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21537 commit 89d025225b557689389d16c207be8a25f5e82fa5 Author: Liang-Chi Hsieh Date: 2018-06-12T08:40:20Z Convert strings in codegen to blocks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21505 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21505 The PR title is too long and truncated. Can you shorten it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21501 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21501 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91693/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21501 **[Test build #91693 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91693/testReport)** for PR 21501 at commit [`bbd167b`](https://github.com/apache/spark/commit/bbd167b79a073f6eca67b57012d936d692f7d7c8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...
Github user ssonker commented on the issue: https://github.com/apache/spark/pull/21505 @kiszk @viirya Do you have more review comments that need to be incorporated? If not, can you please get this merged? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20636 When I check callers of `BufferHolder.grow()`, some of them call `ByteArrayMethods.roundNumberOfBytesToNearestWord()` and other do not call it (i.e. implicitly ensure word-aligned). Is it better way to call `ByteArrayMethods.roundNumberOfBytesToNearestWord()` at `BufferHolder.grow()` instread of a caller to gurantee word-aligned? Then, we can check whether `UnsafeRow.getSizeInBytes()` is a multiple of 8 in `BufferHolderSparkSubmitSuite`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21534: [SPARK-24526][build] Spaces in the build dir causes fail...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21534 **[Test build #91696 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91696/testReport)** for PR 21534 at commit [`bb12f3e`](https://github.com/apache/spark/commit/bb12f3e2ad74f9d4c89e1c7adab4d306fa87b101). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21534: [SPARK-24526][build] Spaces in the build dir causes fail...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21534 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21536 **[Test build #91695 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91695/testReport)** for PR 21536 at commit [`2ea2181`](https://github.com/apache/spark/commit/2ea2181697038dbd2109f2daeb347d98724b93af). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21536 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21536 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/38/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21536 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3928/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21536 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21536 Hm .. ? retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user jainaks commented on the issue: https://github.com/apache/spark/pull/21320 Hi @mallman , I found another major issue after having this fix. Schema: a: struct (nullable = true) ||-- b: struct (nullable = true) |||-- c1: string (nullable = true) |||-- c2: string (nullable = true) |||-- c3: string (nullable = true) |||-- c4: string (nullable = true) |||-- c5: boolean (nullable = true) id: struct (nullable = true) ||-- i1: struct (nullable = true) |||-- i2: string (nullable = true) timestamp: bigint **Query:** select a.b.c3 as c3, first(a.b.c3) over (partition by id.i1.i2 order by timestamp rows between current row and unbounded following) as first_c3 fromtemp; The column "first_c3" gets the value of column "c2". It works well, if i just turn the parquetSchemaPrunning flag to false. It may sound odd in the first look and so does for me, but this is what i am getting. PS: I am running all my tests using #16578 pr. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF should assi...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21427 2.3.1 wouldn't have this behaviour change and we marked this as experimental. So, on the other hand, it probably will give more time to expose that this is discouraged in production and there might be a bit of behaviour changes. Actually, It isn't long time comparing to other APIs we have as well on the other hand ... > it turns runnable code into failure, and the old behavior is kind of self-consistent(by-position match). it's not like turning failures into runnable or fix a correctness bug. It still sounds like we treat this API as a old stable API. It doesn't replace the self-consistent way completely. This PR partially fixes its behaviour to make it more sense, causing some corner behaviour changes which are quite unlikely and making no sense (IMHO). We should be relatively less conservative for new and experimental APIs to promote to make it more stable and coherent as soon as possible until we remove the experimental note .. The only special reason I see is that it's not a correctness bug but it changes the existing behaviour (which I actually don't completely agree but I get what you mean at least). But then what can we do for experimental APIs specifically .. ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r194641579 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3209,6 +3222,19 @@ class Dataset[T] private[sql]( } } + private[sql] def getRowsToPython( + _numRows: Int, + truncate: Int, + vertical: Boolean): Array[Any] = { +EvaluatePython.registerPicklers() +val numRows = _numRows.max(0).min(Int.MaxValue - 1) +val rows = getRows(numRows, truncate, vertical).map(_.toArray).toArray +val toJava: (Any) => Any = EvaluatePython.toJava(_, ArrayType(ArrayType(StringType))) +val iter: Iterator[Array[Byte]] = new SerDeUtil.AutoBatchedPickler( + rows.iterator.map(toJava)) +PythonRDD.serveIterator(iter, "serve-GetRows") --- End diff -- I think we return `Array[Any]` for `PythonRDD.serveIterator` too. https://github.com/apache/spark/blob/628c7b517969c4a7ccb26ea67ab3dd61266073ca/core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala#L400 Did I maybe miss something? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21496: docs: fix typo
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21496 **[Test build #91694 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91694/testReport)** for PR 21496 at commit [`fea9616`](https://github.com/apache/spark/commit/fea9616fb35e3fcf886073767da040aef3a408e0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21496: docs: fix typo
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21496 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21276: [SPARK-24216][SQL] Spark TypedAggregateExpression uses g...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21276 I think this fixing is nice to have. cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20313: [SPARK-22974][ML] Attach attributes to output col...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20313#discussion_r194636521 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -264,7 +265,9 @@ class CountVectorizerModel( Vectors.sparse(dictBr.value.size, effectiveCounts) } -dataset.withColumn($(outputCol), vectorizer(col($(inputCol +val attrs = vocabulary.map(_ => new NumericAttribute).asInstanceOf[Array[Attribute]] --- End diff -- Sorry for replying late. Though I agree that this attributes don't provide much info, I'm wondering if we can let it lazily generated. At this point, I think we don't know if following transformer will need it or not? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20313: [SPARK-22974][ML] Attach attributes to output column of ...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20313 cc @dbtsai too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21501 **[Test build #91693 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91693/testReport)** for PR 21501 at commit [`bbd167b`](https://github.com/apache/spark/commit/bbd167b79a073f6eca67b57012d936d692f7d7c8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3927/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21498: [SPARK-24410][SQL][Core] Optimization for Union outputPa...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21498 @mgaido91 WDYT? Does the benchmark make sense to you? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21501 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21535 **[Test build #91692 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91692/testReport)** for PR 21535 at commit [`b8c7238`](https://github.com/apache/spark/commit/b8c7238aec9d6d79b8528eb3f47c0de7a48d23e8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/37/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21319 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91685/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21535 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21319 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19528: [SPARK-20393][WEBU UI][1.6] Strengthen Spark to prevent ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19528 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91690/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21496: docs: fix typo
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21496 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91688/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19528: [SPARK-20393][WEBU UI][1.6] Strengthen Spark to prevent ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19528 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91687/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21496: docs: fix typo
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21496 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21496: docs: fix typo
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21496 **[Test build #91688 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91688/testReport)** for PR 21496 at commit [`fea9616`](https://github.com/apache/spark/commit/fea9616fb35e3fcf886073767da040aef3a408e0). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21501 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91689/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21501 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21320 @mallman Sorry for the delay. Super busy during the Spark summit. Will continue the code review in the next few days. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21535 **[Test build #91687 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91687/testReport)** for PR 21535 at commit [`b8c7238`](https://github.com/apache/spark/commit/b8c7238aec9d6d79b8528eb3f47c0de7a48d23e8). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait CodegenInterpretedTest extends QueryTest with SharedSQLContext ` * `class DataFrameSuite extends CodegenInterpretedTest ` * `class DatasetSuite extends CodegenInterpretedTest ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21319 **[Test build #91685 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91685/testReport)** for PR 21319 at commit [`91fdedc`](https://github.com/apache/spark/commit/91fdedc4d91a7abde5f6b64dbfcf354b67d89a48). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21501 **[Test build #91689 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91689/testReport)** for PR 21501 at commit [`bbd167b`](https://github.com/apache/spark/commit/bbd167b79a073f6eca67b57012d936d692f7d7c8). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21536 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21536 **[Test build #91691 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91691/testReport)** for PR 21536 at commit [`2ea2181`](https://github.com/apache/spark/commit/2ea2181697038dbd2109f2daeb347d98724b93af). * This patch **fails Java style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21536 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91691/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21510: [SPARK-24490][WebUI] Use WebUI.addStaticHandler i...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/21510#discussion_r194632125 --- Diff: core/src/main/scala/org/apache/spark/ui/WebUI.scala --- @@ -88,41 +90,41 @@ private[spark] abstract class WebUI( handlers += renderHandler } - /** Attach a handler to this UI. */ + /** Attaches a handler to this UI. */ def attachHandler(handler: ServletContextHandler) { handlers += handler serverInfo.foreach(_.addHandler(handler)) } - /** Detach a handler from this UI. */ + /** Detaches a handler from this UI. */ def detachHandler(handler: ServletContextHandler) { handlers -= handler serverInfo.foreach(_.removeHandler(handler)) } /** - * Add a handler for static content. + * Adds a handler for static content. * * @param resourceBase Root of where to find resources to serve. * @param path Path in UI where to mount the resources. */ - def addStaticHandler(resourceBase: String, path: String): Unit = { + def addStaticHandler(resourceBase: String, path: String = "/static"): Unit = { attachHandler(JettyUtils.createStaticHandler(resourceBase, path)) } /** - * Remove a static content handler. + * Removes a static content handler. * * @param path Path in UI to unmount. */ def removeStaticHandler(path: String): Unit = { --- End diff -- OK...since @vanzin requested I'm gonna make all the other changes while at it :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21536 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/36/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21536 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21536 **[Test build #91691 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91691/testReport)** for PR 21536 at commit [`2ea2181`](https://github.com/apache/spark/commit/2ea2181697038dbd2109f2daeb347d98724b93af). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21536 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21536: [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInM...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21536 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3926/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21536: [MINOR][CORE][TEST] Remove unnecessary sort in Un...
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/21536 [MINOR][CORE][TEST] Remove unnecessary sort in UnsafeInMemorySorterSuite ## What changes were proposed in this pull request? We don't require specific ordering of the input data, the sort action is not necessary and misleading. ## How was this patch tested? Existing test suite. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jiangxb1987/spark sorterSuite Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21536.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21536 commit 2ea2181697038dbd2109f2daeb347d98724b93af Author: Xingbo Jiang Date: 2018-06-12T06:50:42Z remove unnecessary sort --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21357 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21357 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91686/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21357 **[Test build #91686 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91686/testReport)** for PR 21357 at commit [`8ad2a3f`](https://github.com/apache/spark/commit/8ad2a3f8112662a865ee1dbaf7c5269197c3ee4f). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21276: [SPARK-24216][SQL] Spark TypedAggregateExpression...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21276#discussion_r194630048 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2715,6 +2716,62 @@ private[spark] object Utils extends Logging { HashCodes.fromBytes(secretBytes).toString() } + /** + * Safer than Class obj's getSimpleName which may throw Malformed class name error in scala. + * This method mimicks scalatest's getSimpleNameOfAnObjectsClass. + */ + def getSimpleName(cls: Class[_]): String = { +try { + return cls.getSimpleName +} catch { + case err: InternalError => return stripDollars(stripPackages(cls.getName)) +} + } + + /** + * Remove the packages from full qualified class name + */ + private def stripPackages(fullyQualifiedName: String): String = { +fullyQualifiedName.split("\\.").takeRight(1)(0) + } + + /** + * Remove trailing dollar signs from qualified class name, + * and return the trailing part after the last dollar sign in the middle + */ + private def stripDollars(s: String): String = { +val lastDollarIndex = s.lastIndexOf('$') +if (lastDollarIndex < s.length - 1) { + // The last char is not a dollar sign + if (lastDollarIndex == -1 || !s.contains("$iw")) { +// The name does not have dollar sign or is not an intepreter +// generated class, so we should return the full string +s + } else { +// The class name is intepreter generated, +// return the part after the last dollar sign +// This is the same behavior as getClass.getSimpleName +s.substring(lastDollarIndex + 1) + } +} +else { --- End diff -- style: ```scala if (...) { } else { } --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r194629747 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3209,6 +3222,19 @@ class Dataset[T] private[sql]( } } + private[sql] def getRowsToPython( + _numRows: Int, + truncate: Int, + vertical: Boolean): Array[Any] = { +EvaluatePython.registerPicklers() +val numRows = _numRows.max(0).min(Int.MaxValue - 1) +val rows = getRows(numRows, truncate, vertical).map(_.toArray).toArray +val toJava: (Any) => Any = EvaluatePython.toJava(_, ArrayType(ArrayType(StringType))) +val iter: Iterator[Array[Byte]] = new SerDeUtil.AutoBatchedPickler( + rows.iterator.map(toJava)) +PythonRDD.serveIterator(iter, "serve-GetRows") --- End diff -- `PythonRDD.serveIterator(iter, "serve-GetRows")` returns `Int`, but the return type of `getRowsToPython ` is `Array[Any]`. How does it work? cc @xuanyuanking @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19528: [SPARK-20393][WEBU UI][1.6] Strengthen Spark to prevent ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19528 **[Test build #91690 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91690/consoleFull)** for PR 19528 at commit [`76ad8c5`](https://github.com/apache/spark/commit/76ad8c5e62a7233c16399043716139b52ee1c97d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20640: [SPARK-19755][Mesos] Blacklist is always active for Meso...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20640 @IgorBerman any thought on this comment? https://github.com/apache/spark/pull/20640#discussion_r191272487 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21515: [SPARK-24372][build] Add scripts to help with pre...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21515#discussion_r194625766 --- Diff: dev/create-release/vote.tmpl --- @@ -0,0 +1,64 @@ +Please vote on releasing the following candidate as Apache Spark version {version}. + +The vote is open until {deadline} and passes if a majority of at least 3 +1 PMC votes are cast. --- End diff -- nit: personally I find `3 PMC +1 votes` more clear --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19528: [SPARK-20393][WEBU UI][1.6] Strengthen Spark to prevent ...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/19528 Jenkins test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21515: [SPARK-24372][build] Add scripts to help with pre...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21515#discussion_r194625981 --- Diff: dev/.rat-excludes --- @@ -106,3 +106,4 @@ spark-warehouse structured-streaming/* kafka-source-initial-offset-version-2.1.0.bin kafka-source-initial-offset-future-version.bin +vote.tmpl --- End diff -- even if rat doesn't check, isn't vote.tmpl packaged into the source release this way? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21515: [SPARK-24372][build] Add scripts to help with pre...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21515#discussion_r194626614 --- Diff: dev/create-release/spark-rm/Dockerfile --- @@ -0,0 +1,89 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +# Image for building Spark releases. Based on Ubuntu 16.04. +# +# Includes: +# * Java 8 +# * Ivy +# * Python/PyPandoc (2.7.12/3.5.2) +# * R-base/R-base-dev (3.3.2+) +# * Ruby 2.3 build utilities + +FROM ubuntu:16.04 + +# These arguments are just for reuse and not really meant to be customized. +ARG APT_INSTALL="apt-get install --no-install-recommends -y" + +# Install extra needed repos and refresh. +# - CRAN repo +# - Ruby repo (for doc generation) +RUN echo 'deb http://cran.cnr.Berkeley.edu/bin/linux/ubuntu xenial/' >> /etc/apt/sources.list && \ + gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9 && \ + gpg -a --export E084DAB9 | apt-key add - && \ + apt-get clean && \ + rm -rf /var/lib/apt/lists/* && \ + apt-get clean && \ + apt-get update && \ + $APT_INSTALL software-properties-common && \ + apt-add-repository -y ppa:brightbox/ruby-ng && \ + apt-get update + +# Install openjdk 8. +RUN $APT_INSTALL openjdk-8-jdk && \ + update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java + +# Install build / source control tools +RUN $APT_INSTALL curl wget git maven ivy subversion make gcc libffi-dev \ +pandoc pandoc-citeproc libssl-dev libcurl4-openssl-dev libxml2-dev && \ + ln -s -T /usr/share/java/ivy.jar /usr/share/ant/lib/ivy.jar && \ + curl -sL https://deb.nodesource.com/setup_4.x | bash && \ + $APT_INSTALL nodejs + +# Install needed python packages. Use pip for installing packages (for consistency). +ARG BASE_PIP_PKGS="setuptools wheel virtualenv" +ARG PIP_PKGS="pyopenssl pypandoc numpy pygments sphinx" + +RUN $APT_INSTALL libpython2.7-dev libpython3-dev python-pip python3-pip && \ + pip install $BASE_PIP_PKGS && \ + pip install $PIP_PKGS && \ + cd && \ + virtualenv -p python3 p35 && \ + . p35/bin/activate && \ + pip install $BASE_PIP_PKGS && \ + pip install $PIP_PKGS + +# Install R packages and dependencies used when building. +# R depends on pandoc*, libssl (which are installed above). +RUN $APT_INSTALL r-base r-base-dev && \ + $APT_INSTALL texlive-latex-base texlive texlive-fonts-extra texinfo qpdf && \ + Rscript -e "install.packages(c('curl', 'xml2', 'httr', 'devtools', 'testthat', 'knitr', 'rmarkdown', 'roxygen2', 'e1071', 'survival'), repos='http://cran.us.r-project.org/')" && \ + Rscript -e "devtools::install_github('jimhester/lintr')" + +# Install tools needed to build the documentation. +RUN $APT_INSTALL ruby2.3 ruby2.3-dev && \ + gem install jekyll --no-rdoc --no-ri && \ + gem install jekyll-redirect-from && \ + gem install pygments.rb + +WORKDIR /opt/spark-rm/output + +ARG UID +RUN useradd -m -s /bin/bash -p spark-rm -u $UID spark-rm --- End diff -- does that mean the do-release script is run as the user "spark-rm"? I thought it's generally best practice to gpg sign as yourself? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21515: [SPARK-24372][build] Add scripts to help with pre...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21515#discussion_r194626379 --- Diff: dev/.rat-excludes --- @@ -106,3 +106,4 @@ spark-warehouse structured-streaming/* kafka-source-initial-offset-version-2.1.0.bin kafka-source-initial-offset-future-version.bin +vote.tmpl --- End diff -- for example, `.git` is removed from release here https://github.com/apache/spark/blob/master/dev/create-release/release-build.sh#L157 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20260: [SPARK-23039][SQL] Finish TODO work in alter tabl...
Github user xubo245 closed the pull request at: https://github.com/apache/spark/pull/20260 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21533 cc @jiangxb1987 @jerryshao --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21533 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91682/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21533 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21501#discussion_r194626092 --- Diff: python/pyspark/ml/feature.py --- @@ -2582,25 +2582,31 @@ class StopWordsRemover(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadabl typeConverter=TypeConverters.toListString) caseSensitive = Param(Params._dummy(), "caseSensitive", "whether to do a case sensitive " + "comparison over the stop words", typeConverter=TypeConverters.toBoolean) +locale = Param(Params._dummy(), "locale", "locale of the input. ignored when case sensitive " + + "is true", typeConverter=TypeConverters.toString) --- End diff -- And also don't forget to mention default value is JVM default locale. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21533 **[Test build #91682 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91682/testReport)** for PR 21533 at commit [`f922fd8`](https://github.com/apache/spark/commit/f922fd8c995164cada4a8b72e92c369a827def16). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21501#discussion_r194625679 --- Diff: python/pyspark/ml/feature.py --- @@ -2582,25 +2582,31 @@ class StopWordsRemover(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadabl typeConverter=TypeConverters.toListString) caseSensitive = Param(Params._dummy(), "caseSensitive", "whether to do a case sensitive " + "comparison over the stop words", typeConverter=TypeConverters.toBoolean) +locale = Param(Params._dummy(), "locale", "locale of the input. ignored when case sensitive " + + "is true", typeConverter=TypeConverters.toString) --- End diff -- I'm not sure if users are familiar with available locale setting values here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...
Github user dongjinleekr commented on a diff in the pull request: https://github.com/apache/spark/pull/21501#discussion_r194623958 --- Diff: python/pyspark/ml/feature.py --- @@ -2582,25 +2582,31 @@ class StopWordsRemover(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadabl typeConverter=TypeConverters.toListString) caseSensitive = Param(Params._dummy(), "caseSensitive", "whether to do a case sensitive " + "comparison over the stop words", typeConverter=TypeConverters.toBoolean) +locale = Param(Params._dummy(), "locale", "locale of the input. ignored when case sensitive " + + "is true", typeConverter=TypeConverters.toString) --- End diff -- Thank you for the comment but... is that necessary? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21501: [SPARK-15064][ML] Locale support in StopWordsRemover
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21501 **[Test build #91689 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91689/testReport)** for PR 21501 at commit [`bbd167b`](https://github.com/apache/spark/commit/bbd167b79a073f6eca67b57012d936d692f7d7c8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org