[GitHub] spark issue #16297: [SPARK-18888] partitionBy in DataStreamWriter in Python ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16297 **[Test build #70205 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70205/testReport)** for PR 16297 at commit [`e10627f`](https://github.com/apache/spark/commit/e10627f740c9ec48c65332e6c27d9981b19233de). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16272 **[Test build #70206 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70206/testReport)** for PR 16272 at commit [`480cf22`](https://github.com/apache/spark/commit/480cf229ccfef32d0b08bd7b323769545d8eb670). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16297: [SPARK-18888] partitionBy in DataStreamWriter in Python ...
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/16297 cc @tdas --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16297: [SPARK-18888] partitionBy in DataStreamWriter in ...
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/16297 [SPARK-1] partitionBy in DataStreamWriter in Python throws _to_seq not defined ## What changes were proposed in this pull request? `_to_seq` wasn't imported. ## How was this patch tested? Added partitionBy to existing write path unit test You can merge this pull request into a Git repository by running: $ git pull https://github.com/brkyvz/spark SPARK-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16297.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16297 commit cedaafdff23ad99e4be06077c5b5cc3bee6ebf07 Author: Burak Yavuz Date: 2016-12-15T00:18:27Z test possible fix commit aa20fd0da513794d3563d3d8a6b566e50fae2d57 Author: Burak Yavuz Date: 2016-12-15T20:02:29Z Merge branch 'master' of github.com:apache/spark commit 856d9ecd535a6b24e1709ecf09d923945137ca5d Author: Burak Yavuz Date: 2016-12-15T20:13:26Z fix partitionBy --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/16281 I agree with @srowen, forking should be the last resort. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16272 **[Test build #70204 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70204/testReport)** for PR 16272 at commit [`5bc8e70`](https://github.com/apache/spark/commit/5bc8e707fd0eeb98ec2f1a9c8bb455b7e624044d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16272: [SPARK-18850][SS]Make StreamExecution serializabl...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16272#discussion_r92690628 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryStatusAndProgressSuite.scala --- @@ -137,12 +146,13 @@ object StreamingQueryStatusAndProgressSuite { name = "myName", timestamp = "2016-12-05T20:54:20.827Z", batchId = 2L, -durationMs = Map("total" -> 0L).mapValues(long2Long).asJava, -eventTime = Map( - "max" -> "2016-12-05T20:54:20.827Z", - "min" -> "2016-12-05T20:54:20.827Z", - "avg" -> "2016-12-05T20:54:20.827Z", - "watermark" -> "2016-12-05T20:54:20.827Z").asJava, +durationMs = Collections.singletonMap("total", 0L), --- End diff -- I changed these two maps to pure Java maps, as the map created by `asJava` is not serializable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16287: [SPARK-18868][FLAKY-TEST] Deflake StreamingQueryListener...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16287 **[Test build #70203 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70203/testReport)** for PR 16287 at commit [`d4bbb45`](https://github.com/apache/spark/commit/d4bbb459ffd48edddfff2967fcc67aefe0b35ce0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16289: [SPARK-18870] Disallowed Distinct Aggregations on...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16289 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16272: [SPARK-18850][SS]Make StreamExecution serializabl...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16272#discussion_r92687106 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala --- @@ -439,6 +440,37 @@ class StreamingQuerySuite extends StreamTest with BeforeAndAfter with Logging { } } + test("progress classes should be Serializable") { --- End diff -- This should be in the StreamingStatusAndPRogressSuite --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16272: [SPARK-18850][SS]Make StreamExecution serializabl...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16272#discussion_r92687025 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala --- @@ -300,6 +302,48 @@ class StreamSuite extends StreamTest { q.stop() } } + + test("StreamingQuery should be Serializable but cannot be used in executors") { --- End diff -- Why is this in StreamSuite. This is better to be in StreamingQuerySuite as its related to the StreamingQuery. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16287: [SPARK-18868][FLAKY-TEST] Deflake StreamingQueryListener...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16287 **[Test build #70202 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70202/testReport)** for PR 16287 at commit [`32d1ca8`](https://github.com/apache/spark/commit/32d1ca8a7ed3c2e548b6c7fdceeddaa754c0b477). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16251: [SPARK-18826][SS]Add 'latestFirst' option to FileStreamS...
Github user tdas commented on the issue: https://github.com/apache/spark/pull/16251 LGTM, pending tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16289: [SPARK-18870] Disallowed Distinct Aggregations on Stream...
Github user tdas commented on the issue: https://github.com/apache/spark/pull/16289 Merging to 2.1 and master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16289: [SPARK-18870] Disallowed Distinct Aggregations on Stream...
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/16289 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16291: [SPARK-18838][WIP] Use separate executor service for eac...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/16291 Hmm... I took a quick look and I'm not sure I understand exactly what's going on. It seems you're wrapping each listener with a `ListenerEvenProcessor` (note the typo), and each processor has its own thread pool for processing events. If that's the case, that sounds wrong. Each listener should process events serially otherwise you risk getting into funny situations like a task end event being processed before the task start event for the same task. I think this would benefit from a proper explanation of the changes being proposed, instead of a bunch of code. How will the listener bus work with the changes? Where will we have thread pools, where will we have message queues? Will each listener get its own dedicated thread, or will there be a limit? What kind of controls will there be to avoid memory pressure? Would it be worth it to allow certain listeners to have dedicated threads while others share the same one like the current approach? That kind of thing. Can you write something up and post it to the bug? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16285: [SPARK-18867] [SQL] Throw cause if IsolatedClientLoad ca...
Github user jojochuang commented on the issue: https://github.com/apache/spark/pull/16285 Good point @rxin. That seems possible. https://docs.oracle.com/javase/7/docs/api/java/lang/reflect/InvocationTargetException.html#getCause() --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16142: [SPARK-18716][CORE] Restrict the disk usage of spark eve...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/16142 I'm not such a big fan of this feature, but mostly I'm not a big fan of the current implementation. For the feature, it feels like it's trying to make the SHS more like a "log management system" than a history server. There were already people who were concerned about this when the existing cleaner functionality was added. But I'll entertain the thought, even though you can probably get pretty close to this by using time-based deletion with a shorter max age, coupled with log compression (both features that already exist). For the implementation, you cannot delete things just based on size. You need to account for time too; you have to delete older logs first, otherwise you risk deleting the logs for just finished applications instead of a large log that's been sitting there for months. It's also a reactive change; you're already using more space then you want too, the history server will just bring that down, eventually. You change will also bombard the NameNode with requests on every scan, to get the size of each log. At this point I'm not so convinced of the usefulness of the feature, and implementing it correctly will be a larger change than you have here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16272 **[Test build #70201 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70201/testReport)** for PR 16272 at commit [`8993018`](https://github.com/apache/spark/commit/8993018547e4b43ec064cd2acda5c10932dc4616). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16228: [WIP] [SPARK-17076] [SQL] Cardinality estimation for joi...
Github user Tagar commented on the issue: https://github.com/apache/spark/pull/16228 @wzhfy, I think overestimating cardinality could be as bad as underestimating. For example, Optimizer could prematurely switch to SortMergeJoin when it could used broadcast hash join. But I agree, this PR is a great improvement over current cardinality estimates. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16251: [SPARK-18826][SS]Add 'latestFirst' option to FileStreamS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16251 **[Test build #70200 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70200/testReport)** for PR 16251 at commit [`2847738`](https://github.com/apache/spark/commit/2847738d9e57e8b16003c2f2520572f5e76dbf2f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16290: [SPARK-18817] [SPARKR] [SQL] Set default warehouse dir t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16290 **[Test build #70199 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70199/testReport)** for PR 16290 at commit [`1d0d1d2`](https://github.com/apache/spark/commit/1d0d1d219f392721e9be73e21752100db0ce065f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16290: [SPARK-18817] [SPARKR] [SQL] Set default warehouse dir t...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16290 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16272 **[Test build #70198 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70198/testReport)** for PR 16272 at commit [`4469b0e`](https://github.com/apache/spark/commit/4469b0e335ed913cf66e490cfcf5f23932a07af1). * This patch **fails to build**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class AlterTableChangeColumnCommand(` * `case class EventTimeStats(var max: Long, var min: Long, var sum: Long, var count: Long) ` * `class EventTimeStatsAccum(protected var currentStats: EventTimeStats = EventTimeStats.zero)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16272 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70198/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16272 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16272 **[Test build #70198 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70198/testReport)** for PR 16272 at commit [`4469b0e`](https://github.com/apache/spark/commit/4469b0e335ed913cf66e490cfcf5f23932a07af1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16272 **[Test build #70197 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70197/consoleFull)** for PR 16272 at commit [`e6c897e`](https://github.com/apache/spark/commit/e6c897eefcb8368a31b345bab68943492fb63d86). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92669686 --- Diff: streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/AllReceiversResource.scala --- @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.status.api.v1.streaming + +import java.util.Date +import javax.ws.rs.{GET, Produces} +import javax.ws.rs.core.MediaType + +import org.apache.spark.status.api.v1.streaming.AllReceiversResource._ +import org.apache.spark.streaming.ui.StreamingJobProgressListener + +@Produces(Array(MediaType.APPLICATION_JSON)) +private[v1] class AllReceiversResource(listener: StreamingJobProgressListener) { + + @GET + def receiversList(): Seq[ReceiverInfo] = { +receiverInfoList(listener).sortBy(_.streamId) + } +} + +private[v1] object AllReceiversResource { + + def receiverInfoList(listener: StreamingJobProgressListener): Seq[ReceiverInfo] = { +listener.synchronized { + listener.receivedRecordRateWithBatchTime.map { case (streamId, eventRates) => + +val receiverInfo = listener.receiverInfo(streamId) +val streamName = receiverInfo.map(_.name). --- End diff -- nit: `.` goes on the next line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92668634 --- Diff: streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/AllBatchesResource.scala --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.status.api.v1.streaming + +import java.util.{ArrayList => JArrayList, Arrays => JArrays, Date, List => JList} +import javax.ws.rs.{GET, Produces, QueryParam} +import javax.ws.rs.core.MediaType + +import org.apache.spark.status.api.v1.streaming.AllBatchesResource._ +import org.apache.spark.streaming.ui.StreamingJobProgressListener + +@Produces(Array(MediaType.APPLICATION_JSON)) +private[v1] class AllBatchesResource(listener: StreamingJobProgressListener) { + + @GET + def batchesList(@QueryParam("status") statusParams: JList[BatchStatus]): Seq[BatchInfo] = { +batchInfoList(listener, statusParams).sortBy(- _.batchId) + } +} + +private[v1] object AllBatchesResource { + + def batchInfoList( + listener: StreamingJobProgressListener, + statusParams: JList[BatchStatus] = new JArrayList[BatchStatus]() + ): Seq[BatchInfo] = { + +listener.synchronized { + val statuses = +if (statusParams.isEmpty) JArrays.asList(BatchStatus.values(): _*) --- End diff -- nit: style --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92670326 --- Diff: streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/StreamingStatisticsResource.scala --- @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.status.api.v1.streaming + +import java.util.Date +import javax.ws.rs.{GET, Produces} +import javax.ws.rs.core.MediaType + +import org.apache.spark.streaming.ui.StreamingJobProgressListener + +@Produces(Array(MediaType.APPLICATION_JSON)) +private[v1] class StreamingStatisticsResource( +listener: StreamingJobProgressListener) { --- End diff -- nit: fits in previous line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92668159 --- Diff: project/MimaExcludes.scala --- @@ -116,7 +116,10 @@ object MimaExcludes { ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.streaming.StreamingQueryException.startOffset"), ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.streaming.StreamingQueryException.endOffset"), ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.sql.streaming.StreamingQueryException.this"), - ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.StreamingQueryException.query") + ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.StreamingQueryException.query"), + + // [SPARK-18537] Add a REST api to spark streaming + ProblemFilters.exclude[ReversedMissingMethodProblem]("org.apache.spark.streaming.scheduler.StreamingListener.onStreamingStarted") --- End diff -- There's a "Exclude rules for 2.2.x" section now, this should probably go there instead. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92669331 --- Diff: streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/AllOutputOperationsResource.scala --- @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.status.api.v1.streaming + +import java.util.Date +import javax.ws.rs.{GET, PathParam, Produces} +import javax.ws.rs.core.MediaType + +import org.apache.spark.status.api.v1.NotFoundException +import org.apache.spark.status.api.v1.streaming.AllOutputOperationsResource._ +import org.apache.spark.streaming.Time +import org.apache.spark.streaming.ui.StreamingJobProgressListener + +@Produces(Array(MediaType.APPLICATION_JSON)) +private[v1] class AllOutputOperationsResource(listener: StreamingJobProgressListener) { + + @GET + def operationsList(@PathParam("batchId") batchId: Long): Seq[OutputOperationInfo] = { +outputOperationInfoList(listener, batchId).sortBy(_.outputOpId) + } +} + +private[v1] object AllOutputOperationsResource { + + def outputOperationInfoList( + listener: StreamingJobProgressListener, batchId: Long): Seq[OutputOperationInfo] = { --- End diff -- nit: style. when wrapping, break each argument into its own line. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92669743 --- Diff: streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/AllReceiversResource.scala --- @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.status.api.v1.streaming + +import java.util.Date +import javax.ws.rs.{GET, Produces} +import javax.ws.rs.core.MediaType + +import org.apache.spark.status.api.v1.streaming.AllReceiversResource._ +import org.apache.spark.streaming.ui.StreamingJobProgressListener + +@Produces(Array(MediaType.APPLICATION_JSON)) +private[v1] class AllReceiversResource(listener: StreamingJobProgressListener) { + + @GET + def receiversList(): Seq[ReceiverInfo] = { +receiverInfoList(listener).sortBy(_.streamId) + } +} + +private[v1] object AllReceiversResource { + + def receiverInfoList(listener: StreamingJobProgressListener): Seq[ReceiverInfo] = { +listener.synchronized { + listener.receivedRecordRateWithBatchTime.map { case (streamId, eventRates) => + +val receiverInfo = listener.receiverInfo(streamId) +val streamName = receiverInfo.map(_.name). + orElse(listener.streamName(streamId)).getOrElse(s"Stream-$streamId") +val avgEventRate = + if (eventRates.isEmpty) None --- End diff -- nit: style --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15505: [SPARK-17931][CORE] taskScheduler has some unneeded seri...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15505 **[Test build #70196 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70196/consoleFull)** for PR 15505 at commit [`38ecc91`](https://github.com/apache/spark/commit/38ecc91595f9f47986c2c472149eb30c36cb8dc6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15717: [SPARK-17910][SQL] Allow users to update the comm...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15717 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14079: [SPARK-8425][CORE] Application Level Blacklisting
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14079 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16276: [SPARK-18855][CORE] Add RDD flatten function
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16276 Do we really need to add this? For the time we spent we can work on more impactful things ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15717 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16291: [SPARK-18838][WIP] Use separate executor service for eac...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16291 This was initially introduced by @kayousterhout --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16281 @nsync you raised an excellent question on test coverage. The kind of bugs we have seen in the past weren't really integration bugs, but bugs in parquet-mr. Technically it should be the jobs of parquet-mr to verify correctness and performance regressions. If we were to introduce a much more broader set of regression tests in Spark, then to me it makes even more sense to just move the Parquet code into Spark and fixed issues found there. Also I have spent some time understanding the Parquet codec, and I have to say it is pretty powerful and complicated and as a result fairly difficult to implement correctly. The dremel format optimizes for sparse nested data, but is much more difficult to get right than a simpler dense format. FWIW, the ideal scenario I can think of is to have parquet-mr publish big fix versions that don't include new features. That would make update auditing easier and updates lower risk. E.g. Parquet-mr 2 adds new features, and 1.x are just bug fixes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 @dongjoon-hyun I just checked the code changes in `1.2.1.spark2` compared with the official Hive 1.2.1: https://github.com/JoshRosen/hive/commits/release-1.2.1-spark2 Very small changes, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16281 Btw issues are not just performance, but often correctness as well. As the default format, a bug in Parquet is much worse than a bug in say ORC. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16281 We haven't really added much to Hive though, and as a matter of fact the dependency on Hive is decreasing. Parquet is a much more manageable piece of code to fork. In the past we have seen fairly critical bugs with almost every upgrade. and coupled with the fact that Parquet cannot always make releases fast enough (yes it happened in the past when we asked to have released but didn't get them), or have proper testing, it has always been very risky to just upgrade a major version of Parquet. In addition, we already have a forked Parquet reader in Spark that is vectorized (that is different from the one in parquet mr). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16281 Yep. Spark Thrift Server are different, but it's not actively maintained. For example, the default database feature is recently added. I mean this one by `Spark Hive`. ``` 1.2.1.spark2 ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16281 Has anyone even asked for a new 1.8.x build from Parquet and been told it won't happen? You don't stop consuming non fix changes by forking. You do that by staying on a maintenance branch. If that branch is maintained of course. I'd be shocked if there were important bugs affecting a major ASF project and no way to get a maintenance release of a recent branch. Esp when we have experts here with influence. What does this block - why the urgency? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16030 Can you also update the title? And the description has a mistake: the logical layer trusts the data schema to infer the type the overlapped partition columns, and, on the other hand, the physical layer trusts partition schema which is inferred from path string. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 We are adding major code changes in Spark Thrift Server? What is the Spark Hive? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 I think we are not adding new features into Parquet. The fixes must be small. To avoid the cost and risk, we need to reject all the major fixes in our special build. At the same time, we also need to request Parquet community to resolve the bugs in the newer releases. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16281 Yep. At the beginning, it starts like that. But, please look at Spark Hive or Spark Thrift Server. I don't think we are maintaining that well or visibly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16296#discussion_r92655270 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -69,16 +69,19 @@ statement | ALTER DATABASE identifier SET DBPROPERTIES tablePropertyList #setDatabaseProperties | DROP DATABASE (IF EXISTS)? identifier (RESTRICT | CASCADE)? #dropDatabase | createTableHeader ('(' colTypeList ')')? tableProvider -(OPTIONS tablePropertyList)? +(OPTIONS options=tablePropertyList)? (PARTITIONED BY partitionColumnNames=identifierList)? -bucketSpec? (AS? query)? #createTableUsing +bucketSpec? +(TBLPROPERTIES properties=tablePropertyList)? +(COMMENT comment=STRING)? --- End diff -- Do we need to keep the same order? For example, moving `(COMMENT comment=STRING)?` before `(PARTITIONED BY partitionColumnNames=identifierList)?`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax for da...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16296 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70195/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax for da...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16296 **[Test build #70195 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70195/testReport)** for PR 16296 at commit [`234d935`](https://github.com/apache/spark/commit/234d93510d9592a950adf61ce4517a29b4048aa7). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class DetermineHiveSerde(conf: SQLConf) extends Rule[LogicalPlan] ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16281 I agree, but, in a long term perspective, the risk and cost of forking could be the worst. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax for da...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16296 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax for da...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16296 **[Test build #70195 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70195/testReport)** for PR 16296 at commit [`234d935`](https://github.com/apache/spark/commit/234d93510d9592a950adf61ce4517a29b4048aa7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 @dongjoon-hyun What kind of questions/requests should we ask in dev mailing list? IMO, the risk and cost are small if we make a special build by ourselves. We can get the bug fixes very quickly. Maybe @rxin @rdblue @liancheng can give their inputs here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax for da...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16296 cc @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14079 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70194/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/16296 [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax for data source and hive serde tables ## What changes were proposed in this pull request? Today we have different syntax to create data source or hive serde tables, we should unify them to not confuse users and step forward to make hive a data source. TODO: add more description, add tests. ## How was this patch tested? TODO. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark create-table Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16296.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16296 commit 234d93510d9592a950adf61ce4517a29b4048aa7 Author: Wenchen Fan Date: 2016-12-15T16:57:22Z unify CREATE TABLE syntax for data source and hive serde tables --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14079 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14079 **[Test build #70194 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70194/testReport)** for PR 14079 at commit [`f249b00`](https://github.com/apache/spark/commit/f249b00e3eb64bf35ab836fa4b89eb961a9511a5). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class EventTimeStats(var max: Long, var min: Long, var sum: Long, var count: Long) ` * `class EventTimeStatsAccum(protected var currentStats: EventTimeStats = EventTimeStats.zero)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15915 **[Test build #3500 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3500/testReport)** for PR 15915 at commit [`22b81cf`](https://github.com/apache/spark/commit/22b81cf4d7205488e22907d27511c71cf9e242dc). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16281 Actually, this PR is about Apache Spark 2.2 on Late March in terms of RC1. We have a lot of time to discuss. Why don't we discuss that on dev mailing list? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 Basically, the idea is to make a special build for Parquet 1.8.1 with the needed fixes by ourselves. Upgrading to newer version like Parquet 1.9.0 is risky. Parquet 1.9.0 was just released this Oct. The bugs might not be exposed until more users/clients start using it. This is also the same to our Spark community. I personally do not suggest any enterprise customer use Spark 2.0.0, even if it resolved many bugs in Spark 1.6+ To evaluate whether upgrading Parquet 1.9.0., the biggest effort is the performance evaluation. We need to have our own standard performance workload benchmarks (TPC-DS is not enough) to test whether the upgrade can introduce any major performance degradation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16057 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70193/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16057 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16057 **[Test build #70193 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70193/testReport)** for PR 16057 at commit [`b065079`](https://github.com/apache/spark/commit/b065079706f2de2a23b055e644ae46402ef97a8d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16281 I'd much rather lobby to release 1.8.2 and help with the legwork than do all that legwork and more to maintain a fork. It's still not clear to me that upgrading to 1.9.0 is not a solution? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15018 Just add commits here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 The problem is the Parquet community will not create a branch 1.8.2+ for us. Upgrading to newer versions 1.9 or 2.0 are always risky. Based on the history, we hit the bugs and performance degradation, when we try to upgrade the Parquet versions. We need more time and efforts to decide whether we need to upgrade to the newer Parquet version. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16282: [DO_NOT_MERGE]Try to fix kafka
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16282 **[Test build #3501 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3501/testReport)** for PR 16282 at commit [`c4e6962`](https://github.com/apache/spark/commit/c4e6962dbf22c2ec7658f95fd1be069628860855). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16281 @gatorsmile @rdblue also works directly on Parquet. I am not seeing "unfixable" Parquet problems here. You're just pointing at problems that can and should be fixed, preferably in one place. Forking is not at all normal as a response to this. This, at least, must block on us figuring out how to manage forks. The Hive fork is still not really technically OK. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 @srowen Even if we fork our own version, it does not mean we will give up the upgrading to the newer version. We just added a few fixes. This is very normal in the mission-critical system. When the customers on mainframe hit a bug, they will not upgrade it to the newer major version. Normally, they can get the special build with the needed fixes. Parquet community will not do this for us, but we can do it by ourselves especially when we have the Parquet experts @liancheng @rdblue --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 Found another input that triggers non-polynomial time with the code in this PR. I'm again borrowing from scikit-learn. I think this is the case they found that led them to re-write their implementation. ``` val y = ((0 until length) ++ (-(length - 1) until length) ++ (-(length - 1) to 0)).toArray.map(_.toDouble) val x = (1 to y.length).toArray.map(_.toDouble) ``` | Input Length | Time (ns) | | --: | --: | | 40 | 2059 | | 80 | 4604 | | 160 | 1974269 | | 320 | 3246603433 | I'm now working on implementing what's described in the Best papers. This should give O(n), even in the worst case. Should I close this and open a new PR with the new algorithm, or just add it here and you can squash when you merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16134: [SPARK-18703] [SQL] Drop Staging Directories and Data Fi...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16134 The staging directory and files will not be removed when users hitting abnormal termination of JVM. In addition, if the JVM does not stop, these temporary files could still consume a lot of spaces. Thus, I think we need to backport it. However, I am not sure whether we should backport it to all the previous versions (2.1, 2.0 and 1.6) @rxin Could you please make a decision? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #1980: [SPARK-2750] support https in spark web ui
Github user pritpalm commented on the issue: https://github.com/apache/spark/pull/1980 I want to enable https on spark UI. I added following config to spark-defaults.config, but when we access spark ui via https::/:8080 or https://:443 or https://:8480, it's not able to connect. spark.ui.https.enabled true spark.ssl.keyPassword abcd spark.ssl.keyStore rtmqa-clientid.jks spark.ssl.keyStorePassword changeit --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #5664: [SPARK-2750][WEB UI]Add Https support for Web UI
Github user pritpalm commented on the issue: https://github.com/apache/spark/pull/5664 I want to enable https on spark UI. I added following config to spark-defaults.config, but when we access spark ui via https::/:8080 or https://:443 or https://:8480, it's not able to connect. spark.ui.https.enabled true spark.ssl.keyPassword abcd spark.ssl.keyStore rtmqa-clientid.jks spark.ssl.keyStorePassword changeit --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #10238: [SPARK-2750][WEB UI] Add https support to the Web UI
Github user pritpalm commented on the issue: https://github.com/apache/spark/pull/10238 I want to enable https on spark UI. I added following config to spark-defaults.config, but when we access spark ui via https::/:8080 or https://:443 or https://:8480, it's not able to connect. spark.ui.https.enabled true spark.ssl.keyPassword abcd spark.ssl.keyStore rtmqa-clientid.jks spark.ssl.keyStorePassword changeit --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16281 My two cents: - Do we have a Parquet specific test suite **with sufficient coverage** to run and back us up that this upgrade won't cause any regressions? I think simply moving up the version of the jar files is a risky act. This practice of doing (sort of) integration test will gain the confidence of our user community that they can count on Spark to exercise its due diligence when it changes the versions of any third party modules Spark runs on. Yes, the activity comes with a cost. We can always define how much we can test and we want to test. - On the topic of forking, it is a judgment call. It's a balance of having a full control on the dependent third party modules but deviating from their origin versus doing little work on our end but risking any contamination. In the world of interdependence and interconnect, my opinion leans towards "good fences make good neighbours." That comes back to my first point, we need to have a good test coverage to gauge the impact of an upgrade to Spark. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15915 **[Test build #3500 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3500/testReport)** for PR 15915 at commit [`22b81cf`](https://github.com/apache/spark/commit/22b81cf4d7205488e22907d27511c71cf9e242dc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15915 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15915 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70191/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15915 **[Test build #70191 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70191/testReport)** for PR 15915 at commit [`22b81cf`](https://github.com/apache/spark/commit/22b81cf4d7205488e22907d27511c71cf9e242dc). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16271: [SPARK-18845][GraphX] PageRank has incorrect initializat...
Github user aray commented on the issue: https://github.com/apache/spark/pull/16271 Yes the improvement is from the sum of magnitudes of initial values being closer to the (known) sum of the solution. Fiddling with resetProb controls a completely different thing. The current implementation has no advantage (excluding finding the incorrect solution to a star graph one iteration faster). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16271: [SPARK-18845][GraphX] PageRank has incorrect init...
Github user aray commented on a diff in the pull request: https://github.com/apache/spark/pull/16271#discussion_r92621591 --- Diff: graphx/src/test/scala/org/apache/spark/graphx/lib/PageRankSuite.scala --- @@ -70,10 +70,10 @@ class PageRankSuite extends SparkFunSuite with LocalSparkContext { val resetProb = 0.15 val errorTol = 1.0e-5 - val staticRanks1 = starGraph.staticPageRank(numIter = 1, resetProb).vertices - val staticRanks2 = starGraph.staticPageRank(numIter = 2, resetProb).vertices.cache() + val staticRanks1 = starGraph.staticPageRank(numIter = 2, resetProb).vertices --- End diff -- Not really more robust since it has a sink and thus is still wrong pending SPARK-18847. But it is needed with the change to fully propagate the change in rank of source vertices in the first iteration as explained above. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16030 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70190/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16030 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16030 **[Test build #70190 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70190/testReport)** for PR 16030 at commit [`8c7d3b8`](https://github.com/apache/spark/commit/8c7d3b85544c5c1624c1e598036212231c8b7a35). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14079 **[Test build #70194 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70194/testReport)** for PR 14079 at commit [`f249b00`](https://github.com/apache/spark/commit/f249b00e3eb64bf35ab836fa4b89eb961a9511a5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16057 **[Test build #70193 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70193/testReport)** for PR 16057 at commit [`b065079`](https://github.com/apache/spark/commit/b065079706f2de2a23b055e644ae46402ef97a8d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16057 **[Test build #70192 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70192/testReport)** for PR 16057 at commit [`9aee0d1`](https://github.com/apache/spark/commit/9aee0d1fe784c1be5db699b559e5311ce6353bd7). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16057 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16057 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70192/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16057 **[Test build #70192 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70192/testReport)** for PR 16057 at commit [`9aee0d1`](https://github.com/apache/spark/commit/9aee0d1fe784c1be5db699b559e5311ce6353bd7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/15717 @gatorsmile I've updated the PR description, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/16030 @cloud-fan okay, I updated the desc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16030 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70186/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16030 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16030 **[Test build #70186 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70186/testReport)** for PR 16030 at commit [`0601ccd`](https://github.com/apache/spark/commit/0601ccddc3a6aef8607fc1dc974677110b85aab8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org