[GitHub] spark issue #18230: [SPARK-19688] [STREAMING] Not to read `spark.yarn.creden...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/18230 @vanzin [Xing Shi (saturday_s)](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=saturday_s), thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18230: [SPARK-19688] [STREAMING] Not to read `spark.yarn.creden...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/18230 @jerryshao > "reload" here meanings retrieving back SparkConf from checkpoint file and using this retrieved SparkConf to create SparkContext when restarting streaming application. That explanation is extremely wrong. But your opinion of what the `propertiesToReload` list does is right. After restarting from checkpoint, properties in `SparkConf` will be the same as the previous application. But something like `spark.yarn.app.id` will be stale an useless in a restarted app. So after retrieving back the `SparkConf` from checkpoint, we want to "reload" the fresh values from system properties, instead of using old ones in the checkpoint. @vanzin > So if you start the second streaming application without providing principal / keytab, Client.scala will not overwrite the credential file path, but still the AM will start the credential updater, because the file location is in the configuration read from the checkpoint. That's probably right, but not the case. I do submit the principal & keytab at restarting and the AM do renew the token using the principal successfully. I noticed that the `SparkConf` used by `AMCredentialRenewer` and `CredentialUpdater` seems NOT THE SAME. The credential renewer thread launched by the AM will work correctly, but the credential updater in executor backend - which uses configs provided by the diver - will confuse and fail in its job. So just fixing the AM code doesn't make much sense. FYI, the log of `AMCredentialRenewer` looks like this: ``` 17/06/07 15:11:14 INFO security.AMCredentialRenewer: Scheduling login from keytab in 96952 millis. ... 17/06/07 15:12:51 INFO security.AMCredentialRenewer: Attempting to login to KDC using principal: xxx@XXX.LOCAL 17/06/07 15:12:51 INFO security.AMCredentialRenewer: Successfully logged into KDC. ... 17/06/07 15:12:53 INFO security.AMCredentialRenewer: Writing out delegation tokens to hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0036/credentials-044b83ea-b46b-4bd4-8e98-0e38928fd58c-1496816091985-1.tmp 17/06/07 15:12:53 INFO security.AMCredentialRenewer: Delegation Tokens written out successfully. Renaming file to hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0036/credentials-044b83ea-b46b-4bd4-8e98-0e38928fd58c-1496816091985-1 17/06/07 15:12:53 INFO security.AMCredentialRenewer: Delegation token file rename complete. 17/06/07 15:12:53 INFO security.AMCredentialRenewer: Scheduling login from keytab in 110925 millis. ... ``` It renews the token successfully and saves it to application_1496384469444_0036's dir. But the `CredentialUpdater` (started by `YarnSparkHadoopUtil`) complains about this: ``` 17/06/07 15:11:24 INFO executor.CoarseGrainedExecutorBackend: Will periodically update credentials from: hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0035/credentials-19a7c11e-8c93-478c-ab0a-cdbfae5b2ae5 ... 17/06/07 15:12:24 WARN yarn.YarnSparkHadoopUtil: Error while attempting to list files from application staging dir java.io.FileNotFoundException: File hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0035 does not exist. ... ``` ... which says that the credentials file doesn't exist in application_1496384469444_0035's dir. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18230: [SPARK-19688] [STREAMING] Not to read `spark.yarn.creden...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/18230 No, I don't mean to insist on my opinion. I'm just curious to know the reason for the changing (as it looks like another point fix). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18230: [SPARK-19688] [STREAMING] Not to read `spark.yarn.creden...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/18230 @jerryshao Sorry for the delay. Currently, Spark checkpoint ALL the configurations no matter it is "internal" or not. So we have to reload ones that should be updated at launching. Probably not to checkpoint the "internal" ones may be a good idea, but it is much like a new feature than a bug fix. This PR attempt to fix the bug at least changing of codes, so it'll be easily merged into any maintenance branches. I don't care adding more options into the exclude-list, but you'll have to do an extra work to cherry-pick a subset for branches before 2.1, as `spark.yarn.credentials.renewalTime` and `spark.yarn.credentials.updateTime` don't exist in old branches. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18230: [SPARK-19688] [STREAMING] Not to read `spark.yarn.creden...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/18230 @jerryshao I've taken a look at `spark.yarn.credentials.renewalTime` and `spark.yarn.credentials.updateTime`, but I don't think there is a necessity for excluding them. Changing on these properties means that either delegation token's configuration or Spark itself has been changed. In both case, restarting from checkpoint will not work. Could you describe some elaborate scenes about the changing of `spark.yarn.credentials.renewalTime` and `spark.yarn.credentials.updateTime`? Or just mean to make it more robust? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18230: [SPARK-19688] [STREAMING] Not to read `spark.yarn.creden...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/18230 > I guess "spark.yarn.credentials.renewalTime" and "spark.yarn.credentials.updateTime" should also be excluded. Thank you for pointing out that. I'll check & fix them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18230: [SPARK-21008] [STREAMING] Not to read `spark.yarn...
GitHub user saturday-shi opened a pull request: https://github.com/apache/spark/pull/18230 [SPARK-21008] [STREAMING] Not to read `spark.yarn.credentials.file` from checkpoint. ## What changes were proposed in this pull request? Reload the `spark.yarn.credentials.file` property when restarting a streaming application from checkpoint. ## How was this patch tested? Manual tested with 1.6.3 and 2.1.1. I didn't test this with master because of some compile problems, but I think it will be the same result. ## Notice This should be merged into maintenance branches too. jira: [SPARK-21008](https://issues.apache.org/jira/browse/SPARK-21008) You can merge this pull request into a Git repository by running: $ git pull https://github.com/saturday-shi/spark SPARK-21008 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18230.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18230 commit 13d40d7533ac73d6fb82706a5ef1d19c9272c0e4 Author: saturday_s Date: 2017-06-07T09:09:05Z Not to read `spark.yarn.credentials.file` from checkpoint. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/16253 @vanzin My JIRA account is [Xing Shi (saturday_s)](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=saturday_s). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...
Github user saturday-shi commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r93560749 --- Diff: streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/api.scala --- @@ -0,0 +1,79 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.status.api.v1.streaming + +import java.util.Date + +import org.apache.spark.streaming.ui.StreamingJobProgressListener._ + +class StreamingStatistics private[spark]( + val startTime: Date, --- End diff -- Done it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/16253 Lot of thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...
Github user saturday-shi commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92730696 --- Diff: streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/AllReceiversResource.scala --- @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.status.api.v1.streaming + +import java.util.Date +import javax.ws.rs.{GET, Produces} +import javax.ws.rs.core.MediaType + +import org.apache.spark.status.api.v1.streaming.AllReceiversResource._ +import org.apache.spark.streaming.ui.StreamingJobProgressListener + +@Produces(Array(MediaType.APPLICATION_JSON)) +private[v1] class AllReceiversResource(listener: StreamingJobProgressListener) { + + @GET + def receiversList(): Seq[ReceiverInfo] = { +receiverInfoList(listener).sortBy(_.streamId) + } +} + +private[v1] object AllReceiversResource { + + def receiverInfoList(listener: StreamingJobProgressListener): Seq[ReceiverInfo] = { +listener.synchronized { + listener.receivedRecordRateWithBatchTime.map { case (streamId, eventRates) => + +val receiverInfo = listener.receiverInfo(streamId) +val streamName = receiverInfo.map(_.name). --- End diff -- I used the same style with the similar code in [StreamingPage.scala](https://github.com/apache/spark/blob/e115cdad29ae90c7d0b7da6d2a2e90047dc87985/streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala#L435-L436). Should I fix StreamingPage too if this style doesn't looks so good? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...
Github user saturday-shi commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92727999 --- Diff: project/MimaExcludes.scala --- @@ -116,7 +116,10 @@ object MimaExcludes { ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.streaming.StreamingQueryException.startOffset"), ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.streaming.StreamingQueryException.endOffset"), ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.sql.streaming.StreamingQueryException.this"), - ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.StreamingQueryException.query") + ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.StreamingQueryException.query"), + + // [SPARK-18537] Add a REST api to spark streaming + ProblemFilters.exclude[ReversedMissingMethodProblem]("org.apache.spark.streaming.scheduler.StreamingListener.onStreamingStarted") --- End diff -- Thank you for your notice. Actually this PR will hardly have any chance to be merged into 2.1. I will fix it with other style issues you've pointed out. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/16253 @ajbozarth @vanzin Can anyone of you retest this please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/16253 > Looks like a compile-time check for the listener API. I think that's right. I was confused with the test suites in Scala. But this is for Java. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/16253 @vanzin Thank you for your suggestions. Actually, `ApiStreamingRootResource` would be a better name. BTW, do you (or anybody else) know the purpose of [JavaStreamingListenerAPISuite.java](https://github.com/apache/spark/blob/master/streaming/src/test/java/org/apache/spark/streaming/JavaStreamingListenerAPISuite.java)? When I adding some test, I found it seems never used by any code, so I just left it away. Should we delete it, or make change in it anyway? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...
Github user saturday-shi commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92073600 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala --- @@ -66,6 +67,8 @@ class StreamingContext private[streaming] ( _batchDur: Duration ) extends Logging { + private var startTime = -1L --- End diff -- You're right. I left it in global just in case of some further using. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...
Github user saturday-shi commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92073117 --- Diff: streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/AllReceiversResource.scala --- @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.status.api.v1.streaming + +import java.util.Date +import javax.ws.rs.{GET, Produces} +import javax.ws.rs.core.MediaType + +import org.apache.spark.status.api.v1.streaming.AllReceiversResource._ +import org.apache.spark.streaming.ui.StreamingJobProgressListener + +@Produces(Array(MediaType.APPLICATION_JSON)) +private[v1] class AllReceiversResource(listener: StreamingJobProgressListener) { + + @GET + def receiversList(): Seq[ReceiverInfo] = { +receiverInfoList(listener).sortBy(_.streamId) + } +} + +private[v1] object AllReceiversResource { + + def receiverInfoList(listener: StreamingJobProgressListener): Seq[ReceiverInfo] = { +listener.synchronized { + listener.receivedRecordRateWithBatchTime.map { case (streamId, eventRates) => + +val receiverInfo = listener.receiverInfo(streamId) +val streamName = receiverInfo.map(_.name). + orElse(listener.streamName(streamId)).getOrElse(s"Stream-$streamId") +val avgEventRate = + if (eventRates.isEmpty) None + else Some(eventRates.map(_._2).sum / eventRates.size) + +val (errorTime, errorMessage, error) = receiverInfo match { + case None => (None, None, None) + case Some(info) => +val someTime = { + if (info.lastErrorTime >= 0) Some(new Date(info.lastErrorTime)) + else None --- End diff -- Uh... I think I misunderstood the comment. You mean something like if (...) Some(...) else None should be in the same line? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15904: [SPARK-18470][STREAMING][WIP] Provide Spark Streaming Mo...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/15904 @uncleGen Exactly I will try my best to complete this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15904: [SPARK-18470][STREAMING][WIP] Provide Spark Streaming Mo...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/15904 @uncleGen It seems you don't have much time to go on with this. So I opened a new PR (#16253) inherit all functions from the old one, with the only change of merge them into the current api v1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/16253 @vanzin Could you take a look at this please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...
GitHub user saturday-shi opened a pull request: https://github.com/apache/spark/pull/16253 [SPARK-18537][Web UI] Add a REST api to serve spark streaming information ## What changes were proposed in this pull request? This PR is an inheritance from #16000, and is a completion of #15904. **Description** > 1. implement a package(org.apache.spark.streaming.status.api.v1) that serve the same purpose as org.apache.spark.status.api.v1 > 1. register the api path through StreamingPage > 1. retrive the streaming informateion through StreamingJobProgressListener > > this api should cover exceptly the same amount of information as you can get from the web interface > the implementation is base on the current REST implementation of spark-core > and will be available for running applications only > > https://issues.apache.org/jira/browse/SPARK-18537 ## How was this patch tested? Local test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/saturday-shi/spark SPARK-18537 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16253.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16253 commit 680a59aab2f8b3624f9f39fbc78bb5cbd7ec3bac Author: Chan Chor Pang Date: 2016-10-26T05:36:44Z compile ok, try to test commit 04f9e9c914b58c14c98845b5529472333b348590 Author: Chan Chor Pang Date: 2016-10-26T07:39:40Z add path /streamingapi commit da20ce37b88770e2cb997ed48976ba2542305e6b Author: Chan Chor Pang Date: 2016-10-28T05:42:40Z need attach to some where commit 3468d40336091201108169b168bd612c0f5fcf77 Author: Chan Chor Pang Date: 2016-11-02T02:10:03Z no writer yet commit 525ae5fe25a47ed3884c2bf15143dcb932581f48 Author: Chan Chor Pang Date: 2016-11-02T05:13:48Z not work, may be the data need to be in Iterator form commit f854767cb5b3bb4a303d3418b426b73d02599c25 Author: Chan Chor Pang Date: 2016-11-08T06:05:24Z remove unuse file commit 34c8b3b90a9b045e3d6b6ac86d270edff9ed24a3 Author: Chan Chor Pang Date: 2016-11-02T06:18:17Z package name didnt change in the copy process commit 170b18904f041dfeb271f54fdb408ad2f575a2ee Author: Chan Chor Pang Date: 2016-11-07T04:35:43Z try to get the real info commit 2f51c59a37f994c6bee2dd65d1517b32e7d9776d Author: saturday_s Date: 2016-11-14T09:51:02Z Refactor to fit scalastyle. commit 76324b7c6f8849bef7d45363d925fd95efbbedcf Author: saturday_s Date: 2016-11-16T04:40:24Z Try to get startTime. commit 68d734f07b43b44127ae5f698db39d671aaa59c1 Author: saturday_s Date: 2016-11-16T04:53:41Z Change api path prefix. commit ccfe0f5f28db73bb300d43c32d40d6e0e596c77c Author: saturday_s Date: 2016-11-16T09:13:59Z Implement statistics api. commit 2d1e88440902c5212f43746f5c0b7f282b7a6243 Author: saturday_s Date: 2016-11-17T02:59:58Z Implement receivers api. commit 0d9f6b9667ef774f5a8c868a453e3d68b66a6702 Author: saturday_s Date: 2016-11-17T04:46:13Z Fix last-error-info format. commit 8088fa5bad4c0e15bb14abfc0ee7475ba4ad138b Author: saturday_s Date: 2016-11-17T05:08:30Z Implement one-receiver api. commit f1da6b1f2856b761696ae9d767836af6417e4f43 Author: saturday_s Date: 2016-11-17T05:21:39Z Fix access level issue of `ErrorWrapper`. commit 4d8138191f1529137e4c1e858998bd78477ca739 Author: saturday_s Date: 2016-11-18T01:30:30Z Synchronize to listener when getting info from it. commit 17cb832cedb4b2cfeff5e501a9f71378b3402cee Author: saturday_s Date: 2016-11-18T05:30:15Z Implement batch(es) api. commit 137e8fb7de34b39b218939b371062e225adc958e Author: saturday_s Date: 2016-11-18T06:55:42Z Remove details of outputOpsInfo from batchInfo. commit 08f33522251ff20b14af15952ac918cbcfada551 Author: saturday_s Date: 2016-11-18T08:35:55Z Implement outputOpsInfo api. commit 477e71de47bbde642a9222729c73b7dd52318529 Author: saturday_s Date: 2016-11-18T09:37:04Z Try another approach to get outputOpsInfo. commit 7ddac2929343ad60f733166d25ed485fa3976cc0 Author: saturday_s Date: 2016-11-21T02:03:55Z Try another more approach to get outputOpsInfo. commit e0fe970fa64fc87de277a7f63f39423608cfef52 Author: saturday_s Date: 2016-11-21T02:41:25Z Continue trying to get outputOpsInfo(jobIds). commit 35963312dcf722b98cd3b0dabff97d398ccd020c Author: saturday_s Date: 2016-11-21T04:14:38Z Fix outputOpsInfo and jobIds issue. commit 9760492cb826c7552c453e3c55a1098455eaa0bc Author: saturday_s Date: 2016-11-21T04:35:55Z Fix syntax error. commit 65b39078d54408d8ac1ee608a21e49a978e7415d Author: saturday_s Date: 2016-11-21T05:23:31Z Consolidate the param check lo
[GitHub] spark issue #16000: [SPARK-18537][Web UI]Add a REST api to spark streaming
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/16000 @vanzin Hello, I'm a collaborator of this PR. Actually I am interested in your plan, but we don't want to make the changes here because that is not the purpose of this PR. I think I can open a new PR and implement the changes there. @uncleGen I reviewed your code and found that there're lot of things to improve. I prefer to use the existing ones in this PR to avoid duplicate works. I will open a new PR later, but if you already have a plan please let me know. Maybe I can work on with you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15022: [SPARK-17465] [Spark Core] Inappropriate memory manageme...
Github user saturday-shi commented on the issue: https://github.com/apache/spark/pull/15022 Thank @JoshRosen for your reply! > Actually, I spot one more step to make this really robust: I think we also need to call `releasePendingUnrollMemoryForThisTask` at the end of Task in order to be absolutely sure this memory will be released during error cases. That's right. I will make the change later. And it is really helpful if you have an good idea of adding a test to avoid regressing. I just worrying about that I don't have an easy way to check if the problem relapsed or not. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15022: [SPARK-17465] [Spark Core] Inappropriate memory m...
GitHub user saturday-shi opened a pull request: https://github.com/apache/spark/pull/15022 [SPARK-17465] [Spark Core] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak ## What changes were proposed in this pull request? The expression like `if (memoryMap(taskAttemptId) == 0) memoryMap.remove(taskAttemptId)` in method `releaseUnrollMemoryForThisTask` and `releasePendingUnrollMemoryForThisTask` should be called after release memory operation, whatever `memoryToRelease` is > 0 or not. If the memory of a task has been set to 0 when calling a `releaseUnrollMemoryForThisTask` or a `releasePendingUnrollMemoryForThisTask` method, the key in the memory map corresponding to that task will never be removed from the hash map. See the details in [SPARK-17465](https://issues.apache.org/jira/browse/SPARK-17465). You can merge this pull request into a Git repository by running: $ git pull https://github.com/saturday-shi/spark SPARK-17465 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15022.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15022 commit 720f2ceb35361f387fc55301fb44b560a289d8ca Author: Xing SHI Date: 2016-09-09T07:58:24Z Correct the inappropriate memory management operation in releaseUnrollMemoryForThisTask and releasePendingUnrollMemoryForThisTask method. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org