[GitHub] spark issue #18230: [SPARK-19688] [STREAMING] Not to read `spark.yarn.creden...

2017-06-19 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/18230
  
@vanzin [Xing Shi 
(saturday_s)](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=saturday_s),
 thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18230: [SPARK-19688] [STREAMING] Not to read `spark.yarn.creden...

2017-06-18 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/18230
  
@jerryshao 
> "reload" here meanings retrieving back SparkConf from checkpoint file and 
using this retrieved SparkConf to create SparkContext when restarting streaming 
application.

That explanation is extremely wrong. But your opinion of what the 
`propertiesToReload` list does is right.

After restarting from checkpoint, properties in `SparkConf` will be the 
same as the previous application. But something like `spark.yarn.app.id` will 
be stale an useless in a restarted app. So after retrieving back the 
`SparkConf` from checkpoint, we want to "reload" the fresh values from system 
properties, instead of using old ones in the checkpoint.

@vanzin 
> So if you start the second streaming application without providing 
principal / keytab, Client.scala will not overwrite the credential file path, 
but still the AM will start the credential updater, because the file location 
is in the configuration read from the checkpoint.

That's probably right, but not the case. I do submit the principal & keytab 
at restarting and the AM do renew the token using the principal successfully.

I noticed that the `SparkConf` used by `AMCredentialRenewer` and 
`CredentialUpdater` seems NOT THE SAME. The credential renewer thread launched 
by the AM will work correctly, but the credential updater in executor backend - 
which uses configs provided by the diver - will confuse and fail in its job. So 
just fixing the AM code doesn't make much sense.

FYI, the log of `AMCredentialRenewer` looks like this:
```
17/06/07 15:11:14 INFO security.AMCredentialRenewer: Scheduling login from 
keytab in 96952 millis.
...
17/06/07 15:12:51 INFO security.AMCredentialRenewer: Attempting to login to 
KDC using principal: xxx@XXX.LOCAL
17/06/07 15:12:51 INFO security.AMCredentialRenewer: Successfully logged 
into KDC.
...
17/06/07 15:12:53 INFO security.AMCredentialRenewer: Writing out delegation 
tokens to 
hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0036/credentials-044b83ea-b46b-4bd4-8e98-0e38928fd58c-1496816091985-1.tmp
17/06/07 15:12:53 INFO security.AMCredentialRenewer: Delegation Tokens 
written out successfully. Renaming file to 
hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0036/credentials-044b83ea-b46b-4bd4-8e98-0e38928fd58c-1496816091985-1
17/06/07 15:12:53 INFO security.AMCredentialRenewer: Delegation token file 
rename complete.
17/06/07 15:12:53 INFO security.AMCredentialRenewer: Scheduling login from 
keytab in 110925 millis.
...
```
It renews the token successfully and saves it to 
application_1496384469444_0036's dir.
But the `CredentialUpdater` (started by `YarnSparkHadoopUtil`) complains 
about this:
```
17/06/07 15:11:24 INFO executor.CoarseGrainedExecutorBackend: Will 
periodically update credentials from: 
hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0035/credentials-19a7c11e-8c93-478c-ab0a-cdbfae5b2ae5
...
17/06/07 15:12:24 WARN yarn.YarnSparkHadoopUtil: Error while attempting to 
list files from application staging dir
java.io.FileNotFoundException: File 
hdfs://nameservice1/user/xxx/.sparkStaging/application_1496384469444_0035 does 
not exist.
...
```
... which says that the credentials file doesn't exist in 
application_1496384469444_0035's dir.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18230: [SPARK-19688] [STREAMING] Not to read `spark.yarn.creden...

2017-06-11 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/18230
  
No, I don't mean to insist on my opinion. I'm just curious to know the 
reason for the changing (as it looks like another point fix).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18230: [SPARK-19688] [STREAMING] Not to read `spark.yarn.creden...

2017-06-11 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/18230
  
@jerryshao Sorry for the delay.
Currently, Spark checkpoint ALL the configurations no matter it is 
"internal" or not. So we have to reload ones that should be updated at 
launching. Probably not to checkpoint the "internal" ones may be a good idea, 
but it is much like a new feature than a bug fix.

This PR attempt to fix the bug at least changing of codes, so it'll be 
easily merged into any maintenance branches. I don't care adding more options 
into the exclude-list, but you'll have to do an extra work to cherry-pick a 
subset for branches before 2.1, as `spark.yarn.credentials.renewalTime` and 
`spark.yarn.credentials.updateTime` don't exist in old branches.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18230: [SPARK-19688] [STREAMING] Not to read `spark.yarn.creden...

2017-06-08 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/18230
  
@jerryshao I've taken a look at `spark.yarn.credentials.renewalTime` and 
`spark.yarn.credentials.updateTime`, but I don't think there is a necessity for 
excluding them. Changing on these properties means that either delegation 
token's configuration or Spark itself has been changed. In both case, 
restarting from checkpoint will not work.

Could you describe some elaborate scenes about the changing of 
`spark.yarn.credentials.renewalTime` and `spark.yarn.credentials.updateTime`? 
Or just mean to make it more robust?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18230: [SPARK-19688] [STREAMING] Not to read `spark.yarn.creden...

2017-06-08 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/18230
  
> I guess "spark.yarn.credentials.renewalTime" and 
"spark.yarn.credentials.updateTime" should also be excluded.

Thank you for pointing out that. I'll check & fix them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18230: [SPARK-21008] [STREAMING] Not to read `spark.yarn...

2017-06-07 Thread saturday-shi
GitHub user saturday-shi opened a pull request:

https://github.com/apache/spark/pull/18230

[SPARK-21008] [STREAMING] Not to read `spark.yarn.credentials.file` from 
checkpoint.

## What changes were proposed in this pull request?

Reload the `spark.yarn.credentials.file` property when restarting a 
streaming application from checkpoint.

## How was this patch tested?

Manual tested with 1.6.3 and 2.1.1.
I didn't test this with master because of some compile problems, but I 
think it will be the same result.

## Notice

This should be merged into maintenance branches too.

jira: [SPARK-21008](https://issues.apache.org/jira/browse/SPARK-21008)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/saturday-shi/spark SPARK-21008

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18230.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18230


commit 13d40d7533ac73d6fb82706a5ef1d19c9272c0e4
Author: saturday_s 
Date:   2017-06-07T09:09:05Z

Not to read `spark.yarn.credentials.file` from checkpoint.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-22 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/16253
  
@vanzin My JIRA account is [Xing Shi 
(saturday_s)](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=saturday_s).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...

2016-12-21 Thread saturday-shi
Github user saturday-shi commented on a diff in the pull request:

https://github.com/apache/spark/pull/16253#discussion_r93560749
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/api.scala ---
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.status.api.v1.streaming
+
+import java.util.Date
+
+import org.apache.spark.streaming.ui.StreamingJobProgressListener._
+
+class StreamingStatistics private[spark](
+  val startTime: Date,
--- End diff --

Done it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-20 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/16253
  
Lot of thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...

2016-12-15 Thread saturday-shi
Github user saturday-shi commented on a diff in the pull request:

https://github.com/apache/spark/pull/16253#discussion_r92730696
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/AllReceiversResource.scala
 ---
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.status.api.v1.streaming
+
+import java.util.Date
+import javax.ws.rs.{GET, Produces}
+import javax.ws.rs.core.MediaType
+
+import org.apache.spark.status.api.v1.streaming.AllReceiversResource._
+import org.apache.spark.streaming.ui.StreamingJobProgressListener
+
+@Produces(Array(MediaType.APPLICATION_JSON))
+private[v1] class AllReceiversResource(listener: 
StreamingJobProgressListener) {
+
+  @GET
+  def receiversList(): Seq[ReceiverInfo] = {
+receiverInfoList(listener).sortBy(_.streamId)
+  }
+}
+
+private[v1] object AllReceiversResource {
+
+  def receiverInfoList(listener: StreamingJobProgressListener): 
Seq[ReceiverInfo] = {
+listener.synchronized {
+  listener.receivedRecordRateWithBatchTime.map { case (streamId, 
eventRates) =>
+
+val receiverInfo = listener.receiverInfo(streamId)
+val streamName = receiverInfo.map(_.name).
--- End diff --

I used the same style with the similar code in 
[StreamingPage.scala](https://github.com/apache/spark/blob/e115cdad29ae90c7d0b7da6d2a2e90047dc87985/streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala#L435-L436).
 Should I fix StreamingPage too if this style doesn't looks so good?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...

2016-12-15 Thread saturday-shi
Github user saturday-shi commented on a diff in the pull request:

https://github.com/apache/spark/pull/16253#discussion_r92727999
  
--- Diff: project/MimaExcludes.scala ---
@@ -116,7 +116,10 @@ object MimaExcludes {
   
ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.streaming.StreamingQueryException.startOffset"),
   
ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.streaming.StreamingQueryException.endOffset"),
   
ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.sql.streaming.StreamingQueryException.this"),
-  
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.StreamingQueryException.query")
+  
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.StreamingQueryException.query"),
+
+  // [SPARK-18537] Add a REST api to spark streaming
+  
ProblemFilters.exclude[ReversedMissingMethodProblem]("org.apache.spark.streaming.scheduler.StreamingListener.onStreamingStarted")
--- End diff --

Thank you for your notice. Actually this PR will hardly have any chance to 
be merged into 2.1. I will fix it with other style issues you've pointed out.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/16253
  
@ajbozarth @vanzin 
Can anyone of you retest this please?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/16253
  
> Looks like a compile-time check for the listener API.

I think that's right. I was confused with the test suites in Scala. But 
this is for Java.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/16253
  
@vanzin 
Thank you for your suggestions. Actually, `ApiStreamingRootResource` would 
be a better name.

BTW, do you (or anybody else) know the purpose of 
[JavaStreamingListenerAPISuite.java](https://github.com/apache/spark/blob/master/streaming/src/test/java/org/apache/spark/streaming/JavaStreamingListenerAPISuite.java)?
 When I adding some test, I found it seems never used by any code, so I just 
left it away. Should we delete it, or make change in it anyway?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...

2016-12-12 Thread saturday-shi
Github user saturday-shi commented on a diff in the pull request:

https://github.com/apache/spark/pull/16253#discussion_r92073600
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala ---
@@ -66,6 +67,8 @@ class StreamingContext private[streaming] (
 _batchDur: Duration
   ) extends Logging {
 
+  private var startTime = -1L
--- End diff --

You're right. I left it in global just in case of some further using.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...

2016-12-12 Thread saturday-shi
Github user saturday-shi commented on a diff in the pull request:

https://github.com/apache/spark/pull/16253#discussion_r92073117
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/AllReceiversResource.scala
 ---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.status.api.v1.streaming
+
+import java.util.Date
+import javax.ws.rs.{GET, Produces}
+import javax.ws.rs.core.MediaType
+
+import org.apache.spark.status.api.v1.streaming.AllReceiversResource._
+import org.apache.spark.streaming.ui.StreamingJobProgressListener
+
+@Produces(Array(MediaType.APPLICATION_JSON))
+private[v1] class AllReceiversResource(listener: 
StreamingJobProgressListener) {
+
+  @GET
+  def receiversList(): Seq[ReceiverInfo] = {
+receiverInfoList(listener).sortBy(_.streamId)
+  }
+}
+
+private[v1] object AllReceiversResource {
+
+  def receiverInfoList(listener: StreamingJobProgressListener): 
Seq[ReceiverInfo] = {
+listener.synchronized {
+  listener.receivedRecordRateWithBatchTime.map { case (streamId, 
eventRates) =>
+
+val receiverInfo = listener.receiverInfo(streamId)
+val streamName = receiverInfo.map(_.name).
+  
orElse(listener.streamName(streamId)).getOrElse(s"Stream-$streamId")
+val avgEventRate =
+  if (eventRates.isEmpty) None
+  else Some(eventRates.map(_._2).sum / eventRates.size)
+
+val (errorTime, errorMessage, error) = receiverInfo match {
+  case None => (None, None, None)
+  case Some(info) =>
+val someTime = {
+  if (info.lastErrorTime >= 0) Some(new 
Date(info.lastErrorTime))
+  else None
--- End diff --

Uh... I think I misunderstood the comment. You mean something like if (...) 
Some(...) else None should be in the same line?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15904: [SPARK-18470][STREAMING][WIP] Provide Spark Streaming Mo...

2016-12-12 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/15904
  
@uncleGen 
Exactly I will try my best to complete this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15904: [SPARK-18470][STREAMING][WIP] Provide Spark Streaming Mo...

2016-12-12 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/15904
  
@uncleGen 
It seems you don't have much time to go on with this. So I opened a new PR 
(#16253) inherit all functions from the old one, with the only change of merge 
them into the current api v1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-12 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/16253
  
@vanzin 
Could you take a look at this please?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...

2016-12-12 Thread saturday-shi
GitHub user saturday-shi opened a pull request:

https://github.com/apache/spark/pull/16253

[SPARK-18537][Web UI] Add a REST api to serve spark streaming information

## What changes were proposed in this pull request?

This PR is an inheritance from #16000, and is a completion of #15904.

**Description**

> 1. implement a package(org.apache.spark.streaming.status.api.v1) that 
serve the same purpose as org.apache.spark.status.api.v1
> 1. register the api path through StreamingPage
> 1. retrive the streaming informateion through StreamingJobProgressListener
> 
> this api should cover exceptly the same amount of information as you can 
get from the web interface
> the implementation is base on the current REST implementation of 
spark-core
> and will be available for running applications only
> 
> https://issues.apache.org/jira/browse/SPARK-18537

## How was this patch tested?

Local test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/saturday-shi/spark SPARK-18537

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16253.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16253


commit 680a59aab2f8b3624f9f39fbc78bb5cbd7ec3bac
Author: Chan Chor Pang 
Date:   2016-10-26T05:36:44Z

compile ok, try to test

commit 04f9e9c914b58c14c98845b5529472333b348590
Author: Chan Chor Pang 
Date:   2016-10-26T07:39:40Z

add path /streamingapi

commit da20ce37b88770e2cb997ed48976ba2542305e6b
Author: Chan Chor Pang 
Date:   2016-10-28T05:42:40Z

need attach to some where

commit 3468d40336091201108169b168bd612c0f5fcf77
Author: Chan Chor Pang 
Date:   2016-11-02T02:10:03Z

no writer yet

commit 525ae5fe25a47ed3884c2bf15143dcb932581f48
Author: Chan Chor Pang 
Date:   2016-11-02T05:13:48Z

not work, may be the data need to be in Iterator form

commit f854767cb5b3bb4a303d3418b426b73d02599c25
Author: Chan Chor Pang 
Date:   2016-11-08T06:05:24Z

remove unuse file

commit 34c8b3b90a9b045e3d6b6ac86d270edff9ed24a3
Author: Chan Chor Pang 
Date:   2016-11-02T06:18:17Z

package name didnt change in the copy process

commit 170b18904f041dfeb271f54fdb408ad2f575a2ee
Author: Chan Chor Pang 
Date:   2016-11-07T04:35:43Z

try to get the real info

commit 2f51c59a37f994c6bee2dd65d1517b32e7d9776d
Author: saturday_s 
Date:   2016-11-14T09:51:02Z

Refactor to fit scalastyle.

commit 76324b7c6f8849bef7d45363d925fd95efbbedcf
Author: saturday_s 
Date:   2016-11-16T04:40:24Z

Try to get startTime.

commit 68d734f07b43b44127ae5f698db39d671aaa59c1
Author: saturday_s 
Date:   2016-11-16T04:53:41Z

Change api path prefix.

commit ccfe0f5f28db73bb300d43c32d40d6e0e596c77c
Author: saturday_s 
Date:   2016-11-16T09:13:59Z

Implement statistics api.

commit 2d1e88440902c5212f43746f5c0b7f282b7a6243
Author: saturday_s 
Date:   2016-11-17T02:59:58Z

Implement receivers api.

commit 0d9f6b9667ef774f5a8c868a453e3d68b66a6702
Author: saturday_s 
Date:   2016-11-17T04:46:13Z

Fix last-error-info format.

commit 8088fa5bad4c0e15bb14abfc0ee7475ba4ad138b
Author: saturday_s 
Date:   2016-11-17T05:08:30Z

Implement one-receiver api.

commit f1da6b1f2856b761696ae9d767836af6417e4f43
Author: saturday_s 
Date:   2016-11-17T05:21:39Z

Fix access level issue of `ErrorWrapper`.

commit 4d8138191f1529137e4c1e858998bd78477ca739
Author: saturday_s 
Date:   2016-11-18T01:30:30Z

Synchronize to listener when getting info from it.

commit 17cb832cedb4b2cfeff5e501a9f71378b3402cee
Author: saturday_s 
Date:   2016-11-18T05:30:15Z

Implement batch(es) api.

commit 137e8fb7de34b39b218939b371062e225adc958e
Author: saturday_s 
Date:   2016-11-18T06:55:42Z

Remove details of outputOpsInfo from batchInfo.

commit 08f33522251ff20b14af15952ac918cbcfada551
Author: saturday_s 
Date:   2016-11-18T08:35:55Z

Implement outputOpsInfo api.

commit 477e71de47bbde642a9222729c73b7dd52318529
Author: saturday_s 
Date:   2016-11-18T09:37:04Z

Try another approach to get outputOpsInfo.

commit 7ddac2929343ad60f733166d25ed485fa3976cc0
Author: saturday_s 
Date:   2016-11-21T02:03:55Z

Try another more approach to get outputOpsInfo.

commit e0fe970fa64fc87de277a7f63f39423608cfef52
Author: saturday_s 
Date:   2016-11-21T02:41:25Z

Continue trying to get outputOpsInfo(jobIds).

commit 35963312dcf722b98cd3b0dabff97d398ccd020c
Author: saturday_s 
Date:   2016-11-21T04:14:38Z

Fix outputOpsInfo and jobIds issue.

commit 9760492cb826c7552c453e3c55a1098455eaa0bc
Author: saturday_s 
Date:   2016-11-21T04:35:55Z

Fix syntax error.

commit 65b39078d54408d8ac1ee608a21e49a978e7415d
Author: saturday_s 
Date:   2016-11-21T05:23:31Z

Consolidate the param check lo

[GitHub] spark issue #16000: [SPARK-18537][Web UI]Add a REST api to spark streaming

2016-12-08 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/16000
  
@vanzin 
Hello, I'm a collaborator of this PR. Actually I am interested in your 
plan, but we don't want to make the changes here because that is not the 
purpose of this PR. I think I can open a new PR and implement the changes there.

@uncleGen 
I reviewed your code and found that there're lot of things to improve. I 
prefer to use the existing ones in this PR to avoid duplicate works. I will 
open a new PR later, but if you already have a plan please let me know. Maybe I 
can work on with you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15022: [SPARK-17465] [Spark Core] Inappropriate memory manageme...

2016-09-10 Thread saturday-shi
Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/15022
  
Thank @JoshRosen for your reply!

> Actually, I spot one more step to make this really robust: I think we 
also need to call `releasePendingUnrollMemoryForThisTask` at the end of Task in 
order to be absolutely sure this memory will be released during error cases.

That's right. I will make the change later.
And it is really helpful if you have an good idea of adding a test to avoid 
regressing. I just worrying about that I don't have an easy way to check if the 
problem relapsed or not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15022: [SPARK-17465] [Spark Core] Inappropriate memory m...

2016-09-09 Thread saturday-shi
GitHub user saturday-shi opened a pull request:

https://github.com/apache/spark/pull/15022

[SPARK-17465] [Spark Core] Inappropriate memory management in 
`org.apache.spark.storage.MemoryStore` may lead to memory leak

## What changes were proposed in this pull request?

The expression like `if (memoryMap(taskAttemptId) == 0) 
memoryMap.remove(taskAttemptId)` in method `releaseUnrollMemoryForThisTask` and 
`releasePendingUnrollMemoryForThisTask` should be called after release memory 
operation, whatever `memoryToRelease` is > 0 or not.

If the memory of a task has been set to 0 when calling a 
`releaseUnrollMemoryForThisTask` or a `releasePendingUnrollMemoryForThisTask` 
method, the key in the memory map corresponding to that task will never be 
removed from the hash map.

See the details in 
[SPARK-17465](https://issues.apache.org/jira/browse/SPARK-17465).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/saturday-shi/spark SPARK-17465

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15022.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15022


commit 720f2ceb35361f387fc55301fb44b560a289d8ca
Author: Xing SHI 
Date:   2016-09-09T07:58:24Z

Correct the inappropriate memory management operation in 
releaseUnrollMemoryForThisTask and releasePendingUnrollMemoryForThisTask method.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org