[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...

2017-02-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/storm/pull/1605


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...

2016-09-22 Thread srdo
Github user srdo commented on a diff in the pull request:

https://github.com/apache/storm/pull/1605#discussion_r80023848
  
--- Diff: 
external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java
 ---
@@ -330,11 +328,9 @@ public void ack(Object messageId) {
 public void fail(Object messageId) {
 final KafkaSpoutMessageId msgId = (KafkaSpoutMessageId) messageId;
 emitted.remove(msgId);
-if (msgId.numFails() < maxRetries) {
-msgId.incrementNumFails();
-retryService.schedule(msgId);
-} else { // limit to max number of retries
-LOG.debug("Reached maximum number of retries. Message [{}] 
being marked as acked.", msgId);
+msgId.incrementNumFails();
+if (!retryService.schedule(msgId)) {
+LOG.debug("Retry service indicated message should not be 
retried. Message [{}] being marked as acked.", msgId);
--- End diff --

@hmcl I was referring to RetryService being an interface users can 
override, which doesn't mention max retries in the schedule javadoc, and this 
part of the code technically not "knowing about" max retries, because users may 
supply a retry service that doesn't schedule tuples for some other reason. I 
think you're right though, it was probably clearer before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...

2016-09-22 Thread hmcl
Github user hmcl commented on a diff in the pull request:

https://github.com/apache/storm/pull/1605#discussion_r80004187
  
--- Diff: 
external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java
 ---
@@ -330,11 +328,9 @@ public void ack(Object messageId) {
 public void fail(Object messageId) {
 final KafkaSpoutMessageId msgId = (KafkaSpoutMessageId) messageId;
 emitted.remove(msgId);
-if (msgId.numFails() < maxRetries) {
-msgId.incrementNumFails();
-retryService.schedule(msgId);
-} else { // limit to max number of retries
-LOG.debug("Reached maximum number of retries. Message [{}] 
being marked as acked.", msgId);
+msgId.incrementNumFails();
+if (!retryService.schedule(msgId)) {
+LOG.debug("Retry service indicated message should not be 
retried. Message [{}] being marked as acked.", msgId);
--- End diff --

@srdo I am not sure I completely follow your reasoning. The most meaningful 
piece of information that we want to show to the user here is that if the max 
number of retries has been reached, no more retries happen. If the user reads a 
message saying that the retry service marked it not to be retried, he still 
does not know the cause.

Looking at the code, the `boolean schedule(msgId)` only returns false only 
if the max number of retries has been reached.

Although the retry service log messages hints ant the max retries cap being 
reached, the user will have to have both logs enabled to have both spouts. I 
favor the message as it was before, as it is straight to the point and provides 
good insight on what is happening.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...

2016-09-20 Thread hmcl
Github user hmcl commented on a diff in the pull request:

https://github.com/apache/storm/pull/1605#discussion_r79571602
  
--- Diff: 
external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java
 ---
@@ -330,11 +328,9 @@ public void ack(Object messageId) {
 public void fail(Object messageId) {
 final KafkaSpoutMessageId msgId = (KafkaSpoutMessageId) messageId;
 emitted.remove(msgId);
-if (msgId.numFails() < maxRetries) {
-msgId.incrementNumFails();
-retryService.schedule(msgId);
-} else { // limit to max number of retries
-LOG.debug("Reached maximum number of retries. Message [{}] 
being marked as acked.", msgId);
+msgId.incrementNumFails();
+if (!retryService.schedule(msgId)) {
+LOG.debug("Retry service indicated message should not be 
retried. Message [{}] being marked as acked.", msgId);
--- End diff --

I would leave the message as it was before. "Reached maximum number of 
retries. Message [{}] being marked as acked."

It adds overhead and it is irrelevant to be mentioning the retry service 
here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...

2016-09-20 Thread hmcl
Github user hmcl commented on a diff in the pull request:

https://github.com/apache/storm/pull/1605#discussion_r79572783
  
--- Diff: 
external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutRetryService.java
 ---
@@ -29,14 +29,18 @@
  */
 public interface KafkaSpoutRetryService extends Serializable {
 /**
- * Schedules this {@link KafkaSpoutMessageId} if not yet scheduled, or 
updates retry time if it has already been scheduled.
+ * Schedules this {@link KafkaSpoutMessageId} if not yet scheduled, or
+ * updates retry time if it has already been scheduled. May also 
indicate
--- End diff --

It may also indicate...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...

2016-09-20 Thread hmcl
Github user hmcl commented on a diff in the pull request:

https://github.com/apache/storm/pull/1605#discussion_r79570505
  
--- Diff: 
external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutRetryExponentialBackoff.java
 ---
@@ -144,7 +145,12 @@ public String toString() {
 /**
  * The time stamp of the next retry is scheduled according to the 
exponential backoff formula ( geometric progression):
  * nextRetry = failCount == 1 ? currentTime + initialDelay : 
currentTime + delayPeriod^(failCount-1) where failCount = 1, 2, 3, ...
- * nextRetry = Min(nextRetry, currentTime + maxDelay)
+ * nextRetry = Min(nextRetry, currentTime + maxDelay).
+ * 
+ * While retrying a record, no new records are committed until the 
previous polled records have been acked. This guarantees at once delivery of
--- End diff --

This comment should probably not go here, but rather somewhere in the Spout 
code. (retry service knows nothing about exactly once or at most once semantics)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...

2016-08-02 Thread srdo
GitHub user srdo opened a pull request:

https://github.com/apache/storm/pull/1605

STORM-2014: Put logic around dropping messages into RetryService, rem…

…ove maxRetry setting from new KafkaSpout

https://issues.apache.org/jira/browse/STORM-2014

This PR removes maxRetry from the KafkaSpout and changes the RetryService 
interface slightly so the schedule method can communicate back to the spout 
that the message should be dropped. Retry logic belongs to the RetryService 
interface, and it's nice for users if they can easily plug in their own 
handling of messages that will be dropped (custom logging for example).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/srdo/storm STORM-2014

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/1605.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1605


commit 432f932a70b0bd21f06cd6f36386404eab194e0a
Author: Stig Rohde Døssing 
Date:   2016-08-02T17:17:02Z

STORM-2014: Put logic around dropping messages into RetryService, remove 
maxRetry setting from new KafkaSpout




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---