[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...
Github user asfgit closed the pull request at: https://github.com/apache/storm/pull/1605 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...
Github user srdo commented on a diff in the pull request: https://github.com/apache/storm/pull/1605#discussion_r80023848 --- Diff: external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java --- @@ -330,11 +328,9 @@ public void ack(Object messageId) { public void fail(Object messageId) { final KafkaSpoutMessageId msgId = (KafkaSpoutMessageId) messageId; emitted.remove(msgId); -if (msgId.numFails() < maxRetries) { -msgId.incrementNumFails(); -retryService.schedule(msgId); -} else { // limit to max number of retries -LOG.debug("Reached maximum number of retries. Message [{}] being marked as acked.", msgId); +msgId.incrementNumFails(); +if (!retryService.schedule(msgId)) { +LOG.debug("Retry service indicated message should not be retried. Message [{}] being marked as acked.", msgId); --- End diff -- @hmcl I was referring to RetryService being an interface users can override, which doesn't mention max retries in the schedule javadoc, and this part of the code technically not "knowing about" max retries, because users may supply a retry service that doesn't schedule tuples for some other reason. I think you're right though, it was probably clearer before. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...
Github user hmcl commented on a diff in the pull request: https://github.com/apache/storm/pull/1605#discussion_r80004187 --- Diff: external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java --- @@ -330,11 +328,9 @@ public void ack(Object messageId) { public void fail(Object messageId) { final KafkaSpoutMessageId msgId = (KafkaSpoutMessageId) messageId; emitted.remove(msgId); -if (msgId.numFails() < maxRetries) { -msgId.incrementNumFails(); -retryService.schedule(msgId); -} else { // limit to max number of retries -LOG.debug("Reached maximum number of retries. Message [{}] being marked as acked.", msgId); +msgId.incrementNumFails(); +if (!retryService.schedule(msgId)) { +LOG.debug("Retry service indicated message should not be retried. Message [{}] being marked as acked.", msgId); --- End diff -- @srdo I am not sure I completely follow your reasoning. The most meaningful piece of information that we want to show to the user here is that if the max number of retries has been reached, no more retries happen. If the user reads a message saying that the retry service marked it not to be retried, he still does not know the cause. Looking at the code, the `boolean schedule(msgId)` only returns false only if the max number of retries has been reached. Although the retry service log messages hints ant the max retries cap being reached, the user will have to have both logs enabled to have both spouts. I favor the message as it was before, as it is straight to the point and provides good insight on what is happening. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...
Github user hmcl commented on a diff in the pull request: https://github.com/apache/storm/pull/1605#discussion_r79571602 --- Diff: external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java --- @@ -330,11 +328,9 @@ public void ack(Object messageId) { public void fail(Object messageId) { final KafkaSpoutMessageId msgId = (KafkaSpoutMessageId) messageId; emitted.remove(msgId); -if (msgId.numFails() < maxRetries) { -msgId.incrementNumFails(); -retryService.schedule(msgId); -} else { // limit to max number of retries -LOG.debug("Reached maximum number of retries. Message [{}] being marked as acked.", msgId); +msgId.incrementNumFails(); +if (!retryService.schedule(msgId)) { +LOG.debug("Retry service indicated message should not be retried. Message [{}] being marked as acked.", msgId); --- End diff -- I would leave the message as it was before. "Reached maximum number of retries. Message [{}] being marked as acked." It adds overhead and it is irrelevant to be mentioning the retry service here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...
Github user hmcl commented on a diff in the pull request: https://github.com/apache/storm/pull/1605#discussion_r79572783 --- Diff: external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutRetryService.java --- @@ -29,14 +29,18 @@ */ public interface KafkaSpoutRetryService extends Serializable { /** - * Schedules this {@link KafkaSpoutMessageId} if not yet scheduled, or updates retry time if it has already been scheduled. + * Schedules this {@link KafkaSpoutMessageId} if not yet scheduled, or + * updates retry time if it has already been scheduled. May also indicate --- End diff -- It may also indicate... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...
Github user hmcl commented on a diff in the pull request: https://github.com/apache/storm/pull/1605#discussion_r79570505 --- Diff: external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutRetryExponentialBackoff.java --- @@ -144,7 +145,12 @@ public String toString() { /** * The time stamp of the next retry is scheduled according to the exponential backoff formula ( geometric progression): * nextRetry = failCount == 1 ? currentTime + initialDelay : currentTime + delayPeriod^(failCount-1) where failCount = 1, 2, 3, ... - * nextRetry = Min(nextRetry, currentTime + maxDelay) + * nextRetry = Min(nextRetry, currentTime + maxDelay). + * + * While retrying a record, no new records are committed until the previous polled records have been acked. This guarantees at once delivery of --- End diff -- This comment should probably not go here, but rather somewhere in the Spout code. (retry service knows nothing about exactly once or at most once semantics) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1605: STORM-2014: Put logic around dropping messages int...
GitHub user srdo opened a pull request: https://github.com/apache/storm/pull/1605 STORM-2014: Put logic around dropping messages into RetryService, rem⦠â¦ove maxRetry setting from new KafkaSpout https://issues.apache.org/jira/browse/STORM-2014 This PR removes maxRetry from the KafkaSpout and changes the RetryService interface slightly so the schedule method can communicate back to the spout that the message should be dropped. Retry logic belongs to the RetryService interface, and it's nice for users if they can easily plug in their own handling of messages that will be dropped (custom logging for example). You can merge this pull request into a Git repository by running: $ git pull https://github.com/srdo/storm STORM-2014 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1605.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1605 commit 432f932a70b0bd21f06cd6f36386404eab194e0a Author: Stig Rohde Døssing Date: 2016-08-02T17:17:02Z STORM-2014: Put logic around dropping messages into RetryService, remove maxRetry setting from new KafkaSpout --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---