[jira] [Comment Edited] (KAFKA-1835) Kafka new producer needs options to make blocking behavior explicit

Jiangjie Qin (JIRA) Mon, 13 Jul 2015 21:08:13 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625781#comment-14625781
 ]


Jiangjie Qin edited comment on KAFKA-1835 at 7/14/15 4:06 AM:
--------------------------------------------------------------

[~ewencp] I agree that handling exception is something users have to do. But 
telling user they are guaranteed to receive exception for a valid configuration 
sounds a bit awkward to me. I think it would be better to only give exception 
to user when there is really something went wrong instead of asking user to 
handle false alarms.

WRT the stale metadata, I agree with you we should let user know immediately if 
a metadata refresh failed (actually from this point of view, we should try to 
fetch metadata from bootstrap servers up on clients instantiation instead of 
doing it later because bootstrap servers might even not connectable), but we 
might want to be very careful on failing send if we can still send them. This 
looks more of a design decision rather than a bug to me. One argument is that 
we should let user know immediately if something goes wrong. On the other hand, 
we want to deliver the message if possible instead of simply dropping them on 
the floor. So maybe we can append the messages but throw an exception saying 
that metadata is outdated.

Also, I think it might worth thinking what kind of exception we want to expose 
to user. For instance, if a partition of a topic is offline, should we throw 
exception in send() or should we just send messages to other available 
partitions. If user were sending keyed messages, the answer would be obvious, 
what if it is sending non-keyed messages?

Thanks for the feedback [~stevenz3wu], I guess in your case you are producing 
messages to a changing topic set. In that case, it is necessary to deal with 
the exception during producing if matadata timeout is set to 0. But for people 
who are producing to a single fixed topic, supposedly metadata should not be 
lost after the first successful metadata fetch. If it is lost then that would 
be a big problem such as partition gets offline.


was (Author: becket_qin):
[~ewencp] I agree that handling exception is something users have to do. But 
telling user they are guaranteed to receive exception for a valid configuration 
sounds a bit awkward to me. I think it would be better to only give exception 
to user when there is really something went wrong instead of asking user to 
handle false alarms.
WRT the stale metadata, I agree with you we should let user know immediately if 
a metadata refresh failed (actually from this point of view, we should try to 
fetch metadata from bootstrap servers up on clients instantiation instead of 
doing it later because bootstrap servers might even not connectable), but we 
might want to be very careful on failing send if we can still send them. This 
looks more of a design decision rather than a bug to me. One argument is that 
we should let user know immediately if something goes wrong. On the other hand, 
we want to deliver the message if possible instead of simply dropping them on 
the floor. So maybe we can append the messages but throw an exception saying 
that metadata is outdated.
Also, I think it might worth thinking what kind of exception we want to expose 
to user. For instance, if a partition of a topic is offline, should we throw 
exception in send() or should we just send messages to other available 
partitions. If user were sending keyed messages, the answer would be obvious, 
what if it is sending non-keyed messages?
Thanks for the feedback [~stevenz3wu], I guess in your case you are producing 
messages to a changing topic set. In that case, it is necessary to deal with 
the exception during producing if matadata timeout is set to 0. But for people 
who are producing to a single fixed topic, supposedly metadata should not be 
lost after the first successful metadata fetch. If it is lost then that would 
be a big problem such as partition gets offline.

> Kafka new producer needs options to make blocking behavior explicit
> -------------------------------------------------------------------
>
>                 Key: KAFKA-1835
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1835
>             Project: Kafka
>          Issue Type: Improvement
>          Components: clients
>    Affects Versions: 0.8.2.0, 0.8.3, 0.9.0
>            Reporter: Paul Pearcy
>             Fix For: 0.8.3
>
>         Attachments: KAFKA-1835-New-producer--blocking_v0.patch, 
> KAFKA-1835.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> The new (0.8.2 standalone) producer will block the first time it attempts to 
> retrieve metadata for a topic. This is not the desired behavior in some use 
> cases where async non-blocking guarantees are required and message loss is 
> acceptable in known cases. Also, most developers will assume an API that 
> returns a future is safe to call in a critical request path. 
> Discussing on the mailing list, the most viable option is to have the 
> following settings:
>  pre.initialize.topics=x,y,z
>  pre.initialize.timeout=x
>  
> This moves potential blocking to the init of the producer and outside of some 
> random request. The potential will still exist for blocking in a corner case 
> where connectivity with Kafka is lost and a topic not included in pre-init 
> has a message sent for the first time. 
> There is the question of what to do when initialization fails. There are a 
> couple of options that I'd like available:
> - Fail creation of the client 
> - Fail all sends until the meta is available 
> Open to input on how the above option should be expressed. 
> It is also worth noting more nuanced solutions exist that could work without 
> the extra settings, they just end up having extra complications and at the 
> end of the day not adding much value. For instance, the producer could accept 
> and queue messages(note: more complicated than I am making it sound due to 
> storing all accepted messages in pre-partitioned compact binary form), but 
> you're still going to be forced to choose to either start blocking or 
> dropping messages at some point. 
> I have some test cases I am going to port over to the Kafka producer 
> integration ones and start from there. My current impl is in scala, but 
> porting to Java shouldn't be a big deal (was using a promise to track init 
> status, but will likely need to make that an atomic bool). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (KAFKA-1835) Kafka new producer needs options to make blocking behavior explicit

Reply via email to