What I meant is that Kafka has been designed first and foremost as a
high-throughput system, and it is achieving that with a couple techniques,
but mainly by batching a bunch of events together so that it can benefit
from the lesser overhead of writing sequentially (as opposed to random
access).

Whether you choose to publish synchronously or asynchronously should not
change anything to the fact that Kafka can achieve a high throughput via
batching.

--
Felix



On Mon, Aug 20, 2012 at 10:55 PM, wm <wmartin...@gmail.com> wrote:

> Felix. My regets for confusing the matter.  Please inform me of a primary
> source for the canonical use case you reference, unless that was scoped to
> the kafka community only. That sort of statement should be clearly
> documented imho.
>
> I am considering the matter closed with respect to this list. I have 3
> publish options each with some degree of autonomy from the calling code's
> designed behavior.
>
> regards
>
>
> On 08/20/2012 02:39 PM, Felix GV wrote:
>
>> I think the difference is merely that async publishing is a non-blocking
>> call, whereas sync publishing is a blocking call, meaning that the code
>> that does a sync publish call could choose to have an alternate behavior
>> if
>> the publish failed, whereas the code that does an async publish would
>> never
>> know whether the publish succeeded or not.
>>
>> But like I said, in both cases, you can configure the batching size at the
>> producer level, and a batching size greater than 1 will provide you with
>> better throughput capabilities... In fact, I think this is the canonical
>> use case Kafka was originally built for.
>>
>> --
>> Felix
>>
>>
>>
>> On Mon, Aug 20, 2012 at 2:24 PM, will martin <wmartin...@gmail.com>
>> wrote:
>>
>>  My understanding is that async is not meant to be an immediate send. As
>>> to
>>> batching, I've not delved into the code differences.
>>>
>>> But batching the sync is not possible at the Producer higher level; at
>>> least that's what I've tried and had no success with, the default and
>>> string encoders cannot handle lists, although the documentation suggests
>>> they can.
>>>
>>> I'm glad to be wrong on this; but I've had no luck with the serializer
>>> deep
>>> in scala code tree accepting a composite of any type containing either
>>> Message or String.  I can batch myself, but doubt this is what any of us
>>> think is the design goal?
>>>
>>>
>>>
>>> On Mon, Aug 20, 2012 at 1:06 PM, Felix GV <fe...@mate1inc.com> wrote:
>>>
>>>  This may not be entirely related to what you're talking about, but why
>>>> would an async producer not be able to meet your throughput needs, and a
>>>> sync producer be able to?
>>>>
>>>> Both sync and async producers can be configured to batch more than one
>>>> message together, and that's pretty much the main thing that's required
>>>>
>>> to
>>>
>>>> be able to achieve good throughput, AFAIK.
>>>>
>>>> ...?
>>>>
>>>> --
>>>> Felix
>>>>
>>>>
>>>>
>>>> On Mon, Aug 20, 2012 at 12:49 PM, will martin <wmartin...@gmail.com>
>>>> wrote:
>>>>
>>>>  Thanks Neha. All my data is of 1 type. The serializer in place doesn't
>>>>>
>>>> seem
>>>>
>>>>> to handle an array of String.
>>>>>
>>>>> The ProducerData I use is a collection of same types of data wrapped
>>>>>
>>>> in a
>>>
>>>> single defintion, according to as I read spec.  Am I to understand
>>>>>
>>>> that,
>>>
>>>> having a producer batch records itself is unsupported?  The async
>>>>>
>>>> producer
>>>>
>>>>> can't meet my throughput needs and as I understand is targetted at
>>>>>
>>>> implicit
>>>>
>>>>> load balancing among different client machines.
>>>>>
>>>>> Additionally, the sync producer can meet my needs, but requires more
>>>>>
>>>> use
>>>
>>>> of
>>>>
>>>>> the lower-level design features. For maintenance, it'd be great if I
>>>>>
>>>> could
>>>>
>>>>> create a list of Strings, create a ProducerData<String, List<String>>
>>>>>
>>>> and
>>>
>>>> have this be serialized.
>>>>>
>>>>> It occurs to me that the described  serialization may need my
>>>>>
>>>> attention?
>>>
>>>> Thx
>>>>>
>>>>>
>>>>> On Mon, Aug 20, 2012 at 12:06 PM, Neha Narkhede <
>>>>>
>>>> neha.narkh...@gmail.com
>>>
>>>> wrote:
>>>>>> The producer takes in a "serializer.class" config that it uses to
>>>>>> serialize data sent by the Producer. A Producer instance is tied to
>>>>>> the type of data it is sending, so you won't be able to send data
>>>>>> belonging to diverse types using the same Producer object.
>>>>>>
>>>>>> Thanks,
>>>>>> Neha
>>>>>>
>>>>>> On Mon, Aug 20, 2012 at 8:02 AM, will martin <wmartin...@gmail.com>
>>>>>>
>>>>> wrote:
>>>>>
>>>>>> This use case is defined by the following snippet from the Design
>>>>>>>
>>>>>> section
>>>>>
>>>>>> of the doc pages.
>>>>>>>
>>>>>>> class Producer {
>>>>>>>
>>>>>>> public void send (ProducerData)
>>>>>>>
>>>>>>> public void send (List<ProducerData>)
>>>>>>>
>>>>>>> public void close()
>>>>>>> }
>>>>>>>
>>>>>>> I've tried various composites for the List<ProducerData> argument,
>>>>>>> including strings and Messages. All of these throw serialization
>>>>>>>
>>>>>> errors
>>>>
>>>>> deep in the engine.
>>>>>>>
>>>>>>> Is the list form of send supported in 7.1?
>>>>>>>
>>>>>>> Thanks in advance,
>>>>>>> mmartin
>>>>>>>
>>>>>>
>

Reply via email to