On 04/14/2015 12:48 PM, Gil Fliker wrote:
Thx for the quick response,

I am not yet familiar with all of heka's futures specifically with
"message_matcher".
That's one of Heka's fundamental concepts, please see 
http://hekad.readthedocs.org/en/v0.9.1/index.html, 
http://hekad.readthedocs.org/en/v0.9.1/getting_started.html and 
http://hekad.readthedocs.org/en/v0.9.1/message_matcher.html.
Let me just add that the reasoning behind the high number of partitions
is to enable parallelism to support the throughput needed.

Can you please point me in a direction for a similar heka example ?
Sorry, there's no existing example that I can point you to at the moment. We're 
happy to answer specific questions, to the extent we're able, but every 
massively parallel data processing infrastructure is going to be different, 
you're going to have to get familiar with the building blocks that Heka 
provides and drill down a bit before you'll be able to get a useful response. :)

-r



Thx




Gil Fliker


On Tue, Apr 14, 2015 at 3:22 PM, Rob Miller <[email protected]
<mailto:[email protected]>> wrote:

    Yes, currently a single KafkaInput can only pull from a single Kafka
    partition. You can think of Heka's KafkaInput as analogous to a
    SimpleConsumer (see
    
https://cwiki.apache.org/__confluence/display/KAFKA/0.8.__0+SimpleConsumer+Example
    
<https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example>).

    If you want to manage inter-partition coordination, along the lines
    of what is described as a "High Level Consumer"
    
(https://cwiki.apache.org/__confluence/display/KAFKA/__Consumer+Group+Example
    <https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example>),
    you'd handle that at the filter layer. For instance, you might set
    up a filter plugin with a message_matcher constructed such that it
    catches all of the messages from a single topic, regardless of
    partition, and perform any necessary correlations therein. The
    delivery semantics to this filter would match that described on the
    consumer group example page linked above, i.e. all of the messages
    from a single partition will be received in the correct order, but
    the messages from across partitions would be non-deterministically
    interleaved.

    If there are so many partitions carrying so much data that a single
    Heka instance can't handle them all, then you might have to have one
    box handling one subset of partitions, another box processing a
    different subset, and each of *those* in turn feeding into a third
    box that performs the next level of correlation.

    In other words, the building blocks are there, but you have to
    actually use them to put together a more sophisticated system. We're
    unfortunately not yet at the point where there are higher level
    constructs that will automatically distribute load for you.

    Hope this helps!

    -r



    On 04/10/2015 02:21 PM, Gil Fliker wrote:

        Hi,

        We are about to start a poc using Heka.

        The plan is to pipe messages via Kafka transport and Heka being the
        endpoints speaking http with various producers and consumers.

        I saw in the documentation that you have to specify a partition
        number
        and only one partition number ?

        Our Kafka topic setup will be made of around 1000 partitions.

        What is the best way to approach this ?


        Thx


        Gil Fliker

        Outbrain Operations Manager

        The above terms reflect a potential business arrangement, are
        provided
        solely as a basis for further discussion, and are not intended
        to be and
        do not constitute a legally binding obligation. No legally binding
        obligations will be created, implied, or inferred until an
        agreement in
        final form is executed in writing by all parties involved.

        This email and any attachments hereto may be confidential or
        privileged.
           If you received this communication by mistake, please don't
        forward it
        to anyone else, please erase all copies and attachments, and
        please let
        me know that it has gone to the wrong person. Thanks.


        _________________________________________________
        Heka mailing list
        [email protected] <mailto:[email protected]>
        https://mail.mozilla.org/__listinfo/heka
        <https://mail.mozilla.org/listinfo/heka>




The above terms reflect a potential business arrangement, are provided
solely as a basis for further discussion, and are not intended to be and
do not constitute a legally binding obligation. No legally binding
obligations will be created, implied, or inferred until an agreement in
final form is executed in writing by all parties involved.

This email and any attachments hereto may be confidential or privileged.
  If you received this communication by mistake, please don't forward it
to anyone else, please erase all copies and attachments, and please let
me know that it has gone to the wrong person. Thanks.

_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Reply via email to