If keys are null, what happens in terms of partitioning, is the load spread
evenly..?
On Mon, Oct 10, 2016 at 7:59 AM, Gwen Shapira wrote:
> Kafka itself supports null keys. I'm not sure about the Go client you
> use, but Confluent's Go client also supports null keys
>
Kafka itself supports null keys. I'm not sure about the Go client you
use, but Confluent's Go client also supports null keys
(https://github.com/confluentinc/confluent-kafka-go/).
If you decide to generate keys and you want even spread, a random
number generator is probably your best bet.
Gwen
Thank you.I'll try the solution.
But in the highlevel consumer API, topics will be created automatically? ,
We are not using zookeeper?
On 10 October 2016 at 12:34, Sachin Mittal wrote:
> You can use topicA.leftJoin (topicB).to (new-topic).
>
> You can the consume message
You can use topicA.leftJoin (topicB).to (new-topic).
You can the consume message from that new topic via second process. Note
you need to create all three topics in zookeeper first.
On 10 Oct 2016 5:19 a.m., "Ratha v" wrote:
Hi all;
I have two topics in the broker.
A kafka producer written elsewhere that I'm using, which uses the Go kafka
driver, is sending messages where the key is null.
Is this OK - or will this cause issues due to partitioning not happening
correctly?
What would be a good way to generate keys in this case, to ensure even
partition
Hi all;
I have two topics in the broker. Without consuming from one topic, I want
to merge both topics, and will consume messages from the second topic.
It is because, I have two processes, one process, pushes messages to topic
A. And the second process once finished processing, it wants to
It isn't a fatal error. It should be logged as a warning, and then the
stream should be continued w/ the next message.
Checking for null is 'ok', in the sense that it gets the job done, but
after java 8's release, we really should be using optionals.
Hopefully we can break compatibility w/ the
Ali,
In your scenario, if serde fails to parse the bytes should that be treated
as a fatal failure or it is expected?
In the former case, instead of returning a null I think it is better to
throw a runtime exception in order to let the whole client to stop and
notify the error; in the latter
Jerry,
Message lost scenarios usually are due to producer and consumer
mis-configured. Have you read about the client configs web docs?
http://kafka.apache.org/documentation#producerconfigs
http://kafka.apache.org/documentation#newconsumerconfigs
If not I'd suggest you reading those first and
Hey Eno, thanks
Yes, localhost would mean testing SSD, maybe moving to a remote producer in
the future over 10GBe but baby steps first.
My record size is pretty flexible - I've tested from 4k to 16M and the
results are +- 10MB/s.
SSD is a samsung 850, claiming 100k IOPS and >500MB/s read/write.
Yeah that I realized and later read that in Java Kafka consumer there is
one thread, that's why such behavior does not arise there. May be if I need
to restrict my application to a single threaded :( in order to achieve
that.. I need to ask Magnus Edenhill who is librdkafka expert..
Thanks for
Hi Chris,
A couple of things: looks like you're primarily testing the SSD if you're
running on the localhost, right? The 10GBe shouldn't matter in this case.
The performance will depend a lot on the record sizes you are using. To get a
ballpark number if would help to know the SSD type, would
I'm pretty sure Jun was talking about the Java API in the quoted blog text, not
librdkafka. There is only one thread in the new Java consumer so you wouldn't
see this behavior. I do not think that librdkafka makes any such guarantee to
dispatch unique keys to each thread but I'm not an expert
Hello,
The last thread available regarding 10GBe is about 2 years old, with no
obvious recommendations on tuning.
Is there a more complex tuning guide than the example production config
available on Kafka's main site? Anything other than the list of possible
configs?
I currently have access to
Hi,
It is actually a KTable-KTable join.
I have a stream (K1, A) which is aggregated as (Key, List) hence it
creates a KTable.
I have another stream (K2, B) which is aggregated as (Key, List) hence
it creates another KTable.
Then I have
KTable (Key, List).leftJoin( KTable(Key, List),
hi,
I meet a issue that the temporary node of broke in zookeeper was lost while
the process of the broker still exist. At this time, the controller would
consider it to be offline. According to zkClient log, I find the session is
timeout, but handleStateChanged and handleNewSession(in
Publisher/Subscriber systems can be divided into two categories.
1) Topic based model
2) Content based model - Provide accurate results compared to topic based
model, since subscribers interested on the content of the message rather
than subscribing to a topic and getting all the messages.
Kafka
I did that but i am getting confusing results
e.g
I have created 4 Kafka Consumer threads for doing data analytic, these
threads just wait for Kafka messages to get consumed and
I have provided the key provided when I produce, it means that all the
messages will go to one single partition ref "
Hi Sachin,
Some comments inline:
> On 9 Oct 2016, at 08:19, Sachin Mittal wrote:
>
> Hi,
> I needed some light on how joins actually work on continuous stream of data.
>
> Say I have 2 topics which I need to join (left join).
> Data record in each topic is aggregated like
Then publish with the user ID as the key and all messages for the same key will
be guaranteed to go to the same partition and therefore be in order for
whichever consumer gets that partition.
//h...@confluent.io
Original message From: Abhit Kalsotra
Date:
What about the order of message getting received ? If i don't mention the
partition.
Lets say if i have user ID :4456 and I have to do some analytics at the
Kafka Consumer end and at my consumer end if its not getting consumed the
way I sent, then my analytics will go haywire.
Abhi
On Sun, Oct
Thanks for pointing this out.
I am doing exactly like this now and it is working fine.
Sachin
On Sun, Oct 9, 2016 at 12:32 PM, Matthias J. Sax
wrote:
> You must ensure, that both streams are co-partitioned (ie, same number
> of partitions and using the join key).
>
>
You don't even have to do that because the default partitioner will spread the
data you publish to the topic over the available partitions for you. Just try
it out to see. Publish multiple messages to the topic without using keys, and
without specifying a partition, and observe that they are
Hi,
I needed some light on how joins actually work on continuous stream of data.
Say I have 2 topics which I need to join (left join).
Data record in each topic is aggregated like (key, value) <=> (string, list)
Topic 1
key1: [A01, A02, A03, A04 ..]
Key2: [A11, A12, A13, A14 ..]
Topic 2
You must ensure, that both streams are co-partitioned (ie, same number
of partitions and using the join key).
(see "Note" box:
http://docs.confluent.io/3.0.1/streams/developer-guide.html#joining-streams)
You can enforce co-partitioning by introducing a call to .through()
before doing the join
Hans
Thanks for the response, yeah you can say yeah I am treating topics like
partitions, because my
current logic of producing to a respective topic goes something like this
RdKafka::ErrorCode resp = m_kafkaProducer->produce(m_kafkaTopic[whichTopic],
26 matches
Mail list logo