No guarantee,
/svante
Den tors 13 jan. 2022 kl 20:21 skrev Chad Preisler :
>
> Hello,
>
> For ConsumerRecord.timestamp() is the timestamp guaranteed to be
> unique within the topic's partition, or can there be records inside the
> topics partition that have the same timestamp?
>
> Thanks.
> Chad
Just bring a new broker up and give it the id of the lost one. It will sync
itself
/svante
Den fre 13 sep. 2019 kl 13:51 skrev saurav suman :
> Hi,
>
> When the old data is lost and another broker is added to the cluster then
> it is a new fresh broker with no data. You can reassign the partitio
Yes that sound likely, if you changed the number of partitions then the
hashing of the key's will change destination. You need to either clear the
data (ie change retention to very small and roll the logs) or recreate the
topic.
/svante
Den fre 17 maj 2019 kl 12:32 skrev Nitay Kufert :
> I would
I would stream to influxdb and visualize with grafana. Works great for
dashboards. But I would rethink your line format. It's very convenient to
have tags (or labels) that are key/value pair on each metric if you ever
want to aggregate over a group of similar metrics.
Svante
Different directories, they cannot share path. A broker will delete
everything under the log directory that it does not know about
Den mån 22 okt. 2018 kl 17:47 skrev M. Manna :
> Hello,
>
> We are thinking of rolling out Kafka on Kubernetes deployed on public cloud
> (AWS or GCP, or other). We w
Sound like a workflow/pipeline thing in jenkins (or equivalent) to me.
Den ons 26 sep. 2018 kl 17:27 skrev Rickard Cardell
:
> Hi
> Is there a way to have a Kafka Connect connector begin in state 'PAUSED'?
> I.e I would like to have the connector set to paused before it can process
> any data f
You are doing something wrong if you need 10k threads to produce 800k
messages per second. It feels you are a factor of 1000 off. What size are
your messages?
On Thu, Sep 13, 2018, 21:04 Praveen wrote:
> Hi there,
>
> I have a kafka application that uses kafka consumer low-level api to help
> us
tion, but for this specific deployment adding a rack
> is out of question.
> Is there a way to resolve this with 2 racks ?
>
> Regards,
> Sanjay
>
> On 05/08/18, 11:57 PM, "Svante Karlsson" wrote:
>
> >3 racks, Replication Factor = 3, min.insync.replicas=2,
3 racks, Replication Factor = 3, min.insync.replicas=2, ack=all
2018-08-05 20:21 GMT+02:00 Sanjay Awatramani :
> Hi,
>
> I have done some experiments and gone through kafka documentation, which
> makes me conclude that there is a small chance of data loss or availability
> in a rack scenario. Ca
alt1)
if you can store a generation counter in the value of the "latest value"
topic you could do as follows
topic latest_value key [id]
topic full_history key[id, generation]
on delete get the latest_value.generation_counter and issue deletes on
full_history
key[id, 0..generation_counter]
alt2
ing
> message to our infrastructure side, but the webapp is unaware if it allowed
> or not ...
>
>
>
> thank for your reply 😊
>
> Adrien
>
>
> De : Svante Karlsson
> Envoyé : samedi 10 mars 2018 19:13:04
> À : users@kafka.apac
You do not want to expose the kafka instance to your different clients. put
some api endpoint between. rest/grpc or whatever.
2018-03-10 19:01 GMT+01:00 Nick Vasilyev :
> Hard to say without more info, but why not just deploy something like a
> REST api and expose it to your clients, they will se
try https://www.confluent.io/ - that's what they do
/svante
2018-03-02 21:21 GMT+01:00 Matt Stone :
> We are looking for a consultant or contractor that can come onsite to our
> Ogden, Utah location in the US, to help with a Kafka set up and maintenance
> project. What we need is someone with t
It's per broker. Usually you run with 4-6GB of java heap. The rest is used
as disk cache and it's more that 64GB seems like a sweet spot between
memory cost and performance.
/svante
2018-03-01 18:30 GMT+01:00 Michal Michalski :
> I'm quite sure it's per broker (it's a standard way to provide
> r
Yes, it will store the last value for each key
2018-01-23 18:30 GMT+01:00 Aman Rastogi :
> Hi All,
>
> We have a use case to store stream for infinite time (given we have enough
> storage).
>
> We are planning to solve this by Log Compaction. If each message key is
> unique and Log compaction is
whats your config for min.insync.replicas?
2018-01-17 13:37 GMT+01:00 Sameer Kumar :
> Hi,
>
> I have a cluster of 3 Kafka brokers, and replication factor is 2. This
> means I can tolerate failure of 1 node without data loss.
>
> Recently, one of my node crashed and some of my partitions went off
Even if you bind your socket to an ip of a specific card, when the packet
is about to leave your host it hits the routing table and gets routed
through the interface with least cost (arbitrary but static since all
interfaces have same cost since they are on the same subnet) thus you will
not reach
if you really want all the brokers to die, try
change server.properties
controlled.shutdown.enable=false
I had a similar problem on dev laptop with a single broker. It refused to
die on system shutdowns (or took a very long time).
2018-01-10 12:57 GMT+01:00 Ted Yu :
> Skip:Can you pastebin the
You are connecting to a single seed node - your kafka library will then
under the hood connect to the partition leaders for each partition you
subscribe or post to.
The load is not different compared to if you gave all nodes as connect
parameter. However if your seed node crashes then your client
Nope, that's the wrong design. It does not scale. You would end up in a
wide and shallow thing. To few messages per partition to make sense. You
want many thousands per partition per second to amortize the consumer to
broker round-trip.
On Nov 1, 2017 21:12, "Anshuman Ghosh"
wrote:
> Hello!
>
>
I've implemented the same logic for a c++ client - caching is the only way
to go since the performance impact of not doing it would be to big. So bet
on caching on all clients.
2017-10-03 18:12 GMT+02:00 Damian Guy :
> If you are using the confluent schema registry then the will be cached by
> th
Short answer - you cannot. The existing data is not reprocessed since kafka
itself has no knowledge on how you did your partitioning.
The normal workaround is that you stop producers and consumers. Create a
new topic with the desired number of partitions. Consume the old topic from
beginning and w
I think you are right, The rack awareness is used to spread the partitions
on creation, assignment -etc so get as many racks as your replication
count.
/svante
2017-08-20 13:33 GMT+02:00 Carl Samuelson :
> Hi
>
> I asked this question on SO here:
> https://stackoverflow.com/questions/45778455/k
Well, the purpose of the schema registry is to map a 16 bit id to a avro
schema. with or without rules on how you may update a schema with a given
name. To decode avro you need a schema. Either you "know" whats in a given
topic and then you can hardcode it. Or you prepend it with something. ie
the
It feels like the wrong usecase for kafka. Its not meant as something you
connect your end users to. Maybe MQTT would be a better fit as the serving
layer to end users or just poll as you said.
2017-07-31 17:10 GMT+02:00 Thakrar, Jayesh :
> You may want to look at the Kafka REST API instead of ha
I've used jolokia which gets JMX metrics without RMI (actually json over
http)
https://jolokia.org/
Integrates nicely with telegraf (and influxdb)
2017-07-19 20:47 GMT+02:00 Vijay Prakash <
vijay.prak...@microsoft.com.invalid>:
> Hey,
>
> Is there a way to use JMXMP instead of RMI to access Kafk
else in the community with more experience can recognize
> the symptoms but in the meantime, if you haven't already done so, you
> may want to search for similar issues:
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20text%20~%20%22ZK%20expired%3B%20shut%20dow
You are not supposed to run an even number of zookeepers. Fix that first
On Apr 26, 2017 20:59, "Abhit Kalsotra" wrote:
> Any pointers please
>
>
> Abhi
>
> On Wed, Apr 26, 2017 at 11:03 PM, Abhit Kalsotra
> wrote:
>
> > Hi *
> >
> > My kafka setup
> >
> >
> > **OS: Windows Machine*6 broker
What kind of disk are you using for the rocksdb store? ie spinning or ssd?
2016-11-25 12:51 GMT+01:00 Damian Guy :
> Hi Frank,
>
> Is this on a restart of the application?
>
> Thanks,
> Damian
>
> On Fri, 25 Nov 2016 at 11:09 Frank Lyaruu wrote:
>
> > Hi y'all,
> >
> > I have a reasonably simple
Hi, I tried building this today and the problem seems to remain.
/svante
[INFO] Building kafka-connect-hdfs 2.0.0-SNAPSHOT
[INFO]
Downloading:
http://packages.confluent.io/maven/io/confluent/kafka-connect-avro-converter/2.
If you have a kafka partition that is replicated to 3 nodes the partition
varies (in time) thus making the colocation pointless. You can only produce
and consume to/from the leader.
/svante
2015-11-12 9:00 GMT+01:00 Young, Ben :
> Hi,
>
> Any thoughts on this? Perhaps Kafka is not the best way
1) correlationId is just a number that you get back in your reply. you
can safely set it to anything. If you have some kind of call identification
is your system that you want to trace through logs - this is what you would
use.
2) You can safely use any external offset management you like. just st
Have you changed
zookeeper.connect=
in server.properties.
A better procedure for replacing zookeeper nodes would be to shutdown one
and install the new one with the same ip. This can easily be done to a
running cluster.
/svante
2015-04-30 20:08 GMT+02:00 Dillian Murphey :
> I had 3 zookeeper
What's the best way of exporting contents (avro encoded) from hive queries
to kafka?
Kind of camus, the other way around
best regards
svante
Your consumer "might" belong to a consumer group. Just commit offsets to
that consumer groups/topic/partition and it will work.
That said - if you want to figure out the consumers groups that exists you
have to look in zookeeper. There is no kafka API to get or create them. In
the java client it i
>4. As for recovering broker from disk full, if replication is enabled one
>can just bring it down (the leader of the partition will then migrate to
>other brokers), clear the disk space, and bring it up again; if replication
>is not enabled then you can first move the partitions away from this bro
>Is there a specific reason for the collocation of all partitions of a
topic?
Not all partitions - any partition of a topic is kept in a separate dir.
(hopefully not all on the same server)
>This means, the capacity of required volume is to be determined by the
retention size of the topic with l
The shutdown is expected. All data in a partition is kept in a single
directory (=> single disk)
I would move some topics/partitions from a full disk to a disk (on the same
broker) with more space.
If you have very unbalanced topics this might be hard.
You could get a bigger disk and copy the da
Wouldn't it be rather simple to add a retention time on "deleted" items ie
keys with null value for topics that are compacted?
The retention time would then be set to some "large" time to allow all
consumers to understand that a previous k/v is being deleted.
2015-03-02 17:30 GMT+01:00 Ivan Bal
Do you have to separate the snapshot from the "normal" update flow.
I've used a compacting kafka topic as the source of truth to a solr
database and fed the topic both with real time updates and "snapshots" from
a hive job. This worked very well. The nice point is that there is a
seamless transiti
/a Error
> Path:/brokers/topics/mytopic/partitions/143/state Error:KeeperErrorCode =
> BadVersion for /brokers/topics/mytopic/partitions/143/state
>
> It's probably worthwhile to note that we've disabled unclean leader
> election.
>
>
>
> On Thu, Feb 5, 2015 at 2:01 PM, svant
I believe I've had the same problem on the 0.8.2 rc2. We had a idle test
cluster with unknown health status and I applied rc3 without checking if
everything was ok before. Since that cluster had been doing nothing for a
couple of days and the retention time was 48 hours it's reasonable to
assume th
A kafka broker never pushes data to a consumer. It's the consumer that does
a long fetch and it provides the offset to read from.
The problem lies in how your consumer handles the for example 1000 messages
that it just got. If you handle 500 of them and crash without committing
the offsets somewhe
thanks,
svante
2015-01-21 16:30 GMT+01:00 Joe Stein :
> Sounds like you are bumping into this
> https://issues.apache.org/jira/browse/KAFKA-1367
>
>
We are running an external (like in non supported) C++ client library
agains 0.8.2-rc2 and see differences in the Isr vector in Metadata Response
compared to what ./kafka-topics.sh --describe returns.
We have a triple replicated topic that is not updated during the test.
kafka-topics.sh
returns
sing
> data over automatically.
>
> Thanks,
>
> Jun
>
> On Tue, Jan 20, 2015 at 1:02 AM, svante karlsson wrote:
>
> > I'm trying to figure out the best way to handle a disk failure in a live
> > environment.
> >
> > The obvious (and naive) solution i
In the wiki - there is a statement that a partition must fit on a single
machine, while technically true, isn't it so that a partition must fit on a
single disk on that machine.
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowmanytopicscanIhave
?
>A partition is basically a director
I'm trying to figure out the best way to handle a disk failure in a live
environment.
The obvious (and naive) solution is to decommission the broker and let
other brokers taker over and create new followers. Then replace the disk
and clean the remaining log directories and add the broker again.
T
wer" also looks strange
> >
> > I can't file it as a bug report as I can't reproduce it but I have a
> > distinct feeling that I can't trust the new mbeans or have to find
> another
> > explanation.
> >
> > regard it as an observation
bug report as I can't reproduce it but I have a
distinct feeling that I can't trust the new mbeans or have to find another
explanation.
regard it as an observation if someone else reports issues.
thanks,
svante
2015-01-16 20:56 GMT+01:00 svante karlsson :
> Jun,
>
> I don
startup. Can you reproduce the issue reliably? Also, is what you saw an
> issue with the mbean itself or graphite?
>
> Thanks,
>
> Jun
>
> On Fri, Jan 16, 2015 at 4:38 AM, svante karlsson wrote:
>
> > I upgrade two small test cluster and I had two small issues but
I upgrade two small test cluster and I had two small issues but I'm, not
clear yet as to if those were an issue due to us using ansible to configure
and deploy the cluster.
The first issue could be us doing something bad when distributing the
update (I updated, not reinstalled) but it should be ea
The messages are ordered per partition. No order between partitions.
If you really need ordering use one partition.
2015-01-08 9:44 GMT+01:00 YuanJia Li :
> Hi all,
> I have a topic with 3 partitions, and each partion has its sequency in
> kafka.
> How to order message between different partio
No, I missed that.
thanks,
svante
2015-01-07 6:44 GMT+01:00 Jun Rao :
> Did you set offsets.storage to kafka in the consumer of mirror maker?
>
> Thanks,
>
> Jun
>
> On Mon, Jan 5, 2015 at 3:49 PM, svante karlsson wrote:
>
> > I'm using 0.82beta a
I'm using 0.82beta and I'm trying to push data with the mirrormaker tool
from several remote sites to two datacenters. I'm testing this from a node
containing zk, broker and mirrormaker and the data is pushed to a "normal"
cluster. 3 zk and 4 brokers with replication.
While the configuration seems
What kind of network do you have? gigabit? if so 90 MB/s would make
sense
Also since you have one partition what's your raw transfer speed to the
disk? 90 MB/s makes sense here as well...
If I were looking for rapid replica catch up I'd have at least 2x Gbit and
partitioned topics spread out o
>Yes - see the offsets.topic.num.partitions and
>offsets.topic.replication.factor broker configs.
Joel, that exactly what I was looking for. I'll look into that and the
source for OffsetsMessageFormatter later today!
thanks
svante
>
Thanks,
>
> Jun
>
> On Fri, Dec 12, 2014 at 2:45 AM, svante karlsson wrote:
>
> > Disregard the creation question - we must have done something wrong
> because
> > now our code is working without obvious changes (on another set of
> > brokers).
> >
> &
If I understand KAFKA-1476 it is only a command line tool that gives access
by using ZKUtils not an API to Kafka. We're looking for a Kafka API so I
guess that this functionality is missing.
thanks for the pointer
Svante Karlsson
2014-12-12 19:03 GMT+01:00 Jiangjie Qin :
>
> KA
/stable in any
way or is there a better way of listing the existing group names?
svante
2014-12-11 20:59 GMT+01:00 svante karlsson :
>
> We're using 0.82 beta and a homegrown c++ async library based on boost
> asio that has support for the offset api.
> (apike
We're using 0.82 beta and a homegrown c++ async library based on boost
asio that has support for the offset api.
(apikeys OffsetCommitRequest = 8, OffsetFetchRequest = 9,
ConsumerMetadataRequest = 10)
If we use a java client and commit an offset then the consumer group shows
up in the response f
I haven't run the sandbox but check if the kafka server is started at all.
ps -ef | grep kafka
2014-12-05 14:34 GMT+01:00 Marco :
> Hi,
>
> I've installed the Hortonworks Sandbox and try to get into Kafka.
>
> Unfortunately, even the simple tutorial does not work :(
> http://kafka.apache.org/d
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper
> )
> and make sure the 3 registered hosts are unique?
>
> Thanks,
>
> Jun
>
> On Wed, Dec 3, 2014 at 5:54 AM, svante karlsson wrote:
>
> > I've installed (for ansible scriptin
I found some logs like this before everything started to go wrong
...
[2014-12-02 07:08:11,722] WARN Partition [test3,13] on broker 2: No
checkpointed highwatermark is found for partition [test3,7]
(kafka.cluster.Partition)
[2014-12-02 07:08:11,722] WARN Partition [test3,7] on broker 2: No
checkpo
I've installed (for ansible scripting testing purposes) 3 VM's each
containing kafka & zookeeer clustered together
Ubuntu 14.04
Zookeepers are 3.4.6 and kafka 2.11-0.8.2-beta
The kafka servers have broker id's 2, 4, 6
The zookeepers seems happy.
The kafka servers start up and seems happy.
I can
By default, the partition key is used for hashing then it's placed in a
partition that has the appropriate hashed keyspace.
If you have three physical partitions and then give the partition key "5"
it has nothing to do with physical partition 5 (that does not exist) ,
similar to physical: partitio
Both variants will work well (if your kafka cluster can handle the full
volume of the transmitted data for the duration of the ttl on each topic) .
I would run the whole thing through kafka since you will be "stresstesting"
you production flow - consider if you at some later time lost your
destina
Magnus,
Do you have any plans to update the protocol to 0.9? I built a boost asio
based version half a year ago but that did only implement v0.8 and I have
not found time to upgrade it. It is a quite big job to have something equal
to java high and low level API.
/svante
>
>
er Message Broker.*
> 1. We have to handle 30,000 TPS.
> 2. We need to prioritize the requests.
> 3. Request Data should not be lost.
>
>
> Thanks
>
> Regards
> Lavish Goel
>
>
>
> On Mon, Sep 22, 2014 at 4:20 PM, svante karlsson wrote:
>
> &
at case should we move to some other message broker? If
> yes, Can you please tell me the name which is best for this use case and
> can handle large amount of requests?
> Is there any workaround in Kafka? If Yes, Please tell me.
>
> Thanks
>
> Warm Regards
> Lavish Goel
Wrong use-case. Kafka is a queue (in normal case a TTL (time to live) on
messages). There is no correlation between producers and consumers. There
is no concept of a consumed message. There is no "request" and no
"response".
You can produce messages (in another topic) as result of your processing
Do you read from the file in the callback from kafka? I just implemented
c++ bindings and in one of the tests i did I got the following results:
1000 messages per batch (fairly small messages ~150 bytes) and then wait
for the network layer to ack the send (not server ack)'s before putting
another
No reshuffeling will take place. And reading messages and put them back in
again will not remove the messages from their "old" partition so the same
message will the exist in more than one partition - eventually to get aged
out of the oldest partion.
If you use partitioning to distribute the load
73 matches
Mail list logo