Timestamps with Kafka REST proxy

2020-05-12 Thread Sachin Nikumbh
Hi all,
Is there a way to include timestamp with each record when using Kafka's REST 
proxy? The documentation does not show any examples and when I tried to use a 
"timestamp" field, I got an "unknown field" error in response.
Any help would be greatly appreciated.
ThanksSachin

Re: Kafka metrics

2020-05-12 Thread Malcolm McFarland
These ideas are specific to Samza and ymmv in how they apply to other
processing frameworks, but we use a couple of custom tools to keep tabs on
processing lag:

- one is a produce/consume timestamp comparison tool which utilizes writes
a message production timestamps out to ZooKeeper on a per-partition basis;
then in our stream processor, Samza, we then write out a consumption
timestamp for the same partition to ZooKeeper, and we can use these
differences (with some compensations for partition offset differences) to
see what our processing lag is;
- we also have a custom offset/checkpoint comparison tool which ingests
Samza's checkpoint topic and compares it against the latest offsets on a
per-partition basis to know how far behind each partition is in processing
messages (this also doubles as a checkpoint-properties file generator which
we can use to rebuild the checkpoint topic if it gets too large)

These two tools have been invaluable in helping us monitor our Samza
processing clusters.

Cheers,
Malcolm McFarland
Cavulus


This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
unauthorized or improper disclosure, copying, distribution, or use of the
contents of this message is prohibited. The information contained in this
message is intended only for the personal and confidential use of the
recipient(s) named above. If you have received this message in error,
please notify the sender immediately and delete the original message.


On Mon, May 11, 2020 at 5:23 PM Eleanore Jin  wrote:

> Hi community,
>
> I just wonder what is the difference between the consumer lag reported by
> Kafka client and the consumer lag reported by burrow?
>
> Thanks a lot!
> Eleanore
>


Re: Kafka metrics

2020-05-12 Thread Matthias J. Sax
I am not 100% sure what Burrow does, but I would assume that it compares
committed offsets to end offsets (similar to
`bin/kafka-consumer-group.sh`). This is a "global" view over all
consumer in the group. Compare to the consumer metric, the might report
a higher lag as it relies on consumer commits.

The consumer lag metric reports a single client view (obviously a
consumer does not know anything about the lag of other consumers in the
group) and it's based on the current fetch offsets the consumer
maintains internally. Thus, the lag might be smaller if the offsets are
not committed yet.


-Matthias

On 5/11/20 5:23 PM, Eleanore Jin wrote:
> Hi community,
> 
> I just wonder what is the difference between the consumer lag reported by
> Kafka client and the consumer lag reported by burrow?
> 
> Thanks a lot!
> Eleanore
> 



signature.asc
Description: OpenPGP digital signature


Re: Merging multiple streams into one

2020-05-12 Thread Alessandro Tagliapietra
Hi  Bill,

thank you for replying.

Yes keys are all the same type (machine ID string)

Btw, your solution sounds great, but it'll only work if al the 3 streams
have the same number of partitions, right?
Otherwise there's no guarantee that all the data of the same machine (the
topic keys are the machine IDs) ends up in the same streams instance.
Which is instead guaranteed with the intermediate topic?

Thanks!

--
Alessandro Tagliapietra


On Tue, May 12, 2020 at 7:16 AM Bill Bejeck  wrote:

> Hi Alessandro,
>
> For merging the three streams, have you considered the `KStream.merge`
> method?
> If the values are of different types, you'll need to map them into a common
> type first, but I think
> something like this will work:
>
> KStream mappedOne = orignalStreamOne.mapValues(...);
> KStream mappedTwo = originalStreamTwo.mapValues(...):
> KStream mappedThree = originalStreamThree.mapValues(...);
>
> KStream mergedStream = mappedOne.merge(mappedTwo).merge(mappedThree);
>
> Just keep in mind there are no ordering guarantees for the records of the
> merged streams.  Also, I made the assumption the keys are
> of the same type, if not, then you'll have to change the `mapValues` call
> to a `map`.
>
> HTH,
> Bill
>
> On Mon, May 11, 2020 at 11:02 PM Alessandro Tagliapietra <
> tagliapietra.alessan...@gmail.com> wrote:
>
> > Hello everyone,
> >
> > we currently use 3 streams (metrics, events, states) and I need to
> > implement a keepalive mechanism so that if the machine doesn't send any
> > data (from a specific list of variables) it'll emit a value that changes
> > the machine state.
> >
> > For example, in machine 1 the list of keepalive variables are "foo" and
> > "bar", to propagate the list of variables I use a configuration topic
> that
> > uses a GlobalKTable so that each stream application can read the
> machine's
> > configuration.
> >
> > Then my idea was to "merge" all the three streams into one so that I can
> > use a ValueTransformer to:
> >  - read the configuration store and ignore messages that don't belong to
> > the configured variables for a machine
> >  - uses a local state store to save the "last seen" of a machine
> >  - use the punctuator to emit a "change" in the machine status if the
> "last
> > seen" of a machine is older than some time
> >
> > To "merge' the 3 streams I was thinking to just map them into a single
> > intermediate topic and have the ValueTransformer read from that.
> >
> > Is there a better way? Maybe without using an intermediate topic?
> >
> > Thank you in advance
> >
> > --
> > Alessandro Tagliapietra
> >
>


Re: data structures used by GlobalKTable, KTable

2020-05-12 Thread Matthias J. Sax
By default, RocksDB is used. You can also change it to use an in-memory
store that is basically a HashMap.


-Matthias

On 5/12/20 10:16 AM, Pushkar Deole wrote:
> Thanks Liam!
> 
> On Tue, May 12, 2020, 15:12 Liam Clarke-Hutchinson <
> liam.cla...@adscale.co.nz> wrote:
> 
>> Hi Pushkar,
>>
>> GlobalKTables and KTables can have whatever data structure you like, if you
>> provide the appropriate deserializers - for example, an Kafka Streams app I
>> maintain stores model data (exported to a topic per entity from Postgres
>> via Kafka Connect's JDBC Source) as a GlobalKTable of Jackson ObjectNode's
>> keyed by entity id
>>
>> If you're worried about efficiency, just treat KTables/GlobalKTables as a
>> HashMap to and you're pretty much there. In terms of efficiency,
>> we're joining model  data to about 7 - 10 TB of transactional data a day,
>> and on average, run about 5 - 10 instances of our enrichment app with about
>> 2GB max heap.
>>
>> Kind regards,
>>
>> Liam "Not a part of the Confluent team, but happy to help"
>> Clarke-Hutchinson
>>
>> On Tue, May 12, 2020 at 9:35 PM Pushkar Deole 
>> wrote:
>>
>>> Hello confluent team,
>>>
>>> Could you provide some information on what data structures are used
>>> internally by GlobalKTable and KTables. The application that I am working
>>> on has a requirement to read cached data from GlobalKTable on every
>>> incoming event, so the reads from GlobalKTable need to be efficient.
>>>
>>
> 



signature.asc
Description: OpenPGP digital signature


Re: Kafka upgrade from 0.10 to 2.3.x

2020-05-12 Thread Matthias J. Sax
I guess you can just stop all servers, update the binaries (and
potentially configs), and afterward restart the servers.

Of course, you might want to stop all applications that connect to the
cluster first.


-Matthias

On 5/11/20 9:50 AM, Praveen wrote:
> Hi folks,
> 
> I'd like to take downtime to upgrade to 2.3.x from 10.2.1. But I can't find
> it in the doc for 2.3.x upgrade that I can take downtime to do this. The
> instructions are for rolling upgrade only.
> 
> Has anyone tried this?
> 
> Praveen
> 



signature.asc
Description: OpenPGP digital signature


Re: data structures used by GlobalKTable, KTable

2020-05-12 Thread Pushkar Deole
Thanks Liam!

On Tue, May 12, 2020, 15:12 Liam Clarke-Hutchinson <
liam.cla...@adscale.co.nz> wrote:

> Hi Pushkar,
>
> GlobalKTables and KTables can have whatever data structure you like, if you
> provide the appropriate deserializers - for example, an Kafka Streams app I
> maintain stores model data (exported to a topic per entity from Postgres
> via Kafka Connect's JDBC Source) as a GlobalKTable of Jackson ObjectNode's
> keyed by entity id
>
> If you're worried about efficiency, just treat KTables/GlobalKTables as a
> HashMap to and you're pretty much there. In terms of efficiency,
> we're joining model  data to about 7 - 10 TB of transactional data a day,
> and on average, run about 5 - 10 instances of our enrichment app with about
> 2GB max heap.
>
> Kind regards,
>
> Liam "Not a part of the Confluent team, but happy to help"
> Clarke-Hutchinson
>
> On Tue, May 12, 2020 at 9:35 PM Pushkar Deole 
> wrote:
>
> > Hello confluent team,
> >
> > Could you provide some information on what data structures are used
> > internally by GlobalKTable and KTables. The application that I am working
> > on has a requirement to read cached data from GlobalKTable on every
> > incoming event, so the reads from GlobalKTable need to be efficient.
> >
>


Re: records with key as string and value as java ArrayList in topic

2020-05-12 Thread Pushkar Deole
Thanks Liamessentially, it would be an internal topic that we would be
creating to use as a cache store by accessing topic through a GlobalKTable,
so the problem you mentioned above for storing Hashmap may not apply there

On Tue, May 12, 2020, 21:25 Liam Clarke-Hutchinson <
liam.cla...@adscale.co.nz> wrote:

> Hi Pushkar,
>
> Just wanted to say, as someone with battle scars from ActiveMQ and Camel,
> there's very many good reasons to avoid Java serialization on a messaging
> system. What if you need to tail a topic from the console? What if your
> testers want to access in their pytests? Etc. And that's not even getting
> into the minefield of Java serialization compatibility between one VM and
> another.
>
> That said, if using another serialization format like Avro or JSON isn't
> feasible, you can implement your own serializer/deserializer.
>
> At it's heart, a Kafka producer sends a byte array and a consumer receives
> a byte array. So a custom serialiser to turn a list or map into bytes is
> easily doable using an ObjectInputStream. And a consumer can use
> deserialiser built on an ObjectOutputStream at the other end.
>
> Kind regards,
>
> Liam Clarke-Hutchinson
>
> On Tue, 12 May 2020, 5:13 pm Pushkar Deole,  wrote:
>
> > And by the way, confluent has provided KafkaAvroSerializer/Deserialier.
> > Can't they be used to do conversion for java types?
> >
> > On Tue, May 12, 2020 at 10:09 AM Pushkar Deole 
> > wrote:
> >
> > > Ok... so jackson json serialization is the way to go for hashmaps as
> > well?
> > >
> > > On Mon, May 11, 2020 at 7:57 PM John Roesler 
> > wrote:
> > >
> > >> Oh, my mistake. I thought this was a different thread :)
> > >>
> > >> You might want to check, but I don’t think there is a kip for a map
> > >> serde. Of course, you’re welcome to start one.
> > >>
> > >> Thanks,
> > >> John
> > >>
> > >> On Mon, May 11, 2020, at 09:14, John Roesler wrote:
> > >> > Hi Pushkar,
> > >> >
> > >> > I don’t think there is. You’re welcome to start one if you think it
> > >> > would be a useful addition.
> > >> >
> > >> > Before worrying about it further, though, you might want to check
> the
> > >> > InMemoryKeyValueStore implementation, since my answer was from
> memory.
> > >> >
> > >> > Thanks,
> > >> > John
> > >> >
> > >> > On Mon, May 11, 2020, at 03:47, Pushkar Deole wrote:
> > >> > > John,
> > >> > > is there KIP in progress for supporting Java HashMap also?
> > >> > >
> > >> > > On Sun, May 10, 2020, 00:47 John Roesler 
> > wrote:
> > >> > >
> > >> > > > Yes, that’s correct. It’s only for serializing the java type
> > >> ‘byte[]’.
> > >> > > >
> > >> > > > On Thu, May 7, 2020, at 10:37, Pushkar Deole wrote:
> > >> > > > > Thanks John... I got to finish the work in few days so need to
> > >> get it
> > >> > > > > quick, so looking for something ready. I will take a look at
> > >> jackson
> > >> > > > json.
> > >> > > > >
> > >> > > > > By the way, what is the byteArrayserializer? As the name
> > >> suggests, it is
> > >> > > > > for byte arrays so won't work for java ArrayList, right?
> > >> > > > >
> > >> > > > > On Thu, May 7, 2020 at 8:44 PM John Roesler <
> > vvcep...@apache.org>
> > >> wrote:
> > >> > > > >
> > >> > > > > > Hi Pushkar,
> > >> > > > > >
> > >> > > > > > If you’re not too concerned about compactness, I think
> Jackson
> > >> json
> > >> > > > > > serialization is the easiest way to serialize complex types.
> > >> > > > > >
> > >> > > > > > There’s also a kip in progress to add a list serde. You
> might
> > >> take a
> > >> > > > look
> > >> > > > > > at that proposal for ideas to write your own.
> > >> > > > > >
> > >> > > > > > Thanks,
> > >> > > > > > John
> > >> > > > > >
> > >> > > > > > On Thu, May 7, 2020, at 08:17, Nicolas Carlot wrote:
> > >> > > > > > > Won't say it's a good idea to use java serialized classes
> > for
> > >> > > > messages,
> > >> > > > > > but
> > >> > > > > > > you should use a byteArraySerializer if you want to do
> such
> > >> things
> > >> > > > > > >
> > >> > > > > > > Le jeu. 7 mai 2020 à 14:32, Pushkar Deole <
> > >> pdeole2...@gmail.com> a
> > >> > > > > > écrit :
> > >> > > > > > >
> > >> > > > > > > > Hi All,
> > >> > > > > > > >
> > >> > > > > > > > I have a requirement to store a record with key as java
> > >> String and
> > >> > > > > > value as
> > >> > > > > > > > java's ArrayList in the kafka topic. Kafka has by
> default
> > >> provided
> > >> > > > a
> > >> > > > > > > > StringSerializer and StringDeserializer, however for
> java
> > >> > > > ArrayList,
> > >> > > > > > how
> > >> > > > > > > > can get serializer. Do I need to write my own? Can
> someone
> > >> share if
> > >> > > > > > someone
> > >> > > > > > > > already has written one?
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > --
> > >> > > > > > > *Nicolas Carlot*
> > >> > > > > > >
> > >> > > > > > > Lead dev
> > >> > > > > > > |  | nicolas.car...@chronopost.fr
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > *Veuillez

Re: records with key as string and value as java ArrayList in topic

2020-05-12 Thread Liam Clarke-Hutchinson
Hi Pushkar,

Just wanted to say, as someone with battle scars from ActiveMQ and Camel,
there's very many good reasons to avoid Java serialization on a messaging
system. What if you need to tail a topic from the console? What if your
testers want to access in their pytests? Etc. And that's not even getting
into the minefield of Java serialization compatibility between one VM and
another.

That said, if using another serialization format like Avro or JSON isn't
feasible, you can implement your own serializer/deserializer.

At it's heart, a Kafka producer sends a byte array and a consumer receives
a byte array. So a custom serialiser to turn a list or map into bytes is
easily doable using an ObjectInputStream. And a consumer can use
deserialiser built on an ObjectOutputStream at the other end.

Kind regards,

Liam Clarke-Hutchinson

On Tue, 12 May 2020, 5:13 pm Pushkar Deole,  wrote:

> And by the way, confluent has provided KafkaAvroSerializer/Deserialier.
> Can't they be used to do conversion for java types?
>
> On Tue, May 12, 2020 at 10:09 AM Pushkar Deole 
> wrote:
>
> > Ok... so jackson json serialization is the way to go for hashmaps as
> well?
> >
> > On Mon, May 11, 2020 at 7:57 PM John Roesler 
> wrote:
> >
> >> Oh, my mistake. I thought this was a different thread :)
> >>
> >> You might want to check, but I don’t think there is a kip for a map
> >> serde. Of course, you’re welcome to start one.
> >>
> >> Thanks,
> >> John
> >>
> >> On Mon, May 11, 2020, at 09:14, John Roesler wrote:
> >> > Hi Pushkar,
> >> >
> >> > I don’t think there is. You’re welcome to start one if you think it
> >> > would be a useful addition.
> >> >
> >> > Before worrying about it further, though, you might want to check the
> >> > InMemoryKeyValueStore implementation, since my answer was from memory.
> >> >
> >> > Thanks,
> >> > John
> >> >
> >> > On Mon, May 11, 2020, at 03:47, Pushkar Deole wrote:
> >> > > John,
> >> > > is there KIP in progress for supporting Java HashMap also?
> >> > >
> >> > > On Sun, May 10, 2020, 00:47 John Roesler 
> wrote:
> >> > >
> >> > > > Yes, that’s correct. It’s only for serializing the java type
> >> ‘byte[]’.
> >> > > >
> >> > > > On Thu, May 7, 2020, at 10:37, Pushkar Deole wrote:
> >> > > > > Thanks John... I got to finish the work in few days so need to
> >> get it
> >> > > > > quick, so looking for something ready. I will take a look at
> >> jackson
> >> > > > json.
> >> > > > >
> >> > > > > By the way, what is the byteArrayserializer? As the name
> >> suggests, it is
> >> > > > > for byte arrays so won't work for java ArrayList, right?
> >> > > > >
> >> > > > > On Thu, May 7, 2020 at 8:44 PM John Roesler <
> vvcep...@apache.org>
> >> wrote:
> >> > > > >
> >> > > > > > Hi Pushkar,
> >> > > > > >
> >> > > > > > If you’re not too concerned about compactness, I think Jackson
> >> json
> >> > > > > > serialization is the easiest way to serialize complex types.
> >> > > > > >
> >> > > > > > There’s also a kip in progress to add a list serde. You might
> >> take a
> >> > > > look
> >> > > > > > at that proposal for ideas to write your own.
> >> > > > > >
> >> > > > > > Thanks,
> >> > > > > > John
> >> > > > > >
> >> > > > > > On Thu, May 7, 2020, at 08:17, Nicolas Carlot wrote:
> >> > > > > > > Won't say it's a good idea to use java serialized classes
> for
> >> > > > messages,
> >> > > > > > but
> >> > > > > > > you should use a byteArraySerializer if you want to do such
> >> things
> >> > > > > > >
> >> > > > > > > Le jeu. 7 mai 2020 à 14:32, Pushkar Deole <
> >> pdeole2...@gmail.com> a
> >> > > > > > écrit :
> >> > > > > > >
> >> > > > > > > > Hi All,
> >> > > > > > > >
> >> > > > > > > > I have a requirement to store a record with key as java
> >> String and
> >> > > > > > value as
> >> > > > > > > > java's ArrayList in the kafka topic. Kafka has by default
> >> provided
> >> > > > a
> >> > > > > > > > StringSerializer and StringDeserializer, however for java
> >> > > > ArrayList,
> >> > > > > > how
> >> > > > > > > > can get serializer. Do I need to write my own? Can someone
> >> share if
> >> > > > > > someone
> >> > > > > > > > already has written one?
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > --
> >> > > > > > > *Nicolas Carlot*
> >> > > > > > >
> >> > > > > > > Lead dev
> >> > > > > > > |  | nicolas.car...@chronopost.fr
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > *Veuillez noter qu'à partir du 20 mai, le siège Chronopost
> >> déménage.
> >> > > > La
> >> > > > > > > nouvelle adresse est : 3 boulevard Romain Rolland 75014
> Paris*
> >> > > > > > >
> >> > > > > > > [image: Logo Chronopost]
> >> > > > > > > | chronopost.fr 
> >> > > > > > > Suivez nous sur Facebook <
> >> https://fr-fr.facebook.com/chronopost> et
> >> > > > > > Twitter
> >> > > > > > > .
> >> > > > > > >
> >> > > > > > > [image: DPD Group]
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>


RE: Offset Management...

2020-05-12 Thread Rajib Deb
Thanks Bill, my apologies I did not elaborate my use case. 

In my use case, the data from Cassandra is pushed to Kafka and then we consume 
from Kafka to snowflake. Once we push the data to snowflake, we do not want to 
go back to the source(Cassandra) to pull the data again. There are occasions 
where we are asked to pull the data for a certain date and time. I thought 
storing the offset will help with that use case. The other item is our 
validation framework. We need to validate that I am processing all the rows 
that Cassandra is pushing to kafka. So the validation program needs to look at 
number of rows in Cassandra for a particular key and see if we have that many 
messages in Kafka and Snowflake for that key.


Thanks
Rajib

-Original Message-
From: Bill Bejeck  
Sent: Tuesday, May 12, 2020 7:41 AM
To: users@kafka.apache.org
Subject: Re: Offset Management...

[**EXTERNAL EMAIL**]

Hi Rajib,

Generally, it's best to let Kafka handle the offset management.
Under normal circumstances, when you restart a consumer, it will start reading 
records from the last committed offset, there's no need for you to manage that 
process yourself.
If you need manually commit records vs. using auto-commit, then you can use one 
of the commit API methods commitSync 

 or commitAsync

.

-Bill


On Mon, May 11, 2020 at 9:52 PM Rajib Deb  wrote:

> Hi, I wanted to know if it is a good practice to develop a custom 
> offset management method while consuming from Kafka. I am thinking to 
> develop it as below.
>
>
>   1.  Create a PartitionInfo named tuple as below
>
> PartitionInfo("PartitionInfo",["header","custom writer","offset"]
>
>   1.  Then populate the tuple with the header, writer and last offset 
> details
>   2.  Write the tuple in a file/database once the consumer commits the 
> message
>   3.  Next time when consumer starts, it checks the last offset and 
> reads from there
>
> Thanks
> Rajib
>
>


RE: EXTERNAL: Re: Separate Kafka partitioning from key compaction

2020-05-12 Thread Young, Ben
I'm not sure that's feasible in this case, but I'll have a look!

Thanks,
Ben

-Original Message-
From: Liam Clarke-Hutchinson 
Sent: 06 May 2020 19:47
To: users@kafka.apache.org
Subject: EXTERNAL: Re: Separate Kafka partitioning from key compaction

Could you deploy a Kafka Streams app that implemented your desired 
partitioning? Obviously this would require a duplication in topics between 
those produced to initially, and those partitioned the way you'd like, but it 
would solve the issue you're having.



On Wed, 6 May 2020, 10:25 pm Young, Ben, 
wrote:

> Hi,
>
> We have a use case where we'd like the partition a key is hashed to,
> to be a subset of the keys that are used for compaction. It would be
> really cool if there was a built in hashing strategy that could help
> us as we're potentially using Kafka from multiple languages and it
> could be hard to standardise.
>
> For instance, large message processing. We have keys like "msg1:1/3",
> "msg1:2/3" and "msg:3/3". We'd like all messages to be retained by
> compaction, but all of these messages to go to the same partition...
> There's lots of similar use cases where we'd like compaction to keep
> more than we'd use for partitioning.
>
> Obviously we could write our own hashing etc but that's hard when our
> main producers and consumers are in C# and we want to integration with KSQL 
> etc.
>
> My desert island solution would be to have the partition key
> optionally called out by something like braces in Redis (
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fredi
> slabs.com%2Fblog%2Fredis-clustering-best-practices-with-keys%2F&da
> ta=02%7C01%7CBen.Young%40fisglobal.com%7Cafc29b19d3474c65e40608d7f1ed777c%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C637243874538623413&sdata=k9pAZ%2BY5wGU28Krg5ZJhC9f11AoY3r1g9wR5QOglitE%3D&reserved=0)
>  , but the whole key used for log compaction. This wouldn't be backwards 
> compatible so I guess it would have to be a new strategy...
>
> Does anyone else have requirements like this? How have they solved them?
>
> Thanks,
> Ben Young
> The information contained in this message is proprietary and/or
> confidential. If you are not the intended recipient, please: (i)
> delete the message and all copies; (ii) do not disclose, distribute or
> use the message in any manner; and (iii) notify the sender
> immediately. In addition, please be aware that any message addressed
> to our domain is subject to archiving and review by persons other than
> the intended recipient. FIS is a trading name of the following companies: 
> Advanced Portfolio Technologies Ltd (No:
> 6312142) | Clear2Pay Limited (No: 5792457) | Decalog (UK) Limited (No:
> 2567370) | FIS Apex (International) Limited (No: 260) | FIS Apex
> (UK) Limited (No. 3573008) | FIS Consulting Services (UK) Limited (No:
> 2486794)
> | FIS Derivatives Utility Services (UK) Limited (No: 9398140) | FIS
> | Energy
> Solutions Limited (No: 1889028) | FIS Global Execution Services
> Limited (No. 3127109) | FIS Global Trading (UK) Limited (No: 2523114)
> | FIS Investment Systems (UK) Limited (No: 1366010) | FIS Sherwood
> Systems Group Limited (No: 982833) | FIS Systems Limited (No: 1937159)
> | FIS Treasury Systems (Europe) Limited (No: 2624209) | FIS Treasury
> Systems (UK) Limited
> (No: 2893376) | GL Settle Limited (No: 2396127) | Integrity Treasury
> Solutions Europe Limited (No: 3289271) | Monis Software Limited (No:
> 2333925) | Reech Capital Limited (No: 3649490) | Solutions Plus
> Consulting Services Limited (No: 3839487) | Valuelink Information
> Services Limited
> (No: 3827424) all registered in England & Wales with their registered
> office at 25 Canada Square, London E14 5LQ | FIS Global Execution
> Services Limited is authorised and regulated by the Financial Conduct
> Authority | Certegy Card Services Limited (No: 3517639) | Certegy France 
> Limited (No:
> 2557650) | eFunds International Limited (No: 1930117) | Fidelity
> Information Services Limited (No: 2225203) | FIS Payments (UK) Limited (No:
> 4215488) | Metavante Technologies Limited (No: 2659326) all registered
> in England & Wales with their registered office at 1st Floor Tricorn
> House,
> 51-53 Hagley Road, Edgbaston, Birmingham, West Midlands, B16 8TU,
> United Kingdom | FIS Payments (UK) Limited is authorised and regulated
> by the Financial Conduct Authority; some services are covered by the
> Financial Ombudsman Service (in the UK). Clear2Pay Limited, Registered
> in Scotland (No SC157659), Registered Office: Clear2Pay House,
> Pitreavie Court, Pitreavie Business Park Queensferry Rd, Dunfermline,
> Fife, SS, KY11 8UU, Scotland | FIS eProcess Intelligence LLC (UK
> Branch), UK Establishment Registered in England & Wales (No:
> FC16527/Branch No. BR000355), Registered Branch Office: 25 Canada
> Square, London, E14 5LQ; FIS eProcess Intelligence LLC is a limited
> liability company formed in the USA registered on file with the Office
> of the Delaware Secretary of Stat

Re: Offset Management...

2020-05-12 Thread Bill Bejeck
Hi Rajib,

Generally, it's best to let Kafka handle the offset management.
Under normal circumstances, when you restart a consumer, it will start
reading records from the last committed offset, there's no need for you to
manage that process yourself.
If you need manually commit records vs. using auto-commit, then you can use
one of the commit API methods
commitSync

 or commitAsync

.

-Bill


On Mon, May 11, 2020 at 9:52 PM Rajib Deb  wrote:

> Hi, I wanted to know if it is a good practice to develop a custom offset
> management method while consuming from Kafka. I am thinking to develop it
> as below.
>
>
>   1.  Create a PartitionInfo named tuple as below
>
> PartitionInfo("PartitionInfo",["header","custom writer","offset"]
>
>   1.  Then populate the tuple with the header, writer and last offset
> details
>   2.  Write the tuple in a file/database once the consumer commits the
> message
>   3.  Next time when consumer starts, it checks the last offset and reads
> from there
>
> Thanks
> Rajib
>
>


Re: Merging multiple streams into one

2020-05-12 Thread Bill Bejeck
Hi Alessandro,

For merging the three streams, have you considered the `KStream.merge`
method?
If the values are of different types, you'll need to map them into a common
type first, but I think
something like this will work:

KStream mappedOne = orignalStreamOne.mapValues(...);
KStream mappedTwo = originalStreamTwo.mapValues(...):
KStream mappedThree = originalStreamThree.mapValues(...);

KStream mergedStream = mappedOne.merge(mappedTwo).merge(mappedThree);

Just keep in mind there are no ordering guarantees for the records of the
merged streams.  Also, I made the assumption the keys are
of the same type, if not, then you'll have to change the `mapValues` call
to a `map`.

HTH,
Bill

On Mon, May 11, 2020 at 11:02 PM Alessandro Tagliapietra <
tagliapietra.alessan...@gmail.com> wrote:

> Hello everyone,
>
> we currently use 3 streams (metrics, events, states) and I need to
> implement a keepalive mechanism so that if the machine doesn't send any
> data (from a specific list of variables) it'll emit a value that changes
> the machine state.
>
> For example, in machine 1 the list of keepalive variables are "foo" and
> "bar", to propagate the list of variables I use a configuration topic that
> uses a GlobalKTable so that each stream application can read the machine's
> configuration.
>
> Then my idea was to "merge" all the three streams into one so that I can
> use a ValueTransformer to:
>  - read the configuration store and ignore messages that don't belong to
> the configured variables for a machine
>  - uses a local state store to save the "last seen" of a machine
>  - use the punctuator to emit a "change" in the machine status if the "last
> seen" of a machine is older than some time
>
> To "merge' the 3 streams I was thinking to just map them into a single
> intermediate topic and have the ValueTransformer read from that.
>
> Is there a better way? Maybe without using an intermediate topic?
>
> Thank you in advance
>
> --
> Alessandro Tagliapietra
>


Re: data structures used by GlobalKTable, KTable

2020-05-12 Thread Liam Clarke-Hutchinson
Hi Pushkar,

GlobalKTables and KTables can have whatever data structure you like, if you
provide the appropriate deserializers - for example, an Kafka Streams app I
maintain stores model data (exported to a topic per entity from Postgres
via Kafka Connect's JDBC Source) as a GlobalKTable of Jackson ObjectNode's
keyed by entity id

If you're worried about efficiency, just treat KTables/GlobalKTables as a
HashMap to and you're pretty much there. In terms of efficiency,
we're joining model  data to about 7 - 10 TB of transactional data a day,
and on average, run about 5 - 10 instances of our enrichment app with about
2GB max heap.

Kind regards,

Liam "Not a part of the Confluent team, but happy to help" Clarke-Hutchinson

On Tue, May 12, 2020 at 9:35 PM Pushkar Deole  wrote:

> Hello confluent team,
>
> Could you provide some information on what data structures are used
> internally by GlobalKTable and KTables. The application that I am working
> on has a requirement to read cached data from GlobalKTable on every
> incoming event, so the reads from GlobalKTable need to be efficient.
>


data structures used by GlobalKTable, KTable

2020-05-12 Thread Pushkar Deole
Hello confluent team,

Could you provide some information on what data structures are used
internally by GlobalKTable and KTables. The application that I am working
on has a requirement to read cached data from GlobalKTable on every
incoming event, so the reads from GlobalKTable need to be efficient.


Re: Custom Connector

2020-05-12 Thread Tom Bentley
Hi Vishnu,

I'm no expert on the Connector ecosystem, but I'm not aware of any source
connector which does that for some arbitrary (i.e. configurable) HTTP
endpoint. I suppose that's due to the difficulty of making it configurable
over the space of all endpoint behaviour (e.g. http methods, request paths,
request parameters, responses, errors etc). If you have no requirements on
the HTTP side it could be that an HTTP proxy is a better option for you (a
web search for "kafka http proxy" yields multiple options, but the
confluent one is most popular). On the other hand, if you have requirements
on the HTTP interaction (e.g. http methods, request paths, request
parameters, responses, errors etc) then writing your own connector is
probably the best option.

Kind regards,

Tom

On Tue, May 12, 2020 at 9:11 AM vishnu murali 
wrote:

> Thanks Tom
>
> I am  receiving data from  one Rest Endpoint and post that data from
> endpoint to topic.
>
>
> Is it possible or any other connector available for that?
>
> On Tue, May 12, 2020, 13:24 Tom Bentley  wrote:
>
> > Hi Vishnu,
> >
> > These are probably a good place to start:
> > 1. https://docs.confluent.io/current/connect/devguide.html
> > 2.
> >
> >
> https://www.confluent.io/blog/create-dynamic-kafka-connect-source-connectors/
> >
> > Cheers,
> >
> > Tom
> >
> >
> > On Tue, May 12, 2020 at 7:34 AM vishnu murali <
> vishnumurali9...@gmail.com>
> > wrote:
> >
> > > Hi Guys,
> > >
> > > i am trying to create a new connector for my own purpose.
> > >
> > > Is there any guide or document which show how to create a own connector
> > and
> > > use?
> > >
> >
>


Re: Custom Connector

2020-05-12 Thread vishnu murali
Thanks Tom

I am  receiving data from  one Rest Endpoint and post that data from
endpoint to topic.


Is it possible or any other connector available for that?

On Tue, May 12, 2020, 13:24 Tom Bentley  wrote:

> Hi Vishnu,
>
> These are probably a good place to start:
> 1. https://docs.confluent.io/current/connect/devguide.html
> 2.
>
> https://www.confluent.io/blog/create-dynamic-kafka-connect-source-connectors/
>
> Cheers,
>
> Tom
>
>
> On Tue, May 12, 2020 at 7:34 AM vishnu murali 
> wrote:
>
> > Hi Guys,
> >
> > i am trying to create a new connector for my own purpose.
> >
> > Is there any guide or document which show how to create a own connector
> and
> > use?
> >
>


Re: Custom Connector

2020-05-12 Thread Tom Bentley
Hi Vishnu,

These are probably a good place to start:
1. https://docs.confluent.io/current/connect/devguide.html
2.
https://www.confluent.io/blog/create-dynamic-kafka-connect-source-connectors/

Cheers,

Tom


On Tue, May 12, 2020 at 7:34 AM vishnu murali 
wrote:

> Hi Guys,
>
> i am trying to create a new connector for my own purpose.
>
> Is there any guide or document which show how to create a own connector and
> use?
>


Need a connector to listening Rest Endpoint

2020-05-12 Thread vishnu murali
Hi friends,

I am having Rest Endpoint and data is receiving in that endpoint
continuously..

I need to send that data to the Kafka topic ..


For these above scenarios I need to solve using connector..

Because I didn't want to run another application to receive data from rest
and send to Kafka.

Instead of seperate application to run and push the data into topic through
Rest service is there any connector available to listen that end point and
automatically push into a topic??