Re: Request for adding into contributors list

2020-05-01 Thread Jun Wang
Thank You !



From: Matthias J. Sax
Sent: Friday, May 1, 2020 4:29 PM
To: users@kafka.apache.org
Subject: Re: Request for adding into contributors list

Done.

On 5/1/20 11:21 AM, Jun Wang wrote:
> Hi,
>
> Please add my JIRA ID into the contributors list of Apache Kafka.
>
> Here is my JIRA profile:
>
> Username: wj1918
> Full name: Jun Wang
>
> Thanks
>



Re: Request for adding into contributors list

2020-05-01 Thread Matthias J. Sax
Done.

On 5/1/20 11:21 AM, Jun Wang wrote:
> Hi,
> 
> Please add my JIRA ID into the contributors list of Apache Kafka.
> 
> Here is my JIRA profile:
> 
> Username: wj1918
> Full name: Jun Wang
> 
> Thanks
> 



signature.asc
Description: OpenPGP digital signature


Re: can kafka state stores be used as a application level cache by application to modify it from outside the stream topology?

2020-05-01 Thread Matthias J. Sax
Both stores sever a different purpose.

Regular stores allow you to store state the application computes.
Writing into the changelog is a fault-tolerance mechanism.

Global store hold "axially" data that is provided from "outside" of the
app. There is no changelog topic, but only the input topic (that is used
to re-create the global state).

Local stores are sharded and updates are "sync" as they don't need to be
shared with anybody else.

For global stores, as all instances need to be updated, updates are
async (we don't know when which instance will update it's own global
store replica).

>> Say one stream thread updates the topic for global store and starts
>> processing next event wherein the processor tries to read the global store
>> which may not have been synced with the topic?

Correct. There is no guarantee when the update to the global store will
be applied. As said, global stores are not designed to hold data the
application computes.


-Matthias


On 4/30/20 11:11 PM, Pushkar Deole wrote:
> thanks... will try with GlobalKTable.
> As a side question, I didn't really understand the significance of global
> state store which kind of works in a reverse way to local state store i.e.
> local state store is updated and then saved to changelog topic whereas in
> case of global state store the topic is updated first and then synced to
> global state store. Do these two work in sync i.e. the update to topic and
> global state store ?
> 
> Say one stream thread updates the topic for global store and starts
> processing next event wherein the processor tries to read the global store
> which may not have been synced with the topic?
> 
> On Fri, May 1, 2020 at 3:35 AM Matthias J. Sax  wrote:
> 
>> Yes.
>>
>> A `GlobalKTable` uses a global store internally.
>>
>> You can also use `StreamsBuilder.addGlobalStore()` or
>> `Topology.addGlobalStore()` to add a global store "manually".
>>
>>
>> -Matthias
>>
>>
>> On 4/30/20 7:42 AM, Pushkar Deole wrote:
>>> Thanks Matthias.
>>> Can you elaborate on the replicated caching layer part?
>>> When you say global stores, do you mean GlobalKTable created from a topic
>>> e.g. using StreamsBuilder.globalTable(String topic) method ?
>>>
>>> On Thu, Apr 30, 2020 at 12:44 PM Matthias J. Sax 
>> wrote:
>>>
 It's not possible to modify state store from "outside".

 If you want to build a "replicated caching layer", you could use global
 stores and write into the corresponding topics to update all stores. Of
 course, those updates would be async.


 -Matthias

 On 4/29/20 10:52 PM, Pushkar Deole wrote:
> Hi All,
>
> I am wondering if this is possible: i have been asked to use state
>> stores
> as a general replicated cache among multiple instances of service
 instances
> however the state store is created through streambuilder but is not
> actually modified through stream processor topology however it is to be
> modified from outside the stream topology. So, essentially, the state
 store
> is just to be created from streambuilder and then to be used as an
> application level cache that will get replicated between application
> instances. Is this possible using state stores?
>
> Secondly, if possible, is this a good design approach?
>
> Appreciate your response since I don't know the internals of state
 stores.
>


>>>
>>
>>
> 



signature.asc
Description: OpenPGP digital signature


Request for adding into contributors list

2020-05-01 Thread Jun Wang
Hi,

Please add my JIRA ID into the contributors list of Apache Kafka.

Here is my JIRA profile:

Username: wj1918
Full name: Jun Wang

Thanks


Re: Connector For MirrorMaker

2020-05-01 Thread vishnu murali
Hi Robin

I am using Apache Kafka there is service called kafka-mirror-maker.bat with
the consumer and producer properties to copy topic from one cluster to
another.

I want to do that by using connector..

I didn't aware anything about MirrorMaker 2 and I didn't know how to
download and configure with Apache Kafka..

Can u guide me how to start with Mirror Maker 2 connector ?

On Fri, May 1, 2020, 19:13 Robin Moffatt  wrote:

> Are you talking about MirrorMaker 2? That runs as a connector.
>
> If not, perhaps you can clarify your question a bit as to what it is you
> are looking for.
>
>
> --
>
> Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
>
>
> On Fri, 1 May 2020 at 13:57, vishnu murali 
> wrote:
>
> > Hi Guys
> >
> > Previously I asked question about the Mirror maker and it is solved now.
> >
> > So Now I need to know is there any connectors available for that same.
> >
> > Like JdbcConnector acts as a source and sink for DB connection is there
> any
> > connector available for performing mirror operations
> >
> >  or
> >
> > does some one having own created connectors for this purpose??
> >
>


Re: Connector For MirrorMaker

2020-05-01 Thread Robin Moffatt
Are you talking about MirrorMaker 2? That runs as a connector.

If not, perhaps you can clarify your question a bit as to what it is you
are looking for.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Fri, 1 May 2020 at 13:57, vishnu murali 
wrote:

> Hi Guys
>
> Previously I asked question about the Mirror maker and it is solved now.
>
> So Now I need to know is there any connectors available for that same.
>
> Like JdbcConnector acts as a source and sink for DB connection is there any
> connector available for performing mirror operations
>
>  or
>
> does some one having own created connectors for this purpose??
>


Connector For MirrorMaker

2020-05-01 Thread vishnu murali
Hi Guys

Previously I asked question about the Mirror maker and it is solved now.

So Now I need to know is there any connectors available for that same.

Like JdbcConnector acts as a source and sink for DB connection is there any
connector available for performing mirror operations

 or

does some one having own created connectors for this purpose??


Re: Kafka: Messages disappearing from topics, largestTime=0

2020-05-01 Thread Goran Sliskovic
 
Yes, that's a clean shutdown log, with few exceptions that are expected 
(connected clients get disconnected during shutdown). Add fsync after kafka 
shutdown should force OS to flush buffers to disk. I somehow suspect there is 
some problems during unmounting/mounting disks.However I don't know about that 
process much. We need startup logs, cause exceptions there are unexpected.
General notes:Such things are difficult to trace, thus setting staging 
environment (test copy of production system) is a must. Then you can experiment 
freely.There is an option in kafka to force flush on every message (but it can 
have serious performance impact). I'd test that on staging, this is not the 
option you want to use in production, however might help diagnose the issue. 
link: 
https://stackoverflow.com/questions/33970374/need-to-understand-kafka-broker-property-log-flush-interval-messagesYou
 may also want to check for filesystem erros (fsck)
You should not copy live filesystems (with cp command for example) while 
applications operate, you need at least crash consistent copy.



On Friday, May 1, 2020, 01:54:49 AM GMT+2, Liam Clarke-Hutchinson 
 wrote:  
 
 So the logs show a healthy shutdown, so we can eliminate that as an issue.
I would look next at the volume management during a rollout based on the
other error messages you had earlier about permission denied etc. It's
possible there's some journalled but not flushed changes in those time
indexes, but at this point we're getting into filesystem internals which
aren't my forte. But if you can temporarily disable the volume switching
and do a test roll out, see if you get the same problems or not, would help
eliminate it or confirm it.

Sorry I can't help further on that.

On Fri, May 1, 2020 at 5:34 AM JP MB  wrote:

> I took a bit because I needed logs of the server shutting down when this
> occurs. Here they are, I can see some errors:
> https://gist.github.com/josebrandao13/e8b82469d3e9ad91fbf38cf139b5a726
>
> Regarding systemd, the closest I could find to TimeoutStopSec was
> DefaultTimeoutStopUSec=1min 30s that looks to be 90seconds. I could not
> find any KillSignal or RestartKillSignal. You can see the output of
> systemctl show --all here:
> https://gist.github.com/josebrandao13/f2dd646fab19b19f127981fce92d78c4
>
> Once again, thanks for the help.
>
> Em qui., 30 de abr. de 2020 às 15:04, Liam Clarke-Hutchinson <
> liam.cla...@adscale.co.nz> escreveu:
>
> > I'd also suggest eyeballing your systemd conf to verify that someone
> hasn't
> > set a very low TimeoutStopSec, or that KillSignal/RestartKillSignal
> haven't
> > been configured to SIGKILL (confusingly named, imo, as the default for
> > KillSignal is SIGTERM).
> >
> > Also, the Kafka broker logs at shutdown look very different if it shut
> down
> > currently vs if it didn't. Could you perhaps put them in a Gist and email
> > the link?
> >
> > Just trying to make sure basic assumptions are holding :)
> >
> > On Fri, 1 May 2020, 1:21 am JP MB,  wrote:
> >
> > > Hi,
> > > It's quite a complex script generated with ansible where we use a/b
> > > deployment and honestly, I don't have full knowledge on it I can share
> > the
> > > general guidelines of what is done:
> > >
> > > > - Any old volumes (from previous releases are removed) (named with
> > suffix
> > > > '-old')
> > > > - Detach the volumes attached to the old host
> > > > - Stop the service in the old host - uses systemctl stop kafka
> > > > - Attempt to create a CNAME volume: this is a volume with the same
> name
> > > > that will be attached to the new box. Except for very first run, this
> > > task
> > > > is used to get the information about the existing volume. (no sufix)
> > > > - A new volume is created as copy of the CNAME volume (named with
> > suffix
> > > > '-new')
> > > > - The new volume is attached to the host/vm (named with suffix
> '-new')
> > > > - The new volume is formated (except for very first run, its already
> > > > formated)(named with suffix '-new')
> > > > - The new volume is mounted (named with suffix '-new')
> > > > - Start the service in the new host - uses systemctl start kafka
> > > > - If everthing went well stopping/starting services:
> > > >    - The volume no the old host is renamed with prefix '-old'.
> > > >    - The new volume is renamed stripping the suffix '-new'.
> > >
> > >
> > > I made a new experiment today with some interesting findings. Had 518
> > > messages in a given topic, after a deployment lost 9 due to this
> problem
> > in
> > > partitions 13,15,16 and 17. All the errors I could find in the time
> > > index files before the deployment (left is partition number):
> > >
> > > 11 -> timestamp mismatch on 685803 - offsets from 685801 to 685805, no
> > > > message loss here
> > > > 12 -> -1 error no indexes on the log - base segment was the last
> offset
> > > so
> > > > ok
> > > > 13 -> timestamp mismatch error on 823168 - offsets from 323168 to
> > 823172,
> > > > four messages lost
> > 

Re: Cant Able to start Kafka MirrorMaker

2020-05-01 Thread nitin agarwal
There is typo in your consumer configuration, it should
be auto.offset.reset.

Thanks,
Nitin

On Fri, May 1, 2020 at 12:28 PM vishnu murali 
wrote:

> Hey Guys,
>
> I am trying to move data between one cluster to another cluster
>
>
>
> *Source*
>
> *Destination*
>
> *Zookeeper*
>
> 2181
>
> 2182
>
> *Kafka*
>
> 9092
>
> 9091
>
>
>
> *ConsumerProperties:*
>
> bootstrap.servers=localhost:9092
> group.id=test-consumer-group
> auto.offset.rest=earliest
>
>
>
> *Producer Properties:*
> bootstrap.servers=localhost:9091
> compression.type=none
>
>
> i am having topic in 9092 as actor which is from MySQL Sakila Schema actor
> table
>
>
> In the 9091 i don't have any topic ,so i try to migrate data from 9092
> ->9091 it is showing like
>
>
> D:\kafka>.\bin\windows\kafka-mirror-maker.bat --consumer.config
> .\config\consumer.properties --producer.config .\config\producer.properties
> --whitelist actor
>
>
> WARNING: The default partition assignment strategy of the mirror maker will
> change from 'range' to 'roundrobin' in an upcoming release (so that better
> load balancing can be achieved). If you prefer to make this switch in
> advance of that release add the following to the corresponding config:
>
> 'partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor'
>
>
> [2020-05-01 12:04:55,024] WARN The configuration 'auto.offset.rest' was
> supplied but isn't a known config.
> (org.apache.kafka.clients.consumer.ConsumerConfig)
>
>
> *it's stucked in that place itself no more operations is performed*
>
>
> and the topic is not copied into 9091
>
>
> i dont know why
>
>
> can anyone able to identify and explain me a way to do it please?
>


Cant Able to start Kafka MirrorMaker

2020-05-01 Thread vishnu murali
Hey Guys,

I am trying to move data between one cluster to another cluster



*Source*

*Destination*

*Zookeeper*

2181

2182

*Kafka*

9092

9091



*ConsumerProperties:*

bootstrap.servers=localhost:9092
group.id=test-consumer-group
auto.offset.rest=earliest



*Producer Properties:*
bootstrap.servers=localhost:9091
compression.type=none


i am having topic in 9092 as actor which is from MySQL Sakila Schema actor
table


In the 9091 i don't have any topic ,so i try to migrate data from 9092
->9091 it is showing like


D:\kafka>.\bin\windows\kafka-mirror-maker.bat --consumer.config
.\config\consumer.properties --producer.config .\config\producer.properties
--whitelist actor


WARNING: The default partition assignment strategy of the mirror maker will
change from 'range' to 'roundrobin' in an upcoming release (so that better
load balancing can be achieved). If you prefer to make this switch in
advance of that release add the following to the corresponding config:
'partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor'


[2020-05-01 12:04:55,024] WARN The configuration 'auto.offset.rest' was
supplied but isn't a known config.
(org.apache.kafka.clients.consumer.ConsumerConfig)


*it's stucked in that place itself no more operations is performed*


and the topic is not copied into 9091


i dont know why


can anyone able to identify and explain me a way to do it please?


Re: Kafka: Messages disappearing from topics, largestTime=0

2020-05-01 Thread JP MB
Thank you very much for the help anyway.

Best regards

On Fri, May 1, 2020, 00:54 Liam Clarke-Hutchinson 
wrote:

> So the logs show a healthy shutdown, so we can eliminate that as an issue.
> I would look next at the volume management during a rollout based on the
> other error messages you had earlier about permission denied etc. It's
> possible there's some journalled but not flushed changes in those time
> indexes, but at this point we're getting into filesystem internals which
> aren't my forte. But if you can temporarily disable the volume switching
> and do a test roll out, see if you get the same problems or not, would help
> eliminate it or confirm it.
>
> Sorry I can't help further on that.
>
> On Fri, May 1, 2020 at 5:34 AM JP MB  wrote:
>
> > I took a bit because I needed logs of the server shutting down when this
> > occurs. Here they are, I can see some errors:
> > https://gist.github.com/josebrandao13/e8b82469d3e9ad91fbf38cf139b5a726
> >
> > Regarding systemd, the closest I could find to TimeoutStopSec was
> > DefaultTimeoutStopUSec=1min 30s that looks to be 90seconds. I could not
> > find any KillSignal or RestartKillSignal. You can see the output of
> > systemctl show --all here:
> > https://gist.github.com/josebrandao13/f2dd646fab19b19f127981fce92d78c4
> >
> > Once again, thanks for the help.
> >
> > Em qui., 30 de abr. de 2020 às 15:04, Liam Clarke-Hutchinson <
> > liam.cla...@adscale.co.nz> escreveu:
> >
> > > I'd also suggest eyeballing your systemd conf to verify that someone
> > hasn't
> > > set a very low TimeoutStopSec, or that KillSignal/RestartKillSignal
> > haven't
> > > been configured to SIGKILL (confusingly named, imo, as the default for
> > > KillSignal is SIGTERM).
> > >
> > > Also, the Kafka broker logs at shutdown look very different if it shut
> > down
> > > currently vs if it didn't. Could you perhaps put them in a Gist and
> email
> > > the link?
> > >
> > > Just trying to make sure basic assumptions are holding :)
> > >
> > > On Fri, 1 May 2020, 1:21 am JP MB,  wrote:
> > >
> > > > Hi,
> > > > It's quite a complex script generated with ansible where we use a/b
> > > > deployment and honestly, I don't have full knowledge on it I can
> share
> > > the
> > > > general guidelines of what is done:
> > > >
> > > > > - Any old volumes (from previous releases are removed) (named with
> > > suffix
> > > > > '-old')
> > > > > - Detach the volumes attached to the old host
> > > > > - Stop the service in the old host - uses systemctl stop kafka
> > > > > - Attempt to create a CNAME volume: this is a volume with the same
> > name
> > > > > that will be attached to the new box. Except for very first run,
> this
> > > > task
> > > > > is used to get the information about the existing volume. (no
> sufix)
> > > > > - A new volume is created as copy of the CNAME volume (named with
> > > suffix
> > > > > '-new')
> > > > > - The new volume is attached to the host/vm (named with suffix
> > '-new')
> > > > > - The new volume is formated (except for very first run, its
> already
> > > > > formated)(named with suffix '-new')
> > > > > - The new volume is mounted (named with suffix '-new')
> > > > > - Start the service in the new host - uses systemctl start kafka
> > > > > - If everthing went well stopping/starting services:
> > > > >- The volume no the old host is renamed with prefix '-old'.
> > > > >- The new volume is renamed stripping the suffix '-new'.
> > > >
> > > >
> > > > I made a new experiment today with some interesting findings. Had 518
> > > > messages in a given topic, after a deployment lost 9 due to this
> > problem
> > > in
> > > > partitions 13,15,16 and 17. All the errors I could find in the time
> > > > index files before the deployment (left is partition number):
> > > >
> > > > 11 -> timestamp mismatch on 685803 - offsets from 685801 to 685805,
> no
> > > > > message loss here
> > > > > 12 -> -1 error no indexes on the log - base segment was the last
> > offset
> > > > so
> > > > > ok
> > > > > 13 -> timestamp mismatch error on 823168 - offsets from 323168 to
> > > 823172,
> > > > > four messages lost
> > > > > 14 -> timestamp mismatch on 619257 - offsets from 619253 to 619258,
> > no
> > > > > message loss here
> > > > > 15 -> timestamp mismatch on 658783 - offsets from 658783 to 658784,
> > one
> > > > > message missing
> > > > > 16 -> timestamp mismatch on 623508 - offsets from 623508 to 623509,
> > one
> > > > > message missing
> > > > > 17 -> timestamp mismatch on 515479 - offsets from 515479 to 515481,
> > two
> > > > > messages missing
> > > >
> > > >
> > > > After the deployment, I took a look and the state was this:
> > > >
> > > > > 11 -> timestamp mismatch error on 685803 -   same state
> > > > > 12 -> -1 error no indexes on the log - same state
> > > > > 13 -> Exception in thread "main" java.io.IOException: Permission
> > denied
> > > > > 14 -> timestamp mismatch error on 619257 - same state
> > > > > 15 -> Exception in thread "main" 

Re: can kafka state stores be used as a application level cache by application to modify it from outside the stream topology?

2020-05-01 Thread Pushkar Deole
thanks... will try with GlobalKTable.
As a side question, I didn't really understand the significance of global
state store which kind of works in a reverse way to local state store i.e.
local state store is updated and then saved to changelog topic whereas in
case of global state store the topic is updated first and then synced to
global state store. Do these two work in sync i.e. the update to topic and
global state store ?

Say one stream thread updates the topic for global store and starts
processing next event wherein the processor tries to read the global store
which may not have been synced with the topic?

On Fri, May 1, 2020 at 3:35 AM Matthias J. Sax  wrote:

> Yes.
>
> A `GlobalKTable` uses a global store internally.
>
> You can also use `StreamsBuilder.addGlobalStore()` or
> `Topology.addGlobalStore()` to add a global store "manually".
>
>
> -Matthias
>
>
> On 4/30/20 7:42 AM, Pushkar Deole wrote:
> > Thanks Matthias.
> > Can you elaborate on the replicated caching layer part?
> > When you say global stores, do you mean GlobalKTable created from a topic
> > e.g. using StreamsBuilder.globalTable(String topic) method ?
> >
> > On Thu, Apr 30, 2020 at 12:44 PM Matthias J. Sax 
> wrote:
> >
> >> It's not possible to modify state store from "outside".
> >>
> >> If you want to build a "replicated caching layer", you could use global
> >> stores and write into the corresponding topics to update all stores. Of
> >> course, those updates would be async.
> >>
> >>
> >> -Matthias
> >>
> >> On 4/29/20 10:52 PM, Pushkar Deole wrote:
> >>> Hi All,
> >>>
> >>> I am wondering if this is possible: i have been asked to use state
> stores
> >>> as a general replicated cache among multiple instances of service
> >> instances
> >>> however the state store is created through streambuilder but is not
> >>> actually modified through stream processor topology however it is to be
> >>> modified from outside the stream topology. So, essentially, the state
> >> store
> >>> is just to be created from streambuilder and then to be used as an
> >>> application level cache that will get replicated between application
> >>> instances. Is this possible using state stores?
> >>>
> >>> Secondly, if possible, is this a good design approach?
> >>>
> >>> Appreciate your response since I don't know the internals of state
> >> stores.
> >>>
> >>
> >>
> >
>
>