Re: Kafka mirror maker help

2018-04-26 Thread Amrit Jangid
You should share related info, such source-destination Kafka versions,
sample Config or error if any.

FYI,  Go through
https://kafka.apache.org/documentation/#basic_ops_mirror_maker


Kafka mirror maker help

2018-04-26 Thread saravanan kannappan
Hello,

I'm setting up mirror maker for kafka however it doesn't  help
can you guide me some steps or else share the model shell script to execute
the mirror maker ?

highly appreciated your help.

Thanks
Sara


Re: Monitoring and Alerting

2018-04-26 Thread Arunkumar
Hi All
Thank you for your time and effort. All have suggested my to look at Prometheus 
etc... I am not looking for a solution. I need metrics on which I can do 
alerting. We are implementing custom alerting mechanism. I am using JMX to get 
all the metrics but as many of you suggested I do not need all of them. I need 
very few to alert and I need details about that metrics on which I can create 
alerts.
Thanks in advance 
ThanksArunkumar Pichaimuthu, PMP 

On Wednesday, April 25, 2018, 11:35:27 PM CDT, Anand, Uttam 
 wrote:  
 
 You can use Prometheus for monitoring , AlertManager for alerting and Grafana 
to view the real time graphs by using the metrics scraped by Prometheus.


For gathering the metrics of Kafka and Zookeeper , you can use jmx exporter. 
Also, all metrics will not be useful for humans to read and understand and it 
will be overhead for monitoring apps to process them. So you can whitelist the 
metrics you actually want.


https://www.robustperception.io/monitoring-kafka-with-prometheus


You can add jmx exporter as environment variable in container also.


Monitoring Kafka with Prometheus | Robust 
Perception
www.robustperception.io
We’ve previously looked at how to monitor Cassandra with Prometheus. Let’s see 
the process for getting metrics from another popular Java application, Kafka. 
Similar to what we did for Cassandra, we download Kafka, the JMX exporter and 
the config file:





From: Will Weber 
Sent: Wednesday, April 25, 2018 11:27:40 PM
To: users@kafka.apache.org
Subject: Re: Monitoring and Alerting

EXTERNAL EMAIL
"Thirding" the use of prometheus, recommend reading this blog post as well:
https://www.robustperception.io/monitoring-kafka-with-prometheus

On Wed, Apr 25, 2018 at 11:39 PM, Jonathan Bethune <
jonathan.beth...@instaclustr.com> wrote:

> I also recommend Prometheus. Works great with JMX for Kafka or any Java
> service.
>
> Datadog is also fine if you want simple and don't spending money. Easy to
> integrate with alerting systems and great visualization.
>
> They have a good blogpost about integrating with Kafka and ZooKeeper:
> https://www.datadoghq.com/blog/monitoring-kafka-performance-metrics/
>
> On 26 April 2018 at 08:02, Stanislav Antic 
> wrote:
>
> > I recommend setting up Prometheus which have Zookeeper exporter (helper
> > program which gives metrics from outside software) and you can use JMX
> > exporter with Kafka.
> >
> > You can find example configs in their repo, which is pretty good and you
> > also have already done Grafana dashboards (https://grafana.com/
> dashboards
> > ):
> > https://github.com/prometheus/jmx_exporter/tree/master/example_configs
> >
> >
> > On Thu, Apr 26, 2018 at 12:05 AM, Arunkumar  > invalid>
> > wrote:
> >
> > > HI All
> > > I am working on setting up Monitoring and alerting for our production
> > > cluster. As of now we have a cluster of 3 zookeeper and 3 kafka Brokers
> > > which will expand later.
> > > We are planning for basic metrics (important ones) on which we need to
> > > alert. We are in a process of developing alerting system for our
> > cluster. I
> > > googled and did not get much detail as needed.
> > > I see many say the following ones are important and I understand these
> > are
> > > aggregate metrics across cluster (I may not be
> right).kafka.server:type=
> > > ReplicaManager,name=UnderReplicatedPartitionskafka.controller:type=
> > > KafkaController,name=OfflinePartitionsCountkafka.controller:type=
> > > KafkaController,name=ActiveControllerCount
> > > But I am looking for broker,topic etc metrics to alert. Can any one
> > > provide info or point to any docs on this regards will be of great
> > > help.            Thanks in advance
> > >
> > > ThanksArunkumar Pichaimuthu, PMP
> >
> >
> >
> >
> > --
> > Stanislav Antic
> >
>
>
>
> --
>
> *Jonathan Bethune - **Senior Consultant*
>
> JP: +81 70 4069 4357
>
>   
> 
>
> Read our latest technical blog posts here
> . This email has been sent on behalf
> of Instaclustr Pty. Limited (Australia) and Instaclustr Inc (USA). This
> email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>  

Re: kafka error after upgrading to 1.1.0: “the state store…may have migrated to another instance”

2018-04-26 Thread dizzy0ny
That function doesn't throw the exception.  It returns an empty store list.  
That function is called by QueryableStateProvider.getStore(), which will throw 
InvalidStateStoreException if the list is empty.
The function I provided, is iterating over the StreamTasks, but is not able to 
find the the store I'm trying to get.  What I don't understand is why It is 
successful after a clean start (after I clean the kafka logs and offset files), 
but not if I then simply stop and restart.





 Original message From: Guozhang Wang  
Date: 4/26/18  1:56 PM  (GMT-05:00) To: users@kafka.apache.org Subject: Re: 
kafka error after upgrading to 1.1.0: “the state store…may have migrated to 
another instance” 
Hello,

Thanks for reporting this issue, did you know which line gets fired and
throw the InvalidStateStoreException since you listed two places here?

1)  if (!streamThread.isRunningAndNotRebalancing())

2) if (!store.isOpen())

From the description that "the above code is not finding the store for the
topic it is supposed to publish (even though it has to exist given the app
starts and works fine the first time i start it after clearing the logs and
store." I'm not clear which scenario are you referring to.


Also could you paste the full stack trace of the exception so that I can
look into this issue further?


Guozhang


On Thu, Apr 26, 2018 at 7:29 AM, dizzy0ny  wrote:

> My stream app produces streams by subscribing to changes from our database
> by using confluent connect, does some calculation and then publishes their
> own stream/topic.
>
> When starting the app, i attempt to get each of the stream store the app
> publishes. This code simply tries to get the store using KafkaStreams.store
> method in a try/catch loop (i try for 300 times with s sleep in between
> calls  to give the the stream time in case it is rebalancing or truly
> migrating). This all worked fine for kafka 0.10.2
>
> After upgrading to kafka 1.1.0, the app starts the first time fine.
> However, if i try to restart the app, in cases where the stream consumes
> multiple topics from connect, such streams are always throwing
>  InvalidStateStoreException. This does not happen for streams that
> subscribe to a single connect topic. To fix, i must delete the logs and
> store, then restarting my stream app.
>
> i debugged into the source a bit and found the issue is this call in
> org.apache.kafka.streams.state.internals.StreamThreadStateStoreProvider
> public  List stores(final String storeName, final
> QueryableStoreType
>  queryableStoreType) {
> if (streamThread.state() == StreamThread.State.DEAD) {
> return Collections.emptyList();
> }
> if (!streamThread.isRunningAndNotRebalancing()) {
> throw new InvalidStateStoreException("the state store, " +
> storeName
>  + ", may have migrated to another instance.");
> }
> final List stores = new ArrayList<>();
> for (Task streamTask : streamThread.tasks().values()) {
> final StateStore store = streamTask.getStore(storeName);
> if (store != null && queryableStoreType.accepts(store)) {
> if (!store.isOpen()) {
> throw new InvalidStateStoreException("the state store, "
> + storeName
>  + ", may have migrated to another instance.");
> }
> stores.add((T) store);
> }
> }
> return stores;
> }
>
> For streams that consume multiple connect topics and produce a single
> stream/topic, when i restart the app, the above code is not finding the
> store for the topic it is supposed to publish (even though
>  it has to exist given the app starts and works fine the first time i
> start it after clearing the logs and store. What is even more strange
> however, is that despite it not finding a store, it is still receiving
> connect
>  produced topics and producing the calculated stream apparently just fine.
>
> Anyone have any ideas on what might be happening here after the upgrade?




-- 
-- Guozhang


Re: Broker cannot start switch to Java9 - weird file system issue ?

2018-04-26 Thread Ismael Juma
Thanks, I updated it.

On Thu, Apr 26, 2018 at 4:09 AM, Enrico Olivelli 
wrote:

> Here it is !
>
> https://issues.apache.org/jira/browse/KAFKA-6828
>
> Thank you
>
> Enrico
>
> 2018-04-24 20:53 GMT+02:00 Ismael Juma :
>
> > A JIRA ticket would be appreciated. :)
> >
> > Ismael
> >
> > On Sat, Apr 21, 2018 at 12:51 AM, Enrico Olivelli 
> > wrote:
> >
> > > Il sab 21 apr 2018, 06:29 Ismael Juma  ha scritto:
> > >
> > > > Hi Enrico,
> > > >
> > > > It is a real problem because it causes indexes to take a lot more
> disk
> > > > space upfront. The sparsity is an important if people over partition,
> > for
> > > > example.
> > > >
> > >
> > > Got it.
> > > In production I saw no issue, maybe due to much availability of disk
> > space.
> > >
> > > Should I file a JIRA or you will do?
> > > Enrico
> > >
> > >
> > > > Ismael
> > > >
> > > > On Fri, Apr 20, 2018 at 12:41 PM, Enrico Olivelli <
> eolive...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Il ven 20 apr 2018, 20:24 Ismael Juma  ha
> > scritto:
> > > > >
> > > > > > Hi Enrico,
> > > > > >
> > > > > > Coincidentally, I saw your message to nio-dev and followed up
> > there.
> > > > > >
> > > > >
> > > > > I think this is not a 'real' problem, you will notice the
> difference
> > > only
> > > > > if you have a lot of empty topics/partitions.
> > > > > In fact when you simply upgrade to jdk9/10 immediately you are
> > charged
> > > > with
> > > > > 10MB of disk space for each partition.
> > > > > In production I did not suffer this change because usually you do
> not
> > > > have
> > > > > empty partitions.
> > > > > In my test environments, where I had thousands of test empty
> > > partitions,
> > > > > disks filled up immediately and without any reason, it took time to
> > > > > understand the real cause. Broker will crash without much 'log' as
> > disk
> > > > is
> > > > > out of space.
> > > > >
> > > > > Maybe it would be useful to add some notice about this potential
> > > problem
> > > > > during the upgrade of the jdk.
> > > > >
> > > > > Hope that helps
> > > > > Enrico
> > > > >
> > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Fri, Apr 20, 2018 at 8:18 AM, Enrico Olivelli <
> > > eolive...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > It is a deliberate change in JDK code
> > > > > > >
> > > > > > > Just for reference see this discussion  on nio-dev list on
> > OpenJDK
> > > > > > >
> > > > http://mail.openjdk.java.net/pipermail/nio-dev/2018-April/
> 005008.html
> > > > > > >
> > > > > > >
> > > > > > > see
> > > > > > > https://bugs.openjdk.java.net/browse/JDK-8168628
> > > > > > >
> > > > > > > Cheers
> > > > > > > Enrico
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > 2018-03-05 14:29 GMT+01:00 Enrico Olivelli <
> eolive...@gmail.com
> > >:
> > > > > > >
> > > > > > > > Workaround:
> > > > > > > > as these brokers are only for test environments I have set
> very
> > > > small
> > > > > > > > values for index file size, which affects pre-allocation
> > > > > > > > segment.index.bytes=65536
> > > > > > > > log.index.size.max.bytes=65536
> > > > > > > >
> > > > > > > > If anyone has some thought it will be very appreciated
> > > > > > > > Cheers
> > > > > > > >
> > > > > > > > Enrico
> > > > > > > >
> > > > > > > >
> > > > > > > > 2018-03-05 13:21 GMT+01:00 Enrico Olivelli <
> > eolive...@gmail.com
> > > >:
> > > > > > > >
> > > > > > > >> The only fact I have found is that with Java8 Kafka is
> > creating
> > > > > > "SPARSE"
> > > > > > > >> files and with Java9 this is not true anymore
> > > > > > > >>
> > > > > > > >> Enrico
> > > > > > > >>
> > > > > > > >> 2018-03-05 12:44 GMT+01:00 Enrico Olivelli <
> > eolive...@gmail.com
> > > >:
> > > > > > > >>
> > > > > > > >>> Hi,
> > > > > > > >>> This is a very strage case. I have a Kafka broker (part of
> a
> > > > > cluster
> > > > > > of
> > > > > > > >>> 3 brokers) which cannot start upgrading Java from Oracle
> JDK8
> > > to
> > > > > > > Oracle JDK
> > > > > > > >>> 9.0.4.
> > > > > > > >>>
> > > > > > > >>> There are a lot of .index and .timeindex files taking 10MB,
> > > they
> > > > > are
> > > > > > > for
> > > > > > > >>> empty partiions.
> > > > > > > >>>
> > > > > > > >>> Running with Java 9 the server seems to rebuild these files
> > and
> > > > > each
> > > > > > > >>> file takes "really" 10MB.
> > > > > > > >>> The sum of all the files (calculated using du -sh) is 22GB
> > and
> > > > the
> > > > > > > >>> broker crashes during startup, disk becomes full and no log
> > > more
> > > > is
> > > > > > > >>> written. (I can send an extraction of the logs, but the
> tell
> > > only
> > > > > > > about
> > > > > > > >>> 'rebuilding index', the same as on Java 8)
> > > > > > > >>>
> > > > > > > >>> Reverting the same broker to Java 8 and removing the index
> > > files,
> > > > > the
> > > > > > > >>> broker rebuilds such files, each files take 10MB, but the
> > full

Re: kafka error after upgrading to 1.1.0: “the state store…may have migrated to another instance”

2018-04-26 Thread Guozhang Wang
Hello,

Thanks for reporting this issue, did you know which line gets fired and
throw the InvalidStateStoreException since you listed two places here?

1)  if (!streamThread.isRunningAndNotRebalancing())

2) if (!store.isOpen())

>From the description that "the above code is not finding the store for the
topic it is supposed to publish (even though it has to exist given the app
starts and works fine the first time i start it after clearing the logs and
store." I'm not clear which scenario are you referring to.


Also could you paste the full stack trace of the exception so that I can
look into this issue further?


Guozhang


On Thu, Apr 26, 2018 at 7:29 AM, dizzy0ny  wrote:

> My stream app produces streams by subscribing to changes from our database
> by using confluent connect, does some calculation and then publishes their
> own stream/topic.
>
> When starting the app, i attempt to get each of the stream store the app
> publishes. This code simply tries to get the store using KafkaStreams.store
> method in a try/catch loop (i try for 300 times with s sleep in between
> calls  to give the the stream time in case it is rebalancing or truly
> migrating). This all worked fine for kafka 0.10.2
>
> After upgrading to kafka 1.1.0, the app starts the first time fine.
> However, if i try to restart the app, in cases where the stream consumes
> multiple topics from connect, such streams are always throwing
>  InvalidStateStoreException. This does not happen for streams that
> subscribe to a single connect topic. To fix, i must delete the logs and
> store, then restarting my stream app.
>
> i debugged into the source a bit and found the issue is this call in
> org.apache.kafka.streams.state.internals.StreamThreadStateStoreProvider
> public  List stores(final String storeName, final
> QueryableStoreType
>  queryableStoreType) {
> if (streamThread.state() == StreamThread.State.DEAD) {
> return Collections.emptyList();
> }
> if (!streamThread.isRunningAndNotRebalancing()) {
> throw new InvalidStateStoreException("the state store, " +
> storeName
>  + ", may have migrated to another instance.");
> }
> final List stores = new ArrayList<>();
> for (Task streamTask : streamThread.tasks().values()) {
> final StateStore store = streamTask.getStore(storeName);
> if (store != null && queryableStoreType.accepts(store)) {
> if (!store.isOpen()) {
> throw new InvalidStateStoreException("the state store, "
> + storeName
>  + ", may have migrated to another instance.");
> }
> stores.add((T) store);
> }
> }
> return stores;
> }
>
> For streams that consume multiple connect topics and produce a single
> stream/topic, when i restart the app, the above code is not finding the
> store for the topic it is supposed to publish (even though
>  it has to exist given the app starts and works fine the first time i
> start it after clearing the logs and store. What is even more strange
> however, is that despite it not finding a store, it is still receiving
> connect
>  produced topics and producing the calculated stream apparently just fine.
>
> Anyone have any ideas on what might be happening here after the upgrade?




-- 
-- Guozhang


Re: Re: kafka steams with TimeWindows ,incorrect result

2018-04-26 Thread Guozhang Wang
Using a control message to flush results to downstream (in your case to the
result db) looks good to me as well.

On Thu, Apr 26, 2018 at 10:49 AM, Guozhang Wang  wrote:

> If you're talking about which store to use in your transform function, it
> should be a windowed store.
>
> You can create such a store with the `Stores` factory, and suppose your
> old code has `windowedBy(TimeWindows.of(6))`, then you can do
>
> `
> windows = TimeWindows.of(6);
>
> Stores.WindowStoreBuilder(
> Stores.persistentWindowStore("Counts"),
> windows.maintainMs(),
>
> windows.segments,
>
> windows.size(),
> true)
>
> )
>
> `
>
>
> Guozhang
>
>
>
> On Thu, Apr 26, 2018 at 4:39 AM, 杰 杨  wrote:
>
>> I return back .
>> Which StateStore could I use for this problem?
>> and another idea .I can send 'flush' message into this topic .
>> when received this message could update results to db.
>> I don't know it's work?
>>
>> 
>> funk...@live.com
>>
>> From: Guozhang Wang
>> Date: 2018-03-12 03:58
>> To: users
>> Subject: Re: Re: kafka steams with TimeWindows ,incorrect result
>> If you want to strictly "only have one output per window", then for now
>> you'd probably implement that logic using a lower-level "transform"
>> function in which you can schedule a punctuate function to send all the
>> results at the end of a window.
>>
>> If you just want to reduce the amount of data to your sink, but your sink
>> can still handle overwritten records of the same key, you can enlarge the
>> cache size via the cache.max.bytes.buffering config.
>>
>> https://kafka.apache.org/documentation/#streamsconfigs
>>
>> On Fri, Mar 9, 2018 at 9:45 PM, 杰 杨  wrote:
>>
>> > thx for your reply!
>> > I see that it is designed to operate on an infinite, unbounded stream of
>> > data.
>> > now I want to process for  unbounded stream but divided by time
>> interval .
>> > so what can I do for doing this ?
>> >
>> > 
>> > funk...@live.com
>> >
>> > From: Guozhang Wang
>> > Date: 2018-03-10 02:50
>> > To: users
>> > Subject: Re: kafka steams with TimeWindows ,incorrect result
>> > Hi Jie,
>> >
>> > This is by design of Kafka Streams, please read this doc for more
>> details
>> > (search for "outputs of the Wordcount application is actually a
>> continuous
>> > stream of updates"):
>> >
>> > https://kafka.apache.org/0110/documentation/streams/quickstart
>> >
>> > Note this semantics applies for both windowed and un-windowed tables.
>> >
>> >
>> > Guozhang
>> >
>> > On Fri, Mar 9, 2018 at 5:36 AM, 杰 杨  wrote:
>> >
>> > > Hi:
>> > > I used TimeWindow for aggregate data in kafka.
>> > >
>> > > this is code snippet ;
>> > >
>> > >   view.flatMap(new MultipleKeyValueMapper(client)
>> > > ).groupByKey(Serialized.with(Serdes.String(),
>> > > Serdes.serdeFrom(new CountInfoSerializer(), new
>> > > CountInfoDeserializer(
>> > > .windowedBy(TimeWindows.of(6)).reduce(new
>> > > Reducer() {
>> > > @Override
>> > > public CountInfo apply(CountInfo value1, CountInfo
>> value2) {
>> > > return new CountInfo(value1.start + value2.start,
>> > > value1.active + value2.active, value1.fresh + value2.fresh);
>> > > }
>> > > }) .toStream(new KeyValueMapper> > > String>() {
>> > > @Override
>> > > public String apply(Windowed key, CountInfo
>> value) {
>> > > return key.key();
>> > > }
>> > > }).print(Printed.toSysOut());
>> > >
>> > > KafkaStreams streams = new KafkaStreams(builder.build(),
>> > > KStreamReducer.getConf());
>> > > streams.start();
>> > >
>> > > and I test 3 data in kafka .
>> > > and I print key value .
>> > >
>> > >
>> > > [KTABLE-TOSTREAM-07]: [9_9_2018-03-09_hour_
>> > > 21@152060130/152060136], CountInfo{start=12179, active=12179,
>> > > fresh=12179}
>> > > [KTABLE-TOSTREAM-07]: [9_9_2018-03-09@
>> > 152060130/152060136],
>> > > CountInfo{start=12179, active=12179, fresh=12179}
>> > > [KTABLE-TOSTREAM-07]: [9_9_2018-03-09_hour_
>> > > 21@152060130/152060136], CountInfo{start=3, active=3,
>> > > fresh=3}
>> > > [KTABLE-TOSTREAM-07]: [9_9_2018-03-09@
>> > 152060130/152060136],
>> > > CountInfo{start=3, active=3, fresh=3}
>> > > why in one window duration will be print two result but not one
>> result ?
>> > >
>> > > 
>> > > funk...@live.com
>> > >
>> >
>> >
>> >
>> > --
>> > -- Guozhang
>> >
>>
>>
>>
>> --
>> -- Guozhang
>>
>
>
>
> --
> -- Guozhang
>



-- 
-- Guozhang


Re: Re: kafka steams with TimeWindows ,incorrect result

2018-04-26 Thread Guozhang Wang
If you're talking about which store to use in your transform function, it
should be a windowed store.

You can create such a store with the `Stores` factory, and suppose your old
code has `windowedBy(TimeWindows.of(6))`, then you can do

`
windows = TimeWindows.of(6);

Stores.WindowStoreBuilder(
Stores.persistentWindowStore("Counts"),
windows.maintainMs(),

windows.segments,

windows.size(),
true)

)

`


Guozhang



On Thu, Apr 26, 2018 at 4:39 AM, 杰 杨  wrote:

> I return back .
> Which StateStore could I use for this problem?
> and another idea .I can send 'flush' message into this topic .
> when received this message could update results to db.
> I don't know it's work?
>
> 
> funk...@live.com
>
> From: Guozhang Wang
> Date: 2018-03-12 03:58
> To: users
> Subject: Re: Re: kafka steams with TimeWindows ,incorrect result
> If you want to strictly "only have one output per window", then for now
> you'd probably implement that logic using a lower-level "transform"
> function in which you can schedule a punctuate function to send all the
> results at the end of a window.
>
> If you just want to reduce the amount of data to your sink, but your sink
> can still handle overwritten records of the same key, you can enlarge the
> cache size via the cache.max.bytes.buffering config.
>
> https://kafka.apache.org/documentation/#streamsconfigs
>
> On Fri, Mar 9, 2018 at 9:45 PM, 杰 杨  wrote:
>
> > thx for your reply!
> > I see that it is designed to operate on an infinite, unbounded stream of
> > data.
> > now I want to process for  unbounded stream but divided by time interval
> .
> > so what can I do for doing this ?
> >
> > 
> > funk...@live.com
> >
> > From: Guozhang Wang
> > Date: 2018-03-10 02:50
> > To: users
> > Subject: Re: kafka steams with TimeWindows ,incorrect result
> > Hi Jie,
> >
> > This is by design of Kafka Streams, please read this doc for more details
> > (search for "outputs of the Wordcount application is actually a
> continuous
> > stream of updates"):
> >
> > https://kafka.apache.org/0110/documentation/streams/quickstart
> >
> > Note this semantics applies for both windowed and un-windowed tables.
> >
> >
> > Guozhang
> >
> > On Fri, Mar 9, 2018 at 5:36 AM, 杰 杨  wrote:
> >
> > > Hi:
> > > I used TimeWindow for aggregate data in kafka.
> > >
> > > this is code snippet ;
> > >
> > >   view.flatMap(new MultipleKeyValueMapper(client)
> > > ).groupByKey(Serialized.with(Serdes.String(),
> > > Serdes.serdeFrom(new CountInfoSerializer(), new
> > > CountInfoDeserializer(
> > > .windowedBy(TimeWindows.of(6)).reduce(new
> > > Reducer() {
> > > @Override
> > > public CountInfo apply(CountInfo value1, CountInfo value2)
> {
> > > return new CountInfo(value1.start + value2.start,
> > > value1.active + value2.active, value1.fresh + value2.fresh);
> > > }
> > > }) .toStream(new KeyValueMapper > > String>() {
> > > @Override
> > > public String apply(Windowed key, CountInfo value)
> {
> > > return key.key();
> > > }
> > > }).print(Printed.toSysOut());
> > >
> > > KafkaStreams streams = new KafkaStreams(builder.build(),
> > > KStreamReducer.getConf());
> > > streams.start();
> > >
> > > and I test 3 data in kafka .
> > > and I print key value .
> > >
> > >
> > > [KTABLE-TOSTREAM-07]: [9_9_2018-03-09_hour_
> > > 21@152060130/152060136], CountInfo{start=12179, active=12179,
> > > fresh=12179}
> > > [KTABLE-TOSTREAM-07]: [9_9_2018-03-09@
> > 152060130/152060136],
> > > CountInfo{start=12179, active=12179, fresh=12179}
> > > [KTABLE-TOSTREAM-07]: [9_9_2018-03-09_hour_
> > > 21@152060130/152060136], CountInfo{start=3, active=3,
> > > fresh=3}
> > > [KTABLE-TOSTREAM-07]: [9_9_2018-03-09@
> > 152060130/152060136],
> > > CountInfo{start=3, active=3, fresh=3}
> > > why in one window duration will be print two result but not one result
> ?
> > >
> > > 
> > > funk...@live.com
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>
>
>
> --
> -- Guozhang
>



-- 
-- Guozhang


Re: Write access required to Confluence wiki

2018-04-26 Thread Piyush Vijay
Ping :)


Piyush Vijay

On Wed, Apr 25, 2018 at 7:45 PM, Piyush Vijay 
wrote:

>
> I want to open a JIRA and a KIP. Can someone please grant me the necessary
> permissions?
> My username is *piyushvijay*.
>
> Thank you
> Piyush Vijay
>
>


kafka error after upgrading to 1.1.0: “the state store…may have migrated to another instance”

2018-04-26 Thread mahendra.b.singh
My stream app produces streams by subscribing to changes from our database by 
using confluent connect, does some calculation and then publishes their own 
stream/topic.

When starting the app, i attempt to get each of the stream store the app 
publishes. This code simply tries to get the store using KafkaStreams.store 
method in a try/catch loop (i try for 300 times with s sleep in between calls  
to give the the stream time in case it is rebalancing or truly migrating). This 
all worked fine for kafka 0.10.2

After upgrading to kafka 1.1.0, the app starts the first time fine. However, if 
i try to restart the app, in cases where the stream consumes multiple topics 
from connect, such streams are always throwing
 InvalidStateStoreException. This does not happen for streams that subscribe to 
a single connect topic. To fix, i must delete the logs and store, then 
restarting my stream app.

i debugged into the source a bit and found the issue is this call in 
org.apache.kafka.streams.state.internals.StreamThreadStateStoreProvider
    public  List stores(final String storeName, final 
QueryableStoreType
 queryableStoreType) {
    if (streamThread.state() == StreamThread.State.DEAD) {
    return Collections.emptyList();
    }
    if (!streamThread.isRunningAndNotRebalancing()) {
    throw new InvalidStateStoreException("the state store, " + storeName
 + ", may have migrated to another instance.");
    }
    final List stores = new ArrayList<>();
    for (Task streamTask : streamThread.tasks().values()) {
    final StateStore store = streamTask.getStore(storeName);
    if (store != null && queryableStoreType.accepts(store)) {
    if (!store.isOpen()) {
    throw new InvalidStateStoreException("the state store, " + 
storeName
 + ", may have migrated to another instance.");
    }
    stores.add((T) store);
    }
    }
    return stores;
}

For streams that consume multiple connect topics and produce a single 
stream/topic, when i restart the app, the above code is not finding the store 
for the topic it is supposed to publish (even though
 it has to exist given the app starts and works fine the first time i start it 
after clearing the logs and store. What is even more strange however, is that 
despite it not finding a store, it is still receiving connect
 produced topics and producing the calculated stream apparently just fine.

Anyone have any ideas on what might be happening here after the upgrade?

kafka error after upgrading to 1.1.0: “the state store…may have migrated to another instance”

2018-04-26 Thread dizzy0ny
My stream app produces streams by subscribing to changes from our database by 
using confluent connect, does some calculation and then publishes their own 
stream/topic.

When starting the app, i attempt to get each of the stream store the app 
publishes. This code simply tries to get the store using KafkaStreams.store 
method in a try/catch loop (i try for 300 times with s sleep in between calls  
to give the the stream time in case it is rebalancing or truly migrating). This 
all worked fine for kafka 0.10.2

After upgrading to kafka 1.1.0, the app starts the first time fine. However, if 
i try to restart the app, in cases where the stream consumes multiple topics 
from connect, such streams are always throwing
 InvalidStateStoreException. This does not happen for streams that subscribe to 
a single connect topic. To fix, i must delete the logs and store, then 
restarting my stream app.

i debugged into the source a bit and found the issue is this call in 
org.apache.kafka.streams.state.internals.StreamThreadStateStoreProvider
    public  List stores(final String storeName, final 
QueryableStoreType
 queryableStoreType) {
    if (streamThread.state() == StreamThread.State.DEAD) {
    return Collections.emptyList();
    }
    if (!streamThread.isRunningAndNotRebalancing()) {
    throw new InvalidStateStoreException("the state store, " + storeName
 + ", may have migrated to another instance.");
    }
    final List stores = new ArrayList<>();
    for (Task streamTask : streamThread.tasks().values()) {
    final StateStore store = streamTask.getStore(storeName);
    if (store != null && queryableStoreType.accepts(store)) {
    if (!store.isOpen()) {
    throw new InvalidStateStoreException("the state store, " + 
storeName
 + ", may have migrated to another instance.");
    }
    stores.add((T) store);
    }
    }
    return stores;
}

For streams that consume multiple connect topics and produce a single 
stream/topic, when i restart the app, the above code is not finding the store 
for the topic it is supposed to publish (even though
 it has to exist given the app starts and works fine the first time i start it 
after clearing the logs and store. What is even more strange however, is that 
despite it not finding a store, it is still receiving connect
 produced topics and producing the calculated stream apparently just fine.

Anyone have any ideas on what might be happening here after the upgrade?

Re: Re: kafka steams with TimeWindows ,incorrect result

2018-04-26 Thread 杰 杨
I return back .
Which StateStore could I use for this problem?
and another idea .I can send 'flush' message into this topic .
when received this message could update results to db.
I don't know it's work?


funk...@live.com

From: Guozhang Wang
Date: 2018-03-12 03:58
To: users
Subject: Re: Re: kafka steams with TimeWindows ,incorrect result
If you want to strictly "only have one output per window", then for now
you'd probably implement that logic using a lower-level "transform"
function in which you can schedule a punctuate function to send all the
results at the end of a window.

If you just want to reduce the amount of data to your sink, but your sink
can still handle overwritten records of the same key, you can enlarge the
cache size via the cache.max.bytes.buffering config.

https://kafka.apache.org/documentation/#streamsconfigs

On Fri, Mar 9, 2018 at 9:45 PM, 杰 杨  wrote:

> thx for your reply!
> I see that it is designed to operate on an infinite, unbounded stream of
> data.
> now I want to process for  unbounded stream but divided by time interval .
> so what can I do for doing this ?
>
> 
> funk...@live.com
>
> From: Guozhang Wang
> Date: 2018-03-10 02:50
> To: users
> Subject: Re: kafka steams with TimeWindows ,incorrect result
> Hi Jie,
>
> This is by design of Kafka Streams, please read this doc for more details
> (search for "outputs of the Wordcount application is actually a continuous
> stream of updates"):
>
> https://kafka.apache.org/0110/documentation/streams/quickstart
>
> Note this semantics applies for both windowed and un-windowed tables.
>
>
> Guozhang
>
> On Fri, Mar 9, 2018 at 5:36 AM, 杰 杨  wrote:
>
> > Hi:
> > I used TimeWindow for aggregate data in kafka.
> >
> > this is code snippet ;
> >
> >   view.flatMap(new MultipleKeyValueMapper(client)
> > ).groupByKey(Serialized.with(Serdes.String(),
> > Serdes.serdeFrom(new CountInfoSerializer(), new
> > CountInfoDeserializer(
> > .windowedBy(TimeWindows.of(6)).reduce(new
> > Reducer() {
> > @Override
> > public CountInfo apply(CountInfo value1, CountInfo value2) {
> > return new CountInfo(value1.start + value2.start,
> > value1.active + value2.active, value1.fresh + value2.fresh);
> > }
> > }) .toStream(new KeyValueMapper > String>() {
> > @Override
> > public String apply(Windowed key, CountInfo value) {
> > return key.key();
> > }
> > }).print(Printed.toSysOut());
> >
> > KafkaStreams streams = new KafkaStreams(builder.build(),
> > KStreamReducer.getConf());
> > streams.start();
> >
> > and I test 3 data in kafka .
> > and I print key value .
> >
> >
> > [KTABLE-TOSTREAM-07]: [9_9_2018-03-09_hour_
> > 21@152060130/152060136], CountInfo{start=12179, active=12179,
> > fresh=12179}
> > [KTABLE-TOSTREAM-07]: [9_9_2018-03-09@
> 152060130/152060136],
> > CountInfo{start=12179, active=12179, fresh=12179}
> > [KTABLE-TOSTREAM-07]: [9_9_2018-03-09_hour_
> > 21@152060130/152060136], CountInfo{start=3, active=3,
> > fresh=3}
> > [KTABLE-TOSTREAM-07]: [9_9_2018-03-09@
> 152060130/152060136],
> > CountInfo{start=3, active=3, fresh=3}
> > why in one window duration will be print two result but not one result ?
> >
> > 
> > funk...@live.com
> >
>
>
>
> --
> -- Guozhang
>



--
-- Guozhang


Re: Broker cannot start switch to Java9 - weird file system issue ?

2018-04-26 Thread Enrico Olivelli
Here it is !

https://issues.apache.org/jira/browse/KAFKA-6828

Thank you

Enrico

2018-04-24 20:53 GMT+02:00 Ismael Juma :

> A JIRA ticket would be appreciated. :)
>
> Ismael
>
> On Sat, Apr 21, 2018 at 12:51 AM, Enrico Olivelli 
> wrote:
>
> > Il sab 21 apr 2018, 06:29 Ismael Juma  ha scritto:
> >
> > > Hi Enrico,
> > >
> > > It is a real problem because it causes indexes to take a lot more disk
> > > space upfront. The sparsity is an important if people over partition,
> for
> > > example.
> > >
> >
> > Got it.
> > In production I saw no issue, maybe due to much availability of disk
> space.
> >
> > Should I file a JIRA or you will do?
> > Enrico
> >
> >
> > > Ismael
> > >
> > > On Fri, Apr 20, 2018 at 12:41 PM, Enrico Olivelli  >
> > > wrote:
> > >
> > > > Il ven 20 apr 2018, 20:24 Ismael Juma  ha
> scritto:
> > > >
> > > > > Hi Enrico,
> > > > >
> > > > > Coincidentally, I saw your message to nio-dev and followed up
> there.
> > > > >
> > > >
> > > > I think this is not a 'real' problem, you will notice the difference
> > only
> > > > if you have a lot of empty topics/partitions.
> > > > In fact when you simply upgrade to jdk9/10 immediately you are
> charged
> > > with
> > > > 10MB of disk space for each partition.
> > > > In production I did not suffer this change because usually you do not
> > > have
> > > > empty partitions.
> > > > In my test environments, where I had thousands of test empty
> > partitions,
> > > > disks filled up immediately and without any reason, it took time to
> > > > understand the real cause. Broker will crash without much 'log' as
> disk
> > > is
> > > > out of space.
> > > >
> > > > Maybe it would be useful to add some notice about this potential
> > problem
> > > > during the upgrade of the jdk.
> > > >
> > > > Hope that helps
> > > > Enrico
> > > >
> > > >
> > > > > Ismael
> > > > >
> > > > > On Fri, Apr 20, 2018 at 8:18 AM, Enrico Olivelli <
> > eolive...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > It is a deliberate change in JDK code
> > > > > >
> > > > > > Just for reference see this discussion  on nio-dev list on
> OpenJDK
> > > > > >
> > > http://mail.openjdk.java.net/pipermail/nio-dev/2018-April/005008.html
> > > > > >
> > > > > >
> > > > > > see
> > > > > > https://bugs.openjdk.java.net/browse/JDK-8168628
> > > > > >
> > > > > > Cheers
> > > > > > Enrico
> > > > > >
> > > > > >
> > > > > >
> > > > > > 2018-03-05 14:29 GMT+01:00 Enrico Olivelli  >:
> > > > > >
> > > > > > > Workaround:
> > > > > > > as these brokers are only for test environments I have set very
> > > small
> > > > > > > values for index file size, which affects pre-allocation
> > > > > > > segment.index.bytes=65536
> > > > > > > log.index.size.max.bytes=65536
> > > > > > >
> > > > > > > If anyone has some thought it will be very appreciated
> > > > > > > Cheers
> > > > > > >
> > > > > > > Enrico
> > > > > > >
> > > > > > >
> > > > > > > 2018-03-05 13:21 GMT+01:00 Enrico Olivelli <
> eolive...@gmail.com
> > >:
> > > > > > >
> > > > > > >> The only fact I have found is that with Java8 Kafka is
> creating
> > > > > "SPARSE"
> > > > > > >> files and with Java9 this is not true anymore
> > > > > > >>
> > > > > > >> Enrico
> > > > > > >>
> > > > > > >> 2018-03-05 12:44 GMT+01:00 Enrico Olivelli <
> eolive...@gmail.com
> > >:
> > > > > > >>
> > > > > > >>> Hi,
> > > > > > >>> This is a very strage case. I have a Kafka broker (part of a
> > > > cluster
> > > > > of
> > > > > > >>> 3 brokers) which cannot start upgrading Java from Oracle JDK8
> > to
> > > > > > Oracle JDK
> > > > > > >>> 9.0.4.
> > > > > > >>>
> > > > > > >>> There are a lot of .index and .timeindex files taking 10MB,
> > they
> > > > are
> > > > > > for
> > > > > > >>> empty partiions.
> > > > > > >>>
> > > > > > >>> Running with Java 9 the server seems to rebuild these files
> and
> > > > each
> > > > > > >>> file takes "really" 10MB.
> > > > > > >>> The sum of all the files (calculated using du -sh) is 22GB
> and
> > > the
> > > > > > >>> broker crashes during startup, disk becomes full and no log
> > more
> > > is
> > > > > > >>> written. (I can send an extraction of the logs, but the tell
> > only
> > > > > > about
> > > > > > >>> 'rebuilding index', the same as on Java 8)
> > > > > > >>>
> > > > > > >>> Reverting the same broker to Java 8 and removing the index
> > files,
> > > > the
> > > > > > >>> broker rebuilds such files, each files take 10MB, but the
> full
> > > sum
> > > > of
> > > > > > sizes
> > > > > > >>> (calculated using du -sh) is 38 MB !
> > > > > > >>>
> > > > > > >>> I am running this broker on CentosOS 7 on EXT4 FS.
> > > > > > >>>
> > > > > > >>> I have upgraded the broker to latest and greatest Kafka 1.0.0
> > > (from
> > > > > > >>> 0.10.2) without any success.
> > > > > > >>>
> > > > > > >>> All of the other testing clusters on CentOS7 (same SO
> settings)
> > > did
> > > > > not

Re: How do I specify jdbc connection porperites in kafka-jdbc-connector

2018-04-26 Thread Niels Ull Harremoes
I am using confluent and configuring the connector in distributed mode  
using REST.


I am setting the connector parameters as follows - but the connector 
params do not include anything like connection.jdbcproperties, so I 
cannot set oracle.jdbc.timezoneAsRegion=false.


Appending it to the jdbc url doesn't help.


{
    "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
    "tasks.max": "1",
    "connection.url": "jdbc:oracle:thin:@oracleserver.local:1521:ors_sid",
    "connection.user": "ors_user",
    "connection.password": "secret",
    "mode": "timestamp",
    "timestamp.column.name": "LASTRELEVANTMODIFICATION",
    "validate.non.null": "false",
    "schemas.enable": "false",
    "topic.prefix": "ors-entries",
    "key.converter": "org.apache.kafka.connect.json.JsonConverter",
    "key.converter.schemas.enable": "false",
    "value.converter": "org.apache.kafka.connect.json.JsonConverter",
    "value.converter.schemas.enable": "false",
    "query": "select ORDERID, REQUESTERID, CREATIONDATE, ORDERSYSTEM, 
PID, SHIPPEDDATE, RESPONDERID, LASTRELEVANTMODIFICATION from ORS_ORDER",

    "poll.interval.ms": "6",
    "fetch.size": "5000",
    "batch.max.rows": "5000",
    "table.poll.interval.ms": "6",
    "transforms": "createKey, insertSourceDetails, deleteCreateDate",
    "transforms.createKey.type": 
"org.apache.kafka.connect.transforms.ValueToKey",

    "transforms.createKey.fields": "ORDERID",
    "transforms.insertSourceDetails.type": 
"org.apache.kafka.connect.transforms.InsertField$Value",

    "transforms.insertSourceDetails.static.field": "source",
    "transforms.insertSourceDetails.static.value": "ors",
    "transforms.deleteCreateDate.type": 
"org.apache.kafka.connect.transforms.ReplaceField$Value",

    "transforms.deleteCreateDate.blacklist": "LASTRELEVANTMODIFICATION"
}


Den 25-04-2018 kl. 18:30 skrev adrien ruffie:

Hi Niels,


for using Kafka Connect, you must to define your connection configuration file 
under kafka_2.11-1.0.0/config

(2.11-1.0.0 is just a example of kafka version). In config directory you can create a 
connector file like "mysql-connect.properties"

and specify required parameters into this file. Example of mysql config 
connector:


name=source-mysql-jdbc-autoincrement
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
task.max=2
connection.url=jdbc:mysql://localhost:3306/test?user=your_user=your_password
table.whitelist=login,students,company
poll.interval.ms=1
incrementing.column.name=id
mode=incrementing
topic.prefix=mysql-jdbc-


Did you try to put it ?

Best regards,

Adrien



De : Niels Ull Harremoes 
Envoyé : mercredi 25 avril 2018 18:19:54
À : users@kafka.apache.org
Objet : How do I specify jdbc connection porperites in kafka-jdbc-connector

Hi!

I am tying to use the confluent connector verson 4.0.1 to connect to
oracle and set up a jdbc source.

Apparently I need to set a jdbc connection property,
oracle.jdbc.timezoneAsRegion=false, when connecting to my oracle database.

But I cannot find out where to set it - it cannot be set in the jdbc url
as far as I can tell, and I can apparently only set connection.url,
connection.username and connection.password in the connector config?

Any suggestions?


--
Med venlig hilsen
Niels Ull Harremoës


---
Denne e-mail blev kontrolleret for virusser af Avast antivirussoftware.
https://www.avast.com/antivirus




--
Med venlig hilsen
Niels Ull Harremoës