Tasks across MultipleJVM

2018-02-28 Thread pravin kumar
I have just did wikifeed example and given the output of wikifeed example
to another topologyProcessor to find the even numbers.while testing for
multiple Consumers in three JVM, the output topic is revoked and rebalanced
across three JVMs. i have got 10 tasks (max no of partitions).

i have three topics:

wikifeedInputtopic1 - 10 partitions
wikifeedOutputtopic1 - 10 partitions
sumoutputeventopicC1 - 5 partitions

I have tried to Spread the task across multiple JVM,

in JVM1:First i have this much partitions
[0_0, 0_1, 1_0, 0_2, 1_1, 0_3, 1_2, 0_4, 1_3, 1_4, 0_5, 1_5, 0_6, 1_6, 0_7,
1_7, 0_8, 1_8, 0_9, 1_9]

then i started with second JVM i have got
JVM1:
current active tasks: [0_0, 1_0, 0_1, 1_1, 0_2, 1_2, 0_3, 1_3, 0_4, 1_4]
current standby tasks: []
previous active tasks: [0_0, 0_1, 1_0, 0_2, 1_1, 0_3, 1_2, 0_4, 1_3,
1_4, 0_5, 1_5, 0_6, 1_6, 0_7, 1_7, 0_8, 1_8, 0_9, 1_9]
 (org.apache.kafka.streams.processor.internals.StreamThread)

JVM2:
current active tasks: [0_5, 1_5, 0_6, 1_6, 0_7, 1_7, 0_8, 1_8, 0_9, 1_9]
current standby tasks: []
previous active tasks: []

while i started third JVM :
JVM1:
current active tasks: [0_0, 1_0, 0_1, 1_1, 0_2, 1_2, 1_9]
current standby tasks: []
previous active tasks: [0_0, 1_0, 0_1, 1_1, 0_2, 1_2, 0_3, 1_3, 0_4,
1_4]

JVM2:
current active tasks: [0_5, 1_5, 0_6, 1_6, 0_7, 0_8, 1_8]
current standby tasks: []
previous active tasks: [0_5, 0_6, 1_5, 0_7, 1_6, 0_8, 1_7, 0_9, 1_8,
1_9]

JVM3:

current active tasks: [0_3, 1_3, 0_4, 1_4, 1_7, 0_9]
current standby tasks: []
previous active tasks: []

i have aslo updated the statedirectory while starting three JVMs
but i have not got the latest task list in statedirectory:

FirstJVM stateDirectory::
[admin@nms-181 WikiFeedLambdaexampleC2]$ ls
0_0  0_1  0_2  0_3  0_4  0_5  0_6  0_7  0_8  0_9  1_0  1_1  1_2  1_3  1_4
1_5  1_6  1_7  1_8  1_9

SecondJVM stateDirectory:
[admin@nms-181 WikiFeedLambdaexampleC2]$ ls
0_5  0_6  0_7  0_8  0_9  1_5  1_6  1_7  1_8  1_9

ThirdJVM stateDirectory:
[admin@nms-181 WikiFeedLambdaexampleC2]$ ls
0_3  0_4  0_9  1_3  1_4  1_7

Doubts:
1.while runnning in multiple JVM with multiple Consumers ,the task also
gets spread across multiple JVM??

2.why  third stateDirectory has the lastest task list,others dont have the
latest taskList ???


i have attached the my codes below:
package kafka.examples.wikifeed;

import org.apache.commons.lang3.SerializationUtils;
import org.apache.kafka.common.serialization.Deserializer;

import java.util.Map;

/**
 * Created by PravinKumar on 29/7/17.
 */
public class CustomDeserializer implements Deserializer {
@Override
public void configure(Map map, boolean b) {

}

@Override
public T deserialize(String s, byte[] bytes) {
return (T) SerializationUtils.deserialize(bytes);
}

@Override
public void close() {

}
}
package kafka.examples.wikifeed;

import org.apache.commons.lang3.SerializationUtils;
import org.apache.kafka.common.serialization.Serializer;

import java.io.Serializable;
import java.util.Map;

/**
 * Created by PravinKumar on 29/7/17.
 */
public class CustomSerializer implements Serializer {

@Override
public void configure(Map map, boolean b) {

}

@Override
public byte[] serialize(String s, T t) {
return SerializationUtils.serialize((Serializable) t);
}

@Override
public void close() {

}
}
package kafka.examples.wikifeed;

import java.io.Serializable;

/**
 * Created by PravinKumar on 29/7/17.
 */
public class Wikifeed implements Serializable {

private String name;
private boolean isNew;
private String content;

public Wikifeed(String name, boolean isNew, String content) {
this.name = name;
this.isNew = isNew;
this.content = content;
}

public String getName() {
return name;
}

public void setName(String name) {
this.name = name;
}

public boolean isNew() {
return isNew;
}

public void setNew(boolean aNew) {
isNew = aNew;
}

public String getContent() {
return content;
}

public void setContent(String content) {
this.content = content;
}
}
package kafka.examples.wikifeed;

import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.LongDeserializer;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.apache.kafka.common.serialization.StringSerializer;

import java.util.Collections;
import java.util.Properties;
import java.util.Random;
import java.util.stream.IntStream;

/**
 * Created by PravinKumar on 29/7/17.
 */
public class 

which Kafka StateStore could I use ?

2018-02-28 Thread 杰 杨

HI:
I use kafka streams for real-time data analysis
and I meet a problem.
now I process a record in kafka and compute it and send to db.
but db concurrency level is not suit for me.
so I want that
1)when there is not data in kakfa ,the statestore is  no results.
2) when there is a lot of data records in kafka the statestore save computed 
result and I need send its once to db.
which StateStoe can I use for do that above

funk...@live.com


Re: aggregation tables mirrored in kafka & rocksdb

2018-02-28 Thread Guozhang Wang
Hello Nicu,

For your aggregation application, is it windowed or non windowed? If it is
windowed aggregation then you can specify your window specs so that the
underlying RocksDB state store would only keep the most recent windows,
while your Cassandra keeps the full history of all past windows.

You can, of course, implement your own state store that directly talk to
Cassandra (the StateStore interfaces allows users to customize their own
storage mechanism, either local or remote), but to optimize latency you may
want to have some local in-memory caches with write-back to batch access to
your Cassandra cluster.


Guozhang


On Wed, Feb 28, 2018 at 5:45 AM, Marasoiu, Nicu <
nicu.maras...@metrosystems.net> wrote:

> Hi,
> Currently we have an aggregation system (without kafka) where events are
> aggregated into Cassandra tables holding aggregate results.
> We are considering moving to a KafkaStreams solution with exactly-once
> processing but in this case it seems that all the aggregation tables
> (reaching TB) need to be kept also in Kafka as ktables(Rocksdb)+compacted
> topics(Kafka) and the direction of computation would be: events topic -> KS
> aggregation -> aggregated topics -> one way sync to Cassandra using
> connector.
> This poses two problems:
> - the doubling on the total storage required for the system (which mainly
> stores aggregates), from 3 C* replicas to 2-3 K replicas + Rocksdb
> - the time to reconstruct a Rocksdb instance during rollover update can be
> half an hour if the rollover is fast
>
> Is there any way in Kafka Streams (even dropping the exactly once) in
> which we can work just with aggregate tables in Cassandra? For sure there
> is a way working with Kafka Consumer, pulling a batch of messages,
> aggregating, and adding aggregate to Cassandra. Not sure if possible with
> KafkaStreams given the higher level / FP modeling with its own clear
> advantages but this disadvantage.
> Please advise,
> Nicu Marasoiu
> Geschäftsanschrift/Business address: METRO SYSTEMS GmbH, Metro-Straße 12,
> 40235 Düsseldorf, Germany
> Aufsichtsrat/Supervisory Board: Heiko Hutmacher (Vorsitzender/ Chairman)
> Geschäftsführung/Management Board: Dr. Dirk Toepfer (Vorsitzender/CEO),
> Wim van Herwijnen
> Sitz Düsseldorf, Amtsgericht Düsseldorf, HRB 18232/Registered Office
> Düsseldorf, Commercial Register of the Düsseldorf Local Court, HRB 18232
>
> Betreffend Mails von *@metrosystems.net
> Die in dieser E-Mail enthaltenen Nachrichten und Anhänge sind
> ausschließlich für den bezeichneten Adressaten bestimmt. Sie können
> rechtlich geschützte, vertrauliche Informationen enthalten. Falls Sie nicht
> der bezeichnete Empfänger oder zum Empfang dieser E-Mail nicht berechtigt
> sind, ist die Verwendung, Vervielfältigung oder Weitergabe der Nachrichten
> und Anhänge untersagt. Falls Sie diese E-Mail irrtümlich erhalten haben,
> informieren Sie bitte unverzüglich den Absender und vernichten Sie die
> E-Mail.
>
> Regarding mails from *@metrosystems.net
> This e-mail message and any attachment are intended exclusively for the
> named addressee. They may contain confidential information which may also
> be protected by professional secrecy. Unless you are the named addressee
> (or authorised to receive for the addressee) you may not copy or use this
> message or any attachment or disclose the contents to anyone else. If this
> e-mail was sent to you by mistake please notify the sender immediately and
> delete this e-mail.
>
>


-- 
-- Guozhang


RE: Zookeeper and Kafka JMX metrics

2018-02-28 Thread adrien ruffie
Hi Arunkumar,


have you take a look if your MBean are exposed with Zookeeper thank to 
JVisualvm yet ? As like in my screen in attachment.


regards Adrien


De : Arunkumar 
Envoyé : mardi 27 février 2018 23:19:33
À : users@kafka.apache.org
Objet : Zookeeper and Kafka JMX metrics

Dear Folks
We have plans implementing kafa and zookeeper metrics using java JMX API. We 
were able to successfully implement metrics collection using the MBean exposed 
for kafka. But when we try to do so for zookeeper I do not find much API 
support like we have for kafka. Can someone help if you have any insight.
Thanks in advanceArunkumar Pichaimuthu, PMP


Re: [VOTE] 1.1.0 RC0

2018-02-28 Thread Damian Guy
Hi Jason,

Ok - thanks. Let me know how you get on.

Cheers,
Damian

On Wed, 28 Feb 2018 at 19:23 Jason Gustafson  wrote:

> Hey Damian,
>
> I think we should consider
> https://issues.apache.org/jira/browse/KAFKA-6593
> for the release. I have a patch available, but still working on validating
> both the bug and the fix.
>
> -Jason
>
> On Wed, Feb 28, 2018 at 9:34 AM, Matthias J. Sax 
> wrote:
>
> > No. Both will be released.
> >
> > -Matthias
> >
> > On 2/28/18 6:32 AM, Marina Popova wrote:
> > > Sorry, maybe a stupid question, but:
> > >  I see that Kafka 1.0.1 RC2 is still not released, but now 1.1.0 RC0 is
> > coming up...
> > > Does it mean 1.0.1 will be abandoned and we should be looking forward
> to
> > 1.1.0 instead?
> > >
> > > thanks!
> > >
> > > ​Sent with ProtonMail Secure Email.​
> > >
> > > ‐‐‐ Original Message ‐‐‐
> > >
> > > On February 26, 2018 6:28 PM, Vahid S Hashemian <
> > vahidhashem...@us.ibm.com> wrote:
> > >
> > >> +1 (non-binding)
> > >>
> > >> Built the source and ran quickstart (including streams) successfully
> on
> > >>
> > >> Ubuntu (with both Java 8 and Java 9).
> > >>
> > >> I understand the Windows platform is not officially supported, but I
> ran
> > >>
> > >> the same on Windows 10, and except for Step 7 (Connect) everything
> else
> > >>
> > >> worked fine.
> > >>
> > >> There are a number of warning and errors (including
> > >>
> > >> java.lang.ClassNotFoundException). Here's the final error message:
> > >>
> > >>> bin\\windows\\connect-standalone.bat config\\connect-standalone.
> > properties
> > >>
> > >> config\\connect-file-source.properties config\\connect-file-sink.
> > properties
> > >>
> > >> ...
> > >>
> > >> \[2018-02-26 14:55:56,529\] ERROR Stopping after connector error
> > >>
> > >> (org.apache.kafka.connect.cli.ConnectStandalone)
> > >>
> > >> java.lang.NoClassDefFoundError:
> > >>
> > >> org/apache/kafka/connect/transforms/util/RegexValidator
> > >>
> > >> at
> > >>
> > >> org.apache.kafka.connect.runtime.SinkConnectorConfig.<
> > clinit>(SinkConnectorConfig.java:46)
> > >>
> > >> at
> > >>
> > >>
> > >> org.apache.kafka.connect.runtime.AbstractHerder.
> > validateConnectorConfig(AbstractHerder.java:263)
> > >>
> > >> at
> > >>
> > >> org.apache.kafka.connect.runtime.standalone.StandaloneHerder.
> > putConnectorConfig(StandaloneHerder.java:164)
> > >>
> > >> at
> > >>
> > >> org.apache.kafka.connect.cli.ConnectStandalone.main(
> > ConnectStandalone.java:107)
> > >>
> > >> Caused by: java.lang.ClassNotFoundException:
> > >>
> > >> org.apache.kafka.connect.transforms.util.RegexValidator
> > >>
> > >> at
> > >>
> > >> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(
> > BuiltinClassLoader.java:582)
> > >>
> > >> at
> > >>
> > >> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.
> > loadClass(ClassLoaders.java:185)
> > >>
> > >> at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:496)
> > >>
> > >> ... 4 more
> > >>
> > >> Thanks for running the release.
> > >>
> > >> --Vahid
> > >>
> > >> From: Damian Guy damian@gmail.com
> > >>
> > >> To: d...@kafka.apache.org, users@kafka.apache.org,
> > >>
> > >> kafka-clie...@googlegroups.com
> > >>
> > >> Date: 02/24/2018 08:16 AM
> > >>
> > >> Subject: \[VOTE\] 1.1.0 RC0
> > >>
> > >> Hello Kafka users, developers and client-developers,
> > >>
> > >> This is the first candidate for release of Apache Kafka 1.1.0.
> > >>
> > >> This is minor version release of Apache Kakfa. It Includes 29 new
> KIPs.
> > >>
> > >> Please see the release plan for more details:
> > >>
> > >> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.
> > apache.org_confluence_pages_viewpage.action-3FpageId-
> > 3D71764913=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_
> >
> itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEe
> > WCbqs5nbbUY0c=M1liORvtcIt7pZ8e5GnLr9a1i6SOUY4bvjHYOrY_zcE=
> > >>
> > >> A few highlights:
> > >>
> > >> -   Significant Controller improvements (much faster and session
> > expiration
> > >>
> > >> edge cases fixed)
> > >>
> > >> -   Data balancing across log directories (JBOD)
> > >> -   More efficient replication when the number of partitions is large
> > >> -   Dynamic Broker Configs
> > >> -   Delegation tokens (KIP-48)
> > >> -   Kafka Streams API improvements (KIP-205 / 210 / 220 / 224 / 239)
> > >>
> > >> Release notes for the 1.1.0 release:
> > >>
> > >> https://urldefense.proofpoint.com/v2/url?u=http-3A__home.
> > apache.org_-7Edamianguy_kafka-2D1.1.0-2Drc0_RELEASE-5FNOTES.
> > html=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-
> > kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEe
> > WCbqs5nbbUY0c=H6-O0mkXk2tT_7RlN4W9bJd_lpoOt5ranhTx28WdRnQ=
> > >>
> > >> \*\*\* Please download, test and vote by Wednesday, February 28th,
> > 5pm PT
> > >>
> > >> Kafka's KEYS file containing PGP keys we use to sign the release:
> > >>
> > >> 

Re: [VOTE] 1.1.0 RC0

2018-02-28 Thread Jason Gustafson
Hey Damian,

I think we should consider https://issues.apache.org/jira/browse/KAFKA-6593
for the release. I have a patch available, but still working on validating
both the bug and the fix.

-Jason

On Wed, Feb 28, 2018 at 9:34 AM, Matthias J. Sax 
wrote:

> No. Both will be released.
>
> -Matthias
>
> On 2/28/18 6:32 AM, Marina Popova wrote:
> > Sorry, maybe a stupid question, but:
> >  I see that Kafka 1.0.1 RC2 is still not released, but now 1.1.0 RC0 is
> coming up...
> > Does it mean 1.0.1 will be abandoned and we should be looking forward to
> 1.1.0 instead?
> >
> > thanks!
> >
> > ​Sent with ProtonMail Secure Email.​
> >
> > ‐‐‐ Original Message ‐‐‐
> >
> > On February 26, 2018 6:28 PM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com> wrote:
> >
> >> +1 (non-binding)
> >>
> >> Built the source and ran quickstart (including streams) successfully on
> >>
> >> Ubuntu (with both Java 8 and Java 9).
> >>
> >> I understand the Windows platform is not officially supported, but I ran
> >>
> >> the same on Windows 10, and except for Step 7 (Connect) everything else
> >>
> >> worked fine.
> >>
> >> There are a number of warning and errors (including
> >>
> >> java.lang.ClassNotFoundException). Here's the final error message:
> >>
> >>> bin\\windows\\connect-standalone.bat config\\connect-standalone.
> properties
> >>
> >> config\\connect-file-source.properties config\\connect-file-sink.
> properties
> >>
> >> ...
> >>
> >> \[2018-02-26 14:55:56,529\] ERROR Stopping after connector error
> >>
> >> (org.apache.kafka.connect.cli.ConnectStandalone)
> >>
> >> java.lang.NoClassDefFoundError:
> >>
> >> org/apache/kafka/connect/transforms/util/RegexValidator
> >>
> >> at
> >>
> >> org.apache.kafka.connect.runtime.SinkConnectorConfig.<
> clinit>(SinkConnectorConfig.java:46)
> >>
> >> at
> >>
> >>
> >> org.apache.kafka.connect.runtime.AbstractHerder.
> validateConnectorConfig(AbstractHerder.java:263)
> >>
> >> at
> >>
> >> org.apache.kafka.connect.runtime.standalone.StandaloneHerder.
> putConnectorConfig(StandaloneHerder.java:164)
> >>
> >> at
> >>
> >> org.apache.kafka.connect.cli.ConnectStandalone.main(
> ConnectStandalone.java:107)
> >>
> >> Caused by: java.lang.ClassNotFoundException:
> >>
> >> org.apache.kafka.connect.transforms.util.RegexValidator
> >>
> >> at
> >>
> >> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(
> BuiltinClassLoader.java:582)
> >>
> >> at
> >>
> >> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.
> loadClass(ClassLoaders.java:185)
> >>
> >> at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:496)
> >>
> >> ... 4 more
> >>
> >> Thanks for running the release.
> >>
> >> --Vahid
> >>
> >> From: Damian Guy damian@gmail.com
> >>
> >> To: d...@kafka.apache.org, users@kafka.apache.org,
> >>
> >> kafka-clie...@googlegroups.com
> >>
> >> Date: 02/24/2018 08:16 AM
> >>
> >> Subject: \[VOTE\] 1.1.0 RC0
> >>
> >> Hello Kafka users, developers and client-developers,
> >>
> >> This is the first candidate for release of Apache Kafka 1.1.0.
> >>
> >> This is minor version release of Apache Kakfa. It Includes 29 new KIPs.
> >>
> >> Please see the release plan for more details:
> >>
> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.
> apache.org_confluence_pages_viewpage.action-3FpageId-
> 3D71764913=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_
> itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEe
> WCbqs5nbbUY0c=M1liORvtcIt7pZ8e5GnLr9a1i6SOUY4bvjHYOrY_zcE=
> >>
> >> A few highlights:
> >>
> >> -   Significant Controller improvements (much faster and session
> expiration
> >>
> >> edge cases fixed)
> >>
> >> -   Data balancing across log directories (JBOD)
> >> -   More efficient replication when the number of partitions is large
> >> -   Dynamic Broker Configs
> >> -   Delegation tokens (KIP-48)
> >> -   Kafka Streams API improvements (KIP-205 / 210 / 220 / 224 / 239)
> >>
> >> Release notes for the 1.1.0 release:
> >>
> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__home.
> apache.org_-7Edamianguy_kafka-2D1.1.0-2Drc0_RELEASE-5FNOTES.
> html=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-
> kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEe
> WCbqs5nbbUY0c=H6-O0mkXk2tT_7RlN4W9bJd_lpoOt5ranhTx28WdRnQ=
> >>
> >> \*\*\* Please download, test and vote by Wednesday, February 28th,
> 5pm PT
> >>
> >> Kafka's KEYS file containing PGP keys we use to sign the release:
> >>
> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__kafka.
> apache.org_KEYS=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_
> itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEe
> WCbqs5nbbUY0c=Eo5JrktOPUlA2-7W11222zSVYfR6oqzd9uiaUEod2D4=
> >>
> >> -   Release artifacts to be voted upon (source and binary):
> >>
> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__home.
> apache.org_-7Edamianguy_kafka-2D1.1.0-2Drc0_=DwIBaQ=jf_
> iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=

Re: Setting topic's offset from the shell

2018-02-28 Thread Manikumar
we can use  "kafka-consumer-groups.sh --reset-offsets" option to reset
offsets. This is available from Kafka 0.11.0.0..


On Wed, Feb 28, 2018 at 2:59 PM, UMESH CHAUDHARY 
wrote:

> You might want to set group.id config in kafka-console-consumer (or in any
> other consumer) to the value which you haven't used before. This will
> replay all available messages in the topic from start if you use
> --from-beginning in console consumer.
>
> On Wed, 28 Feb 2018 at 14:19 Zoran  wrote:
>
> > Hi,
> >
> >
> > If I have a topic that has been fully read by consumers, how to set the
> > offset from the shell to some previous value in order to reread again
> > several messages?
> >
> >
> > Regards.
> >
> >
>


Re: [VOTE] 1.1.0 RC0

2018-02-28 Thread Matthias J. Sax
No. Both will be released.

-Matthias

On 2/28/18 6:32 AM, Marina Popova wrote:
> Sorry, maybe a stupid question, but:
>  I see that Kafka 1.0.1 RC2 is still not released, but now 1.1.0 RC0 is 
> coming up...
> Does it mean 1.0.1 will be abandoned and we should be looking forward to 
> 1.1.0 instead?
> 
> thanks!
> 
> ​Sent with ProtonMail Secure Email.​
> 
> ‐‐‐ Original Message ‐‐‐
> 
> On February 26, 2018 6:28 PM, Vahid S Hashemian  
> wrote:
> 
>> +1 (non-binding)
>>
>> Built the source and ran quickstart (including streams) successfully on
>>
>> Ubuntu (with both Java 8 and Java 9).
>>
>> I understand the Windows platform is not officially supported, but I ran
>>
>> the same on Windows 10, and except for Step 7 (Connect) everything else
>>
>> worked fine.
>>
>> There are a number of warning and errors (including
>>
>> java.lang.ClassNotFoundException). Here's the final error message:
>>
>>> bin\\windows\\connect-standalone.bat config\\connect-standalone.properties
>>
>> config\\connect-file-source.properties config\\connect-file-sink.properties
>>
>> ...
>>
>> \[2018-02-26 14:55:56,529\] ERROR Stopping after connector error
>>
>> (org.apache.kafka.connect.cli.ConnectStandalone)
>>
>> java.lang.NoClassDefFoundError:
>>
>> org/apache/kafka/connect/transforms/util/RegexValidator
>>
>> at
>>
>> org.apache.kafka.connect.runtime.SinkConnectorConfig.(SinkConnectorConfig.java:46)
>>
>> at 
>> 
>>
>> org.apache.kafka.connect.runtime.AbstractHerder.validateConnectorConfig(AbstractHerder.java:263)
>>
>> at
>>
>> org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:164)
>>
>> at
>>
>> org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:107)
>>
>> Caused by: java.lang.ClassNotFoundException:
>>
>> org.apache.kafka.connect.transforms.util.RegexValidator
>>
>> at
>>
>> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:582)
>>
>> at
>>
>> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:185)
>>
>> at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:496)
>>
>> ... 4 more
>>
>> Thanks for running the release.
>>
>> --Vahid
>>
>> From: Damian Guy damian@gmail.com
>>
>> To: d...@kafka.apache.org, users@kafka.apache.org,
>>
>> kafka-clie...@googlegroups.com
>>
>> Date: 02/24/2018 08:16 AM
>>
>> Subject: \[VOTE\] 1.1.0 RC0
>>
>> Hello Kafka users, developers and client-developers,
>>
>> This is the first candidate for release of Apache Kafka 1.1.0.
>>
>> This is minor version release of Apache Kakfa. It Includes 29 new KIPs.
>>
>> Please see the release plan for more details:
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_pages_viewpage.action-3FpageId-3D71764913=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEeWCbqs5nbbUY0c=M1liORvtcIt7pZ8e5GnLr9a1i6SOUY4bvjHYOrY_zcE=
>>
>> A few highlights:
>>
>> -   Significant Controller improvements (much faster and session expiration
>> 
>> edge cases fixed)
>> 
>> -   Data balancing across log directories (JBOD)
>> -   More efficient replication when the number of partitions is large
>> -   Dynamic Broker Configs
>> -   Delegation tokens (KIP-48)
>> -   Kafka Streams API improvements (KIP-205 / 210 / 220 / 224 / 239)
>> 
>> Release notes for the 1.1.0 release:
>> 
>> 
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__home.apache.org_-7Edamianguy_kafka-2D1.1.0-2Drc0_RELEASE-5FNOTES.html=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEeWCbqs5nbbUY0c=H6-O0mkXk2tT_7RlN4W9bJd_lpoOt5ranhTx28WdRnQ=
>> 
>> \*\*\* Please download, test and vote by Wednesday, February 28th, 5pm PT
>> 
>> Kafka's KEYS file containing PGP keys we use to sign the release:
>> 
>> 
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__kafka.apache.org_KEYS=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEeWCbqs5nbbUY0c=Eo5JrktOPUlA2-7W11222zSVYfR6oqzd9uiaUEod2D4=
>> 
>> -   Release artifacts to be voted upon (source and binary):
>> 
>> 
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__home.apache.org_-7Edamianguy_kafka-2D1.1.0-2Drc0_=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEeWCbqs5nbbUY0c=LkMdsPX_jln_lIgxbKUbnElAiqkNdAWJCkA5kuIRU64=
>> 
>> -   Maven artifacts to be voted upon:
>> 
>> 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__repository.apache.org_content_groups_staging_=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEeWCbqs5nbbUY0c=E-Tj8DN83xkbvX8b6Vcel0z7v3AiRIusBmNtOIAUt_c=
>> 
>> -   Javadoc:
>> 
>> 
>> 

Re: MockConsumer class for Python?

2018-02-28 Thread Sam Pegler
Why not just mock out the Kafka client in your tests and have it call a
function which yields a kafka message every call?

```
def consumer():
for _ in range(99):

yield KafkaMessage('key', 'value')mock_consumer =
mocker.patch.object(foo, 'consumer', consumer())

```

Is there any specific feature you're after?


MockConsumer class for Python?

2018-02-28 Thread Skip Montanaro
I think I've seen mention of a MockConsumer class, but the
kafka-python package I have available in my Conda setup doesn't seem
to have anything like that. Is this a Java-only thing or is there a
Python MockConsumer class I just haven't yet encountered in the wild?
I see this:

http://www.jesse-anderson.com/2016/11/unit-testing-kafka-consumers/

but that appears to be Java-specific.

Maybe it's not entirely necessary? Do people just unit test through a
development broker setup using fake messages sent to test-specific
topics?

Thx,

Skip Montanaro


Re: [VOTE] 1.1.0 RC0

2018-02-28 Thread Marina Popova
Sorry, maybe a stupid question, but:
 I see that Kafka 1.0.1 RC2 is still not released, but now 1.1.0 RC0 is coming 
up...
Does it mean 1.0.1 will be abandoned and we should be looking forward to 1.1.0 
instead?

thanks!

​Sent with ProtonMail Secure Email.​

‐‐‐ Original Message ‐‐‐

On February 26, 2018 6:28 PM, Vahid S Hashemian  
wrote:

> +1 (non-binding)
> 
> Built the source and ran quickstart (including streams) successfully on
> 
> Ubuntu (with both Java 8 and Java 9).
> 
> I understand the Windows platform is not officially supported, but I ran
> 
> the same on Windows 10, and except for Step 7 (Connect) everything else
> 
> worked fine.
> 
> There are a number of warning and errors (including
> 
> java.lang.ClassNotFoundException). Here's the final error message:
> 
> > bin\\windows\\connect-standalone.bat config\\connect-standalone.properties
> 
> config\\connect-file-source.properties config\\connect-file-sink.properties
> 
> ...
> 
> \[2018-02-26 14:55:56,529\] ERROR Stopping after connector error
> 
> (org.apache.kafka.connect.cli.ConnectStandalone)
> 
> java.lang.NoClassDefFoundError:
> 
> org/apache/kafka/connect/transforms/util/RegexValidator
> 
> at
> 
> org.apache.kafka.connect.runtime.SinkConnectorConfig.(SinkConnectorConfig.java:46)
> 
> at 
> 
> 
> org.apache.kafka.connect.runtime.AbstractHerder.validateConnectorConfig(AbstractHerder.java:263)
> 
> at
> 
> org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:164)
> 
> at
> 
> org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:107)
> 
> Caused by: java.lang.ClassNotFoundException:
> 
> org.apache.kafka.connect.transforms.util.RegexValidator
> 
> at
> 
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:582)
> 
> at
> 
> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:185)
> 
> at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:496)
> 
> ... 4 more
> 
> Thanks for running the release.
> 
> --Vahid
> 
> From: Damian Guy damian@gmail.com
> 
> To: d...@kafka.apache.org, users@kafka.apache.org,
> 
> kafka-clie...@googlegroups.com
> 
> Date: 02/24/2018 08:16 AM
> 
> Subject: \[VOTE\] 1.1.0 RC0
> 
> Hello Kafka users, developers and client-developers,
> 
> This is the first candidate for release of Apache Kafka 1.1.0.
> 
> This is minor version release of Apache Kakfa. It Includes 29 new KIPs.
> 
> Please see the release plan for more details:
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_pages_viewpage.action-3FpageId-3D71764913=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEeWCbqs5nbbUY0c=M1liORvtcIt7pZ8e5GnLr9a1i6SOUY4bvjHYOrY_zcE=
> 
> A few highlights:
> 
> -   Significant Controller improvements (much faster and session expiration
> 
> edge cases fixed)
> 
> -   Data balancing across log directories (JBOD)
> -   More efficient replication when the number of partitions is large
> -   Dynamic Broker Configs
> -   Delegation tokens (KIP-48)
> -   Kafka Streams API improvements (KIP-205 / 210 / 220 / 224 / 239)
> 
> Release notes for the 1.1.0 release:
> 
> 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__home.apache.org_-7Edamianguy_kafka-2D1.1.0-2Drc0_RELEASE-5FNOTES.html=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEeWCbqs5nbbUY0c=H6-O0mkXk2tT_7RlN4W9bJd_lpoOt5ranhTx28WdRnQ=
> 
> \*\*\* Please download, test and vote by Wednesday, February 28th, 5pm PT
> 
> Kafka's KEYS file containing PGP keys we use to sign the release:
> 
> 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__kafka.apache.org_KEYS=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEeWCbqs5nbbUY0c=Eo5JrktOPUlA2-7W11222zSVYfR6oqzd9uiaUEod2D4=
> 
> -   Release artifacts to be voted upon (source and binary):
> 
> 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__home.apache.org_-7Edamianguy_kafka-2D1.1.0-2Drc0_=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEeWCbqs5nbbUY0c=LkMdsPX_jln_lIgxbKUbnElAiqkNdAWJCkA5kuIRU64=
> 
> -   Maven artifacts to be voted upon:
> 
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__repository.apache.org_content_groups_staging_=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=Q_itwloTQj3_xUKl7Nzswo6KE4Nj-kjJc7uSVcviKUc=K9Iz2hWA2pj4QGxW6fleW20K0M7oEeWCbqs5nbbUY0c=E-Tj8DN83xkbvX8b6Vcel0z7v3AiRIusBmNtOIAUt_c=
> 
> -   Javadoc:
> 
> 
> 

aggregation tables mirrored in kafka & rocksdb

2018-02-28 Thread Marasoiu, Nicu
Hi,
Currently we have an aggregation system (without kafka) where events are 
aggregated into Cassandra tables holding aggregate results.
We are considering moving to a KafkaStreams solution with exactly-once 
processing but in this case it seems that all the aggregation tables (reaching 
TB) need to be kept also in Kafka as ktables(Rocksdb)+compacted topics(Kafka) 
and the direction of computation would be: events topic -> KS aggregation -> 
aggregated topics -> one way sync to Cassandra using connector.
This poses two problems:
- the doubling on the total storage required for the system (which mainly 
stores aggregates), from 3 C* replicas to 2-3 K replicas + Rocksdb
- the time to reconstruct a Rocksdb instance during rollover update can be half 
an hour if the rollover is fast

Is there any way in Kafka Streams (even dropping the exactly once) in which we 
can work just with aggregate tables in Cassandra? For sure there is a way 
working with Kafka Consumer, pulling a batch of messages, aggregating, and 
adding aggregate to Cassandra. Not sure if possible with KafkaStreams given the 
higher level / FP modeling with its own clear advantages but this disadvantage.
Please advise,
Nicu Marasoiu
Geschäftsanschrift/Business address: METRO SYSTEMS GmbH, Metro-Straße 12, 40235 
Düsseldorf, Germany
Aufsichtsrat/Supervisory Board: Heiko Hutmacher (Vorsitzender/ Chairman)
Geschäftsführung/Management Board: Dr. Dirk Toepfer (Vorsitzender/CEO), Wim van 
Herwijnen
Sitz Düsseldorf, Amtsgericht Düsseldorf, HRB 18232/Registered Office 
Düsseldorf, Commercial Register of the Düsseldorf Local Court, HRB 18232

Betreffend Mails von *@metrosystems.net
Die in dieser E-Mail enthaltenen Nachrichten und Anhänge sind ausschließlich 
für den bezeichneten Adressaten bestimmt. Sie können rechtlich geschützte, 
vertrauliche Informationen enthalten. Falls Sie nicht der bezeichnete Empfänger 
oder zum Empfang dieser E-Mail nicht berechtigt sind, ist die Verwendung, 
Vervielfältigung oder Weitergabe der Nachrichten und Anhänge untersagt. Falls 
Sie diese E-Mail irrtümlich erhalten haben, informieren Sie bitte unverzüglich 
den Absender und vernichten Sie die E-Mail.

Regarding mails from *@metrosystems.net
This e-mail message and any attachment are intended exclusively for the named 
addressee. They may contain confidential information which may also be 
protected by professional secrecy. Unless you are the named addressee (or 
authorised to receive for the addressee) you may not copy or use this message 
or any attachment or disclose the contents to anyone else. If this e-mail was 
sent to you by mistake please notify the sender immediately and delete this 
e-mail.



Re: kafka streams, docker/k18s and rocksdb - storage performance

2018-02-28 Thread Pegerto Fernandez Torres
Hi Nicu,

I think the best you can do is start with an empty dir, that will give you
access to a not layered filesystem, but you need to take in consideration
that you will stream data every time that the container restart, so a
rolling update can be slow.

My suggestion is to check with you platform team, I know them and I am sure
they are happy to help, and understand if the restoration time for the data
size you plan is acceptable.

Regards.

On 28 February 2018 at 08:39, Marasoiu, Nicu  wrote:

> Hi,
> When using kafka streams & stateful transformations (rocksdb) and docker
> and Kubernetes, what are the concerns for storage - I mean, I know that
> writing to disk in Docker into the container is less performant than
> mounting a direct volume. However, in our setup a separate team is
> handling, and theoretically we should be stateless in the microservices
> nodes, .. I am just wondering what the performance penalty would be if the
> writes would be done without any special concern,
> Thank you,
> Nicu
> Geschäftsanschrift/Business address: METRO SYSTEMS GmbH, Metro-Straße 12,
> 40235 Düsseldorf, Germany
> Aufsichtsrat/Supervisory Board: Heiko Hutmacher (Vorsitzender/ Chairman)
> Geschäftsführung/Management Board: Dr. Dirk Toepfer (Vorsitzender/CEO),
> Wim van Herwijnen
> Sitz Düsseldorf, Amtsgericht Düsseldorf, HRB 18232/Registered Office
> Düsseldorf, Commercial Register of the Düsseldorf Local Court, HRB 18232
>
> Betreffend Mails von *@metrosystems.net
> Die in dieser E-Mail enthaltenen Nachrichten und Anhänge sind
> ausschließlich für den bezeichneten Adressaten bestimmt. Sie können
> rechtlich geschützte, vertrauliche Informationen enthalten. Falls Sie nicht
> der bezeichnete Empfänger oder zum Empfang dieser E-Mail nicht berechtigt
> sind, ist die Verwendung, Vervielfältigung oder Weitergabe der Nachrichten
> und Anhänge untersagt. Falls Sie diese E-Mail irrtümlich erhalten haben,
> informieren Sie bitte unverzüglich den Absender und vernichten Sie die
> E-Mail.
>
> Regarding mails from *@metrosystems.net
> This e-mail message and any attachment are intended exclusively for the
> named addressee. They may contain confidential information which may also
> be protected by professional secrecy. Unless you are the named addressee
> (or authorised to receive for the addressee) you may not copy or use this
> message or any attachment or disclose the contents to anyone else. If this
> e-mail was sent to you by mistake please notify the sender immediately and
> delete this e-mail.
>
>


-- 

Pegerto Fernandez

*Lead Consultant *

-- 


Check out our new blog by Miles Wilson, "Fargate As An Enabler For 
Serverless Continuous Delivery" 


opencredo.com  . Twitter 
 . LinkedIn 



OpenCredo Ltd -- Delivering Software Innovation

Registered Office:  5-11 Lavington Street, London, SE1 0NZ
Registered in UK. No 3943999


Re: Setting topic's offset from the shell

2018-02-28 Thread UMESH CHAUDHARY
You might want to set group.id config in kafka-console-consumer (or in any
other consumer) to the value which you haven't used before. This will
replay all available messages in the topic from start if you use
--from-beginning in console consumer.

On Wed, 28 Feb 2018 at 14:19 Zoran  wrote:

> Hi,
>
>
> If I have a topic that has been fully read by consumers, how to set the
> offset from the shell to some previous value in order to reread again
> several messages?
>
>
> Regards.
>
>


Setting topic's offset from the shell

2018-02-28 Thread Zoran

Hi,


If I have a topic that has been fully read by consumers, how to set the 
offset from the shell to some previous value in order to reread again 
several messages?



Regards.



kafka streams, docker/k18s and rocksdb - storage performance

2018-02-28 Thread Marasoiu, Nicu
Hi,
When using kafka streams & stateful transformations (rocksdb) and docker and 
Kubernetes, what are the concerns for storage - I mean, I know that writing to 
disk in Docker into the container is less performant than mounting a direct 
volume. However, in our setup a separate team is handling, and theoretically we 
should be stateless in the microservices nodes, .. I am just wondering what the 
performance penalty would be if the writes would be done without any special 
concern,
Thank you,
Nicu
Geschäftsanschrift/Business address: METRO SYSTEMS GmbH, Metro-Straße 12, 40235 
Düsseldorf, Germany
Aufsichtsrat/Supervisory Board: Heiko Hutmacher (Vorsitzender/ Chairman)
Geschäftsführung/Management Board: Dr. Dirk Toepfer (Vorsitzender/CEO), Wim van 
Herwijnen
Sitz Düsseldorf, Amtsgericht Düsseldorf, HRB 18232/Registered Office 
Düsseldorf, Commercial Register of the Düsseldorf Local Court, HRB 18232

Betreffend Mails von *@metrosystems.net
Die in dieser E-Mail enthaltenen Nachrichten und Anhänge sind ausschließlich 
für den bezeichneten Adressaten bestimmt. Sie können rechtlich geschützte, 
vertrauliche Informationen enthalten. Falls Sie nicht der bezeichnete Empfänger 
oder zum Empfang dieser E-Mail nicht berechtigt sind, ist die Verwendung, 
Vervielfältigung oder Weitergabe der Nachrichten und Anhänge untersagt. Falls 
Sie diese E-Mail irrtümlich erhalten haben, informieren Sie bitte unverzüglich 
den Absender und vernichten Sie die E-Mail.

Regarding mails from *@metrosystems.net
This e-mail message and any attachment are intended exclusively for the named 
addressee. They may contain confidential information which may also be 
protected by professional secrecy. Unless you are the named addressee (or 
authorised to receive for the addressee) you may not copy or use this message 
or any attachment or disclose the contents to anyone else. If this e-mail was 
sent to you by mistake please notify the sender immediately and delete this 
e-mail.