Kafka stream transformations

2017-09-26 Thread Roshan
Hi,

I'm looking for information on where the stream transformations are
applied - the server(broker) or the client?

Would it be possible for clients to share the topology?

-- 
Warm regards
Roshan


Running Kafka on docker containers

2017-09-26 Thread Anoop Putta (aputta)
Hi,

Is it a good idea to run kafka as docker containers in the production 
deployments? Do you guys foresee any blocks with this approach?
Please advise.

-Anoop P


how to use Confluent connector with Apache Kafka

2017-09-26 Thread Marina Popova
Hi,
we have an existing Kafka cluster (0.10) already setup and working in 
production.
I woudl like to explore using Confluent's Elasticsearch Connector - however, I 
see it comes as part of the Confluent distribution of Kafka (with separate 
confluent scripts, libs, etc.).

Is there an easy way to just use the Confluent Connector with the plain 
(non-Confluent) Apache distribution of Kafka?

thanks!
Marina

Re: In which scenarios would "INVALID_REQUEST" be returned for "Offset Request"

2017-09-26 Thread Vignesh
We see even when we use  https://github.com/edenhill/librdkafka nuget
version .11.0, https://www.nuget.org/packages/librdkafka.redist/

-Vignesh.

On Tue, Sep 26, 2017 at 11:03 AM, Vignesh  wrote:

> I am just sending the request directly using my own client, Protocol api
> version I used is "1" https://kafka.apache.org/
> protocol#The_Messages_Offsets
> Broker version is .10.2.0 . .This broker version supports protocol version
> 1.
>
> Where are the logs related to such errors stored? Also, is this error
> level enabled by default? If not, How can I enable it?
>
> Thanks,
> Vignesh.
>
> On Sun, Sep 24, 2017 at 12:52 PM, James Cheng 
> wrote:
>
>> Your client library might be sending a message that is too old or too new
>> for your broker to understand.
>>
>> What version is your Kafka client library, and what version is your
>> broker?
>>
>> -James
>>
>> Sent from my iPhone
>>
>> > On Sep 22, 2017, at 4:09 PM, Vignesh  wrote:
>> >
>> > Hi,
>> >
>> > In which scenarios would we get "INVALID_REQUEST" for a Version 1
>> "Offset
>> > Request"  (https://kafka.apache.org/protocol#The_Messages_Offsets)  ?
>> >
>> > I searched for INVALID_REQUEST in https://github.com/apache/kafka and
>> below
>> > is the only file that seems related.
>> >
>> > https://github.com/apache/kafka/blob/96ba21e0dfb1a564d534917
>> 9d844f020abf1e08b/clients/src/main/java/org/apache/kafka/
>> common/protocol/Errors.java
>> >
>> > Here, I see that invalid request is returned only on duplicate topic
>> > partition. Is that the only reason?
>> >
>> > The description for the error is broader though.
>> >
>> > "
>> > This most likely occurs because of a request being malformed by the
>> client
>> > library or the message was sent to an incompatible broker. See the
>> broker
>> > logs for more details.
>> >
>> > "
>> >
>> > Thanks,
>> > Vignesh.
>>
>
>


Re: In which scenarios would "INVALID_REQUEST" be returned for "Offset Request"

2017-09-26 Thread Vignesh
I am just sending the request directly using my own client, Protocol api
version I used is "1" https://kafka.apache.org/protocol#The_Messages_Offsets

Broker version is .10.2.0 . .This broker version supports protocol version
1.

Where are the logs related to such errors stored? Also, is this error level
enabled by default? If not, How can I enable it?

Thanks,
Vignesh.

On Sun, Sep 24, 2017 at 12:52 PM, James Cheng  wrote:

> Your client library might be sending a message that is too old or too new
> for your broker to understand.
>
> What version is your Kafka client library, and what version is your broker?
>
> -James
>
> Sent from my iPhone
>
> > On Sep 22, 2017, at 4:09 PM, Vignesh  wrote:
> >
> > Hi,
> >
> > In which scenarios would we get "INVALID_REQUEST" for a Version 1 "Offset
> > Request"  (https://kafka.apache.org/protocol#The_Messages_Offsets)  ?
> >
> > I searched for INVALID_REQUEST in https://github.com/apache/kafka and
> below
> > is the only file that seems related.
> >
> > https://github.com/apache/kafka/blob/96ba21e0dfb1a564d5349179d844f0
> 20abf1e08b/clients/src/main/java/org/apache/kafka/common/
> protocol/Errors.java
> >
> > Here, I see that invalid request is returned only on duplicate topic
> > partition. Is that the only reason?
> >
> > The description for the error is broader though.
> >
> > "
> > This most likely occurs because of a request being malformed by the
> client
> > library or the message was sent to an incompatible broker. See the broker
> > logs for more details.
> >
> > "
> >
> > Thanks,
> > Vignesh.
>


consumer group offset chaos

2017-09-26 Thread Vincent Dautremont
Hi,
I've recently experienced a reset of consumer group offset on a cluster of
3 Kafka nodes (v0.11.0.0).

I use 3 high level consumers using librdkafka 0.9.4
They first ask the consumer group assigned partition offsets just after
each rebalance and before consuming anything.

every offset related  action is logged to file to retrace possible problems.
On those 3 consumers the log end was at an offset near 313 000 000 for all
12 partitions.

following an unknown "cluster problem", the communication with my 3 clients
and the cluster ended unexpectedly : in one of the offset commit, error
response tells me
err = *Broker: Not coordinator for group*
and
err = *Broker: Unknown member*

so my 3 clients all resets and retries to assign the consumer group 30
seconds later.
On the 3 clients the CG assigned partition offsets was then reported around
9 000 000
Because the first valid messages of each partition was around offset 300
000 000.
The clients did  reprocess the whole topic from its beginning, reprocessing
13 000 000.
my near real time client programs... couldn't achieve near real time any
more from there, which is quite critical.

I'd like help to know what happened here, I wonder if there's a bug in
kafka / usage of zookeeper related to that.

we clearly see from the log that there was a problem on zookeeper on node
2, and that the kafka node 2 was removed from the cluster for a few seconds.

log of KafkaNode3-controller.log.2017-09-22-21 https://pastebin.com/ekXNd13G

log of KafkaNode2-server.log.2017-09-22-21 https://pastebin.com/3d05jtNx

log of *ServerNode2-zookeeper.log* https://pastebin.com/yY9vMWDQ
log of *ServerNode3-zookeeper.log* https://pastebin.com/nQRt8dh1

I don't understand what's the source of all this, but it seems related to a
zookeeper problem on server 2 (what caused it ?).

Could all this be linked to https://issues.apache.org/jira/browse/KAFKA-5600
which has been fixed in v0.11.0.1 ?


Thanks for your thoughts.
VIncent.

-- 
The information transmitted is intended only for the person or entity to 
which it is addressed and may contain confidential and/or privileged 
material. Any review, retransmission, dissemination or other use of, or 
taking of any action in reliance upon, this information by persons or 
entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
computer.


Re: ACL for hosts

2017-09-26 Thread Bastien Durel
Le mardi 26 septembre 2017 à 16:30 +0200, Bastien Durel a écrit :
> Hello,
> 
> I want to allow any user to consume messages from any host, but
> restrict publishing from only one host (and one user), so I think I
> need ACLs
> 
> I use the default authorizer : 
> authorizer.class.name=kafka.security.auth.SimpleAclAuthorizer
> 
> I added the following ACLs to allow anyone to read from anywhere :
> bin/kafka-acls.sh --authorizer-properties
> zookeeper.connect=localhost:2181 --add --consumer --topic test --
> allow-principal 'User:*' --group '*'
> 
> And I've verified I can consume messages from any host (using a small
> python client)
> 
> I then added ACL to permit alice to publish from 127.0.0.1 :
> User:alice has Allow permission for operations: All from hosts:
> 127.0.0.1
> 
> And messages posted from localhost (with another python script) flows
> to any consumer
> 
> But if I add a remote machine ACL :
> bin/kafka-acls.sh --authorizer-properties
> zookeeper.connect=localhost:2181 --add --topic test --allow-principal 
> User:alice --allow-host 10.42.42.3
> Adding ACLs for resource `Topic:test`: 
>   User:alice has Allow permission for operations: All from
> hosts: 10.42.42.3 
> 
> Current ACLs for resource `Topic:test`: 
>   User:* has Allow permission for operations: Describe from
> hosts: *
>   User:* has Allow permission for operations: Read from hosts: *
>   User:alice has Allow permission for operations: All from hosts:
> 10.42.42.3
>   User:alice has Allow permission for operations: All from hosts:
> 127.0.0.1 
> 
> All looks correct but messages sent from this host doesn't flow to
> consumer(s).
> I can see them leave on the wire, but I get an response wireshark
> doesn't know how to decode, but consumers doesn't get anything.
> 
> Removing the 127.0.0.1 ACL leads to the same result (messages sent to
> (local) wire but not delivered to consumers), but adding it back
> leads
> to the intended behaviour (messages delivered)
> 
> I tried with IP, FQDN, hostname ; I cannot get my messages from
> 10.42.42.3 to get delivered
> Except if I add an ACL with --allow-host \* ; in this case messages
> from 10.42.42.3 gets delivered.
> 
> I use kafka 0.10.2.0
> 
> Do you have any clue ? How to debug this issue ?
> 
> Thanks,
> 
There is a router that masquerades my IP, that was the problem ...
sorry for the noise

Regards,

-- 
Bastien Durel
DATA
Intégration des données de l'entreprise,
Systèmes d'information décisionnels.

bastien.du...@data.fr
tel : +33 (0) 1 57 19 59 28
fax : +33 (0) 1 57 19 59 73
12 avenue Raspail, 94250 GENTILLY France
www.data.fr


Re: Data loss while upgrading confluent 3.0.0 kafka cluster toconfluent 3.2.2

2017-09-26 Thread Yogesh Sangvikar
Hi Team,

Thanks a lot for the suggestion Ismael.

We have tried kafka cluster rolling upgrade by doing the version changes
(CURRENT_KAFKA_VERSION -  0.10.0, CURRENT_MESSAGE_FORMAT_VERSION - 0.10.0
and upgraded respective version 0.10.2) in upgraded confluent package 3.2.2
and observed the in-sync replicas are coming up immediately & also, the
preferred leaders are coming up after version bump post sync.

As per my understanding, the in-sync replicas & leader election happening
quickly as the new data getting published while upgrade is getting written
and synced using upgraded package libraries (0.10.2).

Also, observed some records failed to produce due to error,

kafka-rest error response -

{"offsets":[{"partition":null,"offset":null,"error_code":50003,"error":"This
server is not the leader for that
topic-partition."}],"key_schema_id":1542,"value_schema_id":1541}

Exception in log file -
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server
is not the leader for that topic-partition.


To resolve the above error, we have override properties *acks=-1 (default,
1) retries=3 (default, 0) *for kafka rest producer config
(kafka-rest.properties) and getting some duplicate events in topic.


Thanks,
Yogesh

On Thu, Sep 21, 2017 at 7:09 AM, yogesh sangvikar <
yogesh.sangvi...@gmail.com> wrote:

> Thanks Ismael.
> I will try the solution and update all.
>
> Thanks,
> Yogesh
> --
> From: Ismael Juma 
> Sent: ‎20-‎09-‎2017 11:57 PM
> To: Kafka Users 
> Subject: Re: Data loss while upgrading confluent 3.0.0 kafka cluster
> toconfluent 3.2.2
>
> One clarification below:
>
> On Wed, Sep 20, 2017 at 3:50 PM, Ismael Juma  wrote:
>
> > Comments inline.
> >
> > On Wed, Sep 20, 2017 at 11:56 AM, Yogesh Sangvikar <
> > yogesh.sangvi...@gmail.com> wrote:
> >
> >> 2. At which point in the sequence below was the code for the brokers
> >> updated to 0.10.2?
> >>
> >> [Comment: On the kafka servers, we have confluent-3.0.0 and
> >> confluent-3.2.2
> >> packages deployed separately. So, first for protocol and message version
> >> to
> >> 0.10.0 we have updated server.properties file in running confluent-3.0.0
> >> package and restarted the service for the same.
> >
> > And, for protocol and message version to 0.10.2 bumb, we have modified
> >> server.properties file in confluent-3.2.2 & stopped the old package
> >> services and started the kafka services using new one. All restarts are
> >> done rolling fashion and random broker.id sequence (4,3,2,1).]
> >>
> >
> > You have to set version 0.10.0 in the server.properties of the 0.10.2/3.2
> > brokers. This is probably the source of your issue. After all running
> > brokers are version 0.10.2/3.2, then you can switch the version to
> 0.10.2.
> >
>
> The last sentence may be clearer with the following change:
>
> "After all running brokers are version 0.10.2/3.2, then you can switch the
> inter.broker.protocol.version to 0.10.2 in server.properties."
>
> Ismael
>


ACL for hosts

2017-09-26 Thread Bastien Durel
Hello,

I want to allow any user to consume messages from any host, but
restrict publishing from only one host (and one user), so I think I
need ACLs

I use the default authorizer : 
authorizer.class.name=kafka.security.auth.SimpleAclAuthorizer

I added the following ACLs to allow anyone to read from anywhere :
bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:2181 
--add --consumer --topic test --allow-principal 'User:*' --group '*'

And I've verified I can consume messages from any host (using a small
python client)

I then added ACL to permit alice to publish from 127.0.0.1 :
User:alice has Allow permission for operations: All from hosts: 127.0.0.1

And messages posted from localhost (with another python script) flows
to any consumer

But if I add a remote machine ACL :
bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:2181 
--add --topic test --allow-principal User:alice --allow-host 10.42.42.3
Adding ACLs for resource `Topic:test`: 
User:alice has Allow permission for operations: All from hosts: 
10.42.42.3 

Current ACLs for resource `Topic:test`: 
User:* has Allow permission for operations: Describe from hosts: *
User:* has Allow permission for operations: Read from hosts: *
User:alice has Allow permission for operations: All from hosts: 
10.42.42.3
User:alice has Allow permission for operations: All from hosts: 
127.0.0.1 

All looks correct but messages sent from this host doesn't flow to
consumer(s).
I can see them leave on the wire, but I get an response wireshark
doesn't know how to decode, but consumers doesn't get anything.

Removing the 127.0.0.1 ACL leads to the same result (messages sent to
(local) wire but not delivered to consumers), but adding it back leads
to the intended behaviour (messages delivered)

I tried with IP, FQDN, hostname ; I cannot get my messages from
10.42.42.3 to get delivered
Except if I add an ACL with --allow-host \* ; in this case messages
from 10.42.42.3 gets delivered.

I use kafka 0.10.2.0

Do you have any clue ? How to debug this issue ?

Thanks,

-- 
Bastien Durel
DATA
Intégration des données de l'entreprise,
Systèmes d'information décisionnels.

bastien.du...@data.fr
tel : +33 (0) 1 57 19 59 28
fax : +33 (0) 1 57 19 59 73
12 avenue Raspail, 94250 GENTILLY France
www.data.frfrom kafka import KafkaConsumer
consumer = KafkaConsumer('test', bootstrap_servers='192.168.100.168:9092',
 security_protocol="SASL_PLAINTEXT",
 sasl_mechanism="PLAIN",
 sasl_plain_username="alice",
 sasl_plain_password="alice-secret")
for msg in consumer:
print (msg)
from kafka import KafkaProducer
import socket
#producer = KafkaProducer(bootstrap_servers='192.168.100.168:9092')
producer = KafkaProducer(bootstrap_servers='192.168.100.168:9092',
 security_protocol="SASL_PLAINTEXT",
 sasl_mechanism="PLAIN",
 sasl_plain_username="alice",
 sasl_plain_password="alice-secret")
for _ in range(10):
producer.send('test', b'some_message_bytes: ' +
  (str(_) + ' on ' + socket.gethostname()).encode('UTF-8'))


kafka.pcap
Description: application/vnd.tcpdump.pcap


Re: How would Kafka behave in this scenario

2017-09-26 Thread Ismael Juma
Consumers can fetch messages up to the high watermark, which is dependent
on the in sync replicas, but not directly dependent on
`min.insync.replicas` (e.g. if there are 3 in sync replicas, the high
watermark is the min of the log end offset of the 3 replicas, even if min
in sync replicas is 2).

Ismael

On Tue, Sep 26, 2017 at 1:34 PM, Denis Bolshakov 
wrote:

> By default kafkf does not allow dirty reads for clients, so while
> `min.insync.replicas`
> is not achieved consumers don't see new messages.
>
> On 26 September 2017 at 11:09, Sameer Kumar 
> wrote:
>
> > Thanks Stevo for pointing me out to correct link.
> > In this case, how would exactly once feature of streams would behave
> since
> > they configure producers with acks=all. I think they would fail and would
> > need to be resumed once the broker comes back.
> >
> > -Sameer.
> >
> > On Tue, Sep 26, 2017 at 1:09 PM, Stevo Slavić  wrote:
> >
> > > Hello Sameer,
> > >
> > > Behavior depends on min.insync.replicas configured for the topic.
> > > Find more info in the documentation
> > > https://kafka.apache.org/documentation/#topicconfigs
> > >
> > > Kind regards,
> > > Stevo Slavic.
> > >
> > > On Tue, Sep 26, 2017 at 9:01 AM, Sameer Kumar 
> > > wrote:
> > >
> > > > In case one of the brokers fail,  the broker would get removed from
> the
> > > > respective ISR list of those partitions.
> > > > In case producer has acks=all, how would it behave? would the
> producers
> > > be
> > > > throttled and wait till the broker get backed up.
> > > >
> > > > -Sameer.
> > > >
> > >
> >
>
>
>
> --
> //with Best Regards
> --Denis Bolshakov
> e-mail: bolshakov.de...@gmail.com
>


Re: How would Kafka behave in this scenario

2017-09-26 Thread Denis Bolshakov
By default kafkf does not allow dirty reads for clients, so while
`min.insync.replicas`
is not achieved consumers don't see new messages.

On 26 September 2017 at 11:09, Sameer Kumar  wrote:

> Thanks Stevo for pointing me out to correct link.
> In this case, how would exactly once feature of streams would behave since
> they configure producers with acks=all. I think they would fail and would
> need to be resumed once the broker comes back.
>
> -Sameer.
>
> On Tue, Sep 26, 2017 at 1:09 PM, Stevo Slavić  wrote:
>
> > Hello Sameer,
> >
> > Behavior depends on min.insync.replicas configured for the topic.
> > Find more info in the documentation
> > https://kafka.apache.org/documentation/#topicconfigs
> >
> > Kind regards,
> > Stevo Slavic.
> >
> > On Tue, Sep 26, 2017 at 9:01 AM, Sameer Kumar 
> > wrote:
> >
> > > In case one of the brokers fail,  the broker would get removed from the
> > > respective ISR list of those partitions.
> > > In case producer has acks=all, how would it behave? would the producers
> > be
> > > throttled and wait till the broker get backed up.
> > >
> > > -Sameer.
> > >
> >
>



-- 
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.de...@gmail.com


Re: kaka-streams 0.11.0.1 rocksdb bug?

2017-09-26 Thread Damian Guy
It looks like only one of the restoring tasks ever transitions to running,
but it is impossible to tell why from the logs. My guess is there is a bug
in there somewhere.

Interestingly i only see this log line once:
"2017-09-22 14:08:09 DEBUG StoreChangelogReader:152 - stream-thread
[argyle-streams-fp-StreamThread-21] Starting restoring state stores from
changelog topics []"

But there should be one for all the tasks that are restoring.

Can you please also provide your topology?

Thanks,
Damian

On Mon, 25 Sep 2017 at 19:44 Ara Ebrahimi 
wrote:

> Please find attached the entire log. Hope it helps.
>
> Ara.
>
>
> On Sep 25, 2017, at 7:59 AM, Damian Guy  wrote:
>
> Hi, is that the complete log? It looks like there might be 2 tasks that are
> still restoring:
> 2017-09-22 14:08:09 DEBUG AssignedTasks:90 - stream-thread
> [argyle-streams-fp-StreamThread-6] transitioning stream task 1_18 to
> restoring
> 2017-09-22 14:08:09 DEBUG AssignedTasks:90 - stream-thread
> [argyle-streams-fp-StreamThread-23] transitioning stream task 1_14 to
> restoring
>
> I don't see them transitioning to running.
>
> On Fri, 22 Sep 2017 at 22:21 Ara Ebrahimi 
> wrote:
>
> I enabled TRACE level logging for kafka streams package and all I see is
> things like this:
>
> 2017-09-22 14:15:18 INFO  StreamThread:686 - stream-thread
> [argyle-streams-fp-StreamThread-32] Committed all active tasks [0_30] and
> standby tasks [] in 0ms
> 2017-09-22 14:15:18 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-32] completed partitions []
> 2017-09-22 14:15:18 INFO  StreamThread:686 - stream-thread
> [argyle-streams-fp-StreamThread-18] Committed all active tasks [0_4] and
> standby tasks [] in 0ms
> 2017-09-22 14:15:18 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-18] completed partitions []
> 2017-09-22 14:15:18 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-16] completed partitions []
> 2017-09-22 14:15:18 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-13] completed partitions []
> 2017-09-22 14:15:18 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-14] completed partitions []
> 2017-09-22 14:15:18 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-9] completed partitions []
> 2017-09-22 14:15:18 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-5] completed partitions []
> 2017-09-22 14:15:18 DEBUG StreamTask:259 - task [0_29] Committing
> 2017-09-22 14:15:18 DEBUG RecordCollectorImpl:142 - task [0_29] Flushing
> producer
> 2017-09-22 14:15:18 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-13] completed partitions []
> 2017-09-22 14:15:18 DEBUG StreamTask:259 - task [0_1] Committing
> 2017-09-22 14:15:18 DEBUG RecordCollectorImpl:142 - task [0_1] Flushing
> producer
> 2017-09-22 14:15:18 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-5] completed partitions []
> 2017-09-22 14:15:18 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-9] completed partitions []
> 2017-09-22 14:15:18 INFO  StreamThread:686 - stream-thread
> [argyle-streams-fp-StreamThread-14] Committed all active tasks [0_29] and
> standby tasks [] in 0ms
> 2017-09-22 14:15:18 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-14] completed partitions []
> 2017-09-22 14:15:19 INFO  StreamThread:686 - stream-thread
> [argyle-streams-fp-StreamThread-16] Committed all active tasks [0_1] and
> standby tasks [] in 0ms
> 2017-09-22 14:15:19 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-16] completed partitions []
> 2017-09-22 14:15:19 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-6] completed partitions []
> 2017-09-22 14:15:19 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-17] completed partitions []
> 2017-09-22 14:15:19 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-23] completed partitions []
> 2017-09-22 14:15:19 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-2] completed partitions []
> 2017-09-22 14:15:19 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-15] completed partitions []
> 2017-09-22 14:15:19 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-30] completed partitions []
> 2017-09-22 14:15:19 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-22] completed partitions []
> 2017-09-22 14:15:19 DEBUG StoreChangelogReader:194 - stream-thread
> [argyle-streams-fp-StreamThread-27] completed partitions []
> 2017-09-22 14:15:19 DEBUG StoreChangelogReader:194 - stream-thread
> 

Re: Streams dependency on RocksDB - licenses

2017-09-26 Thread Ismael Juma
Hi Stevo,

We are aware. There have been regressions in recent versions which have
prevented us from upgrading. See:

https://github.com/apache/kafka/pull/3519#issuecomment-327992362
https://github.com/apache/kafka/pull/3819

Ismael

On Tue, Sep 26, 2017 at 8:49 AM, Stevo Slavić  wrote:

> There is legal problem for older rocksdb versions, see
> https://issues.apache.org/jira/browse/LEGAL-303?focusedCommentId=16109870;
> page=com.atlassian.jira.plugin.system.issuetabpanels:
> comment-tabpanel#comment-16109870
>
> Dependencies with Facebook BSD+Patents license are not allowed to be
> included in Apache products
> https://www.apache.org/legal/resolved.html#category-x
>
> I see even latest trunk depends on older rocksdb 5.3.6
> https://github.com/apache/kafka/blob/trunk/gradle/dependencies.gradle#L67
>
> Please consider upgrading, maybe as critical for 1.0.0 release if possible.
> Should I create the ticket?
>
> Kind regards,
> Stevo Slavic.
>
> On Thu, Sep 21, 2017 at 5:22 PM, Ismael Juma  wrote:
>
> > This commit says that GPLv2 is an alternative license, so it's all good,
> I
> > believe:
> >
> > https://github.com/facebook/rocksdb/commit/
> d616ebea23fa88cb9c2c8588533526
> > a566d9cfab
> >
> > Ismael
> >
> > On Thu, Sep 21, 2017 at 4:21 PM, Ismael Juma  wrote:
> >
> > > LevelDB is New BSD (and the license in the first commit you linked to
> is
> > > definitely not GPL2):
> > >
> > > https://github.com/google/leveldb/blob/master/LICENSE
> > >
> > > The second commit you referenced, adds GPL2 to the list of licenses in
> > the
> > > pom, but it's not clear why.
> > >
> > > Ismael
> > >
> > > On Thu, Sep 21, 2017 at 3:44 PM, Stevo Slavić 
> wrote:
> > >
> > >> LevelDB GPL2 notice seems to have been added to rocksdb 5.7.1, but
> > likely
> > >> this applies to old versions too.
> > >>
> > >> https://github.com/facebook/rocksdb/commit/4a2e4891fe4c6f66f
> > >> b9e8e2d29b04f46ee702b52#diff-7e1d2c46cd6eacd9a8d864450a128218
> > >> https://github.com/facebook/rocksdb/commit/6e3ee015fb1ce03e4
> > >> 7838e9a3995410ce884c212#diff-1d5a1a614aa158e14292a60d9891f1df
> > >>
> > >> Shouldn't similar notice be added to Apache Kafka too?
> > >>
> > >> On Thu, Sep 21, 2017 at 3:43 PM, Stevo Slavić 
> > wrote:
> > >>
> > >> > Hello Apache Kafka community,
> > >> >
> > >> > Is it on purpose that kafka-streams 0.11.0.1 depends on
> > >> > org.rocksdb:rocksdbjni:5.0.1, and not on newer 5.7.3, because 5.0.1
> > has
> > >> > Apache 2 license while 5.7.3 has also GPL 2.0 parts?
> > >> >
> > >> > Kind regards,
> > >> > Stevo Slavic.
> > >> >
> > >>
> > >
> > >
> >
>


Re: Streams dependency on RocksDB - licenses

2017-09-26 Thread Stevo Slavić
Created https://issues.apache.org/jira/browse/KAFKA-5977 for this issue.

On Tue, Sep 26, 2017 at 9:49 AM, Stevo Slavić  wrote:

> There is legal problem for older rocksdb versions, see
> https://issues.apache.org/jira/browse/LEGAL-303?focusedCommentId=16109870;
> page=com.atlassian.jira.plugin.system.issuetabpanels:
> comment-tabpanel#comment-16109870
>
> Dependencies with Facebook BSD+Patents license are not allowed to be
> included in Apache products https://www.apache.org/legal/
> resolved.html#category-x
>
> I see even latest trunk depends on older rocksdb 5.3.6
> https://github.com/apache/kafka/blob/trunk/gradle/dependencies.gradle#L67
>
> Please consider upgrading, maybe as critical for 1.0.0 release if
> possible. Should I create the ticket?
>
> Kind regards,
> Stevo Slavic.
>
> On Thu, Sep 21, 2017 at 5:22 PM, Ismael Juma  wrote:
>
>> This commit says that GPLv2 is an alternative license, so it's all good, I
>> believe:
>>
>> https://github.com/facebook/rocksdb/commit/d616ebea23fa88cb9
>> c2c8588533526a566d9cfab
>>
>> Ismael
>>
>> On Thu, Sep 21, 2017 at 4:21 PM, Ismael Juma  wrote:
>>
>> > LevelDB is New BSD (and the license in the first commit you linked to is
>> > definitely not GPL2):
>> >
>> > https://github.com/google/leveldb/blob/master/LICENSE
>> >
>> > The second commit you referenced, adds GPL2 to the list of licenses in
>> the
>> > pom, but it's not clear why.
>> >
>> > Ismael
>> >
>> > On Thu, Sep 21, 2017 at 3:44 PM, Stevo Slavić 
>> wrote:
>> >
>> >> LevelDB GPL2 notice seems to have been added to rocksdb 5.7.1, but
>> likely
>> >> this applies to old versions too.
>> >>
>> >> https://github.com/facebook/rocksdb/commit/4a2e4891fe4c6f66f
>> >> b9e8e2d29b04f46ee702b52#diff-7e1d2c46cd6eacd9a8d864450a128218
>> >> https://github.com/facebook/rocksdb/commit/6e3ee015fb1ce03e4
>> >> 7838e9a3995410ce884c212#diff-1d5a1a614aa158e14292a60d9891f1df
>> >>
>> >> Shouldn't similar notice be added to Apache Kafka too?
>> >>
>> >> On Thu, Sep 21, 2017 at 3:43 PM, Stevo Slavić 
>> wrote:
>> >>
>> >> > Hello Apache Kafka community,
>> >> >
>> >> > Is it on purpose that kafka-streams 0.11.0.1 depends on
>> >> > org.rocksdb:rocksdbjni:5.0.1, and not on newer 5.7.3, because 5.0.1
>> has
>> >> > Apache 2 license while 5.7.3 has also GPL 2.0 parts?
>> >> >
>> >> > Kind regards,
>> >> > Stevo Slavic.
>> >> >
>> >>
>> >
>> >
>>
>
>


Re: Error sending msg to Kafka topi

2017-09-26 Thread MAHA ALSAYASNEH

Here is my broker configuration: 

# Server Basics # 

# The id of the broker. This must be set to a unique integer for each broker. 
broker.id=1 
host.name= 
port=9092 


#The maximum size of message that the server can receive 
message.max.bytes=224 


replica.fetch.max.bytes=224 
request.timeout.ms=30 


# Switch to enable topic deletion or not, default value is false 
delete.topic.enable=true 

# Socket Server Settings 
# 

# The address the socket server listens on. It will get the value returned from 
# java.net.InetAddress.getCanonicalHostName() if not configured. 
# FORMAT: 
# listeners = security_protocol://host_name:port 
# EXAMPLE: 
# listeners = PLAINTEXT://your.host.name:9092 
listeners=PLAINTEXT://:9092 

# Hostname and port the broker will advertise to producers and consumers. If 
not set, 
# it uses the value for "listeners" if configured. Otherwise, it will use the 
value 
# returned from java.net.InetAddress.getCanonicalHostName(). 
#advertised.listeners=PLAINTEXT://your.host.name:9092 

# The number of threads handling network requests 
num.network.threads=8 

# The number of threads doing disk I/O 
num.io.threads=8 

# The send buffer (SO_SNDBUF) used by the socket server 
socket.send.buffer.bytes=102400 

# The receive buffer (SO_RCVBUF) used by the socket server 
socket.receive.buffer.bytes=102400 

# The maximum size of a request that the socket server will accept (protection 
against OOM) 
socket.request.max.bytes=104857600 


# Log Basics # 

# A comma seperated list of directories under which to store log files 
log.dirs=/tmp/kafka-logs 

# The default number of log partitions per topic. More partitions allow greater 
# parallelism for consumption, but this will also result in more files across 
# the brokers. 
num.partitions=1 

# The number of threads per data directory to be used for log recovery at 
startup and flushing at shutdown. 
# This value is recommended to be increased for installations with data dirs 
located in RAID array. 
num.recovery.threads.per.data.dir=1 

# Log Flush Policy # 

# Messages are immediately written to the filesystem but by default we only 
fsync() to sync 
# the OS cache lazily. The following configurations control the flush of data 
to disk. 
# There are a few important trade-offs here: 
# 1. Durability: Unflushed data may be lost if you are not using replication. 
# 2. Latency: Very large flush intervals may lead to latency spikes when the 
flush does occur as there will be a lot of data to flush. 
# 3. Throughput: The flush is generally the most expensive operation, and a 
small flush interval may lead to exceessive seeks. 
# The settings below allow one to configure the flush policy to flush data 
after a period of time or 
# every N messages (or both). This can be done globally and overridden on a 
per-topic basis. 

# The number of messages to accept before forcing a flush of data to disk 
log.flush.interval.messages=1 

# The maximum amount of time a message can sit in a log before we force a flush 
log.flush.interval.ms=1000 

# Log Retention Policy 
# 

# The following configurations control the disposal of log segments. The policy 
can 
# be set to delete segments after a period of time, or after a given size has 
accumulated. 
# A segment will be deleted whenever *either* of these criteria are met. 
Deletion always happens 
# from the end of the log. 

# The minimum age of a log file to be eligible for deletion 
log.retention.hours=168 

# A size-based retention policy for logs. Segments are pruned from the log as 
long as the remaining 
# segments don't drop below log.retention.bytes. 
#log.retention.bytes=1073741824 

# The maximum size of a log segment file. When this size is reached a new log 
segment will be created. 
log.segment.bytes=1073741824 

# The interval at which log segments are checked to see if they can be deleted 
according 
# to the retention policies 
log.retention.check.interval.ms=30 

# Zookeeper # 

# Zookeeper connection string (see zookeeper docs for details). 
# This is a comma separated host:port pairs, each corresponding to a zk 
# You can also append an optional chroot string to the urls to specify the 
# root directory for all kafka znodes. 
zookeeper.connect=xx:2181,:2181 

# Timeout in ms for connecting to zookeeper 
zookeeper.connection.timeout.ms=6000 


# metrics reporter properties 
kafka.metrics.polling.interval.secs=5 
kafka.metrics.reporters=kafka.metrics.KafkaCSVMetricsReporter 
kafka.csv.metrics.dir=/tmp/kafka_metrics 
# Disable csv reporting by default. 
kafka.csv.metrics.reporter.enabled=true 



Error sending msg to Kafka topi

2017-09-26 Thread MAHA ALSAYASNEH
Hello, 

I always get this error when I send a big amount of data to the topic: 
"org.apache.kafka.common.errors.TimeoutException: Expiring 35 record(s) for 
words-4 due to 30006 ms has passed since batch creation plus linger time" 

If anybody faced that problem before, could you please share your solution? 

Thanks 
Maha 


Re: Exactly once error | out of order sequence number

2017-09-26 Thread Sameer Kumar
error trace is attached in the initial mail.

On Tue, Sep 26, 2017 at 1:42 PM, Sameer Kumar 
wrote:

> error trace is attached.
>
> On Tue, Sep 26, 2017 at 1:41 PM, Sameer Kumar 
> wrote:
>
>> I received this error, and was wondering why would cause this. I received
>> this only once, got fixed for next run.
>>
>> For every run, I change my state store though.
>>
>> -Sameer.
>>
>
>


Re: Exactly once error | out of order sequence number

2017-09-26 Thread Sameer Kumar
error trace is attached.

On Tue, Sep 26, 2017 at 1:41 PM, Sameer Kumar 
wrote:

> I received this error, and was wondering why would cause this. I received
> this only once, got fixed for next run.
>
> For every run, I change my state store though.
>
> -Sameer.
>


Exactly once error | out of order sequence number

2017-09-26 Thread Sameer Kumar
I received this error, and was wondering why would cause this. I received
this only once, got fixed for next run.

For every run, I change my state store though.

-Sameer.
2017-09-26 13:27:42 INFO  ClassPathXmlApplicationContext:513 - Refreshing 
org.springframework.context.support.ClassPathXmlApplicationContext@6aa8ceb6: 
startup date [Tue Sep 26 13:27:42 IST 2017]; root of context hierarchy
2017-09-26 13:27:42 INFO  XmlBeanDefinitionReader:316 - Loading XML bean 
definitions from class path resource [application-context.xml]
2017-09-26 13:27:43 INFO  CountableImprClicksWriter2SpecLI:104 - State Store 
Dir: /data/streampoc/
2017-09-26 13:27:43 INFO  CountableImprClicksWriter2SpecLI:105 - Run Counter: 
-d11359
2017-09-26 13:27:44 INFO  Kafka10Base:50 - KafkaStreams processID: 
83a35a1f-bc35-47dc-b87a-93eb5e1599b9
StreamsThread appId: c-7-d11359
StreamsThread clientId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9
StreamsThread threadId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9-StreamThread-1
Active tasks:
Running:
Suspended:
Restoring:
New:
Standby tasks:
Running:
Suspended:
Restoring:
New:

StreamsThread appId: c-7-d11359
StreamsThread clientId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9
StreamsThread threadId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9-StreamThread-2
Active tasks:
Running:
Suspended:
Restoring:
New:
Standby tasks:
Running:
Suspended:
Restoring:
New:

StreamsThread appId: c-7-d11359
StreamsThread clientId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9
StreamsThread threadId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9-StreamThread-3
Active tasks:
Running:
Suspended:
Restoring:
New:
Standby tasks:
Running:
Suspended:
Restoring:
New:

StreamsThread appId: c-7-d11359
StreamsThread clientId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9
StreamsThread threadId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9-StreamThread-4
Active tasks:
Running:
Suspended:
Restoring:
New:
Standby tasks:
Running:
Suspended:
Restoring:
New:

StreamsThread appId: c-7-d11359
StreamsThread clientId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9
StreamsThread threadId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9-StreamThread-5
Active tasks:
Running:
Suspended:
Restoring:
New:
Standby tasks:
Running:
Suspended:
Restoring:
New:

StreamsThread appId: c-7-d11359
StreamsThread clientId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9
StreamsThread threadId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9-StreamThread-6
Active tasks:
Running:
Suspended:
Restoring:
New:
Standby tasks:
Running:
Suspended:
Restoring:
New:

StreamsThread appId: c-7-d11359
StreamsThread clientId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9
StreamsThread threadId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9-StreamThread-7
Active tasks:
Running:
Suspended:
Restoring:
New:
Standby tasks:
Running:
Suspended:
Restoring:
New:

StreamsThread appId: c-7-d11359
StreamsThread clientId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9
StreamsThread threadId: 
c-7-d11359-83a35a1f-bc35-47dc-b87a-93eb5e1599b9-StreamThread-8
Active 

Re: How would Kafka behave in this scenario

2017-09-26 Thread Sameer Kumar
Thanks Stevo for pointing me out to correct link.
In this case, how would exactly once feature of streams would behave since
they configure producers with acks=all. I think they would fail and would
need to be resumed once the broker comes back.

-Sameer.

On Tue, Sep 26, 2017 at 1:09 PM, Stevo Slavić  wrote:

> Hello Sameer,
>
> Behavior depends on min.insync.replicas configured for the topic.
> Find more info in the documentation
> https://kafka.apache.org/documentation/#topicconfigs
>
> Kind regards,
> Stevo Slavic.
>
> On Tue, Sep 26, 2017 at 9:01 AM, Sameer Kumar 
> wrote:
>
> > In case one of the brokers fail,  the broker would get removed from the
> > respective ISR list of those partitions.
> > In case producer has acks=all, how would it behave? would the producers
> be
> > throttled and wait till the broker get backed up.
> >
> > -Sameer.
> >
>


Re: Streams dependency on RocksDB - licenses

2017-09-26 Thread Stevo Slavić
There is legal problem for older rocksdb versions, see
https://issues.apache.org/jira/browse/LEGAL-303?focusedCommentId=16109870=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16109870

Dependencies with Facebook BSD+Patents license are not allowed to be
included in Apache products
https://www.apache.org/legal/resolved.html#category-x

I see even latest trunk depends on older rocksdb 5.3.6
https://github.com/apache/kafka/blob/trunk/gradle/dependencies.gradle#L67

Please consider upgrading, maybe as critical for 1.0.0 release if possible.
Should I create the ticket?

Kind regards,
Stevo Slavic.

On Thu, Sep 21, 2017 at 5:22 PM, Ismael Juma  wrote:

> This commit says that GPLv2 is an alternative license, so it's all good, I
> believe:
>
> https://github.com/facebook/rocksdb/commit/d616ebea23fa88cb9c2c8588533526
> a566d9cfab
>
> Ismael
>
> On Thu, Sep 21, 2017 at 4:21 PM, Ismael Juma  wrote:
>
> > LevelDB is New BSD (and the license in the first commit you linked to is
> > definitely not GPL2):
> >
> > https://github.com/google/leveldb/blob/master/LICENSE
> >
> > The second commit you referenced, adds GPL2 to the list of licenses in
> the
> > pom, but it's not clear why.
> >
> > Ismael
> >
> > On Thu, Sep 21, 2017 at 3:44 PM, Stevo Slavić  wrote:
> >
> >> LevelDB GPL2 notice seems to have been added to rocksdb 5.7.1, but
> likely
> >> this applies to old versions too.
> >>
> >> https://github.com/facebook/rocksdb/commit/4a2e4891fe4c6f66f
> >> b9e8e2d29b04f46ee702b52#diff-7e1d2c46cd6eacd9a8d864450a128218
> >> https://github.com/facebook/rocksdb/commit/6e3ee015fb1ce03e4
> >> 7838e9a3995410ce884c212#diff-1d5a1a614aa158e14292a60d9891f1df
> >>
> >> Shouldn't similar notice be added to Apache Kafka too?
> >>
> >> On Thu, Sep 21, 2017 at 3:43 PM, Stevo Slavić 
> wrote:
> >>
> >> > Hello Apache Kafka community,
> >> >
> >> > Is it on purpose that kafka-streams 0.11.0.1 depends on
> >> > org.rocksdb:rocksdbjni:5.0.1, and not on newer 5.7.3, because 5.0.1
> has
> >> > Apache 2 license while 5.7.3 has also GPL 2.0 parts?
> >> >
> >> > Kind regards,
> >> > Stevo Slavic.
> >> >
> >>
> >
> >
>


Re: How would Kafka behave in this scenario

2017-09-26 Thread Stevo Slavić
Hello Sameer,

Behavior depends on min.insync.replicas configured for the topic.
Find more info in the documentation
https://kafka.apache.org/documentation/#topicconfigs

Kind regards,
Stevo Slavic.

On Tue, Sep 26, 2017 at 9:01 AM, Sameer Kumar 
wrote:

> In case one of the brokers fail,  the broker would get removed from the
> respective ISR list of those partitions.
> In case producer has acks=all, how would it behave? would the producers be
> throttled and wait till the broker get backed up.
>
> -Sameer.
>