Build failed in Jenkins: kafka-trunk-jdk8 #3416

2019-02-25 Thread Apache Jenkins Server
See 


Changes:

[mjsax] MINOR: Fix line break issue in upgrade notes (#6320)

--
[...truncated 2.31 MB...]

org.apache.kafka.trogdor.workload.PayloadGeneratorTest > 
testConstantPayloadGenerator STARTED

org.apache.kafka.trogdor.workload.PayloadGeneratorTest > 
testConstantPayloadGenerator PASSED

org.apache.kafka.trogdor.workload.PayloadGeneratorTest > 
testSequentialPayloadGenerator STARTED

org.apache.kafka.trogdor.workload.PayloadGeneratorTest > 
testSequentialPayloadGenerator PASSED

org.apache.kafka.trogdor.workload.PayloadGeneratorTest > 
testNullPayloadGenerator STARTED

org.apache.kafka.trogdor.workload.PayloadGeneratorTest > 
testNullPayloadGenerator PASSED

org.apache.kafka.trogdor.workload.PayloadGeneratorTest > 
testUniformRandomPayloadGenerator STARTED

org.apache.kafka.trogdor.workload.PayloadGeneratorTest > 
testUniformRandomPayloadGenerator PASSED

org.apache.kafka.trogdor.workload.PayloadGeneratorTest > testPayloadIterator 
STARTED

org.apache.kafka.trogdor.workload.PayloadGeneratorTest > testPayloadIterator 
PASSED

org.apache.kafka.trogdor.workload.PayloadGeneratorTest > 
testUniformRandomPayloadGeneratorPaddingBytes STARTED

org.apache.kafka.trogdor.workload.PayloadGeneratorTest > 
testUniformRandomPayloadGeneratorPaddingBytes PASSED

org.apache.kafka.trogdor.workload.ConsumeBenchSpecTest > 
testMaterializeTopicsWithSomePartitions STARTED

org.apache.kafka.trogdor.workload.ConsumeBenchSpecTest > 
testMaterializeTopicsWithSomePartitions PASSED

org.apache.kafka.trogdor.workload.ConsumeBenchSpecTest > 
testMaterializeTopicsWithNoPartitions STARTED

org.apache.kafka.trogdor.workload.ConsumeBenchSpecTest > 
testMaterializeTopicsWithNoPartitions PASSED

org.apache.kafka.trogdor.workload.ConsumeBenchSpecTest > 
testInvalidTopicNameRaisesExceptionInMaterialize STARTED

org.apache.kafka.trogdor.workload.ConsumeBenchSpecTest > 
testInvalidTopicNameRaisesExceptionInMaterialize PASSED

org.apache.kafka.trogdor.workload.HistogramTest > testHistogramPercentiles 
STARTED

org.apache.kafka.trogdor.workload.HistogramTest > testHistogramPercentiles 
PASSED

org.apache.kafka.trogdor.workload.HistogramTest > testHistogramSamples STARTED

org.apache.kafka.trogdor.workload.HistogramTest > testHistogramSamples PASSED

org.apache.kafka.trogdor.workload.HistogramTest > testHistogramAverage STARTED

org.apache.kafka.trogdor.workload.HistogramTest > testHistogramAverage PASSED

org.apache.kafka.trogdor.workload.TopicsSpecTest > testPartitionNumbers STARTED

org.apache.kafka.trogdor.workload.TopicsSpecTest > testPartitionNumbers PASSED

org.apache.kafka.trogdor.workload.TopicsSpecTest > testMaterialize STARTED

org.apache.kafka.trogdor.workload.TopicsSpecTest > testMaterialize PASSED

org.apache.kafka.trogdor.workload.TopicsSpecTest > testPartitionsSpec STARTED

org.apache.kafka.trogdor.workload.TopicsSpecTest > testPartitionsSpec PASSED

org.apache.kafka.trogdor.workload.ExternalCommandWorkerTest > 
testProcessWithFailedExit STARTED

org.apache.kafka.trogdor.workload.ExternalCommandWorkerTest > 
testProcessWithFailedExit PASSED

org.apache.kafka.trogdor.workload.ExternalCommandWorkerTest > 
testProcessNotFound STARTED

org.apache.kafka.trogdor.workload.ExternalCommandWorkerTest > 
testProcessNotFound PASSED

org.apache.kafka.trogdor.workload.ExternalCommandWorkerTest > 
testProcessForceKillTimeout STARTED

org.apache.kafka.trogdor.workload.ExternalCommandWorkerTest > 
testProcessForceKillTimeout PASSED

org.apache.kafka.trogdor.workload.ExternalCommandWorkerTest > testProcessStop 
STARTED

org.apache.kafka.trogdor.workload.ExternalCommandWorkerTest > testProcessStop 
PASSED

org.apache.kafka.trogdor.workload.ExternalCommandWorkerTest > 
testProcessWithNormalExit STARTED

org.apache.kafka.trogdor.workload.ExternalCommandWorkerTest > 
testProcessWithNormalExit PASSED

org.apache.kafka.trogdor.common.WorkerUtilsTest > testCreatesNotExistingTopics 
STARTED

org.apache.kafka.trogdor.common.WorkerUtilsTest > testCreatesNotExistingTopics 
PASSED

org.apache.kafka.trogdor.common.WorkerUtilsTest > 
testCreateZeroTopicsDoesNothing STARTED

org.apache.kafka.trogdor.common.WorkerUtilsTest > 
testCreateZeroTopicsDoesNothing PASSED

org.apache.kafka.trogdor.common.WorkerUtilsTest > 
testCreateNonExistingTopicsWithZeroTopicsDoesNothing STARTED

org.apache.kafka.trogdor.common.WorkerUtilsTest > 
testCreateNonExistingTopicsWithZeroTopicsDoesNothing PASSED

org.apache.kafka.trogdor.common.WorkerUtilsTest > 
testCreateTopicsFailsIfAtLeastOneTopicExists STARTED

org.apache.kafka.trogdor.common.WorkerUtilsTest > 
testCreateTopicsFailsIfAtLeastOneTopicExists PASSED

org.apache.kafka.trogdor.common.WorkerUtilsTest > 
testCreatesOneTopicVerifiesOneTopic STARTED

org.apache.kafka.trogdor.common.WorkerUtilsTest > 
testCreatesOneTopicVerifiesOneTopic PASSED

org.apache.kafka.trogdor.common.WorkerUtilsTest > 

[jira] [Resolved] (KAFKA-7845) Kafka clients do not re-resolve ips when a broker is replaced.

2019-02-25 Thread Jennifer Thompson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jennifer Thompson resolved KAFKA-7845.
--
   Resolution: Fixed
Fix Version/s: 2.1.1

> Kafka clients do not re-resolve ips when a broker is replaced.
> --
>
> Key: KAFKA-7845
> URL: https://issues.apache.org/jira/browse/KAFKA-7845
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.1.0
>Reporter: Jennifer Thompson
>Priority: Major
> Fix For: 2.1.1
>
>
> When one of our Kafka brokers dies and a new one replaces it (via an aws 
> ASG), the clients that publish to Kafka still try to publish to the old 
> brokers.
> We see errors like 
> {code:java}
> 2019-01-18 20:16:16 WARN NetworkClient:721 - [Producer clientId=producer-1] 
> Connection to node 2 (/10.130.98.111:9092) could not be established. Broker 
> may not be available.
> 2019-01-18 20:19:09 WARN Sender:596 - [Producer clientId=producer-1] Got 
> error produce response with correlation id 3414 on topic-partition aa.pga-2, 
> retrying (4 attempts left). Error: NOT_LEADER_FOR_PARTITION
> 2019-01-18 20:19:09 WARN Sender:641 - [Producer clientId=producer-1] Received 
> invalid metadata error in produce request on partition aa.pga-2 due to 
> org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is 
> not the leader for that topic-partition.. Going to request metadata update now
> 2019-01-18 20:21:19 WARN NetworkClient:721 - [Producer clientId=producer-1] 
> Connection to node 2 (/10.130.98.111:9092) could not be established. Broker 
> may not be available.
> 2019-01-18 20:21:50 ERROR ProducerBatch:233 - Error executing user-provided 
> callback on message for topic-partition 'aa.test-liz-0'{code}
> and
> {code:java}
> [2019-01-18 20:28:47,732] ERROR WorkerSourceTask{id=rabbit-vpc-2-kafka-1} 
> Failed to flush, timed out while waiting for producer to flush outstanding 27 
> messages (org.apache.kafka.connect.runtime.WorkerSourceTask)
> [2019-01-18 20:28:47,732] ERROR WorkerSourceTask{id=rabbit-vpc-2-kafka-1} 
> Failed to commit offsets 
> (org.apache.kafka.connect.runtime.SourceTaskOffsetCommitter)
> {code}
> The ip address referenced is for the broker that died. We have Kafka Manager 
> running as well, and that picks up the new broker.
> We already set
> {code:java}
> networkaddress.cache.ttl=60{code}
> in
> {code:java}
> jre/lib/security/java.security{code}
> Our java version is "Java(TM) SE Runtime Environment (build 1.8.0_192-b12)"
> This started happening after we upgraded to 2.1. When had Kafka 1.1, brokers 
> could failover without a problem.
> One thing that might be considered unusual about our deployment is that we 
> reuse the same broker id and EBS volume for the new broker, so that 
> partitions do not have to be reassigned.
> In kafka-connect, the logs look like
> {code}
> [2019-01-28 22:11:02,364] WARN [Consumer clientId=consumer-1, 
> groupId=connect-cluster] Connection to node 3 (/10.130.153.120:9092) could 
> not be established. Broker may not be available. 
> (org.apache.kafka.clients.NetworkClient)
> [2019-01-28 22:11:02,365] INFO [Consumer clientId=consumer-1, 
> groupId=connect-cluster] Error sending fetch request (sessionId=201133590, 
> epoch=INITIAL) to node 3: org.apache.kafka.common.errors.DisconnectException. 
> (org.apache.kafka.clients.FetchSessionHandler)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] 2.2.0 RC0

2019-02-25 Thread Matthias J. Sax
@Stephane

Thanks! You are right (I copied the list from an older draft without
double checking).

On the release Wiki page, it's correctly listed as postponed:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827512


@Viktor

Thanks. This will not block the release, but I'll make sure to include
it in the webpage update.



-Matthias

On 2/25/19 5:16 AM, Viktor Somogyi-Vass wrote:
> Hi Matthias,
> 
> I've noticed a minor line break issue in the upgrade docs. I've created a
> small PR for that: https://github.com/apache/kafka/pull/6320
> 
> Best,
> Viktor
> 
> On Sun, Feb 24, 2019 at 10:16 PM Stephane Maarek 
> wrote:
> 
>> Hi Matthias
>>
>> Thanks for this
>> Running through the list of KIPs. I think this is not included in 2.2:
>>
>> - Allow clients to suppress auto-topic-creation
>>
>> Regards
>> Stephane
>>
>> On Sun, Feb 24, 2019 at 1:03 AM Matthias J. Sax 
>> wrote:
>>
>>> Hello Kafka users, developers and client-developers,
>>>
>>> This is the first candidate for the release of Apache Kafka 2.2.0.
>>>
>>> This is a minor release with the follow highlight:
>>>
>>>  - Added SSL support for custom principle name
>>>  - Allow SASL connections to periodically re-authenticate
>>>  - Improved consumer group management
>>>- default group.id is `null` instead of empty string
>>>  - Add --under-min-isr option to describe topics command
>>>  - Allow clients to suppress auto-topic-creation
>>>  - API improvement
>>>- Producer: introduce close(Duration)
>>>- AdminClient: introduce close(Duration)
>>>- Kafka Streams: new flatTransform() operator in Streams DSL
>>>- KafkaStreams (and other classed) now implement AutoClosable to
>>> support try-with-resource
>>>- New Serdes and default method implementations
>>>  - Kafka Streams exposed internal client.id via ThreadMetadata
>>>  - Metric improvements:  All `-min`, `-avg` and `-max` metrics will now
>>> output `NaN` as default value
>>>
>>>
>>> Release notes for the 2.2.0 release:
>>> http://home.apache.org/~mjsax/kafka-2.2.0-rc0/RELEASE_NOTES.html
>>>
>>> *** Please download, test and vote by Friday, March 1, 9am PST.
>>>
>>> Kafka's KEYS file containing PGP keys we use to sign the release:
>>> http://kafka.apache.org/KEYS
>>>
>>> * Release artifacts to be voted upon (source and binary):
>>> http://home.apache.org/~mjsax/kafka-2.2.0-rc0/
>>>
>>> * Maven artifacts to be voted upon:
>>> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>>>
>>> * Javadoc:
>>> http://home.apache.org/~mjsax/kafka-2.2.0-rc0/javadoc/
>>>
>>> * Tag to be voted upon (off 2.2 branch) is the 2.2.0 tag:
>>> https://github.com/apache/kafka/releases/tag/2.2.0-rc0
>>>
>>> * Documentation:
>>> https://kafka.apache.org/22/documentation.html
>>>
>>> * Protocol:
>>> https://kafka.apache.org/22/protocol.html
>>>
>>> * Successful Jenkins builds for the 2.2 branch:
>>> Unit/integration tests: https://builds.apache.org/job/kafka-2.2-jdk8/31/
>>>
>>> * System tests:
>>> https://jenkins.confluent.io/job/system-test-kafka/job/2.2/
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>> -Matthias
>>>
>>>
>>
> 



signature.asc
Description: OpenPGP digital signature


[jira] [Created] (KAFKA-8002) Replica reassignment to new log dir may not complete if future and current replicas segment files have different base offsets

2019-02-25 Thread Anna Povzner (JIRA)
Anna Povzner created KAFKA-8002:
---

 Summary: Replica reassignment to new log dir may not complete if 
future and current replicas segment files have different base offsets
 Key: KAFKA-8002
 URL: https://issues.apache.org/jira/browse/KAFKA-8002
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.1.1
Reporter: Anna Povzner


Once future replica fetches log end offset, the intended logic is to finish the 
move (and rename the future dir to current replica dir, etc). However, the 
check in Partition.maybeReplaceCurrentWithFutureReplica compares  the whole 
LogOffsetMetadata vs. log end offset. The resulting behavior is that the 
re-assignment will not finish for topic partitions that were cleaned/ compacted 
such that base offset of the last segment is different for the current and 
future replica. 

The proposed fix is to compare only log end offsets of the current and future 
replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8001) Fetch from future replica stalls when local replica becomes a leader

2019-02-25 Thread Anna Povzner (JIRA)
Anna Povzner created KAFKA-8001:
---

 Summary: Fetch from future replica stalls when local replica 
becomes a leader
 Key: KAFKA-8001
 URL: https://issues.apache.org/jira/browse/KAFKA-8001
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.1.1
Reporter: Anna Povzner


With KIP-320, fetch from follower / future replica returns FENCED_LEADER_EPOCH 
if current leader epoch in the request is lower than the leader epoch known to 
the leader (or local replica in case of future replica fetching). In case of 
future replica fetching from the local replica, if local replica becomes the 
leader of the partition, the next fetch from future replica fails with 
FENCED_LEADER_EPOCH and fetching from future replica is stopped until the next 
leader change. 

Proposed solution: on local replica leader change, future replica should 
"become a follower" again, and go through the truncation phase. Or we could 
optimize it, and just update partition state of the future replica to reflect 
the updated current leader epoch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Statestore restoration & scaling questions - possible KIP as well.

2019-02-25 Thread Guozhang Wang
1) you are right, a thread's restoration phase will not interfere will any
other threads' normal processing collocated within the same JVM / machine
etc at all. So you may have a Streams instance which contains some threads
already finished restoring and started processing tasks, while other
threads contained are still restoring.


Guozhang

On Mon, Feb 25, 2019 at 1:53 PM Adam Bellemare 
wrote:

> Hi Guozhang -
>
> Thanks for the replies, and directing me to the existing JIRAs. I think
> that a two-phase rebalance will be quite useful.
>
> 1) For clarity's sake, I should have just asked: When a new thread / node
> is created and tasks are rebalanced, are the state stores on the new
> threads/nodes fully restored during rebalancing, thereby blocking *any and
> all *threads from proceeding with processing until restoration is complete?
> I do not believe that this is the case, and in the case of rebalanced tasks
> only the threads assigned the new tasks will be paused until state store
> restoration is complete.
>
>
> Thanks for your help - I appreciate you taking the time to reply.
>
> Adam
>
>
>
> On Wed, Feb 20, 2019 at 8:38 PM Guozhang Wang  wrote:
>
> > Hello Adam,
> >
> > Sorry for being late replying on this thread, I've put my comments
> inlined
> > below.
> >
> > On Sun, Feb 3, 2019 at 7:34 AM Adam Bellemare 
> > wrote:
> >
> > > Hey Folks
> > >
> > > I have a few questions around the operations of stateful processing
> while
> > > scaling nodes up/down, and a possible KIP in question #4. Most of them
> > have
> > > to do with task processing during rebuilding of state stores after
> > scaling
> > > nodes up.
> > >
> > > Scenario:
> > > Single node/thread, processing 2 topics (10 partitions each):
> > > User event topic (events) - ie: key:userId, value: ProductId
> > > Product topic (entity) - ie: key: ProductId, value: productData
> > >
> > > My topology looks like this:
> > >
> > > KTable productTable = ... //materialize from product topic
> > >
> > > KStream output = userStream
> > > .map(x => (x.value, x.key) ) //Swap the key and value around
> > > .join(productTable, ... ) //Joiner is not relevant here
> > > .to(...)  //Send it to some output topic
> > >
> > >
> > > Here are my questions:
> > > 1) If I scale the processing node count up, partitions will be
> rebalanced
> > > to the new node. Does processing continue as normal on the original
> node,
> > > while the new node's processing is paused as the internal state stores
> > are
> > > rebuilt/reloaded? From my reading of the code (and own experience) I
> > > believe this to be the case, but I am just curious in case I missed
> > > something.
> > >
> > >
> > With 2 topics and 10 partitions each, assuming the default
> PartitionGrouper
> > is used, there should be a total of 20 tasks (10 tasks for map which will
> > send to an internal repartition topic, and 10 tasks for doing the join)
> > created since these two topics are co-partitioned for joins.
> >
> > For example, task-0 would be processing the join from
> > user-topic-partition-0 and product-topic-partition-0, and so on.
> >
> > With a single thread, all of these 20 tasks will be allocated to this
> > thread, which would process them in an iterative manner. Note that since
> > each task has its own state store (e.g. product-state-store-0 for task-0,
> > etc), it means this thread will host all the 10 sets of state stores as
> > well (note for the 10 mapping tasks there's no state stores at all).
> >
> > When you add new threads either within the same node, or on a different
> > node, after rebalance each thread should be processing 10 tasks, and
> hence
> > owning corresponding set of state stores due to rebalance. The new thread
> > will first restore the state stores it gets assigned before start
> > processing.
> >
> >
> > > 2) What happens to the userStream map task? Will the new node be able
> to
> > > process this task while the state store is rebuilding/reloading? My
> > reading
> > > of the code suggests that this map process will be paused on the new
> node
> > > while the state store is rebuilt. The effect of this is that it will
> lead
> > > to a delay in events reaching the original node's partitions, which
> will
> > be
> > > seen as late-arriving events. Am I right in this assessment?
> > >
> > >
> > Currently the thread will NOT start processing any tasks until ALL
> stateful
> > tasks completes restoring (stateless tasks, like the map tasks in your
> > example never needs restoration at all). There's an open JIRA for making
> it
> > customizable but I cannot find it currently.
> >
> >
> > > 3) How does scaling up work with standby state-store replicas? From my
> > > reading of the code, it appears that scaling a node up will result in a
> > > reabalance, with the state assigned to the new node being rebuilt first
> > > (leading to a pause in processing). Following this, the standy replicas
> > are
> > > populated. Am I correct in this reading?
> > >
> > > 

Re: [DISCUSS] KIP-433: Provide client API version to authorizer

2019-02-25 Thread Harsha
Hi Ying,
I think the question is can we add a module in the core which can take 
up the dynamic config and does a block certain APIs.  This module will be 
called in each of the APIs like the authorizer does today to check if the API 
is supported for the client. 
Instead of throwing AuthorizationException like the authorizer does today it 
can throw UnsupportedException.
Benefits are,  we are keeping the authorizer interface as is and adding the 
flexibility based on dynamic configs without the need for categorizing broker 
APIs and it will be easy to extend to do additional options,  like turning off 
certain features which might be in interest to the service providers.
One drawback,  It will introduce another call to check instead of centralizing 
everything around Authorizer.

Thanks,
Harsha

On Mon, Feb 25, 2019, at 2:43 PM, Ying Zheng wrote:
> If you guys don't like the extension of authorizer interface, I will just
> propose a single broker dynamic configuration: client.min.api.version, to
> keep things simple.
> 
> What do you think?
> 
> On Mon, Feb 25, 2019 at 2:23 PM Ying Zheng  wrote:
> 
> > @Viktor Somogyi-Vass, @Harsha, It seems the biggest concern is the
> > backward-compatibility to the existing authorizers. We can put the new
> > method into a new trait / interface:
> > trait AuthorizerEx extends Authorizer {
> >def authorize(session: Session, operation: Operation, resource: Resource,
> > apiVersion: Short): Boolean
> > }
> >
> > When loading an authorizer class, broker will check if the class
> > implemented AuthorizerEx interface. If not, broker will wrapper the
> > Authorizer object with an Adapter class, in which authorizer(...
> > apiVersion) call is translated to the old authorizer() call. So that, both
> > old and new Authorizer is supported and can be treated as AuthorizerEx in
> > the new broker code.
> >
> > As for the broker dynamic configuration approach, I'm not sure how to
> > correctly categorize the 40+ broker APIs into a few categories.
> > For example, describe is used by producer, consumer, and admin. Should it
> > be controlled by producer.min.api.version or consumer.min.api.version?
> > Should producer.min.api.version apply to transaction operations?
> >
> >
> > On Mon, Feb 25, 2019 at 10:33 AM Harsha  wrote:
> >
> >> I think the motivation of the KIP is to configure which API we want to
> >> allow for a broker.
> >> This is challenging for a hosted service where you have customers with
> >> different versions of clients.
> >> It's not just about down conversion but for example transactions, there
> >> is a case where we do not want to allow users to start using transactions
> >> and there is no way to disable to this right now and as specified in the
> >> KIP, having a lock on which client versions we support.
> >> Authorizer's original purpose is to allow policies to be enforced for
> >> each of the Kafka APIs, specifically in the context of security.
> >> Extending this to a general purpose gatekeeper might not be suitable and
> >> as mentioned in the thread every implementation of authorizer needs to
> >> re-implement to provide the same set of functionality.
> >> I think it's better to add an implementation which will use a broker's
> >> dynamic config as mentioned in approach 1.
> >>
> >> Thanks,
> >> Harsha
> >>
> >> On Sat, Feb 23, 2019, at 6:21 AM, Ismael Juma wrote:
> >> > Thanks for the KIP. Have we considered the existing topic config that
> >> makes
> >> > it possible to disallow down conversions? That's the biggest downside in
> >> > allowing older clients.
> >> >
> >> > Ismael
> >> >
> >> > On Fri, Feb 22, 2019, 2:11 PM Ying Zheng 
> >> wrote:
> >> >
> >> > >
> >> > >
> >> >
> >>
> >
>


Re: [DISCUSS] KIP-433: Provide client API version to authorizer

2019-02-25 Thread Ying Zheng
If you guys don't like the extension of authorizer interface, I will just
propose a single broker dynamic configuration: client.min.api.version, to
keep things simple.

What do you think?

On Mon, Feb 25, 2019 at 2:23 PM Ying Zheng  wrote:

> @Viktor Somogyi-Vass, @Harsha, It seems the biggest concern is the
> backward-compatibility to the existing authorizers. We can put the new
> method into a new trait / interface:
> trait AuthorizerEx extends Authorizer {
>def authorize(session: Session, operation: Operation, resource: Resource,
> apiVersion: Short): Boolean
> }
>
> When loading an authorizer class, broker will check if the class
> implemented AuthorizerEx interface. If not, broker will wrapper the
> Authorizer object with an Adapter class, in which authorizer(...
> apiVersion) call is translated to the old authorizer() call. So that, both
> old and new Authorizer is supported and can be treated as AuthorizerEx in
> the new broker code.
>
> As for the broker dynamic configuration approach, I'm not sure how to
> correctly categorize the 40+ broker APIs into a few categories.
> For example, describe is used by producer, consumer, and admin. Should it
> be controlled by producer.min.api.version or consumer.min.api.version?
> Should producer.min.api.version apply to transaction operations?
>
>
> On Mon, Feb 25, 2019 at 10:33 AM Harsha  wrote:
>
>> I think the motivation of the KIP is to configure which API we want to
>> allow for a broker.
>> This is challenging for a hosted service where you have customers with
>> different versions of clients.
>> It's not just about down conversion but for example transactions, there
>> is a case where we do not want to allow users to start using transactions
>> and there is no way to disable to this right now and as specified in the
>> KIP, having a lock on which client versions we support.
>> Authorizer's original purpose is to allow policies to be enforced for
>> each of the Kafka APIs, specifically in the context of security.
>> Extending this to a general purpose gatekeeper might not be suitable and
>> as mentioned in the thread every implementation of authorizer needs to
>> re-implement to provide the same set of functionality.
>> I think it's better to add an implementation which will use a broker's
>> dynamic config as mentioned in approach 1.
>>
>> Thanks,
>> Harsha
>>
>> On Sat, Feb 23, 2019, at 6:21 AM, Ismael Juma wrote:
>> > Thanks for the KIP. Have we considered the existing topic config that
>> makes
>> > it possible to disallow down conversions? That's the biggest downside in
>> > allowing older clients.
>> >
>> > Ismael
>> >
>> > On Fri, Feb 22, 2019, 2:11 PM Ying Zheng 
>> wrote:
>> >
>> > >
>> > >
>> >
>>
>


Re: [DISCUSS] KIP-424: Allow suppression of intermediate events based on wall clock time

2019-02-25 Thread jonathangordon
On 2019/02/21 02:19:27, "Matthias J. Sax"  wrote: 
> thanks for the KIP. Corner case question:
> 
> What happens if an application is stopped an restarted?
> 
>  - Should suppress() flush all records (would be _before_ the time elapsed)?
>  - Or should it preserve buffered records and reload on restart? For
> this case, should the record be flushed on reload (elapsed time is
> unknown) or should we reset the timer to zero?

My opinion is that we should aim for simplicity for the first implementation of 
this feature: Flush all the records on shutdown. If there's demand in the 
future for strict adherence on shutdown we can implement them as extra params 
to Suppressed api.

> What is unclear to me atm, is the use-case you anticipate. If you assume
> a live run of an applications, event-time and processing-time should be
> fairly identical (at least with regard to data rates). Thus, suppress()
> on event-time should give you about the same behavior as wall-clock
> time? If you disagree, can you elaborate?

Imagine a session window where you aggregate 10K events that usually occur 
within 2-3 seconds of each other (event time). However, they are ingested in 
batches of 1000 or so, spread out over 2-3 minutes (ingest time), and not 
necessarily in order. It's important for us to be able to publish this 
aggregate in real-time as we get new data (every 10 seconds) to keep our time 
to glass low, but our data store is non-updateable so we'd like to limit the 
number of aggregates we publish.

If you imagine a case where all the event batches arrive in reverse order for 
one particular session window, then once the stream time advances past the 
suppression threshold, we could publish an aggregate update for each newly 
received event.

> This leave the case for data reprocessing, for which event-time advances
> much faster than wall-clock time. Is this the target use-case?

No, see above.

> About the implementation: checking wall-clock time is an expensive
> system call, so I am little worried about run-time overhead. This seems
> not to be an implementation detail and thus, it might be worth to
> includes is in the discussion. The question is, how strict the guarantee
> when records should be flushed should be. Assume you set a timer of 1
> seconds, and you have a data rate of 1000 records per second, with each
> record arriving one ms after the other all each with different key. To
> flush this data "correctly" we would need to check wall-clock time very
> millisecond... Thoughts?
> 
> (We don't need to dive into all details, but a high level discussion
> about the desired algorithm and guarantees would be good to have IMHO.)

I had never dug into the performance characteristics of currentTimeMillis() 
before:

http://pzemtsov.github.io/2017/07/23/the-slow-currenttimemillis.html

So if we assume the slow Linux average of 640 ns/call, at a 1000 calls/sec 
that's 0.64 ms. Doesn't seem terrible to me but I imagine for some use cases 1M 
calls/sec might be reasonable and now we're up to 0.64s just in system time 
checking. Perhaps we add some logic that calculates the rate of data input and 
if it exceeds some threshold we only check the time every n records? The trick 
there I suppose is for very bursty traffic you could exceed and then wait too 
long to trigger another check. Maybe we store a moving average? Or perhaps this 
is getting too complicated?




Re: [DISCUSS] KIP-433: Provide client API version to authorizer

2019-02-25 Thread Ying Zheng
@Viktor Somogyi-Vass, @Harsha, It seems the biggest concern is the
backward-compatibility to the existing authorizers. We can put the new
method into a new trait / interface:
trait AuthorizerEx extends Authorizer {
   def authorize(session: Session, operation: Operation, resource: Resource,
apiVersion: Short): Boolean
}

When loading an authorizer class, broker will check if the class
implemented AuthorizerEx interface. If not, broker will wrapper the
Authorizer object with an Adapter class, in which authorizer(...
apiVersion) call is translated to the old authorizer() call. So that, both
old and new Authorizer is supported and can be treated as AuthorizerEx in
the new broker code.

As for the broker dynamic configuration approach, I'm not sure how to
correctly categorize the 40+ broker APIs into a few categories.
For example, describe is used by producer, consumer, and admin. Should it
be controlled by producer.min.api.version or consumer.min.api.version?
Should producer.min.api.version apply to transaction operations?


On Mon, Feb 25, 2019 at 10:33 AM Harsha  wrote:

> I think the motivation of the KIP is to configure which API we want to
> allow for a broker.
> This is challenging for a hosted service where you have customers with
> different versions of clients.
> It's not just about down conversion but for example transactions, there is
> a case where we do not want to allow users to start using transactions and
> there is no way to disable to this right now and as specified in the KIP,
> having a lock on which client versions we support.
> Authorizer's original purpose is to allow policies to be enforced for each
> of the Kafka APIs, specifically in the context of security.
> Extending this to a general purpose gatekeeper might not be suitable and
> as mentioned in the thread every implementation of authorizer needs to
> re-implement to provide the same set of functionality.
> I think it's better to add an implementation which will use a broker's
> dynamic config as mentioned in approach 1.
>
> Thanks,
> Harsha
>
> On Sat, Feb 23, 2019, at 6:21 AM, Ismael Juma wrote:
> > Thanks for the KIP. Have we considered the existing topic config that
> makes
> > it possible to disallow down conversions? That's the biggest downside in
> > allowing older clients.
> >
> > Ismael
> >
> > On Fri, Feb 22, 2019, 2:11 PM Ying Zheng  wrote:
> >
> > >
> > >
> >
>


Re: Statestore restoration & scaling questions - possible KIP as well.

2019-02-25 Thread Adam Bellemare
Hi Guozhang -

Thanks for the replies, and directing me to the existing JIRAs. I think
that a two-phase rebalance will be quite useful.

1) For clarity's sake, I should have just asked: When a new thread / node
is created and tasks are rebalanced, are the state stores on the new
threads/nodes fully restored during rebalancing, thereby blocking *any and
all *threads from proceeding with processing until restoration is complete?
I do not believe that this is the case, and in the case of rebalanced tasks
only the threads assigned the new tasks will be paused until state store
restoration is complete.


Thanks for your help - I appreciate you taking the time to reply.

Adam



On Wed, Feb 20, 2019 at 8:38 PM Guozhang Wang  wrote:

> Hello Adam,
>
> Sorry for being late replying on this thread, I've put my comments inlined
> below.
>
> On Sun, Feb 3, 2019 at 7:34 AM Adam Bellemare 
> wrote:
>
> > Hey Folks
> >
> > I have a few questions around the operations of stateful processing while
> > scaling nodes up/down, and a possible KIP in question #4. Most of them
> have
> > to do with task processing during rebuilding of state stores after
> scaling
> > nodes up.
> >
> > Scenario:
> > Single node/thread, processing 2 topics (10 partitions each):
> > User event topic (events) - ie: key:userId, value: ProductId
> > Product topic (entity) - ie: key: ProductId, value: productData
> >
> > My topology looks like this:
> >
> > KTable productTable = ... //materialize from product topic
> >
> > KStream output = userStream
> > .map(x => (x.value, x.key) ) //Swap the key and value around
> > .join(productTable, ... ) //Joiner is not relevant here
> > .to(...)  //Send it to some output topic
> >
> >
> > Here are my questions:
> > 1) If I scale the processing node count up, partitions will be rebalanced
> > to the new node. Does processing continue as normal on the original node,
> > while the new node's processing is paused as the internal state stores
> are
> > rebuilt/reloaded? From my reading of the code (and own experience) I
> > believe this to be the case, but I am just curious in case I missed
> > something.
> >
> >
> With 2 topics and 10 partitions each, assuming the default PartitionGrouper
> is used, there should be a total of 20 tasks (10 tasks for map which will
> send to an internal repartition topic, and 10 tasks for doing the join)
> created since these two topics are co-partitioned for joins.
>
> For example, task-0 would be processing the join from
> user-topic-partition-0 and product-topic-partition-0, and so on.
>
> With a single thread, all of these 20 tasks will be allocated to this
> thread, which would process them in an iterative manner. Note that since
> each task has its own state store (e.g. product-state-store-0 for task-0,
> etc), it means this thread will host all the 10 sets of state stores as
> well (note for the 10 mapping tasks there's no state stores at all).
>
> When you add new threads either within the same node, or on a different
> node, after rebalance each thread should be processing 10 tasks, and hence
> owning corresponding set of state stores due to rebalance. The new thread
> will first restore the state stores it gets assigned before start
> processing.
>
>
> > 2) What happens to the userStream map task? Will the new node be able to
> > process this task while the state store is rebuilding/reloading? My
> reading
> > of the code suggests that this map process will be paused on the new node
> > while the state store is rebuilt. The effect of this is that it will lead
> > to a delay in events reaching the original node's partitions, which will
> be
> > seen as late-arriving events. Am I right in this assessment?
> >
> >
> Currently the thread will NOT start processing any tasks until ALL stateful
> tasks completes restoring (stateless tasks, like the map tasks in your
> example never needs restoration at all). There's an open JIRA for making it
> customizable but I cannot find it currently.
>
>
> > 3) How does scaling up work with standby state-store replicas? From my
> > reading of the code, it appears that scaling a node up will result in a
> > reabalance, with the state assigned to the new node being rebuilt first
> > (leading to a pause in processing). Following this, the standy replicas
> are
> > populated. Am I correct in this reading?
> >
> > Standby tasks are running in parallel with active stream tasks, and it
> simply reads from the changelog topic in read time and populate the standby
> store replica; when scaling out, the instances with standby tasks will be
> preferred over those who do not have any standby for the task, and hence
> when restoring only a very small amount of data needs to be restored
> (think: the standby replica of the store is already populated up to offset
> 90 at the rebalance, while the active task is writing to the changelog
> topic with log end offset 100, so you only need to restore 90 - 100 instead
> of 0 - 100).
>
>
> > 4) If my 

I am not able to authenticate kafka using ssl

2019-02-25 Thread naveenstates
I am getting the following Bad file descriptor error (Full error given below) 
if I try to create an SSL certificate following the instructions in the link 
given below.
http://kafka.apache.org/documentation/#security_ssl

ERROR 
[root@x certificate]# openssl s_client -debug -connect 192.168.x.xxx:9093 
-tls1
socket: Bad file descriptor
connect:errno=9

Could you please help me with where I am doing wrong? thanks in advance 


Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-02-25 Thread George Li
 Hi Viktor, 
Thanks for the response.  Good questions!  answers below: 
> A few questions regarding the rollback algorithm:> 1. At step 2 how do you 
> elect the leader? 

The step 2 code is in https://github.com/apache/kafka/pull/6296  
core/src/main/scala/kafka/controller/KafkaController.scala line#622 
rollbackReassignedPartitionLeaderIfRequired(topicPartition, 
reassignedPartitionContext)During "pending" reassignment, e.g.  (1,2,3) => 
(4,2,5)  normally, the leader (in this case broker_id 1) will remain as the 
leader until all replicas (1,2,3,4,5) in ISR, then the leader will be switched 
to 4.  However, in one scenario, if let's say new replica 4 is already caught 
up in ISR, and somehow original leader 1 is down or bounced.  4 could become 
the new leader. rollbackReassignedPartitionLeaderIfRequired() will do a 
leadership election using PreferredReplicaPartitionLeaderElectionStrategy 
  among brokers in OAR (Original Assigned Replicas set in memory). > 1.1. Would 
it be always the original leader? 
Not necessarily,  if the original preferred leader is down, it can be other 
brokers in OAR which are in ISR.
> 1.2. What if some brokers that are in OAR are down?
If some brokers in OAR are down, the topic/partition will have URP (Under 
Replicated Partition). The client deciding to do reassignment should be clear 
what the current state of the cluster is, what brokers are down, what are up, 
what reassignment is trying to accomplish. e.g. reassignment from down brokers 
to new brokers(?) 

> 2. I still have doubts that we need to do the reassignment backwards during 
> rollback. For instance if we decide to cancel the reassignment at step > #8 
> where replicas in OAR - RAR are offline and start the rollback, then how do 
> we make a replica from OAR online again before electing a leader as described 
> in step #2 of the rollback algorithm?> 3. Does the algorithm defend against 
> crashes? Is it able to continue after a controller failover?
> 4. I think it would be a good addition if you could add few example scenarios 
> for rollback.

yes. shouldTriggerReassignCancelOnControllerStartup() is the integration test 
to simulate controller failover while cancelling pending reassignments.  I will 
try to add controller failover scenario in a ducktape system test. 
You do raise a good point here. If the cluster is in a very BAD shape,  e.g. 
None of the OAR brokers are online,  but some new broker in RAR is in ISR and 
is current leader, it make senses not to rollback to keep that topic/partition 
online. However, if none of brokers in  RAR is online, it may make sense to 
rollback to OAR and remove it from /admin/reassign_partitions, since the 
cluster state is already so bad, that topic/partition is offline anyway no 
matter rollback or not. 
I will add a check before cancel/rollback a topic/partition's pending 
reassignment by checking whether at least one broker of OAR is in ISR, so that 
it can be elected as leader,  if not, skip that topic/partition reassignment 
cancellation and raise an exception.
I will list a few more scenarios for rollback.  
What additional scenarios for rollback you and others can think of? 

Thanks,George

On Monday, February 25, 2019, 3:53:33 AM PST, Viktor Somogyi-Vass 
 wrote:  
 
 Hey George,
Thanks for the prompt response, it makes sense. I'll try to keep your code 
changes on top of my list and help reviewing that. :)Regarding the incremental 
reassignment: I don't mind either to discuss it as part of this or in a 
separate conversation but I think a separate one could be better because both 
discussions can be long and keeping them separated would limit the scope and 
make them more digestible and focused. If the community decides to discuss it 
here then I think I'll put KIP-435 on hold or rejected and add my ideas here. 
If the community decides to discuss it in a different KIP I think it's a good 
idea to move the planned future work part into KIP-435 and rework that. Maybe 
we can co-author it as I think both works could be complementary to each other. 
In any case I'd be absolutely interested in what others think.
A few questions regarding the rollback algorithm:1. At step 2 how do you elect 
the leader? 1.1. Would it be always the original leader? 1.2. What if some 
brokers that are in OAR are down?2. I still have doubts that we need to do the 
reassignment backwards during rollback. For instance if we decide to cancel the 
reassignment at step #8 where replicas in OAR - RAR are offline and start the 
rollback, then how do we make a replica from OAR online again before electing a 
leader as described in step #2 of the rollback algorithm?3. Does the algorithm 
defend against crashes? Is it able to continue after a controller failover?4. I 
think it would be a good addition if you could add few example scenarios for 
rollback.
Best,
Viktor

On Fri, Feb 22, 2019 at 7:04 PM George Li  wrote:

 Hi Viktor, 

Thanks for reading and provide feedbacks on KIP-236. 


Re: [DISCUSS] KIP-433: Provide client API version to authorizer

2019-02-25 Thread Harsha
I think the motivation of the KIP is to configure which API we want to allow 
for a broker.
This is challenging for a hosted service where you have customers with 
different versions of clients.
It's not just about down conversion but for example transactions, there is a 
case where we do not want to allow users to start using transactions and there 
is no way to disable to this right now and as specified in the KIP, having a 
lock on which client versions we support.
Authorizer's original purpose is to allow policies to be enforced for each of 
the Kafka APIs, specifically in the context of security.
Extending this to a general purpose gatekeeper might not be suitable and as 
mentioned in the thread every implementation of authorizer needs to 
re-implement to provide the same set of functionality.
I think it's better to add an implementation which will use a broker's dynamic 
config as mentioned in approach 1.

Thanks,
Harsha

On Sat, Feb 23, 2019, at 6:21 AM, Ismael Juma wrote:
> Thanks for the KIP. Have we considered the existing topic config that makes
> it possible to disallow down conversions? That's the biggest downside in
> allowing older clients.
> 
> Ismael
> 
> On Fri, Feb 22, 2019, 2:11 PM Ying Zheng  wrote:
> 
> >
> >
>


[jira] [Created] (KAFKA-7999) Flaky Test ExampleConnectIntegrationTest#testProduceConsumeConnector

2019-02-25 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-7999:
--

 Summary: Flaky Test 
ExampleConnectIntegrationTest#testProduceConsumeConnector
 Key: KAFKA-7999
 URL: https://issues.apache.org/jira/browse/KAFKA-7999
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect, unit tests
Affects Versions: 2.2.0
Reporter: Matthias J. Sax
 Fix For: 2.2.1


To get stable nightly builds for `2.2` release, I create tickets for all 
observed test failures.

[https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/30/]
{quote}org.apache.kafka.common.KafkaException: Could not produce message to 
topic=test-topic at 
org.apache.kafka.connect.util.clusters.EmbeddedKafkaCluster.produce(EmbeddedKafkaCluster.java:257)
 at 
org.apache.kafka.connect.integration.ExampleConnectIntegrationTest.testProduceConsumeConnector(ExampleConnectIntegrationTest.java:129){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


TSU NOTIFICATION - Encryption

2019-02-25 Thread Rajini Sivaram
SUBMISSION TYPE:  TSU


SUBMITTED BY: Rajini Sivaram


SUBMITTED FOR:The Apache Software Foundation


POINT OF CONTACT: Secretary, The Apache Software Foundation


FAX:  +1-919-573-9199


MANUFACTURER(S):  The Apache Software Foundation, Oracle


PRODUCT NAME/MODEL #: Apache Kafka


ECCN: 5D002


NOTIFICATION: http://www.apache.org/licenses/exports/


[jira] [Created] (KAFKA-7998) Windows Quickstart script fails

2019-02-25 Thread Dieter De Paepe (JIRA)
Dieter De Paepe created KAFKA-7998:
--

 Summary: Windows Quickstart script fails
 Key: KAFKA-7998
 URL: https://issues.apache.org/jira/browse/KAFKA-7998
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.1.0
Reporter: Dieter De Paepe


Following the [Quickstart |http://kafka.apache.org/quickstart]guide on windows, 
I received an error in the script to start Zookeeper:
{noformat}
The input line is too long.
The syntax of the command is incorrect.{noformat}
The cause is in the long CLASSPATH being constructed, resulting in a very long 
string:
{noformat}
for %%i in ("%BASE_DIR%\libs\*") do (
call :concat "%%i"
)

...

:concat
IF not defined CLASSPATH (
  set CLASSPATH="%~1"
) ELSE (
  set CLASSPATH=%CLASSPATH%;"%~1"
){noformat}

A simple fix is to change the "kafka-run-class.bat" as follows (for all similar 
loops):
{noformat}for %%i in ("%BASE_DIR%\libs\*") do (
call :concat "%%i"
){noformat}
should become
{noformat}call :concat "%BASE_DIR%\libs\*"{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7268) Review Handling Crypto Rules and update ECCN page if needed

2019-02-25 Thread Henri Yandell (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henri Yandell resolved KAFKA-7268.
--
Resolution: Fixed

> Review Handling Crypto Rules and update ECCN page if needed
> ---
>
> Key: KAFKA-7268
> URL: https://issues.apache.org/jira/browse/KAFKA-7268
> Project: Kafka
>  Issue Type: Task
>Reporter: Henri Yandell
>Assignee: Rajini Sivaram
>Priority: Blocker
>
> It is suggested in LEGAL-358 that Kafka is containing/using cryptographic 
> functions and does not have an entry on the ECCN page ( 
> [http://www.apache.org/licenses/exports/] ).
> See [http://www.apache.org/dev/crypto.html] to review and confirm whether you 
> should add something to the ECCN page, and if needed, please do so.
> The text in LEGAL-358 was:
> [~zznate] added a comment - 18/Jan/18 16:59
> [~gregSwilliam] I think [~gshapira_impala_35cc] worked on some of that (def. 
> on kafka client SSL stuff) and is on the Kafka PMC. She can probably help. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-131 - Add access to OffsetStorageReader from SourceConnector

2019-02-25 Thread Randall Hauch
+1 (binding)

On Mon, Feb 25, 2019 at 4:32 AM Florian Hussonnois 
wrote:

> Hi Kafka Team,
>
> I'd like to bring this thread back at the top of the email stack to get a
> chance to see this KIP merge in the next minor/major release.
>
> Thanks.
>
> Le ven. 18 janv. 2019 à 01:20, Florian Hussonnois 
> a
> écrit :
>
> > Hey folks,
> >
> > This KIP has start since a while but has never been merged.
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-131+-+Add+access+to+OffsetStorageReader+from+SourceConnector
> > Would it be possible to restart the vote ? I still think this KIP could
> be
> > useful to implement connectors (the PR has been rebased)
> >
> > Thanks,
> >
> > Le ven. 22 sept. 2017 à 09:36, Florian Hussonnois  >
> > a écrit :
> >
> >> Hi team,
> >>
> >> Are there any more votes  ? Thanks
> >>
> >> Le 12 sept. 2017 20:18, "Gwen Shapira"  a écrit :
> >>
> >>> Thanks for clarifying.
> >>>
> >>> +1 again :)
> >>>
> >>>
> >>>
> >>> On Sat, Sep 9, 2017 at 6:57 AM Randall Hauch  wrote:
> >>>
> >>> > Gwen,
> >>> >
> >>> > I've had more time to look into the code. First, the
> >>> OffsetStorageReader
> >>> > JavaDoc says: "OffsetStorageReader provides access to the offset
> >>> storage
> >>> > used by sources. This can be used by connectors to determine offsets
> to
> >>> > start consuming data from. This is most commonly used during
> >>> initialization
> >>> > of a task, but can also be used during runtime, e.g. when
> >>> reconfiguring a
> >>> > task."
> >>> >
> >>> > Second, this KIP allows the SourceConnector implementations to access
> >>> the
> >>> > OffsetStorageReader, but really the SourceConnector methods are
> *only*
> >>> > related to lifecycle changes when using the OffsetStorageReader would
> >>> be
> >>> > appropriate per the comment above.
> >>> >
> >>> > In summary, I don't think there is any concern about the
> >>> > OffsetStorageReader being used inappropriate by the SourceConnector
> >>> > implementations.
> >>> >
> >>> > Randall
> >>> >
> >>> > On Fri, Sep 8, 2017 at 9:46 AM, Gwen Shapira 
> >>> wrote:
> >>> >
> >>> > > Basically, you are saying that the part where the comment says:
> >>> "Offset
> >>> > > data should only be read during startup or reconfiguration of a
> >>> task."
> >>> > > is incorrect? because the API extension allows reading offset data
> >>> at any
> >>> > > point in the lifecycle, right?
> >>> > >
> >>> > > On Fri, Sep 8, 2017 at 5:18 AM Florian Hussonnois <
> >>> fhussonn...@gmail.com
> >>> > >
> >>> > > wrote:
> >>> > >
> >>> > > > Hi Shapira,
> >>> > > >
> >>> > > > We only expose the OffsetStorageReader to connector which relies
> on
> >>> > > > KafkaOffsetBackingStore. The store continuesly consumes offsets
> >>> from
> >>> > > kafka
> >>> > > > so I think we can't have stale data.
> >>> > > >
> >>> > > >
> >>> > > > Le 8 sept. 2017 06:13, "Randall Hauch"  a
> écrit
> >>> :
> >>> > > >
> >>> > > > > The KIP and PR expose the OffsetStorageReader, which is already
> >>> > exposed
> >>> > > > to
> >>> > > > > the tasks. The OffsetStorageWriter is part of the
> >>> implementation, but
> >>> > > was
> >>> > > > > not and is not exposed thru the API.
> >>> > > > >
> >>> > > > > > On Sep 7, 2017, at 9:04 PM, Gwen Shapira 
> >>> > wrote:
> >>> > > > > >
> >>> > > > > > I just re-read the code for the OffsetStorageWriter, and ran
> >>> into
> >>> > > this
> >>> > > > > > comment:
> >>> > > > > >
> >>> > > > > > * Note that this only provides write functionality. This is
> >>> > > > > > intentional to ensure stale data is
> >>> > > > > > * never read. Offset data should only be read during startup
> or
> >>> > > > > > reconfiguration of a task. By
> >>> > > > > > * always serving those requests by reading the values from
> the
> >>> > > backing
> >>> > > > > > store, we ensure we never
> >>> > > > > > * accidentally use stale data. (One example of how this can
> >>> occur:
> >>> > a
> >>> > > > > > task is processing input
> >>> > > > > > * partition A, writing offsets; reconfiguration causes
> >>> partition A
> >>> > to
> >>> > > > > > be reassigned elsewhere;
> >>> > > > > > * reconfiguration causes partition A to be reassigned to this
> >>> node,
> >>> > > > > > but now the offset data is out
> >>> > > > > > * of date). Since these offsets are created and managed by
> the
> >>> > > > > > connector itself, there's no way
> >>> > > > > > * for the offset management layer to know which keys are
> >>> "owned" by
> >>> > > > > > which tasks at any given
> >>> > > > > > * time.
> >>> > > > > >
> >>> > > > > >
> >>> > > > > > I can't figure out how the KIP avoids the stale-reads problem
> >>> > > explained
> >>> > > > > here.
> >>> > > > > >
> >>> > > > > > Can you talk me through it? I'm cancelling my vote since
> right
> >>> now
> >>> > > > > > exposing this interface sounds risky and misleading.
> >>> > > > > >
> >>> > > > > >
> >>> > > > > > Gwen
> >>> > > > > >
> >>> > > > > >
> >>> > > > > >> On Thu, Sep 7, 2017 at 5:04 PM Gwen Shapira <
> >>> 

Re: [VOTE] KIP-430 - Return Authorized Operations in Describe Responses

2019-02-25 Thread Manikumar
Hi,

Thanks for the KIP. +1( binding).

Thanks,


On Mon, Feb 25, 2019 at 6:28 PM Rajini Sivaram 
wrote:

> Thanks Mickael.
>
> I thought I had to list the type only once when the same field appears
> twice (the field itself is listed both for cluster and topic). But you are
> the second person who brought this up, so I must be mistaken. I have added
> the type in both places to avoid confusion.
>
> On Mon, Feb 25, 2019 at 12:38 PM Mickael Maison 
> wrote:
>
> > +1 (non binding)
> > In the Metadata v8 section, it looks like the "authorized_operations"
> > field is missing under "topic_metadata". There's only the top-level
> > "authorized_operations" field.
> >
> > On Mon, Feb 25, 2019 at 12:11 PM Rajini Sivaram  >
> > wrote:
> > >
> > > Hi Colin,
> > >
> > > Yes, it makes sense to reduce response size by using bit fields.
> Updated
> > > the KIP.
> > >
> > > I have also updated the KIP to say that clients will ignore any bits
> set
> > by
> > > the broker that are unknown to the client, so there will be no UNKNOWN
> > > operations in the set returned by AdminClient. Brokers may however set
> > bits
> > > regardless of client version. Does that match your expectation?
> > >
> > > Thank you,
> > >
> > > Rajini
> > >
> > >
> > > On Sat, Feb 23, 2019 at 1:03 AM Colin McCabe 
> wrote:
> > >
> > > > Hi Rajini,
> > > >
> > > > Thanks for the explanations.
> > > >
> > > > On Fri, Feb 22, 2019, at 11:59, Rajini Sivaram wrote:
> > > > > Hi Colin,
> > > > >
> > > > > Thanks for the review. Sorry I meant that an array of INT8's, each
> of
> > > > which
> > > > > is an AclOperation code will be returned. I have clarified that in
> > the
> > > > KIP.
> > > >
> > > > Do you think it's worth considering a bitfield here still?  An array
> > will
> > > > take up at least 4 bytes for the length, plus whatever length the
> > elements
> > > > are.  A 32-bit bitfield would pretty much always take up less space.
> > And
> > > > we can have a new version of the RPC with 64 bits or whatever if we
> > outgrow
> > > > 32 operations.  MetadataResponse for a big cluster could contain
> quite
> > a
> > > > lot of topics, tens or hundreds of thousands.  So the space savings
> > could
> > > > be considerable.
> > > >
> > > > >
> > > > > All permitted operations will be returned from the set of supported
> > > > > operations on each resource. This is regardless of whether the
> > access was
> > > > > implicitly or explicitly granted. Have clarified that in the KIP.
> > > >
> > > > Thanks.
> > > >
> > > > >
> > > > > Since the values returned are INT8 codes, clients can simply ignore
> > any
> > > > > they don't recognize. Java clients convert these into
> > > > AclOperation.UNKNOWN.
> > > > > That way we don't need to update Metadata/describe request versions
> > when
> > > > > new operations are added to a resource. This is consistent with
> > > > > DescribeAcls behaviour. Have added this to the compatibility
> section
> > of
> > > > the
> > > > > KIP.
> > > >
> > > > Displaying "unknown" for new AclOperations made sense for
> DescribeAcls,
> > > > since the ACL is explicitly referencing the new AclOperation.   For
> > > > example, if you upgrade your Kafka cluster to a new version that
> > supports
> > > > DESCRIBE_CONFIGS, your old ACLs still don't reference
> DESCRIBE_CONFIGS.
> > > >
> > > > In contrast, in the case here, existing topics (or other resources)
> > might
> > > > pick up the new ACLOperation just by upgrading Kafka.  For example,
> if
> > you
> > > > had ALL permission on a topic and you upgrade to a new version with
> > > > DESCRIBE_CONFIGS, you now have DESCRIBE_CONFIGS permission on that
> > topic.
> > > > This would result in a lot of "unknowns" being displayed here, which
> > might
> > > > not be ideal.
> > > >
> > > > Also, there is an argument from intent-- the intention here is to let
> > you
> > > > know what you can do with a resource that already exists.  Knowing
> > that you
> > > > can do an unknown thing isn't very useful.  In contrast, for
> > DescribeAcls,
> > > > knowing that an ACL references an operation your software is too old
> to
> > > > understand is useful (you may choose not to modify that ACL, since
> you
> > > > don't know what it does, for example.)  What do you think?
> > > >
> > > > cheers,
> > > > Colin
> > > >
> > > >
> > > > >
> > > > > Thank you,
> > > > >
> > > > > Rajini
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Feb 22, 2019 at 6:46 PM Colin McCabe 
> > wrote:
> > > > >
> > > > > > Hi Rajini,
> > > > > >
> > > > > > Thanks for the KIP!
> > > > > >
> > > > > > The KIP specifies that "Authorized operations will be returned as
> > [an]
> > > > > > INT8 consistent with [the] AclOperation used in ACL requests and
> > > > > > responses."  But there may be more than one AclOperation that is
> > > > applied to
> > > > > > a given resource.  For example, a principal may have both READ
> and
> > > > WRITE
> > > > > > permission on a topic.
> > > > > >
> > > > > > One option for 

Re: [VOTE] 2.2.0 RC0

2019-02-25 Thread Viktor Somogyi-Vass
Hi Matthias,

I've noticed a minor line break issue in the upgrade docs. I've created a
small PR for that: https://github.com/apache/kafka/pull/6320

Best,
Viktor

On Sun, Feb 24, 2019 at 10:16 PM Stephane Maarek 
wrote:

> Hi Matthias
>
> Thanks for this
> Running through the list of KIPs. I think this is not included in 2.2:
>
> - Allow clients to suppress auto-topic-creation
>
> Regards
> Stephane
>
> On Sun, Feb 24, 2019 at 1:03 AM Matthias J. Sax 
> wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the first candidate for the release of Apache Kafka 2.2.0.
> >
> > This is a minor release with the follow highlight:
> >
> >  - Added SSL support for custom principle name
> >  - Allow SASL connections to periodically re-authenticate
> >  - Improved consumer group management
> >- default group.id is `null` instead of empty string
> >  - Add --under-min-isr option to describe topics command
> >  - Allow clients to suppress auto-topic-creation
> >  - API improvement
> >- Producer: introduce close(Duration)
> >- AdminClient: introduce close(Duration)
> >- Kafka Streams: new flatTransform() operator in Streams DSL
> >- KafkaStreams (and other classed) now implement AutoClosable to
> > support try-with-resource
> >- New Serdes and default method implementations
> >  - Kafka Streams exposed internal client.id via ThreadMetadata
> >  - Metric improvements:  All `-min`, `-avg` and `-max` metrics will now
> > output `NaN` as default value
> >
> >
> > Release notes for the 2.2.0 release:
> > http://home.apache.org/~mjsax/kafka-2.2.0-rc0/RELEASE_NOTES.html
> >
> > *** Please download, test and vote by Friday, March 1, 9am PST.
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > http://home.apache.org/~mjsax/kafka-2.2.0-rc0/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >
> > * Javadoc:
> > http://home.apache.org/~mjsax/kafka-2.2.0-rc0/javadoc/
> >
> > * Tag to be voted upon (off 2.2 branch) is the 2.2.0 tag:
> > https://github.com/apache/kafka/releases/tag/2.2.0-rc0
> >
> > * Documentation:
> > https://kafka.apache.org/22/documentation.html
> >
> > * Protocol:
> > https://kafka.apache.org/22/protocol.html
> >
> > * Successful Jenkins builds for the 2.2 branch:
> > Unit/integration tests: https://builds.apache.org/job/kafka-2.2-jdk8/31/
> >
> > * System tests:
> > https://jenkins.confluent.io/job/system-test-kafka/job/2.2/
> >
> >
> >
> >
> > Thanks,
> >
> > -Matthias
> >
> >
>


Re: [VOTE] KIP-430 - Return Authorized Operations in Describe Responses

2019-02-25 Thread Rajini Sivaram
Thanks Mickael.

I thought I had to list the type only once when the same field appears
twice (the field itself is listed both for cluster and topic). But you are
the second person who brought this up, so I must be mistaken. I have added
the type in both places to avoid confusion.

On Mon, Feb 25, 2019 at 12:38 PM Mickael Maison 
wrote:

> +1 (non binding)
> In the Metadata v8 section, it looks like the "authorized_operations"
> field is missing under "topic_metadata". There's only the top-level
> "authorized_operations" field.
>
> On Mon, Feb 25, 2019 at 12:11 PM Rajini Sivaram 
> wrote:
> >
> > Hi Colin,
> >
> > Yes, it makes sense to reduce response size by using bit fields. Updated
> > the KIP.
> >
> > I have also updated the KIP to say that clients will ignore any bits set
> by
> > the broker that are unknown to the client, so there will be no UNKNOWN
> > operations in the set returned by AdminClient. Brokers may however set
> bits
> > regardless of client version. Does that match your expectation?
> >
> > Thank you,
> >
> > Rajini
> >
> >
> > On Sat, Feb 23, 2019 at 1:03 AM Colin McCabe  wrote:
> >
> > > Hi Rajini,
> > >
> > > Thanks for the explanations.
> > >
> > > On Fri, Feb 22, 2019, at 11:59, Rajini Sivaram wrote:
> > > > Hi Colin,
> > > >
> > > > Thanks for the review. Sorry I meant that an array of INT8's, each of
> > > which
> > > > is an AclOperation code will be returned. I have clarified that in
> the
> > > KIP.
> > >
> > > Do you think it's worth considering a bitfield here still?  An array
> will
> > > take up at least 4 bytes for the length, plus whatever length the
> elements
> > > are.  A 32-bit bitfield would pretty much always take up less space.
> And
> > > we can have a new version of the RPC with 64 bits or whatever if we
> outgrow
> > > 32 operations.  MetadataResponse for a big cluster could contain quite
> a
> > > lot of topics, tens or hundreds of thousands.  So the space savings
> could
> > > be considerable.
> > >
> > > >
> > > > All permitted operations will be returned from the set of supported
> > > > operations on each resource. This is regardless of whether the
> access was
> > > > implicitly or explicitly granted. Have clarified that in the KIP.
> > >
> > > Thanks.
> > >
> > > >
> > > > Since the values returned are INT8 codes, clients can simply ignore
> any
> > > > they don't recognize. Java clients convert these into
> > > AclOperation.UNKNOWN.
> > > > That way we don't need to update Metadata/describe request versions
> when
> > > > new operations are added to a resource. This is consistent with
> > > > DescribeAcls behaviour. Have added this to the compatibility section
> of
> > > the
> > > > KIP.
> > >
> > > Displaying "unknown" for new AclOperations made sense for DescribeAcls,
> > > since the ACL is explicitly referencing the new AclOperation.   For
> > > example, if you upgrade your Kafka cluster to a new version that
> supports
> > > DESCRIBE_CONFIGS, your old ACLs still don't reference DESCRIBE_CONFIGS.
> > >
> > > In contrast, in the case here, existing topics (or other resources)
> might
> > > pick up the new ACLOperation just by upgrading Kafka.  For example, if
> you
> > > had ALL permission on a topic and you upgrade to a new version with
> > > DESCRIBE_CONFIGS, you now have DESCRIBE_CONFIGS permission on that
> topic.
> > > This would result in a lot of "unknowns" being displayed here, which
> might
> > > not be ideal.
> > >
> > > Also, there is an argument from intent-- the intention here is to let
> you
> > > know what you can do with a resource that already exists.  Knowing
> that you
> > > can do an unknown thing isn't very useful.  In contrast, for
> DescribeAcls,
> > > knowing that an ACL references an operation your software is too old to
> > > understand is useful (you may choose not to modify that ACL, since you
> > > don't know what it does, for example.)  What do you think?
> > >
> > > cheers,
> > > Colin
> > >
> > >
> > > >
> > > > Thank you,
> > > >
> > > > Rajini
> > > >
> > > >
> > > >
> > > > On Fri, Feb 22, 2019 at 6:46 PM Colin McCabe 
> wrote:
> > > >
> > > > > Hi Rajini,
> > > > >
> > > > > Thanks for the KIP!
> > > > >
> > > > > The KIP specifies that "Authorized operations will be returned as
> [an]
> > > > > INT8 consistent with [the] AclOperation used in ACL requests and
> > > > > responses."  But there may be more than one AclOperation that is
> > > applied to
> > > > > a given resource.  For example, a principal may have both READ and
> > > WRITE
> > > > > permission on a topic.
> > > > >
> > > > > One option for representing this would be a bitfield.  A 32-bit
> > > bitfield
> > > > > could have the appropriate bits set.  For example, if READ and
> WRITE
> > > > > operations were permitted, bits 3 and 4 could be set.
> > > > >
> > > > > Another thing to think about here is that certain AclOperations
> imply
> > > > > certain others.  For example, having WRITE on a topic gives you
> > > DESCRIBE on
> > > > > that topic as 

Re: [VOTE] KIP-430 - Return Authorized Operations in Describe Responses

2019-02-25 Thread Mickael Maison
+1 (non binding)
In the Metadata v8 section, it looks like the "authorized_operations"
field is missing under "topic_metadata". There's only the top-level
"authorized_operations" field.

On Mon, Feb 25, 2019 at 12:11 PM Rajini Sivaram  wrote:
>
> Hi Colin,
>
> Yes, it makes sense to reduce response size by using bit fields. Updated
> the KIP.
>
> I have also updated the KIP to say that clients will ignore any bits set by
> the broker that are unknown to the client, so there will be no UNKNOWN
> operations in the set returned by AdminClient. Brokers may however set bits
> regardless of client version. Does that match your expectation?
>
> Thank you,
>
> Rajini
>
>
> On Sat, Feb 23, 2019 at 1:03 AM Colin McCabe  wrote:
>
> > Hi Rajini,
> >
> > Thanks for the explanations.
> >
> > On Fri, Feb 22, 2019, at 11:59, Rajini Sivaram wrote:
> > > Hi Colin,
> > >
> > > Thanks for the review. Sorry I meant that an array of INT8's, each of
> > which
> > > is an AclOperation code will be returned. I have clarified that in the
> > KIP.
> >
> > Do you think it's worth considering a bitfield here still?  An array will
> > take up at least 4 bytes for the length, plus whatever length the elements
> > are.  A 32-bit bitfield would pretty much always take up less space.  And
> > we can have a new version of the RPC with 64 bits or whatever if we outgrow
> > 32 operations.  MetadataResponse for a big cluster could contain quite a
> > lot of topics, tens or hundreds of thousands.  So the space savings could
> > be considerable.
> >
> > >
> > > All permitted operations will be returned from the set of supported
> > > operations on each resource. This is regardless of whether the access was
> > > implicitly or explicitly granted. Have clarified that in the KIP.
> >
> > Thanks.
> >
> > >
> > > Since the values returned are INT8 codes, clients can simply ignore any
> > > they don't recognize. Java clients convert these into
> > AclOperation.UNKNOWN.
> > > That way we don't need to update Metadata/describe request versions when
> > > new operations are added to a resource. This is consistent with
> > > DescribeAcls behaviour. Have added this to the compatibility section of
> > the
> > > KIP.
> >
> > Displaying "unknown" for new AclOperations made sense for DescribeAcls,
> > since the ACL is explicitly referencing the new AclOperation.   For
> > example, if you upgrade your Kafka cluster to a new version that supports
> > DESCRIBE_CONFIGS, your old ACLs still don't reference DESCRIBE_CONFIGS.
> >
> > In contrast, in the case here, existing topics (or other resources) might
> > pick up the new ACLOperation just by upgrading Kafka.  For example, if you
> > had ALL permission on a topic and you upgrade to a new version with
> > DESCRIBE_CONFIGS, you now have DESCRIBE_CONFIGS permission on that topic.
> > This would result in a lot of "unknowns" being displayed here, which might
> > not be ideal.
> >
> > Also, there is an argument from intent-- the intention here is to let you
> > know what you can do with a resource that already exists.  Knowing that you
> > can do an unknown thing isn't very useful.  In contrast, for DescribeAcls,
> > knowing that an ACL references an operation your software is too old to
> > understand is useful (you may choose not to modify that ACL, since you
> > don't know what it does, for example.)  What do you think?
> >
> > cheers,
> > Colin
> >
> >
> > >
> > > Thank you,
> > >
> > > Rajini
> > >
> > >
> > >
> > > On Fri, Feb 22, 2019 at 6:46 PM Colin McCabe  wrote:
> > >
> > > > Hi Rajini,
> > > >
> > > > Thanks for the KIP!
> > > >
> > > > The KIP specifies that "Authorized operations will be returned as [an]
> > > > INT8 consistent with [the] AclOperation used in ACL requests and
> > > > responses."  But there may be more than one AclOperation that is
> > applied to
> > > > a given resource.  For example, a principal may have both READ and
> > WRITE
> > > > permission on a topic.
> > > >
> > > > One option for representing this would be a bitfield.  A 32-bit
> > bitfield
> > > > could have the appropriate bits set.  For example, if READ and WRITE
> > > > operations were permitted, bits 3 and 4 could be set.
> > > >
> > > > Another thing to think about here is that certain AclOperations imply
> > > > certain others.  For example, having WRITE on a topic gives you
> > DESCRIBE on
> > > > that topic as well automatically.  Does that mean that a topic with
> > WRITE
> > > > on it should automatically get DESCRIBE set in the bitfield?  I would
> > argue
> > > > that the answer is yes, for consistency's sake.
> > > >
> > > > We will inevitably add new AclOperations over time, and we have to
> > think
> > > > about how to do this in a compatible way.  The simplest approach would
> > be
> > > > to just leave out the new AclOperations when a describe request comes
> > in
> > > > from an older version client.  This should be spelled out in the
> > > > compatibility section.
> > > >
> > > > best,
> > > > Colin
> > > >

Re: [VOTE] KIP-430 - Return Authorized Operations in Describe Responses

2019-02-25 Thread Rajini Sivaram
Hi Colin,

Yes, it makes sense to reduce response size by using bit fields. Updated
the KIP.

I have also updated the KIP to say that clients will ignore any bits set by
the broker that are unknown to the client, so there will be no UNKNOWN
operations in the set returned by AdminClient. Brokers may however set bits
regardless of client version. Does that match your expectation?

Thank you,

Rajini


On Sat, Feb 23, 2019 at 1:03 AM Colin McCabe  wrote:

> Hi Rajini,
>
> Thanks for the explanations.
>
> On Fri, Feb 22, 2019, at 11:59, Rajini Sivaram wrote:
> > Hi Colin,
> >
> > Thanks for the review. Sorry I meant that an array of INT8's, each of
> which
> > is an AclOperation code will be returned. I have clarified that in the
> KIP.
>
> Do you think it's worth considering a bitfield here still?  An array will
> take up at least 4 bytes for the length, plus whatever length the elements
> are.  A 32-bit bitfield would pretty much always take up less space.  And
> we can have a new version of the RPC with 64 bits or whatever if we outgrow
> 32 operations.  MetadataResponse for a big cluster could contain quite a
> lot of topics, tens or hundreds of thousands.  So the space savings could
> be considerable.
>
> >
> > All permitted operations will be returned from the set of supported
> > operations on each resource. This is regardless of whether the access was
> > implicitly or explicitly granted. Have clarified that in the KIP.
>
> Thanks.
>
> >
> > Since the values returned are INT8 codes, clients can simply ignore any
> > they don't recognize. Java clients convert these into
> AclOperation.UNKNOWN.
> > That way we don't need to update Metadata/describe request versions when
> > new operations are added to a resource. This is consistent with
> > DescribeAcls behaviour. Have added this to the compatibility section of
> the
> > KIP.
>
> Displaying "unknown" for new AclOperations made sense for DescribeAcls,
> since the ACL is explicitly referencing the new AclOperation.   For
> example, if you upgrade your Kafka cluster to a new version that supports
> DESCRIBE_CONFIGS, your old ACLs still don't reference DESCRIBE_CONFIGS.
>
> In contrast, in the case here, existing topics (or other resources) might
> pick up the new ACLOperation just by upgrading Kafka.  For example, if you
> had ALL permission on a topic and you upgrade to a new version with
> DESCRIBE_CONFIGS, you now have DESCRIBE_CONFIGS permission on that topic.
> This would result in a lot of "unknowns" being displayed here, which might
> not be ideal.
>
> Also, there is an argument from intent-- the intention here is to let you
> know what you can do with a resource that already exists.  Knowing that you
> can do an unknown thing isn't very useful.  In contrast, for DescribeAcls,
> knowing that an ACL references an operation your software is too old to
> understand is useful (you may choose not to modify that ACL, since you
> don't know what it does, for example.)  What do you think?
>
> cheers,
> Colin
>
>
> >
> > Thank you,
> >
> > Rajini
> >
> >
> >
> > On Fri, Feb 22, 2019 at 6:46 PM Colin McCabe  wrote:
> >
> > > Hi Rajini,
> > >
> > > Thanks for the KIP!
> > >
> > > The KIP specifies that "Authorized operations will be returned as [an]
> > > INT8 consistent with [the] AclOperation used in ACL requests and
> > > responses."  But there may be more than one AclOperation that is
> applied to
> > > a given resource.  For example, a principal may have both READ and
> WRITE
> > > permission on a topic.
> > >
> > > One option for representing this would be a bitfield.  A 32-bit
> bitfield
> > > could have the appropriate bits set.  For example, if READ and WRITE
> > > operations were permitted, bits 3 and 4 could be set.
> > >
> > > Another thing to think about here is that certain AclOperations imply
> > > certain others.  For example, having WRITE on a topic gives you
> DESCRIBE on
> > > that topic as well automatically.  Does that mean that a topic with
> WRITE
> > > on it should automatically get DESCRIBE set in the bitfield?  I would
> argue
> > > that the answer is yes, for consistency's sake.
> > >
> > > We will inevitably add new AclOperations over time, and we have to
> think
> > > about how to do this in a compatible way.  The simplest approach would
> be
> > > to just leave out the new AclOperations when a describe request comes
> in
> > > from an older version client.  This should be spelled out in the
> > > compatibility section.
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Thu, Feb 21, 2019, at 02:28, Rajini Sivaram wrote:
> > > > I would like to start vote on KIP-430 to optionally obtain authorized
> > > > operations when describing resources:
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-430+-+Return+Authorized+Operations+in+Describe+Responses
> > > >
> > > > Thank you,
> > > >
> > > > Rajini
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-02-25 Thread Viktor Somogyi-Vass
Hey George,

Thanks for the prompt response, it makes sense. I'll try to keep your code
changes on top of my list and help reviewing that. :)
Regarding the incremental reassignment: I don't mind either to discuss it
as part of this or in a separate conversation but I think a separate one
could be better because both discussions can be long and keeping them
separated would limit the scope and make them more digestible and focused.
If the community decides to discuss it here then I think I'll put KIP-435
on hold or rejected and add my ideas here.
If the community decides to discuss it in a different KIP I think it's a
good idea to move the planned future work part into KIP-435 and rework
that. Maybe we can co-author it as I think both works could be
complementary to each other.
In any case I'd be absolutely interested in what others think.

A few questions regarding the rollback algorithm:
1. At step 2 how do you elect the leader?
1.1. Would it be always the original leader?
1.2. What if some brokers that are in OAR are down?
2. I still have doubts that we need to do the reassignment backwards during
rollback. For instance if we decide to cancel the reassignment at step #8
where replicas in OAR - RAR are offline and start the rollback, then how do
we make a replica from OAR online again before electing a leader as
described in step #2 of the rollback algorithm?
3. Does the algorithm defend against crashes? Is it able to continue after
a controller failover?
4. I think it would be a good addition if you could add few example
scenarios for rollback.

Best,
Viktor


On Fri, Feb 22, 2019 at 7:04 PM George Li  wrote:

> Hi Viktor,
>
>
> Thanks for reading and provide feedbacks on KIP-236.
>
>
> For reassignments, one can generate a json for new assignments and another
> json with "original" assignments for rollback purpose.  In production
> cluster, from our experience, we need to submit the reassignments in
> batches with throttle/staggering to minimize the impact to the cluster.
> Some large topic/partition couple with throttle can take pretty long time
> for the new replica to be in ISR to complete reassignment in that batch.
> Currently during this,  Kafka does not allow cancelling the pending
> reassignments cleanly.  Even you have the json with the "original"
> assignments to rollback, it has to wait till current reassignment to
> complete, then submit it as reassignments to rollback. If the current
> reassignment is causing impact to production, we would like the
> reassignments to be cancelled/rollbacked cleanly/safely/quickly.  This is
> the main goal of KIP-236.
>
>
> The original KIP-236 by Tom Bentley also proposed the incremental
> reassignments, to submit new reassignments while the current reassignments
> is still going on. This is scaled back to put under "Planned Future
> Changes" section of KIP-236, so we can expedite this Reassignment
> Cancellation/Rollback feature out to the community.
>
>
> The main idea incremental reassignment is to allow submit new
> reassignments in another ZK node /admin/reassign_partitions_queue  and
> merge it with current pending reassignments in /admin/reassign_partitions.
> In case of same topic/partition in both ZK node, the conflict resolution is
> to cancel the current reassignment in /admin/reassign_partitions, and move
> the same topic/partition from /admin/reassign_partitions_queue  as new
> reassignment.
>
>
> If there is enough interest from the community, this "Planned Future
> Changes" for incremental reassignments can also be delivered in KIP-236,
> otherwise, another KIP.  The current PR:
> https://github.com/apache/kafka/pull/6296  only focuses/addresses the
> pending Reassignment Cancellation/Rollback.
>
>
> Hope this answers your questions.
>
>
> Thanks,
> George
>
>
> On Friday, February 22, 2019, 6:51:14 AM PST, Viktor Somogyi-Vass <
> viktorsomo...@gmail.com> wrote:
>
>
> Read through the KIP and I have one comment:
>
> It seems like it is not looking strictly for cancellation but also
> implements rolling back to the original. I think it'd be much simpler to
> generate a reassignment json on cancellation that contains the original
> assignment and start a new partition reassignment completely. This way the
> reassignment algorithm (whatever it is) could be reused as a whole. Did you
> consider this or are there any obstacles that prevents doing this?
>
> Regards,
> Viktor
>
> On Fri, Feb 22, 2019 at 2:24 PM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com>
> wrote:
>
> > I've published the above mentioned KIP here:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-435%3A+Incremental+Partition+Reassignment
> > Will start a discussion about it soon.
> >
> > On Fri, Feb 22, 2019 at 12:45 PM Viktor Somogyi-Vass <
> > viktorsomo...@gmail.com> wrote:
> >
> >> Hi Folks,
> >>
> >> I also have a pending active work on the incremental partition
> >> reassignment stuff here:
> https://issues.apache.org/jira/browse/KAFKA-6794
> >> I think it would 

[jira] [Created] (KAFKA-7997) Replace SaslAuthenticate request/response with automated protocol

2019-02-25 Thread Mickael Maison (JIRA)
Mickael Maison created KAFKA-7997:
-

 Summary: Replace SaslAuthenticate request/response with automated 
protocol
 Key: KAFKA-7997
 URL: https://issues.apache.org/jira/browse/KAFKA-7997
 Project: Kafka
  Issue Type: Sub-task
Reporter: Mickael Maison
Assignee: Mickael Maison






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-131 - Add access to OffsetStorageReader from SourceConnector

2019-02-25 Thread Florian Hussonnois
Hi Kafka Team,

I'd like to bring this thread back at the top of the email stack to get a
chance to see this KIP merge in the next minor/major release.

Thanks.

Le ven. 18 janv. 2019 à 01:20, Florian Hussonnois  a
écrit :

> Hey folks,
>
> This KIP has start since a while but has never been merged.
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-131+-+Add+access+to+OffsetStorageReader+from+SourceConnector
> Would it be possible to restart the vote ? I still think this KIP could be
> useful to implement connectors (the PR has been rebased)
>
> Thanks,
>
> Le ven. 22 sept. 2017 à 09:36, Florian Hussonnois 
> a écrit :
>
>> Hi team,
>>
>> Are there any more votes  ? Thanks
>>
>> Le 12 sept. 2017 20:18, "Gwen Shapira"  a écrit :
>>
>>> Thanks for clarifying.
>>>
>>> +1 again :)
>>>
>>>
>>>
>>> On Sat, Sep 9, 2017 at 6:57 AM Randall Hauch  wrote:
>>>
>>> > Gwen,
>>> >
>>> > I've had more time to look into the code. First, the
>>> OffsetStorageReader
>>> > JavaDoc says: "OffsetStorageReader provides access to the offset
>>> storage
>>> > used by sources. This can be used by connectors to determine offsets to
>>> > start consuming data from. This is most commonly used during
>>> initialization
>>> > of a task, but can also be used during runtime, e.g. when
>>> reconfiguring a
>>> > task."
>>> >
>>> > Second, this KIP allows the SourceConnector implementations to access
>>> the
>>> > OffsetStorageReader, but really the SourceConnector methods are *only*
>>> > related to lifecycle changes when using the OffsetStorageReader would
>>> be
>>> > appropriate per the comment above.
>>> >
>>> > In summary, I don't think there is any concern about the
>>> > OffsetStorageReader being used inappropriate by the SourceConnector
>>> > implementations.
>>> >
>>> > Randall
>>> >
>>> > On Fri, Sep 8, 2017 at 9:46 AM, Gwen Shapira 
>>> wrote:
>>> >
>>> > > Basically, you are saying that the part where the comment says:
>>> "Offset
>>> > > data should only be read during startup or reconfiguration of a
>>> task."
>>> > > is incorrect? because the API extension allows reading offset data
>>> at any
>>> > > point in the lifecycle, right?
>>> > >
>>> > > On Fri, Sep 8, 2017 at 5:18 AM Florian Hussonnois <
>>> fhussonn...@gmail.com
>>> > >
>>> > > wrote:
>>> > >
>>> > > > Hi Shapira,
>>> > > >
>>> > > > We only expose the OffsetStorageReader to connector which relies on
>>> > > > KafkaOffsetBackingStore. The store continuesly consumes offsets
>>> from
>>> > > kafka
>>> > > > so I think we can't have stale data.
>>> > > >
>>> > > >
>>> > > > Le 8 sept. 2017 06:13, "Randall Hauch"  a écrit
>>> :
>>> > > >
>>> > > > > The KIP and PR expose the OffsetStorageReader, which is already
>>> > exposed
>>> > > > to
>>> > > > > the tasks. The OffsetStorageWriter is part of the
>>> implementation, but
>>> > > was
>>> > > > > not and is not exposed thru the API.
>>> > > > >
>>> > > > > > On Sep 7, 2017, at 9:04 PM, Gwen Shapira 
>>> > wrote:
>>> > > > > >
>>> > > > > > I just re-read the code for the OffsetStorageWriter, and ran
>>> into
>>> > > this
>>> > > > > > comment:
>>> > > > > >
>>> > > > > > * Note that this only provides write functionality. This is
>>> > > > > > intentional to ensure stale data is
>>> > > > > > * never read. Offset data should only be read during startup or
>>> > > > > > reconfiguration of a task. By
>>> > > > > > * always serving those requests by reading the values from the
>>> > > backing
>>> > > > > > store, we ensure we never
>>> > > > > > * accidentally use stale data. (One example of how this can
>>> occur:
>>> > a
>>> > > > > > task is processing input
>>> > > > > > * partition A, writing offsets; reconfiguration causes
>>> partition A
>>> > to
>>> > > > > > be reassigned elsewhere;
>>> > > > > > * reconfiguration causes partition A to be reassigned to this
>>> node,
>>> > > > > > but now the offset data is out
>>> > > > > > * of date). Since these offsets are created and managed by the
>>> > > > > > connector itself, there's no way
>>> > > > > > * for the offset management layer to know which keys are
>>> "owned" by
>>> > > > > > which tasks at any given
>>> > > > > > * time.
>>> > > > > >
>>> > > > > >
>>> > > > > > I can't figure out how the KIP avoids the stale-reads problem
>>> > > explained
>>> > > > > here.
>>> > > > > >
>>> > > > > > Can you talk me through it? I'm cancelling my vote since right
>>> now
>>> > > > > > exposing this interface sounds risky and misleading.
>>> > > > > >
>>> > > > > >
>>> > > > > > Gwen
>>> > > > > >
>>> > > > > >
>>> > > > > >> On Thu, Sep 7, 2017 at 5:04 PM Gwen Shapira <
>>> g...@confluent.io>
>>> > > > wrote:
>>> > > > > >>
>>> > > > > >> +1 (binding)
>>> > > > > >>
>>> > > > > >> Looking forward to see how connector implementations use this
>>> in
>>> > > > > practice
>>> > > > > >> :)
>>> > > > > >>
>>> > > > > >>> On Thu, Sep 7, 2017 at 3:49 PM Randall Hauch <
>>> rha...@gmail.com>
>>> > > > wrote:
>>> > > > > >>>
>>> > > > > >>> I'd like to open the vote for 

Re: [DISCUSS] KIP-434: Add Replica Fetcher and Log Cleaner Count Metrics

2019-02-25 Thread Viktor Somogyi-Vass
Hi Stanislav,

Thanks for the feedback and sharing that discussion thread.

I read your KIP and the discussion on it too and it seems like that'd cover
the same motivation I had with the log-cleaner-thread-count metric. This
supposed to tell the count of the alive threads which might differ from the
config (I could've used a better name :) ). Now I'm thinking that using
uncleanable-bytes, uncleanable-partition-count together with
time-since-last-run would mostly cover the motivation I have in this KIP,
however displaying the thread count (be it alive or dead) would still add
extra information regarding the failure, that a thread died during cleanup.

You had a very good idea about instead of the alive threads, display the
dead ones! That way we wouldn't need log-cleaner-current-live-thread-rate
just a "dead-log-cleaner-thread-count" as it it would make easy to trigger
warnings based on that (if it's even > 0 then we can say there's a
potential problem).
Doing this on the replica fetchers though would be a bit harder as the
number of replica fetchers is the (brokers-to-fetch-from *
fetchers-per-broker) and we don't really maintain the capacity information
or any kind of cluster information and I'm not sure we should. It would add
too much responsibility to the class and wouldn't be a rock-solid solution
but I guess I have to look into it more.

I don't think that restarting the cleaner threads would be helpful as the
problems I've seen mostly are non-recoverable and requires manual user
intervention and partly I agree what Colin said on the KIP-346 discussion
thread about the problems experienced with HDFS.

Best,
Viktor


On Fri, Feb 22, 2019 at 5:03 PM Stanislav Kozlovski 
wrote:

> Hey Viktor,
>
> First off, thanks for the KIP! I think that it is almost always a good idea
> to have more metrics. Observability never hurts.
>
> In regards to the LogCleaner:
> * Do we need to know log-cleaner-thread-count? That should always be equal
> to "log.cleaner.threads" if I'm not mistaken.
> * log-cleaner-current-live-thread-rate -  We already have the
> "time-since-last-run-ms" metric which can let you know if something is
> wrong with the log cleaning
> As you said, we would like to have these two new metrics in order to
> understand when a partial failure has happened - e.g only 1/3 log cleaner
> threads are alive. I'm wondering if it may make more sense to either:
> a) restart the threads when they die
> b) add a metric which shows the dead thread count. You should probably
> always have a low-level alert in the case that any threads have died
>
> We had discussed a similar topic about thread revival and metrics in
> KIP-346. Have you had a chance to look over that discussion? Here is the
> mailing discussion for that -
>
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201807.mbox/%3ccanzzngyr_22go9swl67hedcm90xhvpyfzy_tezhz1mrizqk...@mail.gmail.com%3E
>
> Best,
> Stanislav
>
>
>
> On Fri, Feb 22, 2019 at 11:18 AM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com> wrote:
>
> > Hi All,
> >
> > I'd like to start a discussion about exposing count gauge metrics for the
> > replica fetcher and log cleaner thread counts. It isn't a long KIP and
> the
> > motivation is very simple: monitoring the thread counts in these cases
> > would help with the investigation of various issues and might help in
> > preventing more serious issues when a broker is in a bad state. Such a
> > scenario that we seen with users is that their disk fills up as the log
> > cleaner died for some reason and couldn't recover (like log corruption).
> In
> > this case an early warning would help in the root cause analysis process
> as
> > well as enable detecting and resolving the problem early on.
> >
> > The KIP is here:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-434%3A+Add+Replica+Fetcher+and+Log+Cleaner+Count+Metrics
> >
> > I'd be happy to receive any feedback on this.
> >
> > Regards,
> > Viktor
> >
>
>
> --
> Best,
> Stanislav
>


Re: [DISCUSS] KIP-307: Allow to define custom processor names with KStreams DSL

2019-02-25 Thread Florian Hussonnois
Hi Matthias, Bill,

I've updated the KIP with your last feedbacks. However, you have suggested
to rename : `Suppressed#withName(String)`
withName is not a static method like Joined.named was. withName method is
part of the new interface NameOperation.

In addition, I've split the PR in 5 commits so that it will be much easier
to review.

Thanks

Le jeu. 21 févr. 2019 à 23:54, Bill Bejeck  a écrit :

> Hi Florian,
>
> Overall the KIP LGTM.  Once you've addressed the final comments from
> Matthias I think we can put this up for a vote.
>
> Thanks,
> Bill
>
> On Wed, Feb 20, 2019 at 9:42 PM Matthias J. Sax 
> wrote:
>
> > Florian,
> >
> > thanks for updating the KIP (and no worries for late reply -- 2.2
> > release kept us busy anyway...). Overall LGTM.
> >
> > Just some nits:
> >
> >
> > KStream-Table:
> >
> > Do we need to list the existing stream-globalTable join methods in the
> > first table (thought it should only contain new/changing methods).
> >
> > typo: `join(GlobalKTbale, KeyValueMapper, ValueJoiner, Named)`
> >
> > `leftJoin(GlobalKTable, KeyValueMapper, ValueJoiner)` is missing the new
> > `Named` parameter.
> >
> > `static Joined#named(final String name)`
> >  -> should be `#as(...)` instead of `named(...)`
> >
> > flatTransform() is missing (cf. KIP-313)
> >
> >
> >
> > KTable-table:
> >
> > `Suppressed#withName(String)`
> >  -> should we change this to `#as(...)` too (similar to `named()`)
> >
> >
> >
> > -Matthias
> >
> >
> >
> > On 1/25/19 9:49 AM, Matthias J. Sax wrote:
> > > I was reading the KIP again, and there are still some open question and
> > > inconsistencies:
> > >
> > > For example for `KGroupedStream#count(Named)` the KIP says, that only
> > > the processor will be named, while the state store name will be `PREFIX
> > > + COUNT` (ie, an auto-generated name). Additionally, for
> > > `KGroupedStream#count(Named, Materialized)` the processor will be named
> > > according to `Named` and the store will be named according to
> > > `Materialized.as()`. So far so good. It implies that naming the
> > > processor and naming the store are independent. (This pattern is
> applied
> > > to all aggregation functions, for KStream and KTable).
> > >
> > > However, for `KTable#filter(Predicate, Named)` the KIP says, the
> > > processor name and the store name are set. This sound wrong (ie,
> > > inconsistent with the first paragraph from above), because there is
> also
> > > `KTable#filter(Predicate, Named, Materialized)`. Also note, for the
> > > first operator, the store might not be materialized to at all. (This
> > > issue is there for all KTable operators -- stateless and stateful).
> > >
> > > Finally, there is the following statement in the KIP:
> > >
> > >> Also, note that for all methods accepting a Materialized argument, if
> > no state store named is provided then the node named will be used to
> > generate a one. The state store name will be the node name suffixed with
> > "-table".
> > >
> > >
> > > This contradict the non-naming of stores from the very beginning.
> > >
> > >
> > > Also, the KIP still contains the question about `join(GlobalKTable,
> > > KeyValueMapper, ValueJoiner)` and `leftJoin(GlobalKTable,
> > > KeyValueMapper, ValueJoiner)`. I think a consistent approach would be
> to
> > > add one overload each that takes a `Named` parameter.
> > >
> > >
> > > Thoughts?
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 1/17/19 2:56 PM, Bill Bejeck wrote:
> > >> +1 for me on Guozhang's proposal for changes to Joined.
> > >>
> > >> Thanks,
> > >> Bill
> > >>
> > >> On Thu, Jan 17, 2019 at 5:55 PM Matthias J. Sax <
> matth...@confluent.io>
> > >> wrote:
> > >>
> > >>> Thanks for all the follow up comments!
> > >>>
> > >>> As I mentioned earlier, I am ok with adding overloads instead of
> using
> > >>> Materialized to specify the processor name. Seems this is what the
> > >>> majority of people prefers.
> > >>>
> > >>> I am also +1 on Guozhang's suggestion to deprecate `static
> > >>> Joined#named()` and replace it with `static Joined#as` for
> consistency
> > >>> and to deprecate getter `Joined#name()` for removal and introduce
> > >>> `JoinedInternal` to access the name.
> > >>>
> > >>> @Guozhang: the vote is already up :)
> > >>>
> > >>>
> > >>> -Matthias
> > >>>
> > >>> On 1/17/19 2:45 PM, Guozhang Wang wrote:
> >  Wow that's a lot of discussions in 6 days! :) Just catching up and
> > >>> sharing
> >  my two cents here:
> > 
> >  1. Materialized: I'm inclined to not let Materialized extending
> Named
> > and
> >  add the overload as well. All the rationales have been very well
> > >>> summarized
> >  before. Just to emphasize on John's points: Materialized is
> > considered as
> >  the control object being leveraged by the optimization framework to
> >  determine if the state store should be physically materialized or
> > not. So
> >  let's say if the user does not want to query the store (hence it can
> > just
> >  be locally 

Build failed in Jenkins: kafka-trunk-jdk8 #3415

2019-02-25 Thread Apache Jenkins Server
See 


Changes:

[manikumar] KAFKA-7972: Use automatic RPC generation in SaslHandshake

--
[...truncated 4.63 MB...]
org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestampWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED


[jira] [Created] (KAFKA-7996) KafkaStreams does not pass timeout when closing Producer

2019-02-25 Thread Patrik Kleindl (JIRA)
Patrik Kleindl created KAFKA-7996:
-

 Summary: KafkaStreams does not pass timeout when closing Producer
 Key: KAFKA-7996
 URL: https://issues.apache.org/jira/browse/KAFKA-7996
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 2.1.0
Reporter: Patrik Kleindl


[https://confluentcommunity.slack.com/messages/C48AHTCUQ/convo/C48AHTCUQ-1550831721.026100/]

We are running 2.1 and have a case where the shutdown of a streams application 
takes several minutes
I noticed that although we call streams.close with a timeout of 30 seconds the 
log says
[Producer 
clientId=…-8be49feb-8a2e-4088-bdd7-3c197f6107bb-StreamThread-1-producer] 
Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.

Matthias J Sax [vor 3 Tagen]
I just checked the code, and yes, we don't provide a timeout for the producer 
on close()...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)