Jenkins build is back to normal : kafka-trunk-jdk8 #3758

2019-06-28 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-484: Expose metrics for group and transaction metadata loading duration

2019-06-28 Thread Gwen Shapira
Hey,

Thank you for proposing this! Sounds really useful - we have
definitely seem some difficult to explain pauses in consumer activity
and this metric will let us correlate those.

Few questions:
1. Did you consider adding both Max and Avg metrics? Many of our
metrics have both (batch-size and message-size for example) and it
helps put the max value in context.
2. You wrote: "Lengthening or shortening the 3 hour time window is up
for discussion (default is 30sec)."  and I'm not sure what default you
are referring to?
3. Can you also give some background on why you are proposing 3h? I'm
guessing it is because loading the state from the topic happens rarely
enough that in 3h it will probably only happen once or not at all?
Perhaps we need a rate metric to see how often it actually happens (if
we have to reload offsets very often it is a different problem).

Gwen

On Tue, Jun 25, 2019 at 4:43 PM Anastasia Vela  wrote:
>
> Hi all,
>
> I'd like to discuss KIP-484:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-484%3A+Expose+metrics+for+group+and+transaction+metadata+loading+duration
>
> Let me know what you think!
>
> Thanks,
> Anastasia



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog


Build failed in Jenkins: kafka-2.3-jdk8 #57

2019-06-28 Thread Apache Jenkins Server
See 


Changes:

[cmccabe] MINOR: add upgrade text (#7013)

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on ubuntu-2 (ubuntu trusty) in workspace 

[WS-CLEANUP] Deleting project workspace...
[WS-CLEANUP] Deferred wipeout is used...
No credentials specified
Cloning the remote Git repository
Cloning repository https://github.com/apache/kafka.git
 > git init  # timeout=10
Fetching upstream changes from https://github.com/apache/kafka.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # 
 > timeout=10
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
Fetching upstream changes from https://github.com/apache/kafka.git
 > git fetch --tags --progress https://github.com/apache/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/2.3^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/2.3^{commit} # timeout=10
Checking out Revision 78d1b06190b4a9c395e53e4728073c16e9a5d7f8 
(refs/remotes/origin/2.3)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 78d1b06190b4a9c395e53e4728073c16e9a5d7f8
Commit message: "MINOR: add upgrade text (#7013)"
 > git rev-list --no-walk dd31a1a0f44a68b99f4a7cb533e097d99a126209 # timeout=10
Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1
Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1
[kafka-2.3-jdk8] $ /bin/bash -xe /tmp/jenkins2967957737735072897.sh
+ rm -rf 
+ /home/jenkins/tools/gradle/4.8.1/bin/gradle
/tmp/jenkins2967957737735072897.sh: line 4: 
/home/jenkins/tools/gradle/4.8.1/bin/gradle: No such file or directory
Build step 'Execute shell' marked build as failure
[FINDBUGS] Collecting findbugs analysis files...
Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1
[FINDBUGS] Searching for all files in 
 that match the pattern 
**/build/reports/findbugs/*.xml
[FINDBUGS] No files found. Configuration error?
Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1
No credentials specified
Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1
 Using GitBlamer to create author and commit information for all 
warnings.
 GIT_COMMIT=78d1b06190b4a9c395e53e4728073c16e9a5d7f8, 
workspace=
[FINDBUGS] Computing warning deltas based on reference build #56
Recording test results
Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1


[jira] [Created] (KAFKA-8616) Replace ApiVersionsRequest request/response with automated protocol

2019-06-28 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8616:
--

 Summary: Replace ApiVersionsRequest request/response with 
automated protocol
 Key: KAFKA-8616
 URL: https://issues.apache.org/jira/browse/KAFKA-8616
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-480 : Sticky Partitioner

2019-06-28 Thread Justine Olshan
Hi all,
I made some changes to the KIP.
The idea is to clean up the code, make behavior more explicit, provide more
flexibility, and to keep default behavior the same.

Now we will change the partition in onNewBatch, and specify the conditions
for this function call (non-keyed values, no explicit partitions) in
willCallOnNewBatch.
This clears up some of the issues with the implementation. I'm happy to
hear further opinions and discuss this change!

Thank you,
Justine

On Thu, Jun 27, 2019 at 2:53 PM Colin McCabe  wrote:

> On Thu, Jun 27, 2019, at 01:31, Ismael Juma wrote:
> > Thanks for the KIP Justine. It looks pretty good. A few comments:
> >
> > 1. Should we favor partitions that are not under replicated? This is
> > something that Netflix did too.
>
> This seems like it could lead to cascading failures, right?  If a
> partition becomes under-replicated because there is too much traffic, the
> producer stops sending to it, which puts even more load on the remaining
> partitions, which are even more likely to fail then, etc.  It also will
> create unbalanced load patterns on the consumers.
>
> >
> > 2. If there's no measurable performance difference, I agree with
> Stanislav
> > that Optional would be better than Integer.
> >
> > 3. We should include the javadoc for the newly introduced method that
> > specifies it and its parameters. In particular, it would good to specify
> if
> > it gets called when an explicit partition id has been provided.
>
> Agreed.
>
> best,
> Colin
>
> >
> > Ismael
> >
> > On Mon, Jun 24, 2019, 2:04 PM Justine Olshan 
> wrote:
> >
> > > Hello,
> > > This is the discussion thread for KIP-480: Sticky Partitioner.
> > >
> > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner
> > >
> > > Thank you,
> > > Justine Olshan
> > >
> >
>


Re: [DISCUSS] KIP-213: Second follow-up on Foreign Key Joins

2019-06-28 Thread John Roesler
Woah. It's your turn to make my head hurt!

I think we can make one simplifying assumption: we will probably never
need a second version. We're just hedging in case we do. Recursively,
if we need a second one, then we'll probably never need a third one,
etc. In other words, I wouldn't worry too much about the version
management problem. It's just important that we have the _ability_ to
update the protocol later on.

Second: there doesn't seem to be any utility in mapping between
protocol versions and release versions. As long as you're getting a
message with a version you understand, it's all good.

So to the main question, I don't think we need to worry about how
future versions may or may not use the instructions enum. In the
initial version, it's enough just to add the version number to the
serial format, and then do something reasonable if we get a version we
don't understand, like log an error and shut down.

To design beyond this, we'd have to speculate about what kind of
modifications we might have to make later, and we just don't know how
accurate those speculations would be until it happens.

For the actual version, I think a numberical Byte is fine, otherwise,
we'll have a proliferation of Version enums.

How does that sound?
-John

On Fri, Jun 28, 2019 at 9:59 AM Adam Bellemare  wrote:
>
> Just some thoughts around the versioning. I'm trying to work out a more
> elegant way to handle it than what I've come up with so far below.
>
>
> *1) I'm planning to use an enum for the versions, but I am not sure how to
> tie the versions to any particular release. For instance, something like
> this is doable, but may become more complex as versions go on. See 2 & 3
> below.*
>
> public enum Version {
>   V0((byte) 0x00),  //2.4.0
>   V1((byte) 0x01);  //2.5.0
> }
>
> *2) Keeping track of compatibility is a bit tricky. For instance, how do we
> know which messages are compatible and which are breaking? Which upgrade
> paths do we handle and which ones do we not?  How long do we handle support
> for old message versions? 2 minor releases? 1 major release?*
>
> For example:
> Version 2.4.0:   V0
> Version 2.5.0:   V0, V1  (V1 processor must also handle V0).
> Version 2.6.0:   V0?, V1, V2  (Can we now drop support for V0? What happens
> if someone upgrades from 2.4.0 to 2.6.0 directly and it's not supported?)
>
> *3) Does the RHS 1:1 processor, which I currently call
> `ForeignKeySingleLookupProcessorSupplier`, basically end up looking like
> this?*
> if (value.version == Version.V0)
>   //do V0 processing
> else if (value.version == Version.V1)
>   //do V1 processing
> else if (value.version == Version.V2)
>   //do V2 processing
> 
>
> The tricky part becomes keeping all the Instructions straight for each
> Version. For instance, one option (*OPTION-A*) is:
> //Version = 2.4.0
> enum Instruction {  A, B, C; }
>
> //Version = 2.5.0
> enum Instruction {  A, B, C, //Added in 2.4.0, Value.Version = V0
>D; //Added in 2.5.0, Value.Version = V1.
> Also uses Value.Version = V0 instructions.
> }
>
> //Version = 2.6.0
> enum Instruction{  A, B, C, //Added in 2.4.0, Value.Version = V0. *Don't
> use for V2*
>D, //Added in 2.5.0, Value.Version = V1.
> Also uses Value.Version = V0 instructions. *Don't use for V2*
>E,F,G,H,I,J; //Added in 2.6.0, Value.Version
> = V2.
> }
>
> Alternatively, something like this (*OPTION-B*), where the Version and the
> Instruction are tied together in the Instruction itself.
>
> enum Instruction{  V0_A, V0_B, V0_C, //Added in 2.4.0
>   V1_A, V1_B, V1_C, V1_D, //Added in 2.5.0
>   V2_E, V2_F, V2_G, V2_H, V2_I;  //Added in
> 2.6.0
> }
> At this point our logic in the `ForeignKeySingleLookupProcessorSupplier`
> looks something like this:
> if (value.version == Version.V0) {
> if (value.instruction == Instruction.V0_A) ...
> else if (value.instruction == Instruction.V0_B) ...
> else if (value.instruction == Instruction.V0_C) ...
> else ...
> } else if (value.version == Version.V1) {
> if (value.instruction == Instruction.V1_A) ...
> else if (value.instruction == Instruction.V1_B) ...
> else if (value.instruction == Instruction.V1_C) ...
> else if (value.instruction == Instruction.V1_D) ...
> else ...
> } else if ...
>
> I prefer option B because Instruction meaning can change between versions,
> especially in scenarios where we may be reworking, say, 2 instructions into
> 4 instructions that aren't neat subsets of the original 2.
>
>
> *4) If we hard-stop on incompatible events (say we don't support that
> version), how does the user go about handling the upgrade? *
> We can't ignore the events as it would ruin the delivery guarantees. At
> this point it seems to me that they would have to do a full streams reset
> for that applicationId. Am I incorrect in this?
>
>
> Adam
>
>
> On Fri, Jun 28, 2019 at 9:19 

[jira] [Resolved] (KAFKA-8492) Modify ConsumerCoordinator Algorithm with incremental protocol (part 2)

2019-06-28 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-8492.

   Resolution: Fixed
Fix Version/s: 2.4.0

> Modify ConsumerCoordinator Algorithm with incremental protocol (part 2)
> ---
>
> Key: KAFKA-8492
> URL: https://issues.apache.org/jira/browse/KAFKA-8492
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Major
> Fix For: 2.4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8615) Change to track partition time breaks TimestampExtractor

2019-06-28 Thread Bill Bejeck (JIRA)
Bill Bejeck created KAFKA-8615:
--

 Summary: Change to track partition time breaks TimestampExtractor
 Key: KAFKA-8615
 URL: https://issues.apache.org/jira/browse/KAFKA-8615
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Bill Bejeck
Assignee: Bill Bejeck


>From the users mailing list
{noformat}
am testing the new version 2.3 for Kafka Streams specifically. I have
noticed that now, the implementation of the method extract from the
interface org.apache.kafka.streams.processor.TimestampExtractor

*public* *long* extract(ConsumerRecord record, *long*
previousTimestamp)


is always returning -1 as value.


Previous version 2.2.1 was returning the correct value for the record
partition.
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: kafka-trunk-jdk8 #3757

2019-06-28 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Add compatibility tests for 2.3.0 (#6995)

--
[...truncated 2.55 MB...]

org.apache.kafka.connect.json.JsonConverterTest > stringHeaderToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonNonStringKeys STARTED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonNonStringKeys PASSED

org.apache.kafka.connect.json.JsonConverterTest > longToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > longToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > mismatchSchemaJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > mismatchSchemaJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToConnectNonStringKeys 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > mapToConnectNonStringKeys 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation PASSED

org.apache.kafka.connect.json.JsonConverterTest > bytesToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > bytesToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > shortToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > shortToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > intToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > intToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndNullValueToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndNullValueToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnectOptional 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnectOptional 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > structToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > structToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > stringToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > stringToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndArrayToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndArrayToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > byteToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > byteToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaPrimitiveToConnect 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaPrimitiveToConnect 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > byteToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > byteToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > intToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > intToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > dateToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > dateToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
decimalToConnectOptionalWithDefaultValue STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
decimalToConnectOptionalWithDefaultValue PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonStringKeys STARTED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonStringKeys PASSED

org.apache.kafka.connect.json.JsonConverterTest > arrayToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > arrayToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > timeToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > timeToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > structToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > structToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
testConnectSchemaMetadataTranslation STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
testConnectSchemaMetadataTranslation PASSED

org.apache.kafka.connect.json.JsonConverterTest > shortToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > shortToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > dateToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > dateToConnect PASSED


Re: [DISCUSS] KIP-478 Strongly Typed Processor API

2019-06-28 Thread Bill Bejeck
Sorry for coming late to the party.

As for the naming I'm in favor of RecordProcessor as well.

I agree that we should not take on doing all of the package movements as
part of this KIP, especially as John has pointed out, it will be an
opportunity to discuss some clean-up on individual classes which I envision
becoming another somewhat involved process.

For the end goal, if possible, here's what I propose.

   1. We keep the scope of the KIP the same, *but we only implement* *it in
   phases*
   2. Phase one could include what Guozhang had proposed earlier namely
   1. > 1.a) modifying ProcessorContext only with the output types on
  forward.
  > 1.b) modifying Transformer signature to have generics of
  ProcessorContext,
  > and then lift the restricting of not using punctuate: if user did
  not
  > follow the enforced typing and just code without generics, they
  will get
  > warning at compile time and get run-time error if they forward
  wrong-typed
  > records, which I think would be acceptable.
   3. Then we could tackle other pieces in an incremental manner as we see
   what makes sense

Just my 2cents

-Bill

On Mon, Jun 24, 2019 at 10:22 PM Guozhang Wang  wrote:

> Hi John,
>
> Yeah I think we should not do all the repackaging as part of this KIP as
> well (we can just do the movement of the Processor / ProcessorSupplier),
> but I think we need to discuss the end goal here since otherwise we may do
> the repackaging of Processor in this KIP, but only later on realizing that
> other re-packagings are not our favorite solutions.
>
>
> Guozhang
>
> On Mon, Jun 24, 2019 at 3:06 PM John Roesler  wrote:
>
> > Hey Guozhang,
> >
> > Thanks for the idea! I'm wondering if we could take a middle ground
> > and take your proposed layout as a "roadmap", while only actually
> > moving the classes that are already involved in this KIP.
> >
> > The reason I ask is not just to control the scope of this KIP, but
> > also, I think that if we move other classes to new packages, we might
> > also want to take the opportunity to clean up other things about them.
> > But each one of those would become a discussion point of its own, so
> > it seems the discussion would become intractable. FWIW, I do like your
> > idea for precisely this reason, it creates opportunities for us to
> > consider other changes that we are simply not able to make without
> > breaking source compatibility.
> >
> > If the others feel "kind of favorable" with this overall vision, maybe
> > we can make one or more Jira tickets to capture it, and then just
> > alter _this_ proposal to `processor.api.Processor` (etc).
> >
> > WDYT?
> > -John
> >
> > On Sun, Jun 23, 2019 at 7:17 PM Guozhang Wang 
> wrote:
> > >
> > > Hello John,
> > >
> > > Thanks for your detailed explanation, I've done some quick checks on
> some
> > > existing examples that heavily used Processor and the results also
> makes
> > me
> > > worried about my previous statements that "the breakage would not be
> > big".
> > > I agree we should maintain compatibility.
> > >
> > > About the naming itself, I'm actually a bit inclined into sub-packages
> > than
> > > renamed new classes, and my motivations are that our current packaging
> is
> > > already quite coarsen grained and sometimes ill-placed, and hence maybe
> > we
> > > can take this change along with some clean up on packages (but again,
> we
> > > should follow the deprecate - removal path). What I'm thinking is:
> > >
> > > ---
> > >
> > > processor/: StateRestoreCallback/AbstractNotifyingRestoreCallback,
> > (deprecated
> > > later, same meaning for other cross-throughs), ProcessContest,
> > > RecordContext, Punctuator, PunctuationType, To, Cancellable (are the
> only
> > > things left)
> > >
> > > (new) processor/api/: Processor, ProcessorSupplier (and of course,
> these
> > > two classes can be strong typed)
> > >
> > > state/: StateStore, BatchingStateRestoreCallback,
> > > AbstractNotifyingBatchingRestoreCallback (moved from processor/),
> > > PartitionGrouper, WindowStoreIterator, StateSerdes (this one can be
> moved
> > > into state/internals), TimestampedByteStore (we can move this to
> > internals
> > > since store types would use vat by default, see below),
> ValueAndTimestamp
> > >
> > > (new) state/factory/: Stores, StoreBuilder, StoreSupplier; *BUT* the
> new
> > > Stores would not have timestampedXXBuilder APIs since the default
> > > StoreSupplier / StoreBuilder value types are ValueAndTimestamp already.
> > >
> > > (new) state/queryable/: QueryableStoreType, QueryableStoreTypes,
> HostInfo
> > >
> > > (new) state/keyValue/: KeyValueXXX classes, and also the same for
> > > state/sessionWindow and state/timeWindow; *BUT* here we use
> > > ValueAndTimestamp as value types of those APIs directly, and also
> > > TimestampedKeyValue/WindowStore would be deprecated.
> > >
> > > (new) kstream/api/: KStream, KTable, GroupedKStream (renamed from
> > > 

Re: Request for Permission to Create KIP

2019-06-28 Thread Jun Rao
Hi, Maulin,

Thanks for  your interest. Just gave you the wiki permission.

Jun

On Thu, Jun 27, 2019 at 7:16 PM Maulin Vasavada 
wrote:

> Hi
>
> Can you please give me permission to Create KIP?
>
> My username: maulin.vasavada
>
> Thank you.
> Maulin
>


Re: [VOTE] KIP-429: Kafka Consumer Incremental Rebalance Protocol

2019-06-28 Thread Sophie Blee-Goldman
It is now! Also updated the KIP to reflect that we will be adding a new
CooperativeStickyAssignor rather than making the existing StickyAssignor
cooperative to prevent users who already use the StickyAssignor from
blindly upgrading and hitting potential problems during a rolling bounce

On Thu, Jun 27, 2019 at 8:15 PM Boyang Chen 
wrote:

> Thank you Sophie for the update. Is this also reflected on the KIP?
>
> On Thu, Jun 27, 2019 at 3:28 PM Sophie Blee-Goldman 
> wrote:
>
> > We would like to tack on some rebalance-related metrics as part of this
> KIP
> > as well. The details can be found in the sub-task JIRA:
> > https://issues.apache.org/jira/browse/KAFKA-8609
> >
> > On Thu, May 30, 2019 at 5:09 PM Guozhang Wang 
> wrote:
> >
> > > +1 (binding) from me as well.
> > >
> > > Thanks to everyone who have voted! I'm closing this vote thread with a
> > > tally:
> > >
> > > binding +1: 3 (Guozhang, Harsha, Matthias)
> > >
> > > non-binding +1: 2 (Boyang, Liquan)
> > >
> > >
> > > Guozhang
> > >
> > > On Wed, May 22, 2019 at 9:22 PM Matthias J. Sax  >
> > > wrote:
> > >
> > > > +1 (binding)
> > > >
> > > >
> > > > On 5/22/19 7:37 PM, Harsha wrote:
> > > > > +1 (binding). Thanks for the KIP looking forward for this to be
> > > avaiable
> > > > in consumers.
> > > > >
> > > > > Thanks,
> > > > > Harsha
> > > > >
> > > > > On Wed, May 22, 2019, at 12:24 AM, Liquan Pei wrote:
> > > > >> +1 (non-binding)
> > > > >>
> > > > >> On Tue, May 21, 2019 at 11:34 PM Boyang Chen  >
> > > > wrote:
> > > > >>
> > > > >>> Thank you Guozhang for all the hard work.
> > > > >>>
> > > > >>> +1 (non-binding)
> > > > >>>
> > > > >>> 
> > > > >>> From: Guozhang Wang 
> > > > >>> Sent: Wednesday, May 22, 2019 1:32 AM
> > > > >>> To: dev
> > > > >>> Subject: [VOTE] KIP-429: Kafka Consumer Incremental Rebalance
> > > Protocol
> > > > >>>
> > > > >>> Hello folks,
> > > > >>>
> > > > >>> I'd like to start the voting for KIP-429 now, details can be
> found
> > > > here:
> > > > >>>
> > > > >>>
> > > > >>>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-429%3A+Kafka+Consumer+Incremental+Rebalance+Protocol#KIP-429:KafkaConsumerIncrementalRebalanceProtocol-RebalanceCallbackErrorHandling
> > > > >>>
> > > > >>> And the on-going PRs available for review:
> > > > >>>
> > > > >>> Part I: https://github.com/apache/kafka/pull/6528
> > > > >>> Part II: https://github.com/apache/kafka/pull/6778
> > > > >>>
> > > > >>>
> > > > >>> Thanks
> > > > >>> -- Guozhang
> > > > >>>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Liquan Pei
> > > > >> Software Engineer, Confluent Inc
> > > > >>
> > > >
> > > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>


[jira] [Created] (KAFKA-8614) Rename the `responses` field of IncrementalAlterConfigsResponse to match AlterConfigs

2019-06-28 Thread Bob Barrett (JIRA)
Bob Barrett created KAFKA-8614:
--

 Summary: Rename the `responses` field of 
IncrementalAlterConfigsResponse to match AlterConfigs
 Key: KAFKA-8614
 URL: https://issues.apache.org/jira/browse/KAFKA-8614
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 2.3.0
Reporter: Bob Barrett


IncrementalAlterConfigsResponse and AlterConfigsResponse have an identical 
structure for per-resource error codes, but in AlterConfigsResponse it is named 
`Resources` while in IncrementalAlterConfigsResponse it is named `responses`.

AlterConfigsResponse:

{{{ "name": "Resources", "type": "[]AlterConfigsResourceResponse", "versions": 
"0+",}}
{{"about": "The responses for each resource.", "fields": [}}
{{{ "name": "ErrorCode", "type": "int16", "versions": "0+",}}
{{"about": "The resource error code." },}}
{{{ "name": "ErrorMessage", "type": "string", "nullableVersions": "0+", 
"versions": "0+",}}
{{"about": "The resource error message, or null if there was no error." },}}
{{{ "name": "ResourceType", "type": "int8", "versions": "0+",}}
{{"about": "The resource type." },}}
{{{ "name": "ResourceName", "type": "string", "versions": "0+",}}
{{"about": "The resource name." }}}

IncrementalAlterConfigsResponse:

{{ { "name": "responses", "type": "[]AlterConfigsResourceResult", "versions": 
"0+",}}
{{ "about": "The responses for each resource.", "fields": [}}
{{ { "name": "ErrorCode", "type": "int16", "versions": "0+",}}
{{ "about": "The resource error code." },}}
{{ { "name": "ErrorMessage", "type": "string", "nullableVersions": "0+", 
"versions": "0+",}}
{{ "about": "The resource error message, or null if there was no error." },}}
{{ { "name": "ResourceType", "type": "int8", "versions": "0+",}}
{{ "about": "The resource type." },}}
{{ { "name": "ResourceName", "type": "string", "versions": "0+",}}
{{ "about": "The resource name." }}}
{{ ]}}}

We should change the field in IncrementalAlterConfigsResponse to be consistent 
with AlterConfigsResponse.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-485 Make topic optional when using through() operations in DSL

2019-06-28 Thread Matthias J. Sax
Levani,

Thanks for the KIP!

As mentioned on the Jira, I think we should discard this KIP and subsume
it in KIP-221. Nobody is actively working on KIP-221, so feel free to
pick it up and we move the discussion to the other thread.

-Matthias

On 6/28/19 2:58 AM, Levani Kokhreidze wrote:
> Hello,
> 
> I would like to start discussion about KIP-485 Make topic optional when using 
> through() operations in DSL 
> 
> 
> Regards,
> Levani
> 



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] KIP-435: Internal Partition Reassignment Batching

2019-06-28 Thread Stanislav Kozlovski
Hey there Viktor,

Thanks for working on this KIP! I agree with the notion that reliability,
stability and predictability of a reassignment should be a core feature of
Kafka.

Let me first explicitly confirm my understanding of the configs and the
algorithm:
* reassignment.parallel.replica.count - the maximum total number of
replicas that we can move at once, *per partition*
* reassignment.parallel.partition.count - the maximum number of partitions
we can move at once
* reassignment.parallel.leader.movements - the maximum number of leader
movements we can have at once

As far as I currently understand it, your proposed algorithm will naturally
prioritize leader movement first. e.g if
reassignment.parallel.replica.count=1 and
reassignment.parallel.partition.count==reassignment.parallel.leader.movements,
we will always move one, the first possible, replica in the replica set
(which will be the leader if part of the excess replica set (ER)).
Am I correct in saying that?

Regarding the KIP, I've got a couple of comments/questions::

1. Does it make sense to add `max` somewhere in the configs' names?

2. How does this KIP play along with KIP-455's notion of multiple
rebalances - do the configs apply on a
single AlterPartitionAssignmentsRequest or are they global?

3. Unless I've missed it, the algorithm does not take into account
`reassignment.parallel.leader.movements`

4. The KIP says that the order of the input has some control over how the
batches are created - i.e it's deterministic. What would the batches of the
following reassignment look like:
reassignment.parallel.replica.count=1
reassignment.parallel.partition.count=MAX_INT
reassignment.parallel.leader.movements=1
partitionA - (0,1,2) -> (3, 4, 5)
partitionB - (0,1,2) -> (3,4,5)
partitionC - (0,1,2) -> (3, 4, 5)
>From my understanding, we would start with A(0->3), B(1->4) and C(1->4). Is
that correct? Would the second step then continue with B(0->3)?

If the configurations are global, I can imagine we will have a bit more
trouble in preserving the expected ordering, especially across controller
failovers -- but I'll avoid speculating until you confirm the scope of the
config.

5. Regarding the new behavior of electing the new preferred leader in the
"first step" of the reassignment - does this obey the
`auto.leader.rebalance.enable` config?
If not, I have concerns regarding how backwards compatible this might be -
e.g imagine a user does a huge reassignment (as they have always done) and
suddenly a huge leader shift happens, whereas the user expected to manually
shift preferred leaders at a slower rate via
the kafka-preferred-replica-election.sh tool.

6. What is the expected behavior if we dynamically change one of the
configs to a lower value while a reassignment is happening. Would we cancel
some of the currently reassigned partitions or would we account for the new
values on the next reassignment? I assume the latter but it's good to be
explicit

As some small nits:
- could we have a section in the KIP where we explicitly define what each
config does? This can be inferred from the KIP as is but requires careful
reading, whereas some developers might want to skim through the change to
get a quick sense. It also improves readability but that's my personal
opinion.
- could you better clarify how a reassignment step is different from the
currently existing algorithm? maybe laying both algorithms out in the KIP
would be most clear
- the names for the OngoingPartitionReassignment and
CurrentPartitionReassignment fields in the
ListPartitionReassignmentsResponse are a bit confusing to me.
Unfortunately, I don't have a better suggestion, but maybe somebody else in
the community has.

Thanks,
Stanislav

On Thu, Jun 27, 2019 at 3:24 PM Viktor Somogyi-Vass 
wrote:

> Hi All,
>
> I've renamed my KIP as its name was a bit confusing so we'll continue it in
> this thread.
> The previous thread for record:
>
> https://lists.apache.org/thread.html/0e97e30271f80540d4da1947bba94832639767e511a87bb2ba530fe7@%3Cdev.kafka.apache.org%3E
>
> A short summary of the KIP:
> In case of a vast partition reassignment (thousands of partitions at once)
> Kafka can collapse under the increased replication traffic. This KIP will
> mitigate it by introducing internal batching done by the controller.
> Besides putting a bandwidth limit on the replication it is useful to batch
> partition movements as fewer number of partitions will use the available
> bandwidth for reassignment and they finish faster.
> The main control handles are:
> - the number of parallel leader movements,
> - the number of parallel partition movements
> - and the number of parallel replica movements.
>
> Thank you for the feedback and ideas so far in the previous thread and I'm
> happy to receive more.
>
> Regards,
> Viktor
>


-- 
Best,
Stanislav


Re: [DISCUSS] KIP-213: Second follow-up on Foreign Key Joins

2019-06-28 Thread Adam Bellemare
Just some thoughts around the versioning. I'm trying to work out a more
elegant way to handle it than what I've come up with so far below.


*1) I'm planning to use an enum for the versions, but I am not sure how to
tie the versions to any particular release. For instance, something like
this is doable, but may become more complex as versions go on. See 2 & 3
below.*

public enum Version {
  V0((byte) 0x00),  //2.4.0
  V1((byte) 0x01);  //2.5.0
}

*2) Keeping track of compatibility is a bit tricky. For instance, how do we
know which messages are compatible and which are breaking? Which upgrade
paths do we handle and which ones do we not?  How long do we handle support
for old message versions? 2 minor releases? 1 major release?*

For example:
Version 2.4.0:   V0
Version 2.5.0:   V0, V1  (V1 processor must also handle V0).
Version 2.6.0:   V0?, V1, V2  (Can we now drop support for V0? What happens
if someone upgrades from 2.4.0 to 2.6.0 directly and it's not supported?)

*3) Does the RHS 1:1 processor, which I currently call
`ForeignKeySingleLookupProcessorSupplier`, basically end up looking like
this?*
if (value.version == Version.V0)
  //do V0 processing
else if (value.version == Version.V1)
  //do V1 processing
else if (value.version == Version.V2)
  //do V2 processing


The tricky part becomes keeping all the Instructions straight for each
Version. For instance, one option (*OPTION-A*) is:
//Version = 2.4.0
enum Instruction {  A, B, C; }

//Version = 2.5.0
enum Instruction {  A, B, C, //Added in 2.4.0, Value.Version = V0
   D; //Added in 2.5.0, Value.Version = V1.
Also uses Value.Version = V0 instructions.
}

//Version = 2.6.0
enum Instruction{  A, B, C, //Added in 2.4.0, Value.Version = V0. *Don't
use for V2*
   D, //Added in 2.5.0, Value.Version = V1.
Also uses Value.Version = V0 instructions. *Don't use for V2*
   E,F,G,H,I,J; //Added in 2.6.0, Value.Version
= V2.
}

Alternatively, something like this (*OPTION-B*), where the Version and the
Instruction are tied together in the Instruction itself.

enum Instruction{  V0_A, V0_B, V0_C, //Added in 2.4.0
  V1_A, V1_B, V1_C, V1_D, //Added in 2.5.0
  V2_E, V2_F, V2_G, V2_H, V2_I;  //Added in
2.6.0
}
At this point our logic in the `ForeignKeySingleLookupProcessorSupplier`
looks something like this:
if (value.version == Version.V0) {
if (value.instruction == Instruction.V0_A) ...
else if (value.instruction == Instruction.V0_B) ...
else if (value.instruction == Instruction.V0_C) ...
else ...
} else if (value.version == Version.V1) {
if (value.instruction == Instruction.V1_A) ...
else if (value.instruction == Instruction.V1_B) ...
else if (value.instruction == Instruction.V1_C) ...
else if (value.instruction == Instruction.V1_D) ...
else ...
} else if ...

I prefer option B because Instruction meaning can change between versions,
especially in scenarios where we may be reworking, say, 2 instructions into
4 instructions that aren't neat subsets of the original 2.


*4) If we hard-stop on incompatible events (say we don't support that
version), how does the user go about handling the upgrade? *
We can't ignore the events as it would ruin the delivery guarantees. At
this point it seems to me that they would have to do a full streams reset
for that applicationId. Am I incorrect in this?


Adam


On Fri, Jun 28, 2019 at 9:19 AM Adam Bellemare 
wrote:

> Hi Matthias
>
> Yes, thanks for the questions - I know it's hard to keep up with all of
> the various KIPs and everything.
>
> The instructions are not stored anywhere, but are simply a way of letting
> the RHS know how to handle the subscription and reply accordingly.
>
> The only case where we send an unnecessary tombstone is (that I can
> tell...) when we do the following:
> RHS:
> (1, bar)
>
> LHS
> (K,1)  -> Results in (K, 1, bar) being output
> (K,1) -> (K,2) ->  Results in (K, null) being output for INNER (no
> matching element on LHS)
> (K,2) -> (K,3) ->  Results in (K, null) being output for INNER (because we
> don't maintain state to know we already output the tombstone on the
> previous transition).
> (K,2) -> (K,9000) ->  Results in (K, null)... etc.
>
> Byte versioning is going in today, then I hope to get back to addressing a
> number of John's previous questions in the PR.
>
> Adam
>
>
> On Thu, Jun 27, 2019 at 5:47 PM Matthias J. Sax 
> wrote:
>
>> Thanks for bringing this issue to our attention. Great find @Joe!
>>
>> Adding the instruction field to the `subscription` sounds like a good
>> solution. What I don't understand atm: for which case would we need to
>> send unnecessary tombstone? I thought that the `instruction` field helps
>> to avoid any unnecessary tombstone? Seems I a missing case?
>>
>> Also for my own understanding: the `instruction` is only part of the
>> message? It is no necessary to store it in the RHS 

Re: [DISCUSS] KIP-213: Second follow-up on Foreign Key Joins

2019-06-28 Thread Adam Bellemare
Hi Matthias

Yes, thanks for the questions - I know it's hard to keep up with all of the
various KIPs and everything.

The instructions are not stored anywhere, but are simply a way of letting
the RHS know how to handle the subscription and reply accordingly.

The only case where we send an unnecessary tombstone is (that I can
tell...) when we do the following:
RHS:
(1, bar)

LHS
(K,1)  -> Results in (K, 1, bar) being output
(K,1) -> (K,2) ->  Results in (K, null) being output for INNER (no matching
element on LHS)
(K,2) -> (K,3) ->  Results in (K, null) being output for INNER (because we
don't maintain state to know we already output the tombstone on the
previous transition).
(K,2) -> (K,9000) ->  Results in (K, null)... etc.

Byte versioning is going in today, then I hope to get back to addressing a
number of John's previous questions in the PR.

Adam


On Thu, Jun 27, 2019 at 5:47 PM Matthias J. Sax 
wrote:

> Thanks for bringing this issue to our attention. Great find @Joe!
>
> Adding the instruction field to the `subscription` sounds like a good
> solution. What I don't understand atm: for which case would we need to
> send unnecessary tombstone? I thought that the `instruction` field helps
> to avoid any unnecessary tombstone? Seems I a missing case?
>
> Also for my own understanding: the `instruction` is only part of the
> message? It is no necessary to store it in the RHS auxiliary store, right?
>
> About right/full-outer joins. Agreed. Getting left-joins would be awesome!
>
> About upgrading: Good call John! Adding a version byte for subscription
> and response is good forward thinking. I personally prefer version
> numbers, too, as they carry more information.
>
> Thanks for all the hard to everybody involved!
>
>
> -Matthias
>
> On 6/27/19 1:44 PM, John Roesler wrote:
> > Hi Adam,
> >
> > Hah! Yeah, I felt a headache coming on myself when I realized this
> > would be a concern.
> >
> > For what it's worth, I'd also lean toward versioning. It seems more
> > explicit and more likely to keep us all sane in the long run. Since we
> > don't _think_ our wire protocol will be subject to a lot of revisions,
> > we can just use one byte. The worst case is that we run out of numbers
> > and reserve the last one to mean, "consult another field for the
> > actual version number". It seems like a single byte on each message
> > isn't too much to pay.
> >
> > Since you point it out, we might as well put a version number on the
> > SubscriptionResponseWrapper as well. It may not be needed, but if we
> > ever need it, even just once, we'll be glad we have it.
> >
> > Regarding the instructions field, we can also serialize the enum very
> > compactly as a single byte (which is the same size a boolean takes
> > anyway), so it seems like an Enum in Java-land and a byte on the wire
> > is a good choice.
> >
> > Agreed on the right and full outer joins, it doesn't seem necessary
> > right now, although I am happy to see the left join "join" the party,
> > since as you said, we were so close to it anyway. Can you also add it
> > to the KIP?
> >
> > Thanks as always for your awesome efforts on this,
> > -John
> >
> > On Thu, Jun 27, 2019 at 3:04 PM Adam Bellemare 
> wrote:
> >>
> >> You're stretching my brain, John!
> >>
> >> I prefer STRATEGY 1 because it solves the problem in a simple way, and
> >> allows us to deprecate support for older message types as we go (ie, we
> >> only support the previous 3 versions, so V5,V4,V3, but not v2 or V1).
> >>
> >> STRATEGY 2 is akin to Avro schemas between two microservices - there are
> >> indeed cases where a breaking change must be made, and forward
> >> compatibility will provide us with no out other than requiring a full
> stop
> >> and full upgrade for all nodes, shifting us back towards STRATEGY 1.
> >>
> >> My preference is STRATEGY 1 with instructions as an ENUM, and we can
> >> certainly include a version. Would it make sense to include a version
> >> number in  SubscriptionResponseWrapper as well? Currently we don't have
> any
> >> instructions in there, as I removed the boolean, but it is certainly
> >> plausible that it could happen in the future. I don't *think* we'll need
> >> it, but I also didn't think we'd need it for SubscriptionWrapper and
> here
> >> we are.
> >>
> >> Thanks for the thoughts, and the info on the right-key. That was
> >> enlightening, though I can't think of a use-case for it *at this point
> in
> >> time*. :)
> >>
> >> Adam
> >>
> >>
> >>
> >> On Thu, Jun 27, 2019 at 12:29 PM John Roesler 
> wrote:
> >>
> >>> I think I agree with you, right joins (and therefore full outer joins)
> >>> don't make sense here, because the result is a keyed table, where the
> >>> key is the PK of the left-hand side. So, when you have a
> >>> right-hand-side record with no incoming FK references, you would want
> >>> to produce a join result like `nullKey: (null, rhsValue)`, but we
> >>> don't currently allow null keys in Streams. It actually is possible to
> 

[jira] [Created] (KAFKA-8613) Set default grace period to 0

2019-06-28 Thread Bruno Cadonna (JIRA)
Bruno Cadonna created KAFKA-8613:


 Summary: Set default grace period to 0
 Key: KAFKA-8613
 URL: https://issues.apache.org/jira/browse/KAFKA-8613
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Affects Versions: 3.0.0
Reporter: Bruno Cadonna


Currently, the grace period is set to retention time if the grace period is not 
specified explicitly. The reason for setting the default grace period to 
retention time was backward compatibility. Topologies that were implemented 
before the introduction of the grace period, added late arriving records to a 
window as long as the window existed, i.e., as long as its retention time was 
not elapsed.  

This unintuitive default grace period has already caused confusion among users.

For the next major release, we should set the default grace period to 
{{Duration.ZERO}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8612) Broker removes consumers from CG, Streams app gets stuck

2019-06-28 Thread Di Campo (JIRA)
Di Campo created KAFKA-8612:
---

 Summary: Broker removes consumers from CG, Streams app gets stuck
 Key: KAFKA-8612
 URL: https://issues.apache.org/jira/browse/KAFKA-8612
 Project: Kafka
  Issue Type: Bug
  Components: clients, streams
Affects Versions: 2.1.1
Reporter: Di Campo
 Attachments: full-thread-dump-kafka-streams-stuck.log

Cluster of 5 brokers, `Kafka 2.1.1`. m5.large (2 CPU, 8GB RAM) instances. 
Kafka Streams application (`stream-processor`) cluster of 3 instances, 2 
threads each. `2.1.0` 
Consumer Store consumer group (ClickHouse Kafka Engine from `ClickHouse 
19.5.3.8`), with several tables consuming from a different topic each.
The `stream-processor` is running consuming from a source topic and running a 
topology of 26 topics (64 partitions each) with 5 state stores, 1 of them 
sessioned, 4 key-value.
Infra running on docker on AWS ECS. 
Consuming at a rate of 300-1000 events per second. Each event generates an avg 
of ~20 extra messages.

Timestamps are kept for better analysis.

`stream-processor` tasks at some point fail to produce to any partition due to 
timeouts:


{noformat}
[2019-06-28 10:04:21,113] ERROR task [1_48] Error sending record (...) to topic 
(...) due to org.apache.kafka.common.errors.TimeoutException: Expiring 44 
record(s) for (...)-48:120002 ms has passed since batch creation; No more 
records will be sent and no more offsets will be recorded for this task.
{noformat}


and "Offset commit failed" errors, in all partitions:

{noformat}
[2019-06-28 10:04:27,705] ERROR [Consumer 
clientId=stream-processor-0.0.1-084f2b82-849a-42b5-a787-f900bbfcb545-StreamThread-1-consumer,
 groupId=stream-processor-0.0.1] Offset commit failed on partition 
events-raw-63 at offset 4858803: The request timed out. 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
{noformat}

_At this point we begin seeing error messages in one of the brokers (see below, 
Broker logs section)._

More error messages are shown on the `stream-processor`: 

{noformat}
org.apache.kafka.common.errors.TimeoutException: Timeout of 6ms expired 
before successfully committing offsets 
{(topic)=OffsetAndMetadata{offset=4858803, leaderEpoch=null, metadata=''}}
{noformat}

then hundreds of messages of the following types (one per topic-partitio) 
intertwinned: 

{noformat}
[2019-06-28 10:05:23,608] WARN [Producer 
clientId=stream-processor-0.0.1-084f2b82-849a-42b5-a787-f900bbfcb545-StreamThread-3-producer]
 Got error produce response with correlation id 39946 on topic-partition 
(topic)-63, retrying (2 attempts left). Error: NETWORK_EXCEPTION 
(org.apache.kafka.clients.producer.internals.Sender)
{noformat}

{noformat}
[2019-06-28 10:05:23,609] WARN [Producer 
clientId=stream-processor-0.0.1-084f2b82-849a-42b5-a787-f900bbfcb545-StreamThread-3-producer]
 Received invalid metadata error in produce request on partition (topic)1-59 
due to org.apache.kafka.common.errors.NetworkException: The server disconnected 
before a response was received.. Going to request metadata update now 
(org.apache.kafka.clients.producer.internals.Sender)
{noformat}


And then: 


{noformat}
[2019-06-28 10:05:47,986] ERROR stream-thread 
[stream-processor-0.0.1-084f2b82-849a-42b5-a787-f900bbfcb545-StreamThread-4] 
Failed to commit stream task 1_57 due to the following error: 
(org.apache.kafka.streams.processor.internals.AssignedStreamsTasks)
2019-06-28 10:05:47org.apache.kafka.streams.errors.StreamsException: task 
[1_57] Abort sending since an error caught with a previous record (...) to 
topic (...) due to org.apache.kafka.common.errors.NetworkException: The server 
disconnected before a response was received.
2019-06-28 10:05:47You can increase producer parameter `retries` and 
`retry.backoff.ms` to avoid this error.
2019-06-28 10:05:47 at 
org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:133)
{noformat}


...and eventually we get to the error messages: 

{noformat}
[2019-06-28 10:05:51,198] ERROR [Producer 
clientId=stream-processor-0.0.1-084f2b82-849a-42b5-a787-f900bbfcb545-StreamThread-3-producer]
 Uncaught error in kafka producer I/O thread: 
(org.apache.kafka.clients.producer.internals.Sender)
2019-06-28 10:05:51java.util.ConcurrentModificationException
2019-06-28 10:05:51 at 
java.util.HashMap$HashIterator.nextNode(HashMap.java:1442)
{noformat}

{noformat}
[2019-06-28 10:07:18,735] ERROR task [1_63] Failed to flush state store 
orderStore: 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager) 
org.apache.kafka.streams.errors.StreamsException: task [1_63] Abort sending 
since an error caught with a previous record (...) timestamp 1561664080389) to 
topic (...) due to org.apache.kafka.common.errors.TimeoutException: Expiring 44 
record(s) for 

[DISCUSS] KIP-485 Make topic optional when using through() operations in DSL

2019-06-28 Thread Levani Kokhreidze
Hello,

I would like to start discussion about KIP-485 Make topic optional when using 
through() operations in DSL 


Regards,
Levani

[jira] [Created] (KAFKA-8611) Make topic optional when using through() operations in DSL

2019-06-28 Thread Levani Kokhreidze (JIRA)
Levani Kokhreidze created KAFKA-8611:


 Summary: Make topic optional when using through() operations in DSL
 Key: KAFKA-8611
 URL: https://issues.apache.org/jira/browse/KAFKA-8611
 Project: Kafka
  Issue Type: New Feature
  Components: streams
Reporter: Levani Kokhreidze


When using DSL in Kafka Streams, data re-partition happens only when 
key-changing operation is followed by stateful operation. On the other hand, in 
DSL, stateful computation can happen using _transform()_ operation as well. 
Problem with this approach is that, even if any upstream operation was 
key-changing before calling _transform()_, no auto-repartition is triggered. If 
repartitioning is required, a call to _through(String)_ should be performed 
before _transform()_. With the current implementation, burden of managing and 
creating the topic falls on user and introduces extra complexity of managing 
Kafka Streams application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-06-28 Thread Stanislav Kozlovski
Hi George,

I think we might want to move the discussion to KIP-455 as this KIP is very
tightly coupled to it.

One issue I see is when controller failover occurs,  the new controller
> will need to read the active reassignments by scanning the
> /brokers/topics//,  for some clusters with hundred+
> brokers and 10s thousands of partitions per broker, it might not be
> scalable for this scan?
>
As Colin had said in the KIP-455 discussion, there is no impact because we
already scan these znodes as part of the normal controller startup.

Another issue, the controller has all the active reassignment in Memory (as
> well as in ZK partition level), the non-controller brokers will maintain
> its Reassignment context in memory as well thru Controller's
> LeaderAndIsrRequest with additional reassignment data,  if the broker
> missing the LeaderAndIsrRequest, will it cause that broker's
> discrepancies?  for performance reason, we do not want the broker's to scan
> all topic/partition to find current active reassignments ?

As far as I understand, the current reassignment code also depends on the
LeaderAndIsr request and the non-controller still keep the reassignment
context in memory. All through the replicas set for a given partition. The
only difference is that we now separate the replicas set into two
finer-grained sets - replicas and targetReplicas.

That's the main purpose of KIP-236.  For data storage product/system, it's
> good to have the cancel/rollback feature.

I fully agree, I think this is a table stakes feature.

Thanks,
Stanislav


On Wed, May 1, 2019 at 7:29 AM George Li 
wrote:

>  Hi Jason,
>
> Sorry for the late response.  been busy.  I have taken a brief look at
> Colin's KIP-455. It's a good direction for Reassignments. I think KIP-236,
> KIP-435,  KIP-455 need some consensus / coordination for where the new
> reassignment data AND the original replicas is stored.
>
> The current way of storing new reassignment data  in  ZK node
> /admin/reassign_partitions Vs the KIP-455 storing in the
> /brokers/topics// level has pros and cons.
>
> Storing in ZK node ZK node /admin/reassign_partitions can quickly find out
> the current pending/active reassignments.  Storing in topic/partition level
> is generally fine for controller, since the active reassignments are also
> kept in memory Controller Context.  One issue I see is when controller
> failover occurs,  the new controller will need to read the active
> reassignments by scanning the /brokers/topics//,  for
> some clusters with hundred+ brokers and 10s thousands of partitions per
> broker, it might not be scalable for this scan?  The failover controller
> needs to come up as quickly as possible?
>
> Another issue, the controller has all the active reassignment in Memory
> (as well as in ZK partition level), the non-controller brokers will
> maintain its Reassignment context in memory as well thru Controller's
> LeaderAndIsrRequest with additional reassignment data,  if the broker
> missing the LeaderAndIsrRequest, will it cause that broker's
> discrepancies?  for performance reason, we do not want the broker's to scan
> all topic/partition to find current active reassignments ?
>
> I agree this is more flexible in the long term to have the capability of
> submitting new reassignments while current batch is still active.  You
> brought up a good point of allowing canceling individual reassignment,  I
> will have the reassignment cancel RPC to specify individual partition for
> cancellation, if it's null, then cancel all pending reassignments.  In the
> current work environment, Reassignment operations are very sensitive
> operations to the cluster's latency, especially for the cluster with
> min.insync.replicas >=2, So when reassignment is kicked off and cluster
> performance is affected, the most desirable features that we would like is
> to cancel all pending reassignments in a timely/cleanly fashion so that the
> cluster performance can be restored. That's the main purpose of KIP-236.
> For data storage product/system, it's good to have the cancel/rollback
> feature.
>
> Thanks,
> George
>
> On Wednesday, April 24, 2019, 10:57:45 AM PDT, Jason Gustafson <
> ja...@confluent.io> wrote:
>
>  Hi George,
>
> I think focusing on the admin API for cancellation in this KIP is
> reasonable. Colin wrote up KIP-455 which adds APIs to submit a reassignment
> and to list reassignments in progress. Probably as long as we make these
> APIs all consistent with each other, it is fine to do them separately.
>
> The major point we need to reach agreement on is how an active reassignment
> will be stored. In the proposal here, we continue to
> use /admin/reassign_partitions and we further extend it to differentiate
> between the original replicas and the target replicas of a reassignment. In
> KIP-455, Colin proposes to store the same information, but do it at the
> partition level. Obviously we would only do one of these. I think the main
> difference is how we 

Re: [DISCUSS] KIP-477: Add PATCH method for connector config in Connect REST API

2019-06-28 Thread Ivan Yurchenko
Thank you for your feedback Ryanne!
These are all surely valid concerns and PATCH isn't really necessary or
suitable for normal production configuration management. However, there are
cases where quick patching of the configuration is useful, such as hot
fixes of production or in development.

Overall, the change itself is really tiny and if the cost-benefit balance
is positive, I'd still like to drive it further.

Ivan

On Wed, 26 Jun 2019 at 17:45, Ryanne Dolan  wrote:

> Ivan, I looked at adding PATCH a while ago as well. I decided not to pursue
> the idea for a few reasons:
>
> 1) PATCH is still racy. For example, if you want to add a topic to the
> "topics" property, you still need to read, modify, and write the existing
> value. To handle this, you'd need to support atomic sub-document
> operations, which I don't see happening.
>
> 2) A common pattern is to store your configurations in git or something,
> and then apply them via PUT. Throw in some triggers or jenkins etc, and you
> have a more robust solution than PATCH provides.
>
> 3) For properties that change a lot, it's possible to use an out-of-band
> data source, e.g. Kafka or Zookeeper, and then have your Connector
> subscribe to changes. I've done something like this to enable dynamic
> reconfiguration of Connectors from command-line tools and dashboards
> without involving the Connect REST API at all. Moreover, I've done so in an
> atomic, non-racy way.
>
> So I don't think PATCH is strictly necessary nor sufficient for atomic
> partial updates. That said, it doesn't hurt and I'm happy to support the
> KIP.
>
> Ryanne
>
> On Tue, Jun 25, 2019 at 12:15 PM Ivan Yurchenko 
> wrote:
>
> > Hi,
> >
> > Since Kafka 2.3 has just been release and more people may have time to
> look
> > at this now, I'd like to bump this discussion.
> > Thanks.
> >
> > Ivan
> >
> >
> > On Thu, 13 Jun 2019 at 17:20, Ivan Yurchenko 
> > wrote:
> >
> > > Hello,
> > >
> > > I'd like to start the discussion of KIP-477: Add PATCH method for
> > > connector config in Connect REST API.
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-477%3A+Add+PATCH+method+for+connector+config+in+Connect+REST+API
> > >
> > > There is also a draft PR: https://github.com/apache/kafka/pull/6934.
> > >
> > > Thank you.
> > >
> > > Ivan
> > >
> >
>


Build failed in Jenkins: kafka-trunk-jdk8 #3756

2019-06-28 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] KAFKA-8356: add static membership info to round robin assignor 
(#6815)

[wangguoz] KAFKA-8538 (part of KIP-345): add group.instance.id to DescribeGroup

--
[...truncated 2.55 MB...]

org.apache.kafka.connect.transforms.ExtractFieldTest > testNullWithSchema 
STARTED

org.apache.kafka.connect.transforms.ExtractFieldTest > testNullWithSchema PASSED

org.apache.kafka.connect.transforms.ExtractFieldTest > schemaless STARTED

org.apache.kafka.connect.transforms.ExtractFieldTest > schemaless PASSED

org.apache.kafka.connect.transforms.ExtractFieldTest > testNullSchemaless 
STARTED

org.apache.kafka.connect.transforms.ExtractFieldTest > testNullSchemaless PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessFieldConversion STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessFieldConversion PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testConfigInvalidTargetType STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testConfigInvalidTargetType PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessStringToTimestamp STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessStringToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > testKey STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > testKey PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessDateToTimestamp STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessDateToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToString STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToString PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimeToTimestamp STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimeToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToDate STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToDate PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToTime STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToTime PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToUnix STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToUnix PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaUnixToTimestamp STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaUnixToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessIdentity STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessIdentity PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaFieldConversion STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaFieldConversion PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaDateToTimestamp STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaDateToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaIdentity STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaIdentity PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToDate STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToDate PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToTime STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToTime PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToUnix STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToUnix PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testConfigMissingFormat STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testConfigMissingFormat PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testConfigNoTargetType STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testConfigNoTargetType PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaStringToTimestamp STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaStringToTimestamp PASSED


Jenkins build is back to normal : kafka-trunk-jdk11 #663

2019-06-28 Thread Apache Jenkins Server
See