[KAFKA-7388] PR Review: equal sign in password property

2018-09-12 Thread Mutasem Al Dmour
Hi all,

New to Kafka here. Can you please review this PR to fix KAFKA-7388?
https://github.com/apache/kafka/pull/5630/files
https://issues.apache.org/jira/browse/KAFKA-7388

Best,

Mutasem


Re: [DISCUSS] KIP-158: Kafka Connect should allow source connectors to set topic-specific settings for new topics

2018-09-12 Thread Stephane Maarek
If an admin team gives another the right to create and configure their
topic, I can't see why specifying topic configuration in connector
properties wouldn't be a great idea. Configuration as code, repeatable,
central and automated, easy iteration. Sign me in

On Wed., 12 Sep. 2018, 4:57 am Gwen Shapira,  wrote:

> Hi Ryanne,
>
> Thanks for the feedback!
>
> Can you explain a bit more what you mean by "if we allow connectors to make
> this
> decision, they should have full control of the process."?
>
> I assume you mean, something like:
> Rather than go though the connect framework, connectors should just create
> their own AdminClient instance and create their own topics?
>
> The problem with this approach is that connectors currently don't have
> their own identity (in the authentication/authorization sense). All
> connectors share the framework identity, if the users need to start
> configuring security for both the framework and connect itself, it gets
> messy rather quickly.
> We actually already do the thing I'm imagining you suggested in some
> connectors right now (create AdminClient and configure topics), and we hope
> to use the new framework capability to clean-up the configuration mess this
> has caused. I spent 4 days trying to figure out what a specific connector
> doesn't work, just to find out that you need to give it its own security
> config because it has an AdminClient so the configuration on the framework
> isn't enough.
>
> From my experience with rather large number of customers, there are some
> companies where the topics are controlled by a central team that owns all
> the machinery to create and configure topics (sometimes via gitops,
> kubernetes custom resources, etc) and they would indeed be very surprised
> if a connector suddenly had opinions about topics. There are also teams
> where the application developers feel like they know their data and
> use-case the best and they are in-charge of making all topic-level
> decisions, usually automated by the app itself. Admin client was created
> for those teams and I think they'll appreciate having this capability in
> connect too. Funny thing is, customers who work with one model usually
> can't believe the other model even exists.
>
> I'd love to propose a compromise and suggest that we'll allow this
> functionality in Connect but also give ops teams the option to disable it
> and avoid surprises. But I'm afraid this wont work - too often the defaults
> are just terrible for specific connectors (CDC connectors sometimes need a
> single partition to maintain consistency) and if there is a chance the
> connector preference won't be used, connectors will have to force it via
> admin client which brings us back to the terrible config situation we
> currently have with Admin client.
>
> Gwen
>
>
> On Tue, Sep 11, 2018 at 7:23 PM, Ryanne Dolan 
> wrote:
>
> > Randall,
> >
> > I have some concerns with this proposal.
> >
> > Firstly, I don't believe it is the job of a connector to configure
> topics,
> > generally, nor for topic-specific settings to hang out in connector
> > configurations. Automatic creation of topics with default settings is an
> > established pattern elsewhere, and I don't think connectors need to
> diverge
> > from this.
> >
> > I agree there are cases where the default settings don't make sense and
> > it'd be nice to override them. But if we allow connectors to make this
> > decision, they should have full control of the process.
> >
> > Some concerns:
> > - I'd expect the cluster's default settings to apply to newly created
> > topics, regardless of who created them. I wouldn't expect source
> connectors
> > to be a special case. In particular, I'd be surprised if Kafka Connect
> were
> > to ignore my cluster's default settings and apply its own defaults.
> > - It will be possible to add a specific topic to this configuration, in
> > which case any reader would expect the topic to have the specified
> > settings. But this will not generally be true. Thus, the configuration
> will
> > end up lying and misleading, and there won't be any indication that the
> > configuration is lying.
> > - Connectors that want to control settings will end up naming topics
> > accordingly. For example, a connector that wants to control the number of
> > partitions would need a bunch of creation rules for 1 partition, 2
> > partitions and so on. This is a bad pattern to establish. A better
> pattern
> > is to let the connector control the number of partitions directly when
> that
> > feature is required.
> > - The proposal introduces 2 new interfaces to control topic creation
> > (configuration rules and TopicSettings), where there is already a
> perfectly
> > good one (AdminClient).
> >
> > Ryanne
> >
> >
> >
> >
> > On Tue, Sep 4, 2018 at 5:08 PM Randall Hauch  wrote:
> >
> > > Okay, I think I cleaned up the formatting issues in the KIP wiki page.
> > And
> > > while implementing I realized that it'd be helpful to be able to limit

Re: [VOTE] KIP-110: Add Codec for ZStandard Compression

2018-09-12 Thread Manikumar
+1 (non-binding).

Thanks for the KIP.

On Wed, Sep 12, 2018 at 10:44 PM Ismael Juma  wrote:

> Thanks for the KIP, +1 (binding).
>
> Ismael
>
> On Wed, Sep 12, 2018 at 10:02 AM Dongjin Lee  wrote:
>
> > Hello, I would like to start a VOTE on KIP-110: Add Codec for ZStandard
> > Compression.
> >
> > The KIP:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-110%3A+Add+Codec+for+ZStandard+Compression
> > Discussion thread:
> > https://www.mail-archive.com/dev@kafka.apache.org/msg88673.html
> >
> > Thanks,
> > Dongjin
> >
> > --
> > *Dongjin Lee*
> >
> > *A hitchhiker in the mathematical world.*
> >
> > *github:  github.com/dongjinleekr
> > linkedin:
> kr.linkedin.com/in/dongjinleekr
> > slideshare:
> > www.slideshare.net/dongjinleekr
> > *
> >
>


Re: [VOTE] KIP-110: Add Codec for ZStandard Compression

2018-09-12 Thread Ismael Juma
Thanks for the KIP, +1 (binding).

Ismael

On Wed, Sep 12, 2018 at 10:02 AM Dongjin Lee  wrote:

> Hello, I would like to start a VOTE on KIP-110: Add Codec for ZStandard
> Compression.
>
> The KIP:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-110%3A+Add+Codec+for+ZStandard+Compression
> Discussion thread:
> https://www.mail-archive.com/dev@kafka.apache.org/msg88673.html
>
> Thanks,
> Dongjin
>
> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
>
> *github:  github.com/dongjinleekr
> linkedin: kr.linkedin.com/in/dongjinleekr
> slideshare:
> www.slideshare.net/dongjinleekr
> *
>


[jira] [Resolved] (KAFKA-6699) When one of two Kafka nodes are dead, streaming API cannot handle messaging

2018-09-12 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-6699.

Resolution: Not A Bug

This ticket does not seem to be a bug but a configuration question. I am 
closing this for now. If you have further questions, please consult the user 
mailing list (https://kafka.apache.org/contact).

> When one of two Kafka nodes are dead, streaming API cannot handle messaging
> ---
>
> Key: KAFKA-6699
> URL: https://issues.apache.org/jira/browse/KAFKA-6699
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.2
>Reporter: Seweryn Habdank-Wojewodzki
>Priority: Major
>
> Dears,
> I am observing quite often, when Kafka Broker is partly dead(*), then 
> application, which uses streaming API are doing nothing.
> (*) Partly dead in my case it means that one of two Kafka nodes are out of 
> order. 
> Especially when disk is full on one machine, then Broker is going in some 
> strange state, where streaming API goes vacations. It seems like regular 
> producer/consumer API has no problem in such a case.
> Can you have a look on that matter?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-12 Thread Ron Dagostino
Thanks, Rajini.  Before I digest/respond to that, here's an update that I
just completed.

I added a commit to the PR (https://github.com/apache/kafka/pull/5582/)
that implements server-side kill of expired OAUTHBEARER connections.  No
tests yet since we still haven't settled on a final approach (low-level vs.
high-level).

I also updated the KIP to reflect the latest discussion and PR as follows:

   - Include support for brokers killing connections as part of this KIP
   (rather than deferring it to a future KIP as was originally mentioned; the
   PR now includes it as mentioned -- it was very easy to add)
   - Added metrics (they will mirror existing ones related to
   authentications; I have not added those to the PR)
   - Updated the implementation description to reflect the current state of
   the PR, which is a high-level, one-size-fits-all approach (as opposed to my
   initial, even-higher-level approach)
   - Added a "Rejected Alternative" for the first version of the PR, which
   injected requests directly into synchronous I/O clients' queues
   - Added a "Rejected Alternative" for the low-level approach as
   suggested, but of course we have not formally decided to reject this
   approach or adopt the current PR implementation.

I'll think about where we stand some more before responding again.  Thanks
for the above reply.

Ron

On Wed, Sep 12, 2018 at 1:36 PM Rajini Sivaram 
wrote:

> Hi Ron,
>
> Thank you for summarising, I think it covers the differences between the
> two approaches well.
>
> A few minor points to answer the questions in there:
>
> 1) When re-authetication is initiated in the Selector during poll(), we can
> move an idle channel to re-authentication state. It is similar to injecting
> requests, but achieved by changing channel back to authenticating state.
>
> 3) To clarify why I think re-authentication should fit in with our
> authentication design: My point was not about a specific connection being
> usable or not usable. It was about what happens at the client API level.
> Our client API (producer/consumer/admin client etc.) currently assume that
> a single broker authentication failure is a fatal error that is never
> retried because we assume that broker only ever fails an authentication
> request if credentials are invalid. If we ever decide to support cases
> where broker occasionally fails an authentication request due to a
> transient failure, we need to do more around how we handle authentication
> failures in clients. We may decide that it is ok to close the connection
> for authentication and not for re-authentication as you mentioned, but we
> need to change the way this disconnection is handled by clients. So IMO, we
> should either add support for transient retriable authentication failures
> properly or not retry for any scenario. Personally, I don't think we would
> want to retry all authentication failures even if it is a
> re-authentication, I think we could (at some point in future), allow
> brokers to return an error code that indicates that it is a transient
> broker-side failure rather than invalid credentials and handle the error
> differently. I see no reason at that point why we wouldn't handle
> authentication and re-authentication in the same way.
>
> 4) As you said, the high-level approach would be bigger than the low-level
> approach in terms of LOC. But I wouldn't be too concerned about lines of
> code. My bigger concern was about modularity. Our security code is already
> complex, protocols like Kerberos and SSL that we use from the JRE make
> problem diagnosis hard. Async I/O makes the networking code complex. You
> need to understand networking layer to work with the security layer, but
> the rest of the code base doesn't rely on knowledge of network/security
> layers. My main concern about the high-level approach is that it spans
> these boundaries, making it harder to maintain in the long run.
>
>
> On Wed, Sep 12, 2018 at 10:23 AM, Stanislav Kozlovski <
> stanis...@confluent.io> wrote:
>
> > Hi Ron, Rajini
> >
> > Thanks for summarizing the discussion so far, Ron!
> >
> > 1) How often do we have such long-lived connection idleness (e.g 5-10
> > minutes) in practice?
> >
> > 3) I agree that retries for re-authentication are useful.
> >
> > 4) The interleaving of requests sounds like a great feature to have, but
> > the tradeoff against code complexity is a tough one. I would personally
> go
> > with the simpler approach since you could always add interleaving on top
> if
> > the community decides the latency should be better.
> >
> > Best,
> > Stanislav
> >
> > On Tue, Sep 11, 2018 at 5:00 AM Ron Dagostino  wrote:
> >
> > > Hi everyone.  I've updated the PR to reflect the latest conclusions
> from
> > > this ongoing discussion.  The KIP still needs the suggested updates; I
> > will
> > > do that later this week.  II agree with Rajini that some additional
> > > feedback from the community at large would be very helpful at this
> point
> > 

Re: [DISCUSS] KIP-110: Add Codec for ZStandard Compression (Updated)

2018-09-12 Thread Dongjin Lee
Hi Ismael,

Sure. Thanks.

- Dongjin

On Wed, Sep 12, 2018 at 11:56 PM Ismael Juma  wrote:

> Dongjin, can you please start a vote?
>
> Ismael
>
> On Sun, Sep 9, 2018 at 11:15 PM Dongjin Lee  wrote:
>
> > Hi Jason,
> >
> > You are right. Explicit statements are always better. I updated the
> > document following your suggestion.
> >
> > @Magnus
> >
> > Thanks for the inspection. It seems like a new error code is not a
> problem.
> >
> > Thanks,
> > Dongjin
> >
> > On Fri, Sep 7, 2018 at 3:23 AM Jason Gustafson 
> wrote:
> >
> > > Hi Dongjin,
> > >
> > > The KIP looks good to me. I'd suggest starting a vote. A couple minor
> > > points that might be worth calling out explicitly in the compatibility
> > > section:
> > >
> > > 1. Zstd will only be allowed for the bumped produce API. For older
> > > versions, we return UNSUPPORTED_COMPRESSION_TYPE regardless of the
> > message
> > > format.
> > > 2. Down-conversion of zstd-compressed records will not be supported.
> > > Instead we will return UNSUPPORTED_COMPRESSION_TYPE.
> > >
> > > Does that sound right?
> > >
> > > Thanks,
> > > Jason
> > >
> > > On Thu, Sep 6, 2018 at 1:45 AM, Magnus Edenhill 
> > > wrote:
> > >
> > > > > Ismael wrote:
> > > > > Jason, that's an interesting point regarding the Java client. Do we
> > > know
> > > > > what clients in other languages do in these cases?
> > > >
> > > > librdkafka (and its bindings) passes unknown/future errors through to
> > the
> > > > application, the error code remains intact while
> > > > the error string will be set to something like "Err-123?", which
> isn't
> > > very
> > > > helpful to the user but it at least
> > > > preserves the original error code for further troubleshooting.
> > > > For the producer any unknown error returned in the ProduceResponse
> will
> > > be
> > > > considered a permanent delivery failure (no retries),
> > > > and for the consumer any unknown FetchResponse errors will propagate
> > > > directly to the application, trigger a fetch backoff, and then
> > > > continue fetching past that offset.
> > > >
> > > > So, from the client's perspective it is not really a problem if new
> > error
> > > > codes are added to older API versions.
> > > >
> > > > /Magnus
> > > >
> > > >
> > > > Den tors 6 sep. 2018 kl 09:45 skrev Dongjin Lee  >:
> > > >
> > > > > I updated the KIP page
> > > > > <
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 110%3A+Add+Codec+for+ZStandard+Compression
> > > > > >
> > > > > following the discussion here. Please take a look when you are
> free.
> > > > > If you have any opinion, don't hesitate to give me a message.
> > > > >
> > > > > Best,
> > > > > Dongjin
> > > > >
> > > > > On Fri, Aug 31, 2018 at 11:35 PM Dongjin Lee 
> > > wrote:
> > > > >
> > > > > > I just updated the draft implementation[^1], rebasing against the
> > > > latest
> > > > > > trunk and implementing error routine (i.e., Error code 74 for
> > > > > > UnsupportedCompressionTypeException.) Since we decided to
> disallow
> > > all
> > > > > > fetch request below version 2.1.0 for the topics specifying
> > > ZStandard,
> > > > I
> > > > > > added an error logic only.
> > > > > >
> > > > > > Please have a look when you are free.
> > > > > >
> > > > > > Thanks,
> > > > > > Dongjin
> > > > > >
> > > > > > [^1]: Please check the last commit here:
> > > > > > https://github.com/apache/kafka/pull/2267
> > > > > >
> > > > > > On Thu, Aug 23, 2018, 8:55 AM Dongjin Lee 
> > > wrote:
> > > > > >
> > > > > >> Jason,
> > > > > >>
> > > > > >> Great. +1 for UNSUPPORTED_COMPRESSION_TYPE.
> > > > > >>
> > > > > >> Best,
> > > > > >> Dongjin
> > > > > >>
> > > > > >> On Thu, Aug 23, 2018 at 8:19 AM Jason Gustafson <
> > ja...@confluent.io
> > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >>> Hey Dongjin,
> > > > > >>>
> > > > > >>> Yeah that's right. For what it's worth, librdkafka also appears
> > to
> > > > > handle
> > > > > >>> unexpected error codes. I expect that most client
> implementations
> > > > would
> > > > > >>> either pass through the raw type or convert to an enum using
> > > > something
> > > > > >>> like
> > > > > >>> what the java client does. Since we're expecting the client to
> > fail
> > > > > >>> anyway,
> > > > > >>> I'm probably in favor of using the UNSUPPORTED_COMPRESSION_TYPE
> > > error
> > > > > >>> code.
> > > > > >>>
> > > > > >>> -Jason
> > > > > >>>
> > > > > >>> On Wed, Aug 22, 2018 at 1:46 AM, Dongjin Lee <
> dong...@apache.org
> > >
> > > > > wrote:
> > > > > >>>
> > > > > >>> > Jason and Ismael,
> > > > > >>> >
> > > > > >>> > It seems like the only thing we need to regard if we define a
> > new
> > > > > error
> > > > > >>> > code (i.e., UNSUPPORTED_COMPRESSION_TYPE) would be the
> > > > implementation
> > > > > >>> of
> > > > > >>> > the other language clients, right? At least, this strategy
> > causes
> > > > any
> > > > > >>> > problem for Java client. Do I understand correctly?
> > > > > >>> >
> > > > > >>> > Thanks,
> > > > > >>> > 

Re: [DISCUSS] KIP-110: Add Codec for ZStandard Compression (Updated)

2018-09-12 Thread Ismael Juma
Dongjin, can you please start a vote?

Ismael

On Sun, Sep 9, 2018 at 11:15 PM Dongjin Lee  wrote:

> Hi Jason,
>
> You are right. Explicit statements are always better. I updated the
> document following your suggestion.
>
> @Magnus
>
> Thanks for the inspection. It seems like a new error code is not a problem.
>
> Thanks,
> Dongjin
>
> On Fri, Sep 7, 2018 at 3:23 AM Jason Gustafson  wrote:
>
> > Hi Dongjin,
> >
> > The KIP looks good to me. I'd suggest starting a vote. A couple minor
> > points that might be worth calling out explicitly in the compatibility
> > section:
> >
> > 1. Zstd will only be allowed for the bumped produce API. For older
> > versions, we return UNSUPPORTED_COMPRESSION_TYPE regardless of the
> message
> > format.
> > 2. Down-conversion of zstd-compressed records will not be supported.
> > Instead we will return UNSUPPORTED_COMPRESSION_TYPE.
> >
> > Does that sound right?
> >
> > Thanks,
> > Jason
> >
> > On Thu, Sep 6, 2018 at 1:45 AM, Magnus Edenhill 
> > wrote:
> >
> > > > Ismael wrote:
> > > > Jason, that's an interesting point regarding the Java client. Do we
> > know
> > > > what clients in other languages do in these cases?
> > >
> > > librdkafka (and its bindings) passes unknown/future errors through to
> the
> > > application, the error code remains intact while
> > > the error string will be set to something like "Err-123?", which isn't
> > very
> > > helpful to the user but it at least
> > > preserves the original error code for further troubleshooting.
> > > For the producer any unknown error returned in the ProduceResponse will
> > be
> > > considered a permanent delivery failure (no retries),
> > > and for the consumer any unknown FetchResponse errors will propagate
> > > directly to the application, trigger a fetch backoff, and then
> > > continue fetching past that offset.
> > >
> > > So, from the client's perspective it is not really a problem if new
> error
> > > codes are added to older API versions.
> > >
> > > /Magnus
> > >
> > >
> > > Den tors 6 sep. 2018 kl 09:45 skrev Dongjin Lee :
> > >
> > > > I updated the KIP page
> > > > <
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 110%3A+Add+Codec+for+ZStandard+Compression
> > > > >
> > > > following the discussion here. Please take a look when you are free.
> > > > If you have any opinion, don't hesitate to give me a message.
> > > >
> > > > Best,
> > > > Dongjin
> > > >
> > > > On Fri, Aug 31, 2018 at 11:35 PM Dongjin Lee 
> > wrote:
> > > >
> > > > > I just updated the draft implementation[^1], rebasing against the
> > > latest
> > > > > trunk and implementing error routine (i.e., Error code 74 for
> > > > > UnsupportedCompressionTypeException.) Since we decided to disallow
> > all
> > > > > fetch request below version 2.1.0 for the topics specifying
> > ZStandard,
> > > I
> > > > > added an error logic only.
> > > > >
> > > > > Please have a look when you are free.
> > > > >
> > > > > Thanks,
> > > > > Dongjin
> > > > >
> > > > > [^1]: Please check the last commit here:
> > > > > https://github.com/apache/kafka/pull/2267
> > > > >
> > > > > On Thu, Aug 23, 2018, 8:55 AM Dongjin Lee 
> > wrote:
> > > > >
> > > > >> Jason,
> > > > >>
> > > > >> Great. +1 for UNSUPPORTED_COMPRESSION_TYPE.
> > > > >>
> > > > >> Best,
> > > > >> Dongjin
> > > > >>
> > > > >> On Thu, Aug 23, 2018 at 8:19 AM Jason Gustafson <
> ja...@confluent.io
> > >
> > > > >> wrote:
> > > > >>
> > > > >>> Hey Dongjin,
> > > > >>>
> > > > >>> Yeah that's right. For what it's worth, librdkafka also appears
> to
> > > > handle
> > > > >>> unexpected error codes. I expect that most client implementations
> > > would
> > > > >>> either pass through the raw type or convert to an enum using
> > > something
> > > > >>> like
> > > > >>> what the java client does. Since we're expecting the client to
> fail
> > > > >>> anyway,
> > > > >>> I'm probably in favor of using the UNSUPPORTED_COMPRESSION_TYPE
> > error
> > > > >>> code.
> > > > >>>
> > > > >>> -Jason
> > > > >>>
> > > > >>> On Wed, Aug 22, 2018 at 1:46 AM, Dongjin Lee  >
> > > > wrote:
> > > > >>>
> > > > >>> > Jason and Ismael,
> > > > >>> >
> > > > >>> > It seems like the only thing we need to regard if we define a
> new
> > > > error
> > > > >>> > code (i.e., UNSUPPORTED_COMPRESSION_TYPE) would be the
> > > implementation
> > > > >>> of
> > > > >>> > the other language clients, right? At least, this strategy
> causes
> > > any
> > > > >>> > problem for Java client. Do I understand correctly?
> > > > >>> >
> > > > >>> > Thanks,
> > > > >>> > Dongjin
> > > > >>> >
> > > > >>> > On Wed, Aug 22, 2018 at 5:43 PM Dongjin Lee <
> dong...@apache.org>
> > > > >>> wrote:
> > > > >>> >
> > > > >>> > > Jason,
> > > > >>> > >
> > > > >>> > > > I think we would only use this error code when we /know/
> that
> > > > zstd
> > > > >>> was
> > > > >>> > > in use and the client doesn't support it? This is true if
> > either
> > > 1)
> > > > >>> the
> > > > >>> > > message needs down-conversion 

[VOTE] KIP-110: Add Codec for ZStandard Compression

2018-09-12 Thread Dongjin Lee
Hello, I would like to start a VOTE on KIP-110: Add Codec for ZStandard
Compression.

The KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-110%3A+Add+Codec+for+ZStandard+Compression
Discussion thread:
https://www.mail-archive.com/dev@kafka.apache.org/msg88673.html

Thanks,
Dongjin

-- 
*Dongjin Lee*

*A hitchhiker in the mathematical world.*

*github:  github.com/dongjinleekr
linkedin: kr.linkedin.com/in/dongjinleekr
slideshare:
www.slideshare.net/dongjinleekr
*


Re: [DISCUSS] KIP-158: Kafka Connect should allow source connectors to set topic-specific settings for new topics

2018-09-12 Thread Ryanne Dolan
> Rather than go though the connect framework, connectors should just
create their own AdminClient instance and create their own topics?

Rather, can the framework be improved to expose an AdminClient ready to
use? Then connectors can use this instance without needing separate
identities/principals and associated configuration (which I totally
understand would be a nightmare). I believe that covers all the use-cases?

I just don't see how the "terrible config situation" is remedied by adding
even more configuration.

Also, I'm not sure I can conceive of a use-case in which a single connector
would need multiple default topic settings *based on the topic name*. Can
you give a real-world example? Is this something you've encountered, or are
you just trying for a flexible design?

Ryanne

On Tue, Sep 11, 2018 at 9:57 PM Gwen Shapira  wrote:

> Hi Ryanne,
>
> Thanks for the feedback!
>
> Can you explain a bit more what you mean by "if we allow connectors to make
> this
> decision, they should have full control of the process."?
>
> I assume you mean, something like:
> Rather than go though the connect framework, connectors should just create
> their own AdminClient instance and create their own topics?
>
> The problem with this approach is that connectors currently don't have
> their own identity (in the authentication/authorization sense). All
> connectors share the framework identity, if the users need to start
> configuring security for both the framework and connect itself, it gets
> messy rather quickly.
> We actually already do the thing I'm imagining you suggested in some
> connectors right now (create AdminClient and configure topics), and we hope
> to use the new framework capability to clean-up the configuration mess this
> has caused. I spent 4 days trying to figure out what a specific connector
> doesn't work, just to find out that you need to give it its own security
> config because it has an AdminClient so the configuration on the framework
> isn't enough.
>
> From my experience with rather large number of customers, there are some
> companies where the topics are controlled by a central team that owns all
> the machinery to create and configure topics (sometimes via gitops,
> kubernetes custom resources, etc) and they would indeed be very surprised
> if a connector suddenly had opinions about topics. There are also teams
> where the application developers feel like they know their data and
> use-case the best and they are in-charge of making all topic-level
> decisions, usually automated by the app itself. Admin client was created
> for those teams and I think they'll appreciate having this capability in
> connect too. Funny thing is, customers who work with one model usually
> can't believe the other model even exists.
>
> I'd love to propose a compromise and suggest that we'll allow this
> functionality in Connect but also give ops teams the option to disable it
> and avoid surprises. But I'm afraid this wont work - too often the defaults
> are just terrible for specific connectors (CDC connectors sometimes need a
> single partition to maintain consistency) and if there is a chance the
> connector preference won't be used, connectors will have to force it via
> admin client which brings us back to the terrible config situation we
> currently have with Admin client.
>
> Gwen
>
>
> On Tue, Sep 11, 2018 at 7:23 PM, Ryanne Dolan 
> wrote:
>
> > Randall,
> >
> > I have some concerns with this proposal.
> >
> > Firstly, I don't believe it is the job of a connector to configure
> topics,
> > generally, nor for topic-specific settings to hang out in connector
> > configurations. Automatic creation of topics with default settings is an
> > established pattern elsewhere, and I don't think connectors need to
> diverge
> > from this.
> >
> > I agree there are cases where the default settings don't make sense and
> > it'd be nice to override them. But if we allow connectors to make this
> > decision, they should have full control of the process.
> >
> > Some concerns:
> > - I'd expect the cluster's default settings to apply to newly created
> > topics, regardless of who created them. I wouldn't expect source
> connectors
> > to be a special case. In particular, I'd be surprised if Kafka Connect
> were
> > to ignore my cluster's default settings and apply its own defaults.
> > - It will be possible to add a specific topic to this configuration, in
> > which case any reader would expect the topic to have the specified
> > settings. But this will not generally be true. Thus, the configuration
> will
> > end up lying and misleading, and there won't be any indication that the
> > configuration is lying.
> > - Connectors that want to control settings will end up naming topics
> > accordingly. For example, a connector that wants to control the number of
> > partitions would need a bunch of creation rules for 1 partition, 2
> > partitions and so on. This is a bad pattern to establish. A better
> pattern
> > is 

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-12 Thread Rajini Sivaram
Hi Ron,

Thank you for summarising, I think it covers the differences between the
two approaches well.

A few minor points to answer the questions in there:

1) When re-authetication is initiated in the Selector during poll(), we can
move an idle channel to re-authentication state. It is similar to injecting
requests, but achieved by changing channel back to authenticating state.

3) To clarify why I think re-authentication should fit in with our
authentication design: My point was not about a specific connection being
usable or not usable. It was about what happens at the client API level.
Our client API (producer/consumer/admin client etc.) currently assume that
a single broker authentication failure is a fatal error that is never
retried because we assume that broker only ever fails an authentication
request if credentials are invalid. If we ever decide to support cases
where broker occasionally fails an authentication request due to a
transient failure, we need to do more around how we handle authentication
failures in clients. We may decide that it is ok to close the connection
for authentication and not for re-authentication as you mentioned, but we
need to change the way this disconnection is handled by clients. So IMO, we
should either add support for transient retriable authentication failures
properly or not retry for any scenario. Personally, I don't think we would
want to retry all authentication failures even if it is a
re-authentication, I think we could (at some point in future), allow
brokers to return an error code that indicates that it is a transient
broker-side failure rather than invalid credentials and handle the error
differently. I see no reason at that point why we wouldn't handle
authentication and re-authentication in the same way.

4) As you said, the high-level approach would be bigger than the low-level
approach in terms of LOC. But I wouldn't be too concerned about lines of
code. My bigger concern was about modularity. Our security code is already
complex, protocols like Kerberos and SSL that we use from the JRE make
problem diagnosis hard. Async I/O makes the networking code complex. You
need to understand networking layer to work with the security layer, but
the rest of the code base doesn't rely on knowledge of network/security
layers. My main concern about the high-level approach is that it spans
these boundaries, making it harder to maintain in the long run.


On Wed, Sep 12, 2018 at 10:23 AM, Stanislav Kozlovski <
stanis...@confluent.io> wrote:

> Hi Ron, Rajini
>
> Thanks for summarizing the discussion so far, Ron!
>
> 1) How often do we have such long-lived connection idleness (e.g 5-10
> minutes) in practice?
>
> 3) I agree that retries for re-authentication are useful.
>
> 4) The interleaving of requests sounds like a great feature to have, but
> the tradeoff against code complexity is a tough one. I would personally go
> with the simpler approach since you could always add interleaving on top if
> the community decides the latency should be better.
>
> Best,
> Stanislav
>
> On Tue, Sep 11, 2018 at 5:00 AM Ron Dagostino  wrote:
>
> > Hi everyone.  I've updated the PR to reflect the latest conclusions from
> > this ongoing discussion.  The KIP still needs the suggested updates; I
> will
> > do that later this week.  II agree with Rajini that some additional
> > feedback from the community at large would be very helpful at this point
> in
> > time.
> >
> > Here's where we stand.
> >
> > We have 2 separate implementations for re-authenticating existing
> > connections -- a so-called "low-level" approach and a (relatively
> speaking)
> > "high-level" approach -- and we agree on how they should be compared.
> > Specifically, Rajini has provided a mechanism that works at a relatively
> > low level in the stack by intercepting network sends and queueing them up
> > while re-authentication happens; then the queued sends are sent after
> > re-authentication succeeds, or the connection is closed if
> > re-authentication fails.  See the prototype commit at
> >
> > https://github.com/rajinisivaram/kafka/commit/b9d711907ad843
> c11d17e80d6743bfb1d4e3f3fd
> > .
> >
> > I have an implementation that works higher up in the stack; it injects
> > authentication requests into the KafkaClient via a new method added to
> that
> > interface, and the implementation (i.e. NetworkClient) sends them when
> > poll() is called.  The updated PR is available at
> > https://github.com/apache/kafka/pull/5582/.
> >
> > Here is how we think these two approaches should be compared.
> >
> > 1) *When re-authentication starts*.  The low-level approach initiates
> > re-authentication only if/when the connection is actually used, so it may
> > start after the existing credential expires; the current PR
> implementation
> > injects re-authentication requests into the existing flow, and
> > re-authentication starts immediately regardless of whether or not the
> > connection is being used for something 

Re: A question about kafka streams API

2018-09-12 Thread John Roesler
Hi!

As Adam said, if you throw an exception during processing, it should cause
Streams to shut itself down and *not* commit that message. Therefore, when
you start up again, it should again attempt to process that same message
(and shut down again).

Within a single partition, messages are processed in order, so a bad
message will block the queue, and you should not see subsequent messages
get processed.

However, if your later message "{}" goes to a different partition than the
bad message, then there's no relationship between them, and the later,
good, message might get processed.

Does that help?
-John

On Wed, Sep 12, 2018 at 8:38 AM Adam Bellemare 
wrote:

> Hi Yui Yoi
>
>
> Keep in mind that Kafka Consumers don't traditionally request only a single
> message at a time, but instead requests them in batches. This allows for
> much higher throughput, but does result in the scenario of "at-least-once"
> processing. Generally what will happen in this scenario is the following:
>
> 1) Client requests the next set of messages from offset (t). For example,
> assume it gets 10 messages and message 6 is "bad".
> 2) The client's processor will then process the messages one at a time.
> Note that the offsets are not committed after the message is processed, but
> only at the end of the batch.
> 3) The bad message it hit by the processor. At this point you can decide to
> skip the message, throw an exception, etc.
> 4a) If you decide to skip the message, processing will continue. Once all
> 10 messages are processed, the new offset (t+10) offset is committed back
> to Kafka.
> 4b) If you decide to throw an exception and terminate your app, you will
> have still processed the messages that came before the bad message. Because
> the offset (t+10) is not committed, the next time you start the app it will
> consume from offset t, and those messages will be processed again. This is
> "at-least-once" processing.
>
>
> Now, if you need exactly-once processing, you have two choices -
> 1) Use Kafka Streams with exactly-once semantics (though, as I am not
> familiar with your framework, it may support it as well).
> 2) Use idempotent practices (ie: it doesn't matter if the same messages get
> processed more than once).
>
>
> Hope this helps -
>
> Adam
>
>
> On Wed, Sep 12, 2018 at 7:59 AM, Yui Yoi  wrote:
>
> > Hi Adam,
> > Thanks a lot for the rapid response, it did helped!
> >
> > Let me though ask one more simple question: Can I make a stream
> application
> > stuck on an invalid message? and not consuming any further messages?
> >
> > Thanks again
> >
> > On Wed, Sep 12, 2018 at 2:35 PM Adam Bellemare  >
> > wrote:
> >
> > > Hi Yui Yoi
> > >
> > > Preface: I am not familiar with the spring framework.
> > >
> > > "Earliest" when it comes to consuming from Kafka means, "Start reading
> > from
> > > the first message in the topic, *if there is no offset stored for that
> > > consumer group*". It sounds like you are expecting it to re-read each
> > > message whenever a new message comes in. This is not going to happen,
> as
> > > there will be a committed offset and "earliest" will no longer be used.
> > If
> > > you were to use "latest" instead, if a consumer is started that does
> not
> > > have a valid offset, it would use the very latest message in the topic
> as
> > > the starting offset for message consumption.
> > >
> > > Now, if you are using the same consumer group each time you run the
> > > application (which it seems is true, as you have "test-group" hardwired
> > in
> > > your application.yml), but you do not tear down your local cluster and
> > > clear out its state, you will indeed see the behaviour you describe.
> > > Remember that Kafka is durable, and maintains the offsets when the
> > > individual applications go away. So you are probably seeing this:
> > >
> > > 1) start application instance 1. It realizes it has no offset when it
> > tries
> > > to register as a consumer on the input topic, so it creates a new
> > consumer
> > > entry for "earliest" for your consumer group.
> > > 2) send message "asd"
> > > 3) application instance 1 receives "asd", processes it, and updates the
> > > offset (offset head = 1)
> > > 4) Terminate instance 1
> > > 5) Start application instance 2. It detects correctly that consumer
> group
> > > "test-group" is available and reads that offset as its starting point.
> > > 6) send message "{}"
> > > 7) application instance 2 receives "{}", processes it, and updates the
> > > offset (offset head = 2)
> > > *NOTE:* App instance 2 NEVER received "asd", nor should it, as it is
> > > telling the Kafka cluster that it belongs to the same consumer group as
> > > application 1.
> > >
> > > Hope this helps,
> > >
> > > Adam
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Sep 12, 2018 at 6:57 AM, Yui Yoi  wrote:
> > >
> > > > TL;DR:
> > > > my streams application skips uncommitted messages
> > > >
> > > > Hello,
> > > > I'm using streams API via spring framework and experiencing a weird

Build failed in Jenkins: kafka-trunk-jdk10 #476

2018-09-12 Thread Apache Jenkins Server
See 

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H23 (ubuntu xenial) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
Fetching upstream changes from https://github.com/apache/kafka.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/kafka.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:888)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1155)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
at hudson.scm.SCM.checkout(SCM.java:504)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1794)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress https://github.com/apache/kafka.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Counting objects: 10373, done.
remote: Compressing objects:   3% (1/29)   remote: Compressing objects: 
  6% (2/29)   remote: Compressing objects:  10% (3/29)   
remote: Compressing objects:  13% (4/29)   remote: Compressing objects: 
 17% (5/29)   remote: Compressing objects:  20% (6/29)   
remote: Compressing objects:  24% (7/29)   remote: Compressing objects: 
 27% (8/29)   remote: Compressing objects:  31% (9/29)   
remote: Compressing objects:  34% (10/29)   remote: Compressing 
objects:  37% (11/29)   remote: Compressing objects:  41% (12/29)   
remote: Compressing objects:  44% (13/29)   remote: Compressing 
objects:  48% (14/29)   remote: Compressing objects:  51% (15/29)   
remote: Compressing objects:  55% (16/29)   remote: Compressing 
objects:  58% (17/29)   remote: Compressing objects:  62% (18/29)   
remote: Compressing objects:  65% (19/29)   remote: Compressing 
objects:  68% (20/29)   remote: Compressing objects:  72% (21/29)   
remote: Compressing objects:  75% (22/29)   remote: Compressing 
objects:  79% (23/29)   remote: Compressing objects:  82% (24/29)   
remote: Compressing objects:  86% (25/29)   remote: Compressing 
objects:  89% (26/29)   remote: Compressing objects:  93% (27/29)   
remote: Compressing objects:  96% (28/29)   remote: Compressing 
objects: 100% (29/29)   remote: Compressing objects: 100% (29/29), 
done.
Receiving objects:   0% (1/10373)   Receiving objects:   1% (104/10373)   
Receiving objects:   2% (208/10373)   Receiving objects:   3% (312/10373)   
Receiving objects:   4% (415/10373)   Receiving objects:   5% (519/10373)   
Receiving objects:   6% (623/10373)   Receiving objects:   7% (727/10373)   
Receiving objects:   8% (830/10373)   Receiving objects:   9% (934/10373)   
Receiving objects:  10% (1038/10373)   Receiving objects:  11% (1142/10373)   
Receiving objects:  12% (1245/10373)   Receiving objects:  13% (1349/10373)   
Receiving objects:  14% (1453/10373)   Receiving objects:  15% (1556/10373)   
Receiving objects:  16% (1660/10373)   Receiving objects:  17% (1764/10373)   
Receiving objects:  18% (1868/10373)   Receiving objects:  19% (1971/10373)   
Receiving objects:  20% (2075/10373)   Receiving objects:  21% (2179/10373)   
Receiving objects:  22% (2283/10373)   Receiving objects:  23% (2386/10373)   
Receiving objects:  24% (2490/10373)   Receiving objects:  25% (2594/10373)   
Receiving objects:  26% (2697/10373)   Receiving objects:  27% (2801/10373)   
Receiving objects:  28% (2905/10373)   Receiving objects:  29% (3009/10373)   
Receiving objects:  30% (3112/10373)   Receiving objects:  31% (3216/10373)   
Receiving objects:  32% (3320/10373)   Receiving objects:  33% (3424/10373)   
Receiving objects:  34% (3527/10373)   Receiving objects:  35% (3631/10373)   
Receiving objects:  36% (3735/10373)   Receiving objects:  37% (3839/10373)   
Receiving 

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-12 Thread Ron Dagostino
Hi Rajini.  Here is some feedback/some things I thought of.

First, I just realized that from a timing perspective that I am not sure
retry is going to be necessary.  The background login refresh thread
triggers re-authentication when it refreshes the client's credential.  The
OAuth infrastructure has to be available in order for this refresh to
succeed (the background thread repeatedly retries if it can't refresh the
credential, and that retry loop handles any temporary outage).  If clients
are told to re-authenticate when the credential is refreshed **and they
actually re-authenticate at that point** then it is highly unlikely that
the OAuth infrastructure would fail within those intervening milliseconds.
So we don't need re-authentication retry in this KIP as long as retry
starts immediately.  The low-level prototype as currently coded doesn't
actually start re-authentication until the connection is subsequently used,
and that could take a while.  But then again, if the connection isn't used
for some period of time, then the lost value of having to abandon the
connection is lessened anyway.  Plus, as you pointed out, OAuth
infrastructure is assumed to be highly available.

Does this makes sense, and would you be willing to say that retry isn't a
necessary requirement?  I'm tempted but would like to hear your perspective
first.

Interleaving/latency and maintainability (more than lines of code) are the
two remaining areas of comparison.  I did not add the maintainability issue
to the KIP yet, but before I do I thought I would address it here first to
see if we can come to consensus on it because I'm not sure I see the
high-level approach as being hard to maintain (yet -- I could be
convinced/convince myself; we'll see).  I do want to make sure we are on
the same page about what is required to add re-authentication support in
the high-level case.  Granted, the amount is zero in the low-level case,
and it doesn't get any better that that, but the amount in the high-level
case is very low -- just a few lines of code.  For example:

KafkaAdminClient:
https://github.com/apache/kafka/pull/5582/commits/4fa70f38b9d33428ff98b64a3a2bd668f5f28c38#diff-6869b8fccf6b098cbcb0676e8ceb26a7
It is the same few lines of code for KafkaConsumer, KafkaProducer,
WorkerGroupMember, and TransactionMarkerChannelManager

The two synchronous I/O use cases are ControllerChannelManager and
ReplicaFetcherBlockingSend (via ReplicaFetcherThread), and they require a
little bit more -- but not much.

Thoughts?

Ron

On Wed, Sep 12, 2018 at 1:57 PM Ron Dagostino  wrote:

> Thanks, Rajini.  Before I digest/respond to that, here's an update that I
> just completed.
>
> I added a commit to the PR (https://github.com/apache/kafka/pull/5582/)
> that implements server-side kill of expired OAUTHBEARER connections.  No
> tests yet since we still haven't settled on a final approach (low-level vs.
> high-level).
>
> I also updated the KIP to reflect the latest discussion and PR as follows:
>
>- Include support for brokers killing connections as part of this KIP
>(rather than deferring it to a future KIP as was originally mentioned; the
>PR now includes it as mentioned -- it was very easy to add)
>- Added metrics (they will mirror existing ones related to
>authentications; I have not added those to the PR)
>- Updated the implementation description to reflect the current state
>of the PR, which is a high-level, one-size-fits-all approach (as opposed to
>my initial, even-higher-level approach)
>- Added a "Rejected Alternative" for the first version of the PR,
>which injected requests directly into synchronous I/O clients' queues
>- Added a "Rejected Alternative" for the low-level approach as
>suggested, but of course we have not formally decided to reject this
>approach or adopt the current PR implementation.
>
> I'll think about where we stand some more before responding again.  Thanks
> for the above reply.
>
> Ron
>
> On Wed, Sep 12, 2018 at 1:36 PM Rajini Sivaram 
> wrote:
>
>> Hi Ron,
>>
>> Thank you for summarising, I think it covers the differences between the
>> two approaches well.
>>
>> A few minor points to answer the questions in there:
>>
>> 1) When re-authetication is initiated in the Selector during poll(), we
>> can
>> move an idle channel to re-authentication state. It is similar to
>> injecting
>> requests, but achieved by changing channel back to authenticating state.
>>
>> 3) To clarify why I think re-authentication should fit in with our
>> authentication design: My point was not about a specific connection being
>> usable or not usable. It was about what happens at the client API level.
>> Our client API (producer/consumer/admin client etc.) currently assume that
>> a single broker authentication failure is a fatal error that is never
>> retried because we assume that broker only ever fails an authentication
>> request if credentials are invalid. If we ever decide to support 

Re: [DISCUSS] KIP-358: Migrate Streams API to Duration instead of long ms times

2018-09-12 Thread John Roesler
Hi Nikolay,

Yes, the changes we discussed for ReadOnlyXxxStore and XxxStore should be
in this KIP.

And you're right, it seems like ReadOnlySessionStore is not necessary to
touch, since it doesn't reference any `long` timestamps.

Thanks,
-John

On Wed, Sep 12, 2018 at 4:36 AM Nikolay Izhikov  wrote:

> Hello, Matthias.
>
> > His proposal is, to deprecate existing methods on `ReadOnlyWindowStore`>
> and `ReadOnlySessionStore` and add them to `WindowStore` and> `SessionStore`
> > Does this make sense?
>
> You both are experienced Kafka developers, so yes, it does make a sense to
> me :).
> Do we want to make this change in KIP-358 or it required another KIP?
>
> > Btw: the KIP misses `ReadOnlySessionStore` atm.
>
> Sorry, but I don't understand you.
> As far as I can see, there is only 2 methods in `ReadOnlySessionStore`.
> Which method should be migrated to Duration?
>
>
> https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/ReadOnlySessionStore.java
>
> В Вт, 11/09/2018 в 09:21 -0700, Matthias J. Sax пишет:
> > I talked to John offline about his last suggestions, that I originally
> > did not fully understand.
> >
> > His proposal is, to deprecate existing methods on `ReadOnlyWindowStore`
> > and `ReadOnlySessionStore` and add them to `WindowStore` and
> > `SessionStore` (note, all singular -- not to be confused with classes
> > names plural).
> >
> > Btw: the KIP misses `ReadOnlySessionStore` atm.
> >
> > The argument is, that the `ReadOnlyXxxStore` interfaces are only exposed
> > via Interactive Queries feature and for this part, using `long` is
> > undesired. However, for a `Processor` that reads/writes stores on the
> > hot code path, we would like to avoid the object creation overhead and
> > stay with `long`. Note, that a `Processor` would use the "read-write"
> > interfaces and thus, we can add the more efficient read methods using
> > `long` there.
> >
> > Does this make sense?
> >
> >
> > -Matthias
> >
> > On 9/11/18 12:20 AM, Nikolay Izhikov wrote:
> > > Hello, Guozhang, Bill.
> > >
> > > > 1) I'd suggest keeping `Punctuator#punctuate(long timestamp)` as is
> > >
> > > I am agree with you.
> > > Currently, `Punctuator` edits are not included in KIP.
> > >
> > > > 2) I'm fine with keeping KeyValueStore extending
> ReadOnlyKeyValueStore
> > >
> > > Great, currently, there is no suggested API change in `KeyValueStore`
> or `ReadOnlyKeyValueStore`.
> > >
> > > Seems, you agree with all KIP details.
> > > Can you vote, please? [1]
> > >
> > > [1]
> https://lists.apache.org/thread.html/e976352e7e42d459091ee66ac790b6a0de7064eac0c57760d50c983b@%3Cdev.kafka.apache.org%3E
> > >
> > >
> > > В Пн, 10/09/2018 в 19:49 -0400, Bill Bejeck пишет:
> > > > Hi Nikolay,
> > > >
> > > > I'm a +1 to points 1 and 2 above from Guozhang.
> > > >
> > > > Thanks,
> > > > Bill
> > > >
> > > > On Mon, Sep 10, 2018 at 6:58 PM Guozhang Wang 
> wrote:
> > > >
> > > > > Hello Nikolay,
> > > > >
> > > > > Thanks for picking this up! Just sharing my two cents:
> > > > >
> > > > > 1) I'd suggest keeping `Punctuator#punctuate(long timestamp)` as
> is since
> > > > > comparing with other places where we are replacing with Duration
> and
> > > > > Instant, this is not a user specified value as part of the DSL but
> rather a
> > > > > passed-in parameter, plus with high punctuation frequency creating
> a new
> > > > > instance of Instant may be costly.
> > > > >
> > > > > 2) I'm fine with keeping KeyValueStore extending
> ReadOnlyKeyValueStore with
> > > > > APIs of `long` as well as inheriting APIs of `Duration`.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > >
> > > > > On Mon, Sep 10, 2018 at 11:11 AM, Nikolay Izhikov <
> nizhi...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hello, Matthias.
> > > > > >
> > > > > > > (4) While I agree that we might want to deprecate it, I am not
> sure if
> > > > > >
> > > > > > this should be part of this KIP?
> > > > > > > Seems to be unrelated?
> > > > > > > Should this have been part of KIP-319?
> > > > > > > If yes, we might still want to updated this other KIP? WDYT?
> > > > > >
> > > > > > OK, I removed this deprecation from this KIP.
> > > > > >
> > > > > > Please, tell me, is there anything else that should be improved
> to make
> > > > > > this KIP ready to be implemented.
> > > > > >
> > > > > > В Пт, 07/09/2018 в 17:06 -0700, Matthias J. Sax пишет:
> > > > > > > (1) Sounds good to me, to just use IllegalArgumentException
> for both --
> > > > > > > and thanks for pointing out that Duration can be negative and
> we need
> > > > >
> > > > > to
> > > > > > > check for this. For the KIP, it would be nice to add to all
> methods
> > > > >
> > > > > than
> > > > > > > (even if we don't do it in the code but only document in
> JavaDocs).
> > > > > > >
> > > > > > > (2) I would argue for a new single method interface. Not sure
> about the
> > > > > > > name though.
> > > > > > >
> > > > > > > (3) Even if `#fetch(K, K, long, long)` and 

[jira] [Created] (KAFKA-7405) Support for graceful handling of corrupted records in Kafka consumer

2018-09-12 Thread Eugen Feller (JIRA)
Eugen Feller created KAFKA-7405:
---

 Summary: Support for graceful handling of corrupted records in 
Kafka consumer
 Key: KAFKA-7405
 URL: https://issues.apache.org/jira/browse/KAFKA-7405
 Project: Kafka
  Issue Type: Improvement
  Components: consumer, streams
Affects Versions: 1.1.1, 0.11.0.3, 0.10.2.2
Reporter: Eugen Feller


We have run into issues several times where a corrupted records cause the Kafka 
consumer to throw an error code 2 exception in the fetch layer that can not be 
handled gracefully. Specifically, when using Kafka streams we run into 
KAFKA-6977 that throws an IllegalStateException. It would be great if the Kafka 
consumer could be extended with a setting similar to 
[KIP-161|https://cwiki.apache.org/confluence/display/KAFKA/KIP-161%3A+streams+deserialization+exception+handlers],
 that would allow one to cleanly ignore corrupted records.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: kafka-trunk-jdk10 #477

2018-09-12 Thread Apache Jenkins Server
See 

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H23 (ubuntu xenial) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
Fetching upstream changes from https://github.com/apache/kafka.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/kafka.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:888)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1155)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
at hudson.scm.SCM.checkout(SCM.java:504)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1794)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress https://github.com/apache/kafka.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Counting objects: 10373, done.
remote: Compressing objects:   3% (1/29)   remote: Compressing objects: 
  6% (2/29)   remote: Compressing objects:  10% (3/29)   
remote: Compressing objects:  13% (4/29)   remote: Compressing objects: 
 17% (5/29)   remote: Compressing objects:  20% (6/29)   
remote: Compressing objects:  24% (7/29)   remote: Compressing objects: 
 27% (8/29)   remote: Compressing objects:  31% (9/29)   
remote: Compressing objects:  34% (10/29)   remote: Compressing 
objects:  37% (11/29)   remote: Compressing objects:  41% (12/29)   
remote: Compressing objects:  44% (13/29)   remote: Compressing 
objects:  48% (14/29)   remote: Compressing objects:  51% (15/29)   
remote: Compressing objects:  55% (16/29)   remote: Compressing 
objects:  58% (17/29)   remote: Compressing objects:  62% (18/29)   
remote: Compressing objects:  65% (19/29)   remote: Compressing 
objects:  68% (20/29)   remote: Compressing objects:  72% (21/29)   
remote: Compressing objects:  75% (22/29)   remote: Compressing 
objects:  79% (23/29)   remote: Compressing objects:  82% (24/29)   
remote: Compressing objects:  86% (25/29)   remote: Compressing 
objects:  89% (26/29)   remote: Compressing objects:  93% (27/29)   
remote: Compressing objects:  96% (28/29)   remote: Compressing 
objects: 100% (29/29)   remote: Compressing objects: 100% (29/29), 
done.
Receiving objects:   0% (1/10373)   Receiving objects:   1% (104/10373)   
Receiving objects:   2% (208/10373)   Receiving objects:   3% (312/10373)   
Receiving objects:   4% (415/10373)   Receiving objects:   5% (519/10373)   
Receiving objects:   6% (623/10373)   Receiving objects:   7% (727/10373)   
Receiving objects:   8% (830/10373)   Receiving objects:   9% (934/10373)   
Receiving objects:  10% (1038/10373)   Receiving objects:  11% (1142/10373)   
Receiving objects:  12% (1245/10373)   Receiving objects:  13% (1349/10373)   
Receiving objects:  14% (1453/10373)   Receiving objects:  15% (1556/10373)   
Receiving objects:  16% (1660/10373)   Receiving objects:  17% (1764/10373)   
Receiving objects:  18% (1868/10373)   Receiving objects:  19% (1971/10373)   
Receiving objects:  20% (2075/10373)   Receiving objects:  21% (2179/10373)   
Receiving objects:  22% (2283/10373)   Receiving objects:  23% (2386/10373)   
Receiving objects:  24% (2490/10373)   Receiving objects:  25% (2594/10373)   
Receiving objects:  26% (2697/10373)   Receiving objects:  27% (2801/10373)   
Receiving objects:  28% (2905/10373)   Receiving objects:  29% (3009/10373)   
Receiving objects:  30% (3112/10373)   Receiving objects:  31% (3216/10373)   
Receiving objects:  32% (3320/10373)   Receiving objects:  33% (3424/10373)   
Receiving objects:  34% (3527/10373)   Receiving objects:  35% (3631/10373)   
Receiving objects:  36% (3735/10373)   Receiving objects:  37% (3839/10373)   
Receiving 

[jira] [Created] (KAFKA-7406) Naming Join and Grouping Operations

2018-09-12 Thread Bill Bejeck (JIRA)
Bill Bejeck created KAFKA-7406:
--

 Summary: Naming Join and Grouping Operations
 Key: KAFKA-7406
 URL: https://issues.apache.org/jira/browse/KAFKA-7406
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Bill Bejeck
Assignee: Bill Bejeck
 Fix For: 2.1.0


To help make Streams compatible with topology changes, we will need to give 
users the ability to name some operators so after adjusting the topology a 
rolling upgrade is possible.  

This Jira is the first in this effort to allow for giving operators 
deterministic names.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-12 Thread Ron Dagostino
Ok, I am tempted to just say we go with the low-level approach since it is
the quickest and seems to meet the clear requirements. We can always adjust
later if we get to clarity on other requirements or we decide we need to
approach it differently for whatever reason.  But in the meantime, before
fully committing to this decision, I would appreciate another perspective
if someone has one.

Ron

On Wed, Sep 12, 2018 at 3:15 PM Rajini Sivaram 
wrote:

> Hi Ron,
>
> Yes, I would leave out retries from this KIP for now. In the future, if
> there is a requirement for supporting retries, we can consider it. I think
> we can support retries with either approach if we needed to, but it would
> be better to do it along with other changes required to support
> authentication servers that are not highly available.
>
> For maintainability, I am biased, so it will be good to get the perspective
> of others in the community :-)
>
> On Wed, Sep 12, 2018 at 7:47 PM, Ron Dagostino  wrote:
>
> > Hi Rajini.  Here is some feedback/some things I thought of.
> >
> > First, I just realized that from a timing perspective that I am not sure
> > retry is going to be necessary.  The background login refresh thread
> > triggers re-authentication when it refreshes the client's credential.
> The
> > OAuth infrastructure has to be available in order for this refresh to
> > succeed (the background thread repeatedly retries if it can't refresh the
> > credential, and that retry loop handles any temporary outage).  If
> clients
> > are told to re-authenticate when the credential is refreshed **and they
> > actually re-authenticate at that point** then it is highly unlikely that
> > the OAuth infrastructure would fail within those intervening
> milliseconds.
> > So we don't need re-authentication retry in this KIP as long as retry
> > starts immediately.  The low-level prototype as currently coded doesn't
> > actually start re-authentication until the connection is subsequently
> used,
> > and that could take a while.  But then again, if the connection isn't
> used
> > for some period of time, then the lost value of having to abandon the
> > connection is lessened anyway.  Plus, as you pointed out, OAuth
> > infrastructure is assumed to be highly available.
> >
> > Does this makes sense, and would you be willing to say that retry isn't a
> > necessary requirement?  I'm tempted but would like to hear your
> perspective
> > first.
> >
> > Interleaving/latency and maintainability (more than lines of code) are
> the
> > two remaining areas of comparison.  I did not add the maintainability
> issue
> > to the KIP yet, but before I do I thought I would address it here first
> to
> > see if we can come to consensus on it because I'm not sure I see the
> > high-level approach as being hard to maintain (yet -- I could be
> > convinced/convince myself; we'll see).  I do want to make sure we are on
> > the same page about what is required to add re-authentication support in
> > the high-level case.  Granted, the amount is zero in the low-level case,
> > and it doesn't get any better that that, but the amount in the high-level
> > case is very low -- just a few lines of code.  For example:
> >
> > KafkaAdminClient:
> > https://github.com/apache/kafka/pull/5582/commits/4fa70f38b9
> > d33428ff98b64a3a2bd668f5f28c38#diff-6869b8fccf6b098cbcb0676e8ceb26a7
> > It is the same few lines of code for KafkaConsumer, KafkaProducer,
> > WorkerGroupMember, and TransactionMarkerChannelManager
> >
> > The two synchronous I/O use cases are ControllerChannelManager and
> > ReplicaFetcherBlockingSend (via ReplicaFetcherThread), and they require a
> > little bit more -- but not much.
> >
> > Thoughts?
> >
> > Ron
> >
> > On Wed, Sep 12, 2018 at 1:57 PM Ron Dagostino  wrote:
> >
> > > Thanks, Rajini.  Before I digest/respond to that, here's an update
> that I
> > > just completed.
> > >
> > > I added a commit to the PR (https://github.com/apache/kafka/pull/5582/
> )
> > > that implements server-side kill of expired OAUTHBEARER connections.
> No
> > > tests yet since we still haven't settled on a final approach (low-level
> > vs.
> > > high-level).
> > >
> > > I also updated the KIP to reflect the latest discussion and PR as
> > follows:
> > >
> > >- Include support for brokers killing connections as part of this
> KIP
> > >(rather than deferring it to a future KIP as was originally
> > mentioned; the
> > >PR now includes it as mentioned -- it was very easy to add)
> > >- Added metrics (they will mirror existing ones related to
> > >authentications; I have not added those to the PR)
> > >- Updated the implementation description to reflect the current
> state
> > >of the PR, which is a high-level, one-size-fits-all approach (as
> > opposed to
> > >my initial, even-higher-level approach)
> > >- Added a "Rejected Alternative" for the first version of the PR,
> > >which injected requests directly into synchronous I/O clients'
> queues
> > 

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-12 Thread Rajini Sivaram
Hi Ron,

Yes, I would leave out retries from this KIP for now. In the future, if
there is a requirement for supporting retries, we can consider it. I think
we can support retries with either approach if we needed to, but it would
be better to do it along with other changes required to support
authentication servers that are not highly available.

For maintainability, I am biased, so it will be good to get the perspective
of others in the community :-)

On Wed, Sep 12, 2018 at 7:47 PM, Ron Dagostino  wrote:

> Hi Rajini.  Here is some feedback/some things I thought of.
>
> First, I just realized that from a timing perspective that I am not sure
> retry is going to be necessary.  The background login refresh thread
> triggers re-authentication when it refreshes the client's credential.  The
> OAuth infrastructure has to be available in order for this refresh to
> succeed (the background thread repeatedly retries if it can't refresh the
> credential, and that retry loop handles any temporary outage).  If clients
> are told to re-authenticate when the credential is refreshed **and they
> actually re-authenticate at that point** then it is highly unlikely that
> the OAuth infrastructure would fail within those intervening milliseconds.
> So we don't need re-authentication retry in this KIP as long as retry
> starts immediately.  The low-level prototype as currently coded doesn't
> actually start re-authentication until the connection is subsequently used,
> and that could take a while.  But then again, if the connection isn't used
> for some period of time, then the lost value of having to abandon the
> connection is lessened anyway.  Plus, as you pointed out, OAuth
> infrastructure is assumed to be highly available.
>
> Does this makes sense, and would you be willing to say that retry isn't a
> necessary requirement?  I'm tempted but would like to hear your perspective
> first.
>
> Interleaving/latency and maintainability (more than lines of code) are the
> two remaining areas of comparison.  I did not add the maintainability issue
> to the KIP yet, but before I do I thought I would address it here first to
> see if we can come to consensus on it because I'm not sure I see the
> high-level approach as being hard to maintain (yet -- I could be
> convinced/convince myself; we'll see).  I do want to make sure we are on
> the same page about what is required to add re-authentication support in
> the high-level case.  Granted, the amount is zero in the low-level case,
> and it doesn't get any better that that, but the amount in the high-level
> case is very low -- just a few lines of code.  For example:
>
> KafkaAdminClient:
> https://github.com/apache/kafka/pull/5582/commits/4fa70f38b9
> d33428ff98b64a3a2bd668f5f28c38#diff-6869b8fccf6b098cbcb0676e8ceb26a7
> It is the same few lines of code for KafkaConsumer, KafkaProducer,
> WorkerGroupMember, and TransactionMarkerChannelManager
>
> The two synchronous I/O use cases are ControllerChannelManager and
> ReplicaFetcherBlockingSend (via ReplicaFetcherThread), and they require a
> little bit more -- but not much.
>
> Thoughts?
>
> Ron
>
> On Wed, Sep 12, 2018 at 1:57 PM Ron Dagostino  wrote:
>
> > Thanks, Rajini.  Before I digest/respond to that, here's an update that I
> > just completed.
> >
> > I added a commit to the PR (https://github.com/apache/kafka/pull/5582/)
> > that implements server-side kill of expired OAUTHBEARER connections.  No
> > tests yet since we still haven't settled on a final approach (low-level
> vs.
> > high-level).
> >
> > I also updated the KIP to reflect the latest discussion and PR as
> follows:
> >
> >- Include support for brokers killing connections as part of this KIP
> >(rather than deferring it to a future KIP as was originally
> mentioned; the
> >PR now includes it as mentioned -- it was very easy to add)
> >- Added metrics (they will mirror existing ones related to
> >authentications; I have not added those to the PR)
> >- Updated the implementation description to reflect the current state
> >of the PR, which is a high-level, one-size-fits-all approach (as
> opposed to
> >my initial, even-higher-level approach)
> >- Added a "Rejected Alternative" for the first version of the PR,
> >which injected requests directly into synchronous I/O clients' queues
> >- Added a "Rejected Alternative" for the low-level approach as
> >suggested, but of course we have not formally decided to reject this
> >approach or adopt the current PR implementation.
> >
> > I'll think about where we stand some more before responding again.
> Thanks
> > for the above reply.
> >
> > Ron
> >
> > On Wed, Sep 12, 2018 at 1:36 PM Rajini Sivaram 
> > wrote:
> >
> >> Hi Ron,
> >>
> >> Thank you for summarising, I think it covers the differences between the
> >> two approaches well.
> >>
> >> A few minor points to answer the questions in there:
> >>
> >> 1) When re-authetication is initiated in the Selector during 

Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-12 Thread Guozhang Wang
Hello Bill,

I made a pass over your proposal and here are some questions:

1. For Joined names, the current proposal is to define the repartition
topic names as

* [app-id]-this-[join-name]-repartition

* [app-id]-other-[join-name]-repartition


And if [join-name] not specified, stay the same, which is:

* [previous-processor-name]-repartition for both Stream-Stream (S-S) join
and S-T join

I think it is more natural to rename it to

* [app-id]-[join-name]-left-repartition

* [app-id]-[join-name]-right-repartition


2. I'd suggest to use the name to also define the corresponding processor
names accordingly, in addition to the repartition topic names. Note that
for joins, this may be overlapping with KIP-307

as
it also have proposals for defining processor names for join operators as
well.

3. Could you also specify how this would affect the optimization for
merging multiple repartition topics?

4. In the "Compatibility, Deprecation, and Migration Plan" section, could
you also mention the following scenarios, if any of the upgrade path would
be changed:

 a) changing user DSL code: under which scenarios users can now do a
rolling bounce instead of resetting applications.

 b) upgrading from older version to new version, with all the names
specified, and with optimization turned on. E.g. say we have the code
written in 2.1 with all names specified, and now upgrading to 2.2 with new
optimizations that may potentially change the repartition topics. Is that
always safe to do?



Guozhang


On Wed, Sep 12, 2018 at 4:52 PM, Bill Bejeck  wrote:

> All I'd like to start a discussion on KIP-372 for the naming of joins and
> grouping operations in Kafka Streams.
>
> The KIP page can be found here:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 372%3A+Naming+Joins+and+Grouping
>
> I look forward to feedback and comments.
>
> Thanks,
> Bill
>



-- 
-- Guozhang


[DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-12 Thread Bill Bejeck
All I'd like to start a discussion on KIP-372 for the naming of joins and
grouping operations in Kafka Streams.

The KIP page can be found here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-372%3A+Naming+Joins+and+Grouping

I look forward to feedback and comments.

Thanks,
Bill


Build failed in Jenkins: kafka-trunk-jdk10 #478

2018-09-12 Thread Apache Jenkins Server
See 


Changes:

[rajinisivaram] MINOR: Remove deprecated Metric.value() method usage (#5626)

--
[...truncated 2.22 MB...]

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessRecordForTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessRecordForTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldForwardRecordsFromSubtopologyToSubtopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldForwardRecordsFromSubtopologyToSubtopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForSmallerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForSmallerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis STARTED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis PASSED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime STARTED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime PASSED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep PASSED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldStoreAndReturnStateStores STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldStoreAndReturnStateStores PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureOutputRecordsUsingTo STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureOutputRecordsUsingTo PASSED

org.apache.kafka.streams.MockProcessorContextTest > shouldCaptureOutputRecords 
STARTED

org.apache.kafka.streams.MockProcessorContextTest > shouldCaptureOutputRecords 
PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
fullConstructorShouldSetAllExpectedAttributes STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
fullConstructorShouldSetAllExpectedAttributes PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureCommitsAndAllowReset STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureCommitsAndAllowReset PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldThrowIfForwardedWithDeprecatedChildName STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldThrowIfForwardedWithDeprecatedChildName PASSED


Re: [VOTE] KIP-110: Add Codec for ZStandard Compression

2018-09-12 Thread Jason Gustafson
Great contribution! +1

On Wed, Sep 12, 2018 at 10:20 AM, Manikumar 
wrote:

> +1 (non-binding).
>
> Thanks for the KIP.
>
> On Wed, Sep 12, 2018 at 10:44 PM Ismael Juma  wrote:
>
> > Thanks for the KIP, +1 (binding).
> >
> > Ismael
> >
> > On Wed, Sep 12, 2018 at 10:02 AM Dongjin Lee  wrote:
> >
> > > Hello, I would like to start a VOTE on KIP-110: Add Codec for ZStandard
> > > Compression.
> > >
> > > The KIP:
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 110%3A+Add+Codec+for+ZStandard+Compression
> > > Discussion thread:
> > > https://www.mail-archive.com/dev@kafka.apache.org/msg88673.html
> > >
> > > Thanks,
> > > Dongjin
> > >
> > > --
> > > *Dongjin Lee*
> > >
> > > *A hitchhiker in the mathematical world.*
> > >
> > > *github:  github.com/dongjinleekr
> > > linkedin:
> > kr.linkedin.com/in/dongjinleekr
> > > slideshare:
> > > www.slideshare.net/dongjinleekr
> > > *
> > >
> >
>


Re: [DISCUSS] KIP-358: Migrate Streams API to Duration instead of long ms times

2018-09-12 Thread Matthias J. Sax
Great!

I did not double check the ReadOnlySessionStore interface before, and
just assumed it would take a timestamp, too. My bad.

Please update the KIP for ReadOnlyWindowStore and WindowStore.

I like the KIP as-is. Feel free to start a VOTE thread. Even if there
might be minor follow up comments, we can vote in parallel.


-Matthias

On 9/12/18 1:06 PM, John Roesler wrote:
> Hi Nikolay,
> 
> Yes, the changes we discussed for ReadOnlyXxxStore and XxxStore should be
> in this KIP.
> 
> And you're right, it seems like ReadOnlySessionStore is not necessary to
> touch, since it doesn't reference any `long` timestamps.
> 
> Thanks,
> -John
> 
> On Wed, Sep 12, 2018 at 4:36 AM Nikolay Izhikov  wrote:
> 
>> Hello, Matthias.
>>
>>> His proposal is, to deprecate existing methods on `ReadOnlyWindowStore`>
>> and `ReadOnlySessionStore` and add them to `WindowStore` and> `SessionStore`
>>> Does this make sense?
>>
>> You both are experienced Kafka developers, so yes, it does make a sense to
>> me :).
>> Do we want to make this change in KIP-358 or it required another KIP?
>>
>>> Btw: the KIP misses `ReadOnlySessionStore` atm.
>>
>> Sorry, but I don't understand you.
>> As far as I can see, there is only 2 methods in `ReadOnlySessionStore`.
>> Which method should be migrated to Duration?
>>
>>
>> https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/ReadOnlySessionStore.java
>>
>> В Вт, 11/09/2018 в 09:21 -0700, Matthias J. Sax пишет:
>>> I talked to John offline about his last suggestions, that I originally
>>> did not fully understand.
>>>
>>> His proposal is, to deprecate existing methods on `ReadOnlyWindowStore`
>>> and `ReadOnlySessionStore` and add them to `WindowStore` and
>>> `SessionStore` (note, all singular -- not to be confused with classes
>>> names plural).
>>>
>>> Btw: the KIP misses `ReadOnlySessionStore` atm.
>>>
>>> The argument is, that the `ReadOnlyXxxStore` interfaces are only exposed
>>> via Interactive Queries feature and for this part, using `long` is
>>> undesired. However, for a `Processor` that reads/writes stores on the
>>> hot code path, we would like to avoid the object creation overhead and
>>> stay with `long`. Note, that a `Processor` would use the "read-write"
>>> interfaces and thus, we can add the more efficient read methods using
>>> `long` there.
>>>
>>> Does this make sense?
>>>
>>>
>>> -Matthias
>>>
>>> On 9/11/18 12:20 AM, Nikolay Izhikov wrote:
 Hello, Guozhang, Bill.

> 1) I'd suggest keeping `Punctuator#punctuate(long timestamp)` as is

 I am agree with you.
 Currently, `Punctuator` edits are not included in KIP.

> 2) I'm fine with keeping KeyValueStore extending
>> ReadOnlyKeyValueStore

 Great, currently, there is no suggested API change in `KeyValueStore`
>> or `ReadOnlyKeyValueStore`.

 Seems, you agree with all KIP details.
 Can you vote, please? [1]

 [1]
>> https://lists.apache.org/thread.html/e976352e7e42d459091ee66ac790b6a0de7064eac0c57760d50c983b@%3Cdev.kafka.apache.org%3E


 В Пн, 10/09/2018 в 19:49 -0400, Bill Bejeck пишет:
> Hi Nikolay,
>
> I'm a +1 to points 1 and 2 above from Guozhang.
>
> Thanks,
> Bill
>
> On Mon, Sep 10, 2018 at 6:58 PM Guozhang Wang 
>> wrote:
>
>> Hello Nikolay,
>>
>> Thanks for picking this up! Just sharing my two cents:
>>
>> 1) I'd suggest keeping `Punctuator#punctuate(long timestamp)` as
>> is since
>> comparing with other places where we are replacing with Duration
>> and
>> Instant, this is not a user specified value as part of the DSL but
>> rather a
>> passed-in parameter, plus with high punctuation frequency creating
>> a new
>> instance of Instant may be costly.
>>
>> 2) I'm fine with keeping KeyValueStore extending
>> ReadOnlyKeyValueStore with
>> APIs of `long` as well as inheriting APIs of `Duration`.
>>
>>
>> Guozhang
>>
>>
>> On Mon, Sep 10, 2018 at 11:11 AM, Nikolay Izhikov <
>> nizhi...@apache.org>
>> wrote:
>>
>>> Hello, Matthias.
>>>
 (4) While I agree that we might want to deprecate it, I am not
>> sure if
>>>
>>> this should be part of this KIP?
 Seems to be unrelated?
 Should this have been part of KIP-319?
 If yes, we might still want to updated this other KIP? WDYT?
>>>
>>> OK, I removed this deprecation from this KIP.
>>>
>>> Please, tell me, is there anything else that should be improved
>> to make
>>> this KIP ready to be implemented.
>>>
>>> В Пт, 07/09/2018 в 17:06 -0700, Matthias J. Sax пишет:
 (1) Sounds good to me, to just use IllegalArgumentException
>> for both --
 and thanks for pointing out that Duration can be negative and
>> we need
>>
>> to
 check for this. For the KIP, it would be nice to add to all
>> methods
>>
>> than
 (even if we don't do it in 

Build failed in Jenkins: kafka-trunk-jdk8 #2961

2018-09-12 Thread Apache Jenkins Server
See 


Changes:

[rajinisivaram] MINOR: Remove deprecated Metric.value() method usage (#5626)

--
[...truncated 2.69 MB...]
org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 

Re: [VOTE] KIP-110: Add Codec for ZStandard Compression

2018-09-12 Thread Harsha
+1 (binding).

Thanks,
Harsha

On Wed, Sep 12, 2018, at 4:56 PM, Jason Gustafson wrote:
> Great contribution! +1
> 
> On Wed, Sep 12, 2018 at 10:20 AM, Manikumar 
> wrote:
> 
> > +1 (non-binding).
> >
> > Thanks for the KIP.
> >
> > On Wed, Sep 12, 2018 at 10:44 PM Ismael Juma  wrote:
> >
> > > Thanks for the KIP, +1 (binding).
> > >
> > > Ismael
> > >
> > > On Wed, Sep 12, 2018 at 10:02 AM Dongjin Lee  wrote:
> > >
> > > > Hello, I would like to start a VOTE on KIP-110: Add Codec for ZStandard
> > > > Compression.
> > > >
> > > > The KIP:
> > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 110%3A+Add+Codec+for+ZStandard+Compression
> > > > Discussion thread:
> > > > https://www.mail-archive.com/dev@kafka.apache.org/msg88673.html
> > > >
> > > > Thanks,
> > > > Dongjin
> > > >
> > > > --
> > > > *Dongjin Lee*
> > > >
> > > > *A hitchhiker in the mathematical world.*
> > > >
> > > > *github:  github.com/dongjinleekr
> > > > linkedin:
> > > kr.linkedin.com/in/dongjinleekr
> > > > slideshare:
> > > > www.slideshare.net/dongjinleekr
> > > > *
> > > >
> > >
> >


Re: KIP-213 - Scalable/Usable Foreign-Key KTable joins - Rebooted.

2018-09-12 Thread Jan Filipiak


On 11.09.2018 18:00, Adam Bellemare wrote:

Hi Guozhang

Current highwater mark implementation would grow endlessly based on 
primary key of original event. It is a pair of (key>, ). This is used to 
differentiate between late arrivals and new updates. My newest 
proposal would be to replace it with a Windowed state store of 
Duration N. This would allow the same behaviour, but cap the size 
based on time. This should allow for all late-arriving events to be 
processed, and should be customizable by the user to tailor to their 
own needs (ie: perhaps just 10 minutes of window, or perhaps 7 days...).
Hi Adam, using time based retention can do the trick here. Even if I 
would still like to see the automatic repartitioning optional since I 
would just reshuffle again. With windowed store I am a little bit 
sceptical about how to determine the window. So esentially one could run 
into problems when the rapid change happens near a window border. I will 
check you implementation in detail, if its problematic, we could still 
check _all_ windows on read with not to bad performance impact I guess. 
Will let you know if the implementation would be correct as is. I 
wouldn't not like to assume that: offset(A) < offset(B) => timestamp(A)  
< timestamp(B). I think we can't expect that.



@Jan
I believe I understand what you mean now - thanks for the diagram, it 
did really help. You are correct that I do not have the original 
primary key available, and I can see that if it was available then you 
would be able to add and remove events from the Map. That being said, 
I encourage you to finish your diagrams / charts just for clarity for 
everyone else.


Yeah 100%, this giphy thing is just really hard work. But I understand 
the benefits for the rest. Sorry about the original primary key, We have 
join and Group by implemented our own in PAPI and basically not using 
any DSL (Just the abstraction). Completely missed that in original DSL 
its not there and just assumed it. total brain mess up on my end. Will 
finish the chart as soon as i get a quite evening this week.


My follow up question for you is, won't the Map stay inside the State 
Store indefinitely after all of the changes have propagated? Isn't 
this effectively the same as a highwater mark state store?
Thing is that if the map is empty, substractor is gonna return `null` 
and the key is removed from the keyspace. But there is going to be a 
store 100%, the good thing is that I can use this store directly for
materialize() / enableSendingOldValues() is a regular store, satisfying 
all gurantees needed for further groupby / join. The Windowed store is 
not keeping the values, so for the next statefull operation we would
need to instantiate an extra store. or we have the window store also 
have the values then.


Long story short. if we can flip in a custom group by before 
repartitioning to the original primary key i think it would help the 
users big time in building efficient apps. Given the original primary 
key issue I understand that we do not have a solid foundation to build on.
Leaving primary key carry along to the user. very unfortunate. I could 
understand the decision goes like that. I do not think its a good decision.







Thanks
Adam






On Tue, Sep 11, 2018 at 10:07 AM, Prajakta Dumbre 
mailto:dumbreprajakta...@gmail.com>> wrote:


please remove me from this group

On Tue, Sep 11, 2018 at 1:29 PM Jan Filipiak
mailto:jan.filip...@trivago.com>>
wrote:

> Hi Adam,
>
> give me some time, will make such a chart. last time i didn't
get along
> well with giphy and ruined all your charts.
> Hopefully i can get it done today
>
> On 08.09.2018 16:00, Adam Bellemare wrote:
> > Hi Jan
> >
> > I have included a diagram of what I attempted on the KIP.
> >
>

https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable#KIP-213Supportnon-keyjoininginKTable-GroupBy+Reduce/Aggregate


> >
> > I attempted this back at the start of my own implementation of
this
> > solution, and since I could not get it to work I have since
discarded the
> > code. At this point in time, if you wish to continue pursuing
for your
> > groupBy solution, I ask that you please create a diagram on
the KIP
> > carefully explaining your solution. Please feel free to use
the image I
> > just posted as a starting point. I am having trouble
understanding your
> > explanations but I think that a carefully constructed diagram
will clear
> up
> > any misunderstandings. Alternately, please post a
comprehensive PR with
> > your solution. I can only guess at what you mean, and since I
value my
> own
> > time as much as you value yours, I believe it 

Re: [VOTE] KIP-367 Introduce close(Duration) to Producer and AdminClient instead of close(long, TimeUnit)

2018-09-12 Thread vito jeng
+1



---
Vito

On Mon, Sep 10, 2018 at 4:52 PM, Dongjin Lee  wrote:

> +1. (Non-binding)
>
> On Mon, Sep 10, 2018 at 4:13 AM Matthias J. Sax 
> wrote:
>
> > Thanks a lot for the KIP.
> >
> > +1 (binding)
> >
> >
> > -Matthias
> >
> >
> > On 9/8/18 11:27 AM, Chia-Ping Tsai wrote:
> > > Hi All,
> > >
> > > I'd like to put KIP-367 to the vote.
> > >
> > >
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=89070496
> > >
> > > --
> > > Chia-Ping
> > >
> >
> >
>
> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
>
> *github:  github.com/dongjinleekr
> linkedin: kr.linkedin.com/in/dongjinleekr
> slideshare:
> www.slideshare.net/dongjinleekr
> *
>


Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-12 Thread Matthias J. Sax
Follow up comments:

1) We should either use `[app-id]-this|other-[join-name]-repartition` or
`app-id]-[join-name]-left|right-repartition` but we should not change
the pattern depending if the user specifies a name of not. I am fine
with both patterns---just want to make sure with stick with one.

2) I didn't see why we would need to do this in this KIP. KIP-307 seems
to be orthogonal, and thus KIP-372 should not change any processor
names, but KIP-307 should define a holistic strategy for all processor.
Otherwise, we might up with different strategies or revert what we
decide in this KIP if it's not compatible with KIP-307.


-Matthias


On 9/12/18 6:28 PM, Guozhang Wang wrote:
> Hello Bill,
> 
> I made a pass over your proposal and here are some questions:
> 
> 1. For Joined names, the current proposal is to define the repartition
> topic names as
> 
> * [app-id]-this-[join-name]-repartition
> 
> * [app-id]-other-[join-name]-repartition
> 
> 
> And if [join-name] not specified, stay the same, which is:
> 
> * [previous-processor-name]-repartition for both Stream-Stream (S-S) join
> and S-T join
> 
> I think it is more natural to rename it to
> 
> * [app-id]-[join-name]-left-repartition
> 
> * [app-id]-[join-name]-right-repartition
> 
> 
> 2. I'd suggest to use the name to also define the corresponding processor
> names accordingly, in addition to the repartition topic names. Note that
> for joins, this may be overlapping with KIP-307
> 
> as
> it also have proposals for defining processor names for join operators as
> well.
> 
> 3. Could you also specify how this would affect the optimization for
> merging multiple repartition topics?
> 
> 4. In the "Compatibility, Deprecation, and Migration Plan" section, could
> you also mention the following scenarios, if any of the upgrade path would
> be changed:
> 
>  a) changing user DSL code: under which scenarios users can now do a
> rolling bounce instead of resetting applications.
> 
>  b) upgrading from older version to new version, with all the names
> specified, and with optimization turned on. E.g. say we have the code
> written in 2.1 with all names specified, and now upgrading to 2.2 with new
> optimizations that may potentially change the repartition topics. Is that
> always safe to do?
> 
> 
> 
> Guozhang
> 
> 
> On Wed, Sep 12, 2018 at 4:52 PM, Bill Bejeck  wrote:
> 
>> All I'd like to start a discussion on KIP-372 for the naming of joins and
>> grouping operations in Kafka Streams.
>>
>> The KIP page can be found here:
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 372%3A+Naming+Joins+and+Grouping
>>
>> I look forward to feedback and comments.
>>
>> Thanks,
>> Bill
>>
> 
> 
> 



signature.asc
Description: OpenPGP digital signature


Build failed in Jenkins: kafka-trunk-jdk10 #479

2018-09-12 Thread Apache Jenkins Server
See 

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H23 (ubuntu xenial) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
Fetching upstream changes from https://github.com/apache/kafka.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/kafka.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:888)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1155)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
at hudson.scm.SCM.checkout(SCM.java:504)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1794)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress https://github.com/apache/kafka.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: remote: Counting objects: 10412, done.
remote: Compressing objects:   2% (1/46)   remote: Compressing objects: 
  4% (2/46)   remote: Compressing objects:   6% (3/46)   
remote: Compressing objects:   8% (4/46)   remote: Compressing objects: 
 10% (5/46)   remote: Compressing objects:  13% (6/46)   
remote: Compressing objects:  15% (7/46)   remote: Compressing objects: 
 17% (8/46)   remote: Compressing objects:  19% (9/46)   
remote: Compressing objects:  21% (10/46)   remote: Compressing 
objects:  23% (11/46)   remote: Compressing objects:  26% (12/46)   
remote: Compressing objects:  28% (13/46)   remote: Compressing 
objects:  30% (14/46)   remote: Compressing objects:  32% (15/46)   
remote: Compressing objects:  34% (16/46)   remote: Compressing 
objects:  36% (17/46)   remote: Compressing objects:  39% (18/46)   
remote: Compressing objects:  41% (19/46)   remote: Compressing 
objects:  43% (20/46)   remote: Compressing objects:  45% (21/46)   
remote: Compressing objects:  47% (22/46)   remote: Compressing 
objects:  50% (23/46)   remote: Compressing objects:  52% (24/46)   
remote: Compressing objects:  54% (25/46)   remote: Compressing 
objects:  56% (26/46)   remote: Compressing objects:  58% (27/46)   
remote: Compressing objects:  60% (28/46)   remote: Compressing 
objects:  63% (29/46)   remote: Compressing objects:  65% (30/46)   
remote: Compressing objects:  67% (31/46)   remote: Compressing 
objects:  69% (32/46)   remote: Compressing objects:  71% (33/46)   
remote: Compressing objects:  73% (34/46)   remote: Compressing 
objects:  76% (35/46)   remote: Compressing objects:  78% (36/46)   
remote: Compressing objects:  80% (37/46)   remote: Compressing 
objects:  82% (38/46)   remote: Compressing objects:  84% (39/46)   
remote: Compressing objects:  86% (40/46)   remote: Compressing 
objects:  89% (41/46)   remote: Compressing objects:  91% (42/46)   
remote: Compressing objects:  93% (43/46)   remote: Compressing 
objects:  95% (44/46)   remote: Compressing objects:  97% (45/46)   
remote: Compressing objects: 100% (46/46)   remote: Compressing 
objects: 100% (46/46), done.
Receiving objects:   0% (1/10412)   Receiving objects:   1% (105/10412)   
Receiving objects:   2% (209/10412)   Receiving objects:   3% (313/10412)   
Receiving objects:   4% (417/10412)   Receiving objects:   5% (521/10412)   
Receiving objects:   6% (625/10412)   Receiving objects:   7% (729/10412)   
Receiving objects:   8% (833/10412)   Receiving objects:   9% (938/10412)   
Receiving objects:  10% (1042/10412)   Receiving objects:  11% (1146/10412)   
Receiving objects:  12% (1250/10412)   Receiving objects:  13% (1354/10412)   
Receiving objects:  14% (1458/10412)   Receiving objects:  15% 

Re: [VOTE] KIP-367 Introduce close(Duration) to Producer and AdminClient instead of close(long, TimeUnit)

2018-09-12 Thread Harsha
+1 (Binding).
Thanks,
Harsha

On Wed, Sep 12, 2018, at 9:06 PM, vito jeng wrote:
> +1
> 
> 
> 
> ---
> Vito
> 
> On Mon, Sep 10, 2018 at 4:52 PM, Dongjin Lee  wrote:
> 
> > +1. (Non-binding)
> >
> > On Mon, Sep 10, 2018 at 4:13 AM Matthias J. Sax 
> > wrote:
> >
> > > Thanks a lot for the KIP.
> > >
> > > +1 (binding)
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 9/8/18 11:27 AM, Chia-Ping Tsai wrote:
> > > > Hi All,
> > > >
> > > > I'd like to put KIP-367 to the vote.
> > > >
> > > >
> > > https://cwiki.apache.org/confluence/pages/viewpage.
> > action?pageId=89070496
> > > >
> > > > --
> > > > Chia-Ping
> > > >
> > >
> > >
> >
> > --
> > *Dongjin Lee*
> >
> > *A hitchhiker in the mathematical world.*
> >
> > *github:  github.com/dongjinleekr
> > linkedin: kr.linkedin.com/in/dongjinleekr
> > slideshare:
> > www.slideshare.net/dongjinleekr
> > *
> >


Re: A question about kafka streams API

2018-09-12 Thread Adam Bellemare
Hi Yui Yoi

Preface: I am not familiar with the spring framework.

"Earliest" when it comes to consuming from Kafka means, "Start reading from
the first message in the topic, *if there is no offset stored for that
consumer group*". It sounds like you are expecting it to re-read each
message whenever a new message comes in. This is not going to happen, as
there will be a committed offset and "earliest" will no longer be used. If
you were to use "latest" instead, if a consumer is started that does not
have a valid offset, it would use the very latest message in the topic as
the starting offset for message consumption.

Now, if you are using the same consumer group each time you run the
application (which it seems is true, as you have "test-group" hardwired in
your application.yml), but you do not tear down your local cluster and
clear out its state, you will indeed see the behaviour you describe.
Remember that Kafka is durable, and maintains the offsets when the
individual applications go away. So you are probably seeing this:

1) start application instance 1. It realizes it has no offset when it tries
to register as a consumer on the input topic, so it creates a new consumer
entry for "earliest" for your consumer group.
2) send message "asd"
3) application instance 1 receives "asd", processes it, and updates the
offset (offset head = 1)
4) Terminate instance 1
5) Start application instance 2. It detects correctly that consumer group
"test-group" is available and reads that offset as its starting point.
6) send message "{}"
7) application instance 2 receives "{}", processes it, and updates the
offset (offset head = 2)
*NOTE:* App instance 2 NEVER received "asd", nor should it, as it is
telling the Kafka cluster that it belongs to the same consumer group as
application 1.

Hope this helps,

Adam





On Wed, Sep 12, 2018 at 6:57 AM, Yui Yoi  wrote:

> TL;DR:
> my streams application skips uncommitted messages
>
> Hello,
> I'm using streams API via spring framework and experiencing a weird
> behavior which I would like to get an explanation to:
> First of all: The attached zip is my test project, I used kafka cli to run
> a localhost broker and zookeeper
>
> what is happening is as follows:
> 1. I send an invalid message, such as "asd", and my consumer has a lag and
> error message as expected
> 2. I send a valid message such as "{}", but instead of rereading the first
> message as expected from an "earliest" configured application - my
> application reads the latest message, commits it and ignoring the one in
> error, thus i have no lag!
> 3. When I'm running my application when there are uncommitted messages -
> my application reads the FIRST not committed message, as if it IS an
> "earliest" configured application!
>
> In your documentation you assure "at least once" behavior, but according
> to section 2. it happens so my application does not receive those messages
> not even once (as i said, those messages are uncommitted)
>
> My guess is that it has something to do with the stream's cache... I would
> very like to have an explanation or even a solution
>
> I'm turning to you as a last resort, after long weeks of research and
> experiments
>
> Thanks alot
>


Re: A question about kafka streams API

2018-09-12 Thread Yui Yoi
Hi Adam,
Thanks a lot for the rapid response, it did helped!

Let me though ask one more simple question: Can I make a stream application
stuck on an invalid message? and not consuming any further messages?

Thanks again

On Wed, Sep 12, 2018 at 2:35 PM Adam Bellemare 
wrote:

> Hi Yui Yoi
>
> Preface: I am not familiar with the spring framework.
>
> "Earliest" when it comes to consuming from Kafka means, "Start reading from
> the first message in the topic, *if there is no offset stored for that
> consumer group*". It sounds like you are expecting it to re-read each
> message whenever a new message comes in. This is not going to happen, as
> there will be a committed offset and "earliest" will no longer be used. If
> you were to use "latest" instead, if a consumer is started that does not
> have a valid offset, it would use the very latest message in the topic as
> the starting offset for message consumption.
>
> Now, if you are using the same consumer group each time you run the
> application (which it seems is true, as you have "test-group" hardwired in
> your application.yml), but you do not tear down your local cluster and
> clear out its state, you will indeed see the behaviour you describe.
> Remember that Kafka is durable, and maintains the offsets when the
> individual applications go away. So you are probably seeing this:
>
> 1) start application instance 1. It realizes it has no offset when it tries
> to register as a consumer on the input topic, so it creates a new consumer
> entry for "earliest" for your consumer group.
> 2) send message "asd"
> 3) application instance 1 receives "asd", processes it, and updates the
> offset (offset head = 1)
> 4) Terminate instance 1
> 5) Start application instance 2. It detects correctly that consumer group
> "test-group" is available and reads that offset as its starting point.
> 6) send message "{}"
> 7) application instance 2 receives "{}", processes it, and updates the
> offset (offset head = 2)
> *NOTE:* App instance 2 NEVER received "asd", nor should it, as it is
> telling the Kafka cluster that it belongs to the same consumer group as
> application 1.
>
> Hope this helps,
>
> Adam
>
>
>
>
>
> On Wed, Sep 12, 2018 at 6:57 AM, Yui Yoi  wrote:
>
> > TL;DR:
> > my streams application skips uncommitted messages
> >
> > Hello,
> > I'm using streams API via spring framework and experiencing a weird
> > behavior which I would like to get an explanation to:
> > First of all: The attached zip is my test project, I used kafka cli to
> run
> > a localhost broker and zookeeper
> >
> > what is happening is as follows:
> > 1. I send an invalid message, such as "asd", and my consumer has a lag
> and
> > error message as expected
> > 2. I send a valid message such as "{}", but instead of rereading the
> first
> > message as expected from an "earliest" configured application - my
> > application reads the latest message, commits it and ignoring the one in
> > error, thus i have no lag!
> > 3. When I'm running my application when there are uncommitted messages -
> > my application reads the FIRST not committed message, as if it IS an
> > "earliest" configured application!
> >
> > In your documentation you assure "at least once" behavior, but according
> > to section 2. it happens so my application does not receive those
> messages
> > not even once (as i said, those messages are uncommitted)
> >
> > My guess is that it has something to do with the stream's cache... I
> would
> > very like to have an explanation or even a solution
> >
> > I'm turning to you as a last resort, after long weeks of research and
> > experiments
> >
> > Thanks alot
> >
>


Re: A question about kafka streams API

2018-09-12 Thread Adam Bellemare
Hi Yui Yoi


Keep in mind that Kafka Consumers don't traditionally request only a single
message at a time, but instead requests them in batches. This allows for
much higher throughput, but does result in the scenario of "at-least-once"
processing. Generally what will happen in this scenario is the following:

1) Client requests the next set of messages from offset (t). For example,
assume it gets 10 messages and message 6 is "bad".
2) The client's processor will then process the messages one at a time.
Note that the offsets are not committed after the message is processed, but
only at the end of the batch.
3) The bad message it hit by the processor. At this point you can decide to
skip the message, throw an exception, etc.
4a) If you decide to skip the message, processing will continue. Once all
10 messages are processed, the new offset (t+10) offset is committed back
to Kafka.
4b) If you decide to throw an exception and terminate your app, you will
have still processed the messages that came before the bad message. Because
the offset (t+10) is not committed, the next time you start the app it will
consume from offset t, and those messages will be processed again. This is
"at-least-once" processing.


Now, if you need exactly-once processing, you have two choices -
1) Use Kafka Streams with exactly-once semantics (though, as I am not
familiar with your framework, it may support it as well).
2) Use idempotent practices (ie: it doesn't matter if the same messages get
processed more than once).


Hope this helps -

Adam


On Wed, Sep 12, 2018 at 7:59 AM, Yui Yoi  wrote:

> Hi Adam,
> Thanks a lot for the rapid response, it did helped!
>
> Let me though ask one more simple question: Can I make a stream application
> stuck on an invalid message? and not consuming any further messages?
>
> Thanks again
>
> On Wed, Sep 12, 2018 at 2:35 PM Adam Bellemare 
> wrote:
>
> > Hi Yui Yoi
> >
> > Preface: I am not familiar with the spring framework.
> >
> > "Earliest" when it comes to consuming from Kafka means, "Start reading
> from
> > the first message in the topic, *if there is no offset stored for that
> > consumer group*". It sounds like you are expecting it to re-read each
> > message whenever a new message comes in. This is not going to happen, as
> > there will be a committed offset and "earliest" will no longer be used.
> If
> > you were to use "latest" instead, if a consumer is started that does not
> > have a valid offset, it would use the very latest message in the topic as
> > the starting offset for message consumption.
> >
> > Now, if you are using the same consumer group each time you run the
> > application (which it seems is true, as you have "test-group" hardwired
> in
> > your application.yml), but you do not tear down your local cluster and
> > clear out its state, you will indeed see the behaviour you describe.
> > Remember that Kafka is durable, and maintains the offsets when the
> > individual applications go away. So you are probably seeing this:
> >
> > 1) start application instance 1. It realizes it has no offset when it
> tries
> > to register as a consumer on the input topic, so it creates a new
> consumer
> > entry for "earliest" for your consumer group.
> > 2) send message "asd"
> > 3) application instance 1 receives "asd", processes it, and updates the
> > offset (offset head = 1)
> > 4) Terminate instance 1
> > 5) Start application instance 2. It detects correctly that consumer group
> > "test-group" is available and reads that offset as its starting point.
> > 6) send message "{}"
> > 7) application instance 2 receives "{}", processes it, and updates the
> > offset (offset head = 2)
> > *NOTE:* App instance 2 NEVER received "asd", nor should it, as it is
> > telling the Kafka cluster that it belongs to the same consumer group as
> > application 1.
> >
> > Hope this helps,
> >
> > Adam
> >
> >
> >
> >
> >
> > On Wed, Sep 12, 2018 at 6:57 AM, Yui Yoi  wrote:
> >
> > > TL;DR:
> > > my streams application skips uncommitted messages
> > >
> > > Hello,
> > > I'm using streams API via spring framework and experiencing a weird
> > > behavior which I would like to get an explanation to:
> > > First of all: The attached zip is my test project, I used kafka cli to
> > run
> > > a localhost broker and zookeeper
> > >
> > > what is happening is as follows:
> > > 1. I send an invalid message, such as "asd", and my consumer has a lag
> > and
> > > error message as expected
> > > 2. I send a valid message such as "{}", but instead of rereading the
> > first
> > > message as expected from an "earliest" configured application - my
> > > application reads the latest message, commits it and ignoring the one
> in
> > > error, thus i have no lag!
> > > 3. When I'm running my application when there are uncommitted messages
> -
> > > my application reads the FIRST not committed message, as if it IS an
> > > "earliest" configured application!
> > >
> > > In your documentation you assure "at least once" 

Re: [DISCUSS] KIP-358: Migrate Streams API to Duration instead of long ms times

2018-09-12 Thread Nikolay Izhikov
Hello, Matthias.

> His proposal is, to deprecate existing methods on `ReadOnlyWindowStore`> and 
> `ReadOnlySessionStore` and add them to `WindowStore` and> `SessionStore`
> Does this make sense?

You both are experienced Kafka developers, so yes, it does make a sense to me 
:).
Do we want to make this change in KIP-358 or it required another KIP?

> Btw: the KIP misses `ReadOnlySessionStore` atm.

Sorry, but I don't understand you.
As far as I can see, there is only 2 methods in `ReadOnlySessionStore`.
Which method should be migrated to Duration?

https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/ReadOnlySessionStore.java

В Вт, 11/09/2018 в 09:21 -0700, Matthias J. Sax пишет:
> I talked to John offline about his last suggestions, that I originally
> did not fully understand.
> 
> His proposal is, to deprecate existing methods on `ReadOnlyWindowStore`
> and `ReadOnlySessionStore` and add them to `WindowStore` and
> `SessionStore` (note, all singular -- not to be confused with classes
> names plural).
> 
> Btw: the KIP misses `ReadOnlySessionStore` atm.
> 
> The argument is, that the `ReadOnlyXxxStore` interfaces are only exposed
> via Interactive Queries feature and for this part, using `long` is
> undesired. However, for a `Processor` that reads/writes stores on the
> hot code path, we would like to avoid the object creation overhead and
> stay with `long`. Note, that a `Processor` would use the "read-write"
> interfaces and thus, we can add the more efficient read methods using
> `long` there.
> 
> Does this make sense?
> 
> 
> -Matthias
> 
> On 9/11/18 12:20 AM, Nikolay Izhikov wrote:
> > Hello, Guozhang, Bill.
> > 
> > > 1) I'd suggest keeping `Punctuator#punctuate(long timestamp)` as is
> > 
> > I am agree with you.
> > Currently, `Punctuator` edits are not included in KIP.
> > 
> > > 2) I'm fine with keeping KeyValueStore extending ReadOnlyKeyValueStore
> > 
> > Great, currently, there is no suggested API change in `KeyValueStore` or 
> > `ReadOnlyKeyValueStore`.
> > 
> > Seems, you agree with all KIP details.
> > Can you vote, please? [1]
> > 
> > [1] 
> > https://lists.apache.org/thread.html/e976352e7e42d459091ee66ac790b6a0de7064eac0c57760d50c983b@%3Cdev.kafka.apache.org%3E
> > 
> > 
> > В Пн, 10/09/2018 в 19:49 -0400, Bill Bejeck пишет:
> > > Hi Nikolay,
> > > 
> > > I'm a +1 to points 1 and 2 above from Guozhang.
> > > 
> > > Thanks,
> > > Bill
> > > 
> > > On Mon, Sep 10, 2018 at 6:58 PM Guozhang Wang  wrote:
> > > 
> > > > Hello Nikolay,
> > > > 
> > > > Thanks for picking this up! Just sharing my two cents:
> > > > 
> > > > 1) I'd suggest keeping `Punctuator#punctuate(long timestamp)` as is 
> > > > since
> > > > comparing with other places where we are replacing with Duration and
> > > > Instant, this is not a user specified value as part of the DSL but 
> > > > rather a
> > > > passed-in parameter, plus with high punctuation frequency creating a new
> > > > instance of Instant may be costly.
> > > > 
> > > > 2) I'm fine with keeping KeyValueStore extending ReadOnlyKeyValueStore 
> > > > with
> > > > APIs of `long` as well as inheriting APIs of `Duration`.
> > > > 
> > > > 
> > > > Guozhang
> > > > 
> > > > 
> > > > On Mon, Sep 10, 2018 at 11:11 AM, Nikolay Izhikov 
> > > > wrote:
> > > > 
> > > > > Hello, Matthias.
> > > > > 
> > > > > > (4) While I agree that we might want to deprecate it, I am not sure 
> > > > > > if
> > > > > 
> > > > > this should be part of this KIP?
> > > > > > Seems to be unrelated?
> > > > > > Should this have been part of KIP-319?
> > > > > > If yes, we might still want to updated this other KIP? WDYT?
> > > > > 
> > > > > OK, I removed this deprecation from this KIP.
> > > > > 
> > > > > Please, tell me, is there anything else that should be improved to 
> > > > > make
> > > > > this KIP ready to be implemented.
> > > > > 
> > > > > В Пт, 07/09/2018 в 17:06 -0700, Matthias J. Sax пишет:
> > > > > > (1) Sounds good to me, to just use IllegalArgumentException for 
> > > > > > both --
> > > > > > and thanks for pointing out that Duration can be negative and we 
> > > > > > need
> > > > 
> > > > to
> > > > > > check for this. For the KIP, it would be nice to add to all methods
> > > > 
> > > > than
> > > > > > (even if we don't do it in the code but only document in JavaDocs).
> > > > > > 
> > > > > > (2) I would argue for a new single method interface. Not sure about 
> > > > > > the
> > > > > > name though.
> > > > > > 
> > > > > > (3) Even if `#fetch(K, K, long, long)` and `#fetchAll(long, long)` 
> > > > > > is
> > > > > > _currently_ not used internally, I would still argue they are both 
> > > > > > dual
> > > > > > use -- we might all a new DSL operator at any point that uses those
> > > > > > methods. Thus to be "future prove" I would consider them dual use.
> > > > > > 
> > > > > > > Since the ReadOnlyWindowStore is only used by IQ,
> > > > > > 
> > > > > > This contradicts your other statement:
> > > > > > 
> 

[jira] [Resolved] (KAFKA-3657) NewProducer NullPointerException on ProduceRequest

2018-09-12 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-3657.
--
Resolution: Cannot Reproduce

Closing inactive issue.  Please reopen if  the issue still exists on newer 
versions.

> NewProducer NullPointerException on ProduceRequest
> --
>
> Key: KAFKA-3657
> URL: https://issues.apache.org/jira/browse/KAFKA-3657
> Project: Kafka
>  Issue Type: Bug
>  Components: network, producer 
>Affects Versions: 0.8.2.1, 0.9.0.1
> Environment: linux 3.2.0 debian7
>Reporter: Vamsi Subhash Achanta
>Assignee: Jun Rao
>Priority: Major
>  Labels: reliability
>
> The producer upon send.get() on the future appends to the accumulator the 
> record batches and the Sender.java (separate thread) flushes it to the 
> server. The produce request waits on the countDownLatch in the 
> FutureRecordMetadata:
> public RecordMetadata get() throws InterruptedException, 
> ExecutionException {
> this.result.await();
> In this case, the client thread is blocked for ever (as it is get() without 
> timeout) for the response and the response upon poll by the Sender returns an 
> attachment with the batch value as null. The batch is processed and the 
> request is errored out. The Sender catches a global level exception and then 
> goes ahead. As the accumulator is drained, the response will never be 
> returned and the producer client thread calling get() is blocked for ever on 
> the latch await call.
> I checked at the server end but still haven't found the reason for null 
> batch. Any pointers on this?
> ERROR [2016-05-01 21:00:09,256] [kafka-producer-network-thread |producer-app] 
> [Sender] message_id: group_id: : Uncaught error in kafka producer I/O thread:
> ! java.lang.NullPointerException: null
> ! at 
> org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:266)
> ! at 
> org.apache.kafka.clients.producer.internals.Sender.handleResponse(Sender.java:236)
> ! at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:196)
> ! at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
> ! at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-4030) Update older quickstart documents to clarify which version they relate to

2018-09-12 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-4030.
--
Resolution: Not A Problem

Closing as per PR comment.

> Update older quickstart documents to clarify which version they relate to
> -
>
> Key: KAFKA-4030
> URL: https://issues.apache.org/jira/browse/KAFKA-4030
> Project: Kafka
>  Issue Type: Improvement
>  Components: website
>Reporter: Todd Snyder
>Priority: Major
>  Labels: documentation, website
>
> If you search for 'kafka quickstart' it takes you to 
> kafka.apache.org/07/quickstart.html which is, unclearly, for release 0.7 and 
> not the current release.
> [~gwenshap] suggested a ticket and a note added to the 0.7 (and likely 0.8 
> and 0.9) quickstart guides directing people to use ~current for the latest 
> release documentation.
> I'll submit a PR shortly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review and merge - KAFKA-6764

2018-09-12 Thread Suman B N
Team,

Review and merge below pull request.
Merge Request: https://github.com/apache/kafka/pull/5637
Jira: https://issues.apache.org/jira/browse/KAFKA-6764

-- 
*Suman*
*OlaCabs*


A question about kafka streams API

2018-09-12 Thread Yui Yoi
TL;DR:
my streams application skips uncommitted messages

Hello,
I'm using streams API via spring framework and experiencing a weird
behavior which I would like to get an explanation to:
First of all: The attached zip is my test project, I used kafka cli to run
a localhost broker and zookeeper

what is happening is as follows:
1. I send an invalid message, such as "asd", and my consumer has a lag and
error message as expected
2. I send a valid message such as "{}", but instead of rereading the first
message as expected from an "earliest" configured application - my
application reads the latest message, commits it and ignoring the one in
error, thus i have no lag!
3. When I'm running my application when there are uncommitted messages - my
application reads the FIRST not committed message, as if it IS an
"earliest" configured application!

In your documentation you assure "at least once" behavior, but according to
section 2. it happens so my application does not receive those messages not
even once (as i said, those messages are uncommitted)

My guess is that it has something to do with the stream's cache... I would
very like to have an explanation or even a solution

I'm turning to you as a last resort, after long weeks of research and
experiments

Thanks alot
<>


[jira] [Resolved] (KAFKA-4910) kafka consumer not receiving messages

2018-09-12 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-4910.
--
Resolution: Cannot Reproduce

Closing inactive issue.  Please reopen if the issue still exists on newer 
versions.

> kafka consumer not receiving messages
> -
>
> Key: KAFKA-4910
> URL: https://issues.apache.org/jira/browse/KAFKA-4910
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.0
>Reporter: zhiwei
>Priority: Major
> Attachments: 128314.jstack
>
>
> kafka consumer not receiving messages
> consumer log:
> "2017-03-10 14:35:34,448" | INFO  | [Thread-5-KafkaSpout] | Revoking 
> previously assigned partitions [MAFS_BPIS_ICSWIPE_IC-0] for group 
> data_storm_hw_tianlu | 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator 
> (ConsumerCoordinator.java:291) 
> "2017-03-10 14:35:34,451" | INFO  | [Thread-5-KafkaSpout] | (Re-)joining 
> group data_storm_hw_tianlu | 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator 
> (AbstractCoordinator.java:326) 
> "2017-03-10 14:35:34,456" | INFO  | [Thread-5-KafkaSpout] | Successfully 
> joined group data_storm_hw_tianlu with generation 7 | 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator 
> (AbstractCoordinator.java:434) 
> "2017-03-10 14:35:34,456" | INFO  | [Thread-5-KafkaSpout] | Setting newly 
> assigned partitions [MAFS_BPIS_ICSWIPE_IC-0] for group data_storm_hw_tianlu | 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator 
> (ConsumerCoordinator.java:230) 
> "2017-03-10 14:36:10,297" | INFO  | [Thread-7-__system] | Getting metrics for 
> server on port 29104 | backtype.storm.messaging.netty.Server 
> (Server.java:321) 
> "2017-03-10 14:36:10,298" | INFO  | [Thread-7-__system] | Getting metrics for 
> client connection to Netty-Client-streaming96/10.55.45.96:29106 | 
> backtype.storm.messaging.netty.Client (Client.java:405) 
> "2017-03-10 14:37:10,297" | INFO  | [Thread-7-__system] | Getting metrics for 
> server on port 29104 | backtype.storm.messaging.netty.Server 
> (Server.java:321) 
> "2017-03-10 14:37:10,298" | INFO  | [Thread-7-__system] | Getting metrics for 
> client connection to Netty-Client-streaming96/10.55.45.96:29106 | 
> backtype.storm.messaging.netty.Client (Client.java:405) 
> "2017-03-10 14:38:10,297" | INFO  | [Thread-7-__system] | Getting metrics for 
> server on port 29104 | backtype.storm.messaging.netty.Server 
> (Server.java:321) 
> "2017-03-10 14:38:10,297" | INFO  | [Thread-7-__system] | Getting metrics for 
> client connection to Netty-Client-streaming96/10.55.45.96:29106 | 
> backtype.storm.messaging.netty.Client (Client.java:405) 
> "2017-03-10 14:39:10,297" | INFO  | [Thread-7-__system] | Getting metrics for 
> server on port 29104 | backtype.storm.messaging.netty.Server 
> (Server.java:321) 
> "2017-03-10 14:39:10,298" | INFO  | [Thread-7-__system] | Getting metrics for 
> client connection to Netty-Client-streaming96/10.55.45.96:29106 | 
> backtype.storm.messaging.netty.Client (Client.java:405) 
> "2017-03-10 14:40:10,298" | INFO  | [Thread-7-__system] | Getting metrics for 
> server on port 29104 | backtype.storm.messaging.netty.Server 
> (Server.java:321) 
> "2017-03-10 14:40:10,298" | INFO  | [Thread-7-__system] | Getting metrics for 
> client connection to Netty-Client-streaming96/10.55.45.96:29106 | 
> backtype.storm.messaging.netty.Client (Client.java:405) 
> "2017-03-10 14:40:34,454" | INFO  | [Thread-5-KafkaSpout] | Revoking 
> previously assigned partitions [MAFS_BPIS_ICSWIPE_IC-0] for group 
> data_storm_hw_tianlu | 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator 
> (ConsumerCoordinator.java:291) 
> "2017-03-10 14:40:34,457" | INFO  | [Thread-5-KafkaSpout] | (Re-)joining 
> group data_storm_hw_tianlu | 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator 
> (AbstractCoordinator.java:326) 
> "2017-03-10 14:40:36,458" | INFO  | [Thread-5-KafkaSpout] | Successfully 
> joined group data_storm_hw_tianlu with generation 8 | 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator 
> (AbstractCoordinator.java:434) 
> "2017-03-10 14:40:36,459" | INFO  | [Thread-5-KafkaSpout] | Setting newly 
> assigned partitions [MAFS_BPIS_ICSWIPE_IC-0] for group data_storm_hw_tianlu | 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator 
> (ConsumerCoordinator.java:230) 
> "2017-03-10 14:41:10,297" | INFO  | [Thread-7-__system] | Getting metrics for 
> server on port 29104 | backtype.storm.messaging.netty.Server 
> (Server.java:321) 
> "2017-03-10 14:41:10,298" | INFO  | [Thread-7-__system] | Getting metrics for 
> client connection to Netty-Client-streaming96/10.55.45.96:29106 | 
> backtype.storm.messaging.netty.Client (Client.java:405) 
> "2017-03-10 14:42:10,297" | 

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-12 Thread Stanislav Kozlovski
Hi Ron, Rajini

Thanks for summarizing the discussion so far, Ron!

1) How often do we have such long-lived connection idleness (e.g 5-10
minutes) in practice?

3) I agree that retries for re-authentication are useful.

4) The interleaving of requests sounds like a great feature to have, but
the tradeoff against code complexity is a tough one. I would personally go
with the simpler approach since you could always add interleaving on top if
the community decides the latency should be better.

Best,
Stanislav

On Tue, Sep 11, 2018 at 5:00 AM Ron Dagostino  wrote:

> Hi everyone.  I've updated the PR to reflect the latest conclusions from
> this ongoing discussion.  The KIP still needs the suggested updates; I will
> do that later this week.  II agree with Rajini that some additional
> feedback from the community at large would be very helpful at this point in
> time.
>
> Here's where we stand.
>
> We have 2 separate implementations for re-authenticating existing
> connections -- a so-called "low-level" approach and a (relatively speaking)
> "high-level" approach -- and we agree on how they should be compared.
> Specifically, Rajini has provided a mechanism that works at a relatively
> low level in the stack by intercepting network sends and queueing them up
> while re-authentication happens; then the queued sends are sent after
> re-authentication succeeds, or the connection is closed if
> re-authentication fails.  See the prototype commit at
>
> https://github.com/rajinisivaram/kafka/commit/b9d711907ad843c11d17e80d6743bfb1d4e3f3fd
> .
>
> I have an implementation that works higher up in the stack; it injects
> authentication requests into the KafkaClient via a new method added to that
> interface, and the implementation (i.e. NetworkClient) sends them when
> poll() is called.  The updated PR is available at
> https://github.com/apache/kafka/pull/5582/.
>
> Here is how we think these two approaches should be compared.
>
> 1) *When re-authentication starts*.  The low-level approach initiates
> re-authentication only if/when the connection is actually used, so it may
> start after the existing credential expires; the current PR implementation
> injects re-authentication requests into the existing flow, and
> re-authentication starts immediately regardless of whether or not the
> connection is being used for something else.  Rajini believes the low-level
> approach can be adjusted to re-authenticate idle connections (Rajini, I
> don't see how this can be done without injecting requests, which is what
> the high-level connection is doing, but I am probably missing something --
> no need to go into it unless it is easy to explain.)
>
> 2) *When requests not related to re-authentication are able to use the
> re-authenticating connection*.  The low-level approach finishes
> re-authentication completely before allowing anything else to traverse the
> connection, and we agree that this is how the low-level implementation must
> work without significant work/changes.  The current PR implementation
> interleaves re-authentication requests with the existing flow in the
> asynchronous I/O uses cases (Consumer, Producer, Admin Client, etc.), and
> it allows the two synchronous use cases (ControllerChannelManager and
> ReplicaFetcherBlockingSend/ReplicaFetcherThread) to decide which style they
> want.  I (somewhat arbitrarily, to prove out the flexibility of the
> high-level approach) coded ControllerChannelManager to do complete,
> non-interleaved re-authentication and ReplicaFetcherBlockingSend/
> ReplicaFetcherThread to interleave requests.  The approach has impacts on
> the maximum size of latency spikes: interleaving requests can decrease the
> latency.  The benefit of interleaving is tough to evaluate because it isn't
> clear what the latency requirement really is.  Note that re-authentication
> can entail several network round-trips between the client and the broker.
> Comments in this area would be especially appreciated.
>
> 3) *What happens if re-authentication fails (i.e. retry capability)*.  I
> think this is where we have not yet settled on what the requirement is
> (even more so than the issue of potential latency mitigation
> requirements).  Rajini, you had mentioned that re-authentication should
> work the same way as authentication, but I see the two situations as being
> asymmetric.  In the authentication case, if authentication fails, the
> connection was never fully established and could not be used, and it is
> closed -- TLS negotiation may have taken place, so that effort is lost, but
> beyond that nothing else is lost.  In the re-authentication case the
> connection is fully established, it is in use, and in fact it can remain in
> use for some time going forward as long as the existing credential remains
> unexpired; abandoning it at this point due to a re-authentication failure
> (which can occur due to no fault of the client -- i.e. a remote directory
> or OAuth server being temporarily down)