Re: [DISCUSS] KIP-185: Make exactly once in order delivery per partition the default producer setting

2017-08-09 Thread Apurva Mehta
Thanks for your email Becket.

Your observations around using acks=1 and acks=-1 are correct. Do note that
getting an OutOfOrderSequence means that acknowledged data has been lost.
This could be due to a weaker acks setting like acks=1 or due to a topic
which is not configured to handle broker failures cleanly (unclean leader
election is enabled, etc.). Either way, you are right in observing that if
an app is very serious about having exactly one copy of each ack'd message
in the log, it is a significant effort to recover from this error.

However, I propose an alternate way of thinking about this: is it
worthwhile shipping Kafka with the defaults tuned for strong semantics?
That is essentially what is being proposed here, and of course there will
be tradeoffs with performance and deployment costs-- you can't have your
cake and eat it too.

And if we want to ship Kafka with strong semantics by default, we might
want to make the default topic level settings as well as the client
settings more robust. This means, for instance, disabling unclean leader
election by default. If there are other configs we need to change on the
broker side to ensure that ack'd messages are not lost due to transient
failures, we should change those as well as part of a future KIP.

Personally, I think that the defaults should provide robust guarantees.

And this brings me to another point: these are just proposed defaults.
Nothing is being taken away in terms of flexibility to tune for different
behavior.

Finally, the way idempotence is implemented means that there needs to be
some cap on max.in.flight when idempotence is enabled -- that is just a
tradeoff of the feature. Do we have any data that there are installations
which benefit greatly for a value of max.in.flight > 5? For instance,
LinkedIn probably has the largest and most demanding deployment of Kafka.
Are there any applications which use max.inflight > 5? That would be good
data to have.

Thanks,
Apurva





On Wed, Aug 9, 2017 at 2:59 PM, Becket Qin  wrote:

> Thanks for the KIP, Apurva. It is a good time to review the configurations
> to see if we can improve the user experience. We also might need to think
> from users standpoint about the out of the box experience.
>
> 01. Generally speaking, I think it makes sense to make idempotence=true so
> we can enable producer side pipeline without ordering issue. However, the
> impact is that users may occasionally receive OutOfOrderSequencException.
> In this case, there is not much user can do if they want to ensure
> ordering. They basically have to close the producer in the call back and
> resend all the records that are in the RecordAccumulator. This is very
> involved. And the users may not have a way to retrieve the Records in the
> accumulator anymore. So for the users who really want to achieve the
> exactly once semantic, there are actually still a lot of work to do even
> with those default. For the rest of the users, they need to handle one more
> exception, which might not be a big deal.
>
> 02. Setting acks=-1 would significantly reduce the likelihood of
> OutOfOrderSequenceException from happening. However, the latency/throughput
> impact and additional purgatory burden on the broker are big concerns. And
> it does not really guarantee exactly once without broker side
> configurations. i.e unclean.leader.election, min.isr, etc. I am not sure if
> it is worth making acks=-1 a global config instead of letting the users who
> are really care about this to configure correctly.
>
> 03. Regarding retries, I think we had some discussion in KIP-91. The
> problem of setting retries to max integer is that producer.flush() may take
> forever. Will this KIP be depending on KIP-91?
>
> I am not sure about having a cap on the max.in.flight.requests. It seems
> that on some long RTT link, sending more requests in the pipeline would be
> the only way to keep the latency to be close to RTT.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
> On Wed, Aug 9, 2017 at 11:28 AM, Apurva Mehta  wrote:
>
> > Thanks for the comments Ismael and Jason.
> >
> > Regarding the OutOfOrderSequenceException, it is more likely when you
> > enable idempotence and have acks=1, simply because you have a greater
> > probability of losing acknowledged data with acks=1, and the error code
> > indicates that.
> >
> > The particular scenario is that a broker acknowledges a message with
> > sequence N before replication happens, and then crashes. Since the
> message
> > was acknowledged the producer increments its sequence to N+1. The new
> > leader would not have received the message, and still expects sequence N
> > from the producer. When it receives N+1 for the next message, it will
> > return an OutOfOrderSequenceNumber, correctl/y indicating some previously
> > acknowledged messages are missing.
> >
> > For the idempotent producer alone, the OutOfOrderSequenceException is
> > returned in the Future and Callback, 

Re: [DISCUSS] KIP-91 Provide Intuitive User Timeouts in The Producer

2017-08-09 Thread Sumant Tambe
On Wed, Aug 9, 2017 at 1:28 PM Apurva Mehta  wrote:

> > > There seems to be no relationship with cluster metadata availability or
> > > staleness. Expiry is just based on the time since the batch has been
> > ready.
> > > Please correct me if I am wrong.
> > >
> >
> > I was not very specific about where we do expiration. I glossed over some
> > details because (again) we've other mechanisms to detect non progress.
> The
> > condition (!muted.contains(tp) && (isMetadataStale ||
> > > cluster.leaderFor(tp) == null)) is used in
> > RecordAccumualtor.expiredBatches:
> > https://github.com/apache/kafka/blob/trunk/clients/src/
> > main/java/org/apache/kafka/clients/producer/internals/
> > RecordAccumulator.java#L443
> >
> >
> > Effectively, we expire in all the following cases
> > 1) producer is partitioned from the brokers. When metadata age grows
> beyond
> > 3x it's max value. It's safe to say that we're not talking to the brokers
> > at all. Report.
> > 2) fresh metadata && leader for a partition is not known && a batch is
> > sitting there for longer than request.timeout.ms. This is one case we
> > would
> > like to improve and use batch.expiry.ms because request.timeout.ms is
> too
> > small.
> > 3) fresh metadata && leader for a partition is known && batch is sitting
> > there for longer than batch.expiry.ms. This is a new case that is
> > different
> > from #2. This is the catch-up mode case. Things are moving too slowly.
> > Pipeline SLAs are broken. Report and shutdown kmm.
> >
> > The second and the third cases are useful to a real-time app for a
> > completely different reason. Report, forget about the batch, and just
> move
> > on (without shutting down).
> >
> >
> If I understand correctly, you are talking about a fork of apache kafka
> which has these additional conditions? Because that check doesn't exist on
> trunk today.

Right. It is our internal release in LinkedIn.

Or are you proposing to change the behavior of expiry to
> account for stale metadata and partitioned producers as part of this KIP?


No. It's our temporary solution in the absence of kip-91. Note that we dont
like increasing request.timeout.ms. Without our extra conditions our
batches expire too soon--a problem in kmm catchup mode.

If we get batch.expiry.ms, we will configure it to 20 mins. maybeExpire
will use the config instead of r.t.ms. The extra conditions will be
unnecessary. All three cases shall be covered via the batch.expiry timeout.

>
>


Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-08-09 Thread Dong Lin
Hey Jun,

I have been thinking about whether it is better to return lag (i.e. HW -
LEO) instead of LEO. Note that the lag in the DescribeDirsResponse may be
negative if LEO > HW. It will almost always be negative for leader and
in-sync replicas. Note that we can not calculate lag as max(0, HW - LEO)
because we still need the difference between two lags to measure the
progress of intra-broker replica movement. The AdminClient API can choose
to return max(0, HW - LEO) depending on whether it is used for tracking
progress of inter-broker reassignment or intra-broker movement. Is it OK?
If so, I will update the KIP-113 accordingly to return lag in the
DescribeDirsResponse .

Thanks,
Dong




Virus-free.
www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Wed, Aug 9, 2017 at 5:06 PM, Jun Rao  wrote:

> Hi, Dong,
>
> Yes, the lag in a replica is calculated as the difference of LEO of the
> replica and the HW. So, as long as a replica is in sync, the lag is always
> 0.
>
> So, I was suggesting to return lag instead of LEO in DescribeDirsResponse
> for each replica. I am not sure if we need to return HW though.
>
> Thanks,
>
> Jun
>
> On Wed, Aug 9, 2017 at 5:01 PM, Dong Lin  wrote:
>
> > Hey Jun,
> >
> > It just came to me that you may be assuming that folower_lag = HW -
> > follower_LEO. If that is the case, then we need to have new
> > request/response to retrieve this lag since the DescribeDirsResponse
> > doesn't even include HW. It seems that KIP-179 does not explicitly
> specify
> > the definition of this lag.
> >
> > I have been assuming that follow_lag = leader_LEO - follower_LEO given
> that
> > the request is used to query the reassignment status. Strictly speaking
> the
> > difference between leader_LEO and the HW is limited by the amount of data
> > produced in KafkaConfig.replicaLagTimeMaxMs, which is 10 seconds. I also
> > assumed that 10 seconds is probably not a big deal given the typical time
> > length of the reassignment.
> >
> > Thanks,
> > Dong
> >
> > On Wed, Aug 9, 2017 at 4:40 PM, Dong Lin  wrote:
> >
> > > Hey Jun,
> > >
> > > If I understand you right, you are suggesting that, in the case when
> > there
> > > is continuous incoming traffic, the approach in the KIP-179 will report
> > lag
> > > as 0 whereas the approach using DescribeDirsRequest will report lag as
> > > non-zero. But I think the approach in KIP-179 will also report non-zero
> > lag
> > > when there is continuous traffic. This is because at the time the
> leader
> > > receives ReplicaStatusRequest, it is likely that some data has been
> > > appended to the partition after the last FetchRequest from the
> follower.
> > > Does this make sense?
> > >
> > > Thanks,
> > > Dong
> > >
> > >
> > >
> > > On Wed, Aug 9, 2017 at 4:24 PM, Jun Rao  wrote:
> > >
> > >> Hi, Dong,
> > >>
> > >> As for whether to return LEO or lag, my point was the following. What
> > you
> > >> are concerned about is that an in-sync replica could become out of
> sync
> > >> again. However, the more common case is that once a replica is caught
> > up,
> > >> it will stay in sync afterwards. In that case, once the reassignment
> > >> process completes, if we report based on lag, all lags will be 0. If
> we
> > >> report based on Math.max(0, leaderLEO - followerLEO), the value may
> not
> > be
> > >> 0 if there is continuous incoming traffic, which will be confusing to
> > the
> > >> user.
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >>
> > >>
> > >> On Tue, Aug 8, 2017 at 6:26 PM, Dong Lin  wrote:
> > >>
> > >> > Hey Jun,
> > >> >
> > >> > Thanks for the comment!
> > >> >
> > >> > Yes, it should work. The tool can send request to any broker and
> > broker
> > >> can
> > >> > just write the reassignment znode. My previous intuition is that it
> > may
> > >> be
> > >> > better to only send this request to controller. But I don't have
> good
> > >> > reasons for this restriction.
> > >> >
> > >> > My intuition is that we can keep them separate as well. Becket and I
> > >> have
> > >> > discussed this both offline and in https://github.com/apache/
> > >> > kafka/pull/3621.
> > >> > Currently I don't have a strong opinion on this and I am open to
> using
> > >> only
> > >> > one API to do both if someone can come up with a reasonable API
> > >> signature
> > >> > for this method. For now I have added the method alterReplicaDir()
> in
> > >> > KafkaAdminClient instead of the AdminClient interface so that the
> > >> > reassignment script can use this method without concluding what the
> > API
> > >> > would look like in AdminClient in the future.
> > >> >
> > >> > Regarding DescribeDirsResponse, I think it is probably OK to have
> > >> 

[jira] [Resolved] (KAFKA-5077) Make server start script work against Java 9

2017-08-09 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5077.

Resolution: Fixed

> Make server start script work against Java 9
> 
>
> Key: KAFKA-5077
> URL: https://issues.apache.org/jira/browse/KAFKA-5077
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.11.0.0
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
>Priority: Minor
> Fix For: 1.0.0
>
>
> Current start script fails with {{Unrecognized VM option 
> 'PrintGCDateStamps'}} using Java 9-ea



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] kafka pull request #2863: KAFKA-5077 fix GC logging arguments for Java 9

2017-08-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2863


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3651: MINOR: Add missing deprecations on old request obj...

2017-08-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3651


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-trunk-jdk8 #1896

2017-08-09 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-5721) Kafka broker stops after network failure

2017-08-09 Thread Val Feron (JIRA)
Val Feron created KAFKA-5721:


 Summary: Kafka broker stops after network failure
 Key: KAFKA-5721
 URL: https://issues.apache.org/jira/browse/KAFKA-5721
 Project: Kafka
  Issue Type: Bug
Reporter: Val Feron
 Attachments: thread_dump_kafka.out

+Cluster description+
* 3 brokers
* version 0.10.1.1
* running on AWS

+Description+

The following will happen at random intervals, on random brokers

>From the logs here is the information I could gather :

# Shrinking Intra-cluster replication on a random broker (I suppose it could be 
a temporary network failure but couldn't produce evidence of it)
# System starts showing close to no activity @02:27:20 (note that it's not load 
related as it happens at very quiet times)
enter image description here
!https://i.stack.imgur.com/g1Pem.png!
# From there, this kafka broker doesn't process messages which is expected IMO 
as it dropped out of the cluster replication.
# Now the real issue appears as the number of connections in CLOSE_WAIT is 
constantly increasing until it reaches the configured ulimit of the 
system/process, ending up crashing the kafka process.

Now, I've been changing limits to see if kafka would eventually join again the 
ISR before crashing but even with a limit that's very high, kafka just seems 
stuck in a weird state and never recovers.

Note that between the time when the faulty broker is on its own and the time it 
crashes, kafka is listening and kafka producer.

For this single crash, I could see 320 errors like this from the producers :

{code}java.util.concurrent.ExecutionException: 
org.springframework.kafka.core.KafkaProducerException: Failed to send; nested 
exception is org.apache.kafka.common.errors.NotLeaderForPartitionException: 
This server is not the leader for that topic-partition.
{code}



The configuration being the default one and the use being quite standard, I'm 
wondering if I missed something.

I put in place a script that check the number of kafka file descriptors and 
restarts the service when it gets abnormally high, which does the trick for now 
but I still lose messages when it crashes.

I'm available to make any verification / test you need.

Could see some similarities with ticket KAFKA-5007 too.

cc [~junrao]
Confirming the logs seen at the time of failure :
{code}
 /var/log/kafka/kafkaServer.out:[2017-08-09 22:13:29,045] WARN 
[ReplicaFetcherThread-0-2], Error in fetch 
kafka.server.ReplicaFetcherThread$FetchRequest@4a63f79 
(kafka.server.ReplicaFetcherThread)
/var/log/kafka/kafkaServer.out-java.io.IOException: Connection to 2 was 
disconnected before the response was read
/var/log/kafka/kafkaServer.out- at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:115)
/var/log/kafka/kafkaServer.out- at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:112)
/var/log/kafka/kafkaServer.out- at 
scala.Option.foreach(Option.scala:257)
/var/log/kafka/kafkaServer.out- at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:112)
/var/log/kafka/kafkaServer.out- at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:108)
/var/log/kafka/kafkaServer.out- at 
kafka.utils.NetworkClientBlockingOps$.recursivePoll$1(NetworkClientBlockingOps.scala:137)
/var/log/kafka/kafkaServer.out- at 
kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scala:143)
/var/log/kafka/kafkaServer.out- at 
kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:108)
/var/log/kafka/kafkaServer.out- at 
kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:253)
/var/log/kafka/kafkaServer.out- at 
kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:238)
/var/log/kafka/kafkaServer.out- at 
kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
/var/log/kafka/kafkaServer.out- at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
/var/log/kafka/kafkaServer.out- at 
kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:103)
/var/log/kafka/kafkaServer.out- at 
kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)

{code}

{code}

/var/log/kafka/kafkaServer.out:[2017-08-09 22:17:39,940] WARN 
[ReplicaFetcherThread-0-2], Error in fetch 
kafka.server.ReplicaFetcherThread$FetchRequest@3a8493a8 
(kafka.server.ReplicaFetcherThread)
/var/log/kafka/kafkaServer.out-java.io.IOException: Connection 

Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-08-09 Thread Jun Rao
Hi, Dong,

Yes, the lag in a replica is calculated as the difference of LEO of the
replica and the HW. So, as long as a replica is in sync, the lag is always
0.

So, I was suggesting to return lag instead of LEO in DescribeDirsResponse
for each replica. I am not sure if we need to return HW though.

Thanks,

Jun

On Wed, Aug 9, 2017 at 5:01 PM, Dong Lin  wrote:

> Hey Jun,
>
> It just came to me that you may be assuming that folower_lag = HW -
> follower_LEO. If that is the case, then we need to have new
> request/response to retrieve this lag since the DescribeDirsResponse
> doesn't even include HW. It seems that KIP-179 does not explicitly specify
> the definition of this lag.
>
> I have been assuming that follow_lag = leader_LEO - follower_LEO given that
> the request is used to query the reassignment status. Strictly speaking the
> difference between leader_LEO and the HW is limited by the amount of data
> produced in KafkaConfig.replicaLagTimeMaxMs, which is 10 seconds. I also
> assumed that 10 seconds is probably not a big deal given the typical time
> length of the reassignment.
>
> Thanks,
> Dong
>
> On Wed, Aug 9, 2017 at 4:40 PM, Dong Lin  wrote:
>
> > Hey Jun,
> >
> > If I understand you right, you are suggesting that, in the case when
> there
> > is continuous incoming traffic, the approach in the KIP-179 will report
> lag
> > as 0 whereas the approach using DescribeDirsRequest will report lag as
> > non-zero. But I think the approach in KIP-179 will also report non-zero
> lag
> > when there is continuous traffic. This is because at the time the leader
> > receives ReplicaStatusRequest, it is likely that some data has been
> > appended to the partition after the last FetchRequest from the follower.
> > Does this make sense?
> >
> > Thanks,
> > Dong
> >
> >
> >
> > On Wed, Aug 9, 2017 at 4:24 PM, Jun Rao  wrote:
> >
> >> Hi, Dong,
> >>
> >> As for whether to return LEO or lag, my point was the following. What
> you
> >> are concerned about is that an in-sync replica could become out of sync
> >> again. However, the more common case is that once a replica is caught
> up,
> >> it will stay in sync afterwards. In that case, once the reassignment
> >> process completes, if we report based on lag, all lags will be 0. If we
> >> report based on Math.max(0, leaderLEO - followerLEO), the value may not
> be
> >> 0 if there is continuous incoming traffic, which will be confusing to
> the
> >> user.
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >>
> >>
> >> On Tue, Aug 8, 2017 at 6:26 PM, Dong Lin  wrote:
> >>
> >> > Hey Jun,
> >> >
> >> > Thanks for the comment!
> >> >
> >> > Yes, it should work. The tool can send request to any broker and
> broker
> >> can
> >> > just write the reassignment znode. My previous intuition is that it
> may
> >> be
> >> > better to only send this request to controller. But I don't have good
> >> > reasons for this restriction.
> >> >
> >> > My intuition is that we can keep them separate as well. Becket and I
> >> have
> >> > discussed this both offline and in https://github.com/apache/
> >> > kafka/pull/3621.
> >> > Currently I don't have a strong opinion on this and I am open to using
> >> only
> >> > one API to do both if someone can come up with a reasonable API
> >> signature
> >> > for this method. For now I have added the method alterReplicaDir() in
> >> > KafkaAdminClient instead of the AdminClient interface so that the
> >> > reassignment script can use this method without concluding what the
> API
> >> > would look like in AdminClient in the future.
> >> >
> >> > Regarding DescribeDirsResponse, I think it is probably OK to have
> >> slightly
> >> > more lag. The script can calculate the lag of the follower replica as
> >> > Math.max(0, leaderLEO - followerLEO). I agree that it will be slightly
> >> less
> >> > accurate than the current approach in KIP-179. But even with the
> current
> >> > approach in KIP-179, the result provided by the script is an
> >> approximation
> >> > anyway, since there is delay from the time that leader returns
> response
> >> to
> >> > the time that the script collects response from all brokers and prints
> >> > result to user. I think if the slight difference in the accuracy
> between
> >> > the two approaches does not make a difference to the intended use-case
> >> of
> >> > this API, then we probably want to re-use the exiting request/response
> >> to
> >> > keep the protocol simple.
> >> >
> >> > Thanks,
> >> > Dong
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Tue, Aug 8, 2017 at 5:56 PM, Jun Rao  wrote:
> >> >
> >> > > Hi, Dong,
> >> > >
> >> > > I think Tom was suggesting to have the AlterTopicsRequest sent to
> any
> >> > > broker, which just writes the reassignment json to ZK. The
> controller
> >> > will
> >> > > pick up the reassignment and act on it as usual. This should work,
> >> right?
> >> > >
> >> > > Having a separate 

Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-08-09 Thread Dong Lin
Hey Jun,

It just came to me that you may be assuming that folower_lag = HW -
follower_LEO. If that is the case, then we need to have new
request/response to retrieve this lag since the DescribeDirsResponse
doesn't even include HW. It seems that KIP-179 does not explicitly specify
the definition of this lag.

I have been assuming that follow_lag = leader_LEO - follower_LEO given that
the request is used to query the reassignment status. Strictly speaking the
difference between leader_LEO and the HW is limited by the amount of data
produced in KafkaConfig.replicaLagTimeMaxMs, which is 10 seconds. I also
assumed that 10 seconds is probably not a big deal given the typical time
length of the reassignment.

Thanks,
Dong

On Wed, Aug 9, 2017 at 4:40 PM, Dong Lin  wrote:

> Hey Jun,
>
> If I understand you right, you are suggesting that, in the case when there
> is continuous incoming traffic, the approach in the KIP-179 will report lag
> as 0 whereas the approach using DescribeDirsRequest will report lag as
> non-zero. But I think the approach in KIP-179 will also report non-zero lag
> when there is continuous traffic. This is because at the time the leader
> receives ReplicaStatusRequest, it is likely that some data has been
> appended to the partition after the last FetchRequest from the follower.
> Does this make sense?
>
> Thanks,
> Dong
>
>
>
> On Wed, Aug 9, 2017 at 4:24 PM, Jun Rao  wrote:
>
>> Hi, Dong,
>>
>> As for whether to return LEO or lag, my point was the following. What you
>> are concerned about is that an in-sync replica could become out of sync
>> again. However, the more common case is that once a replica is caught up,
>> it will stay in sync afterwards. In that case, once the reassignment
>> process completes, if we report based on lag, all lags will be 0. If we
>> report based on Math.max(0, leaderLEO - followerLEO), the value may not be
>> 0 if there is continuous incoming traffic, which will be confusing to the
>> user.
>>
>> Thanks,
>>
>> Jun
>>
>>
>>
>> On Tue, Aug 8, 2017 at 6:26 PM, Dong Lin  wrote:
>>
>> > Hey Jun,
>> >
>> > Thanks for the comment!
>> >
>> > Yes, it should work. The tool can send request to any broker and broker
>> can
>> > just write the reassignment znode. My previous intuition is that it may
>> be
>> > better to only send this request to controller. But I don't have good
>> > reasons for this restriction.
>> >
>> > My intuition is that we can keep them separate as well. Becket and I
>> have
>> > discussed this both offline and in https://github.com/apache/
>> > kafka/pull/3621.
>> > Currently I don't have a strong opinion on this and I am open to using
>> only
>> > one API to do both if someone can come up with a reasonable API
>> signature
>> > for this method. For now I have added the method alterReplicaDir() in
>> > KafkaAdminClient instead of the AdminClient interface so that the
>> > reassignment script can use this method without concluding what the API
>> > would look like in AdminClient in the future.
>> >
>> > Regarding DescribeDirsResponse, I think it is probably OK to have
>> slightly
>> > more lag. The script can calculate the lag of the follower replica as
>> > Math.max(0, leaderLEO - followerLEO). I agree that it will be slightly
>> less
>> > accurate than the current approach in KIP-179. But even with the current
>> > approach in KIP-179, the result provided by the script is an
>> approximation
>> > anyway, since there is delay from the time that leader returns response
>> to
>> > the time that the script collects response from all brokers and prints
>> > result to user. I think if the slight difference in the accuracy between
>> > the two approaches does not make a difference to the intended use-case
>> of
>> > this API, then we probably want to re-use the exiting request/response
>> to
>> > keep the protocol simple.
>> >
>> > Thanks,
>> > Dong
>> >
>> >
>> >
>> >
>> >
>> > On Tue, Aug 8, 2017 at 5:56 PM, Jun Rao  wrote:
>> >
>> > > Hi, Dong,
>> > >
>> > > I think Tom was suggesting to have the AlterTopicsRequest sent to any
>> > > broker, which just writes the reassignment json to ZK. The controller
>> > will
>> > > pick up the reassignment and act on it as usual. This should work,
>> right?
>> > >
>> > > Having a separate AlterTopicsRequest and AlterReplicaDirRequest seems
>> > > simpler to me. The former is handled by the controller and the latter
>> is
>> > > handled by the affected broker. They don't always have to be done
>> > together.
>> > > Merging the two into a single request probably will make both the api
>> and
>> > > the implementation a bit more complicated. If we do keep the two
>> separate
>> > > requests, it seems that we should just add AlterReplicaDirRequest to
>> the
>> > > AdminClient interface?
>> > >
>> > > Now, regarding DescribeDirsResponse. I agree that it can be used for
>> the
>> > > status reporting in KIP-179 as well. However, it seems 

Re: [DISCUSS] KIP-186: Increase offsets retention default to 7 days

2017-08-09 Thread Stephane Maarek
Any interest on having offsets.retention.minutes= 
log.retention.(ms|minutes|hours)   as a dynamic setting if not set, and having 
the option the override to a constant value?
That would address different types of deployments as well, who modify the 
default log retention period

On 10/8/17, 5:11 am, "Apurva Mehta"  wrote:

Thanks for the KIP. +1 from me.

On Tue, Aug 8, 2017 at 5:24 PM, Ewen Cheslack-Postava 
wrote:

> Hi all,
>
> I posted a simple new KIP for a problem we see with a lot of users:
> KIP-186: Increase offsets retention default to 7 days
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 186%3A+Increase+offsets+retention+default+to+7+days
>
> Note that in addition to the KIP text itself, the linked JIRA already
> existed and has a bunch of discussion on the subject.
>
> -Ewen
>





Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-08-09 Thread Dong Lin
Hey Jun,

If I understand you right, you are suggesting that, in the case when there
is continuous incoming traffic, the approach in the KIP-179 will report lag
as 0 whereas the approach using DescribeDirsRequest will report lag as
non-zero. But I think the approach in KIP-179 will also report non-zero lag
when there is continuous traffic. This is because at the time the leader
receives ReplicaStatusRequest, it is likely that some data has been
appended to the partition after the last FetchRequest from the follower.
Does this make sense?

Thanks,
Dong



On Wed, Aug 9, 2017 at 4:24 PM, Jun Rao  wrote:

> Hi, Dong,
>
> As for whether to return LEO or lag, my point was the following. What you
> are concerned about is that an in-sync replica could become out of sync
> again. However, the more common case is that once a replica is caught up,
> it will stay in sync afterwards. In that case, once the reassignment
> process completes, if we report based on lag, all lags will be 0. If we
> report based on Math.max(0, leaderLEO - followerLEO), the value may not be
> 0 if there is continuous incoming traffic, which will be confusing to the
> user.
>
> Thanks,
>
> Jun
>
>
>
> On Tue, Aug 8, 2017 at 6:26 PM, Dong Lin  wrote:
>
> > Hey Jun,
> >
> > Thanks for the comment!
> >
> > Yes, it should work. The tool can send request to any broker and broker
> can
> > just write the reassignment znode. My previous intuition is that it may
> be
> > better to only send this request to controller. But I don't have good
> > reasons for this restriction.
> >
> > My intuition is that we can keep them separate as well. Becket and I have
> > discussed this both offline and in https://github.com/apache/
> > kafka/pull/3621.
> > Currently I don't have a strong opinion on this and I am open to using
> only
> > one API to do both if someone can come up with a reasonable API signature
> > for this method. For now I have added the method alterReplicaDir() in
> > KafkaAdminClient instead of the AdminClient interface so that the
> > reassignment script can use this method without concluding what the API
> > would look like in AdminClient in the future.
> >
> > Regarding DescribeDirsResponse, I think it is probably OK to have
> slightly
> > more lag. The script can calculate the lag of the follower replica as
> > Math.max(0, leaderLEO - followerLEO). I agree that it will be slightly
> less
> > accurate than the current approach in KIP-179. But even with the current
> > approach in KIP-179, the result provided by the script is an
> approximation
> > anyway, since there is delay from the time that leader returns response
> to
> > the time that the script collects response from all brokers and prints
> > result to user. I think if the slight difference in the accuracy between
> > the two approaches does not make a difference to the intended use-case of
> > this API, then we probably want to re-use the exiting request/response to
> > keep the protocol simple.
> >
> > Thanks,
> > Dong
> >
> >
> >
> >
> >
> > On Tue, Aug 8, 2017 at 5:56 PM, Jun Rao  wrote:
> >
> > > Hi, Dong,
> > >
> > > I think Tom was suggesting to have the AlterTopicsRequest sent to any
> > > broker, which just writes the reassignment json to ZK. The controller
> > will
> > > pick up the reassignment and act on it as usual. This should work,
> right?
> > >
> > > Having a separate AlterTopicsRequest and AlterReplicaDirRequest seems
> > > simpler to me. The former is handled by the controller and the latter
> is
> > > handled by the affected broker. They don't always have to be done
> > together.
> > > Merging the two into a single request probably will make both the api
> and
> > > the implementation a bit more complicated. If we do keep the two
> separate
> > > requests, it seems that we should just add AlterReplicaDirRequest to
> the
> > > AdminClient interface?
> > >
> > > Now, regarding DescribeDirsResponse. I agree that it can be used for
> the
> > > status reporting in KIP-179 as well. However, it seems that reporting
> the
> > > log end offset of each replica may not be easy to use. The log end
> offset
> > > will be returned from different brokers in slightly different time. If
> > > there is continuous producing traffic, the difference in log end offset
> > > between the leader and the follower could be larger than 0 even if the
> > > follower has fully caught up. I am wondering if it's better to instead
> > > return the lag in offset per replica. This way, the status can probably
> > be
> > > reported more reliably.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Aug 8, 2017 at 11:23 AM, Dong Lin  wrote:
> > >
> > > > Hey Tom,
> > > >
> > > > Thanks for the quick reply. Please see my comment inline.
> > > >
> > > > On Tue, Aug 8, 2017 at 11:06 AM, Tom Bentley 
> > > > wrote:
> > > >
> > > > > Hi Dong,
> > > > >
> > > > > Replies inline, as usual
> > > > >
> > > 

Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-08-09 Thread Jun Rao
Hi, Dong,

As for whether to return LEO or lag, my point was the following. What you
are concerned about is that an in-sync replica could become out of sync
again. However, the more common case is that once a replica is caught up,
it will stay in sync afterwards. In that case, once the reassignment
process completes, if we report based on lag, all lags will be 0. If we
report based on Math.max(0, leaderLEO - followerLEO), the value may not be
0 if there is continuous incoming traffic, which will be confusing to the
user.

Thanks,

Jun



On Tue, Aug 8, 2017 at 6:26 PM, Dong Lin  wrote:

> Hey Jun,
>
> Thanks for the comment!
>
> Yes, it should work. The tool can send request to any broker and broker can
> just write the reassignment znode. My previous intuition is that it may be
> better to only send this request to controller. But I don't have good
> reasons for this restriction.
>
> My intuition is that we can keep them separate as well. Becket and I have
> discussed this both offline and in https://github.com/apache/
> kafka/pull/3621.
> Currently I don't have a strong opinion on this and I am open to using only
> one API to do both if someone can come up with a reasonable API signature
> for this method. For now I have added the method alterReplicaDir() in
> KafkaAdminClient instead of the AdminClient interface so that the
> reassignment script can use this method without concluding what the API
> would look like in AdminClient in the future.
>
> Regarding DescribeDirsResponse, I think it is probably OK to have slightly
> more lag. The script can calculate the lag of the follower replica as
> Math.max(0, leaderLEO - followerLEO). I agree that it will be slightly less
> accurate than the current approach in KIP-179. But even with the current
> approach in KIP-179, the result provided by the script is an approximation
> anyway, since there is delay from the time that leader returns response to
> the time that the script collects response from all brokers and prints
> result to user. I think if the slight difference in the accuracy between
> the two approaches does not make a difference to the intended use-case of
> this API, then we probably want to re-use the exiting request/response to
> keep the protocol simple.
>
> Thanks,
> Dong
>
>
>
>
>
> On Tue, Aug 8, 2017 at 5:56 PM, Jun Rao  wrote:
>
> > Hi, Dong,
> >
> > I think Tom was suggesting to have the AlterTopicsRequest sent to any
> > broker, which just writes the reassignment json to ZK. The controller
> will
> > pick up the reassignment and act on it as usual. This should work, right?
> >
> > Having a separate AlterTopicsRequest and AlterReplicaDirRequest seems
> > simpler to me. The former is handled by the controller and the latter is
> > handled by the affected broker. They don't always have to be done
> together.
> > Merging the two into a single request probably will make both the api and
> > the implementation a bit more complicated. If we do keep the two separate
> > requests, it seems that we should just add AlterReplicaDirRequest to the
> > AdminClient interface?
> >
> > Now, regarding DescribeDirsResponse. I agree that it can be used for the
> > status reporting in KIP-179 as well. However, it seems that reporting the
> > log end offset of each replica may not be easy to use. The log end offset
> > will be returned from different brokers in slightly different time. If
> > there is continuous producing traffic, the difference in log end offset
> > between the leader and the follower could be larger than 0 even if the
> > follower has fully caught up. I am wondering if it's better to instead
> > return the lag in offset per replica. This way, the status can probably
> be
> > reported more reliably.
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Aug 8, 2017 at 11:23 AM, Dong Lin  wrote:
> >
> > > Hey Tom,
> > >
> > > Thanks for the quick reply. Please see my comment inline.
> > >
> > > On Tue, Aug 8, 2017 at 11:06 AM, Tom Bentley 
> > > wrote:
> > >
> > > > Hi Dong,
> > > >
> > > > Replies inline, as usual
> > > >
> > > > > As I originally envisaged it, KIP-179's support for reassigning
> > > > partitions
> > > > >
> > > > > would have more-or-less taken the logic currently in the
> > > > > > ReassignPartitionsCommand (that is, writing JSON to the
> > > > > > ZkUtils.ReassignPartitionsPath)
> > > > > > and put it behind a suitable network protocol API. Thus it
> wouldn't
> > > > > matter
> > > > > > which broker received the protocol call: It would be acted on by
> > > > brokers
> > > > > > being notified of the change in the ZK path, just as currently.
> > This
> > > > > would
> > > > > > have kept the ReassignPartitionsCommand relatively simple, as it
> > > > > currently
> > > > > > is.
> > > > > >
> > > > >
> > > > > I am not sure I fully understand your proposal. I think you are
> > saying
> > > > that
> > > > > any broker can receive and handle the 

[GitHub] kafka pull request #3421: KAFKA-5507: Check classpath empty in kafka-run-cla...

2017-08-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3421


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-180: Add a broker metric specifying the number of consumer group rebalances in progress

2017-08-09 Thread Ismael Juma
Sounds good to me.

On Wed, Aug 9, 2017 at 11:32 PM, Colin McCabe  wrote:

> Hi Apurva,
>
> Thanks for taking another look.  Responses below:
>
> On Mon, Aug 7, 2017, at 17:47, Apurva Mehta wrote:
> > The KIP looks good to me. In your latest proposal, the change of state
> > would be captured as followed in the metrics for groups using Kafka for
> > membership management:
> >
> > PreparingRebalance -> CompletingRebalance -> Stable -> Dead?
>
> Right.  As described in
> core/src/main/scala/kafka/coordinator/group/GroupMetadata.scala
>
> >
> > If a group is just being used to store offsets, then it is always Empty?
>
> I believe so.  Groups can also be Empty if they have no more members,
> but the offsets have not yet expired.
>
> Of course, this KIP is just exposing the metrics, not changing how
> groups work in any way.
>
> Maybe we should start a vote, if this looks good to everyone?
>
> best,
> Colin
>
>
> >
> > If so, this makes sense to me.
> >
> > Thanks,
> > Apurva
> >
> > On Mon, Aug 7, 2017 at 5:09 PM, Colin McCabe  wrote:
> >
> > > How about PreparingRebalance / CompletingRebalance?
> > >
> > > cheers,
> > > Colin
> > >
> > >
> > > On Fri, Aug 4, 2017, at 09:03, Ismael Juma wrote:
> > > > I agree that we should make them consistent. I think RebalanceJoin
> and
> > > > RebalanceAssignment are reasonable names. I think they are a bit more
> > > > descriptive than `PreparingRebalance` and `CompletingRebalance`. If
> we
> > > > need
> > > > to add more states, it seems a little easier to do if the states are
> a
> > > > bit
> > > > more descriptive. I am OK with either of the 2 options as I think
> they
> > > > are
> > > > both better than the status quo.
> > > >
> > > > Ismael
> > > >
> > > > On Fri, Aug 4, 2017 at 4:52 PM, Jason Gustafson 
> > > > wrote:
> > > >
> > > > > Hey Guozhang,
> > > > >
> > > > > Usually I think such naming inconsistencies are best avoided. It
> adds
> > > > > another level of confusion for people who have to dip into the
> code,
> > > figure
> > > > > out a problem, and ultimately explain it. Since we already have the
> > > > > PreparingRebalance state, maybe we could just rename the
> AwaitingSync
> > > state
> > > > > to CompletingRebalance?
> > > > >
> > > > > -Jason
> > > > >
> > > > > On Thu, Aug 3, 2017 at 6:09 PM, Guozhang Wang 
> > > wrote:
> > > > >
> > > > > > From an ops person's view who are mostly likely watching the
> metrics
> > > > > these
> > > > > > names may not be very clear as people may not know the internals
> > > well.
> > > > > I'd
> > > > > > prefer PrepareRebalance and CompleteRebalance since they may be
> > > easier to
> > > > > > understand thought not 100 percent accurately match to internal
> > > > > > implementation.
> > > > > >
> > > > > >
> > > > > > Guozhang
> > > > > >
> > > > > >
> > > > > > On Thu, Aug 3, 2017 at 4:14 PM, Jason Gustafson <
> ja...@confluent.io>
> > > > > > wrote:
> > > > > >
> > > > > > > Hey Colin, Guozhang,
> > > > > > >
> > > > > > > I agree the current state names are not ideal for end users. I
> > > tend to
> > > > > > see
> > > > > > > the rebalance as joining the group and receiving the
> assignment.
> > > Maybe
> > > > > > the
> > > > > > > states could be named in those terms? For example:
> RebalanceJoin
> > > and
> > > > > > > RebalanceAssignment. What do you think?
> > > > > > >
> > > > > > > -Jason
> > > > > > >
> > > > > > > On Fri, Jul 28, 2017 at 11:18 AM, Guozhang Wang <
> > > wangg...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I feel we can change `AwaitSync` to `completeRebalance` while
> > > keeping
> > > > > > the
> > > > > > > > other as is.
> > > > > > > >
> > > > > > > > cc Jason?
> > > > > > > >
> > > > > > > > Guozhang
> > > > > > > >
> > > > > > > > On Fri, Jul 28, 2017 at 10:08 AM, Colin McCabe <
> > > cmcc...@apache.org>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > >> Thanks for the explanation.  I guess maybe we should just
> keep
> > > the
> > > > > > group
> > > > > > > >> names as they are, then?
> > > > > > > >>
> > > > > > > >> best,
> > > > > > > >> Colin
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> On Wed, Jul 26, 2017, at 11:25, Guozhang Wang wrote:
> > > > > > > >> > To me `PreparingRebalance` sounds better than
> > > `StartingRebalance`
> > > > > > > since
> > > > > > > >> > only by the end of that stage we have formed a new group.
> More
> > > > > > > >> > specifically, this this the workflow from the
> coordinator's
> > > point
> > > > > of
> > > > > > > >> > view:
> > > > > > > >> >
> > > > > > > >> > 1. decided to trigger a rebalance, enter
> PreparingRebalance
> > > phase;
> > > > > > > >> >   |
> > > > > > > >> >   |   send out error code for all
> heartbeat
> > > > > reponses
> > > > > > > >> >  \|/
> > > > > > > >> >   |
> > > > > > > >> >   |   waiting for join group requests from
> 

[GitHub] kafka pull request #3371: KAFKA-5470: Replace -XX:+DisableExplicitGC with -X...

2017-08-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3371


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka-site issue #70: Add learn more verbiage on ctas

2017-08-09 Thread guozhangwang
Github user guozhangwang commented on the issue:

https://github.com/apache/kafka-site/pull/70
  
LGTM. Merged to `asf-site`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka-site pull request #70: Add learn more verbiage on ctas

2017-08-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka-site/pull/70


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3649: MINOR: Add one more instruction

2017-08-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3649


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-180: Add a broker metric specifying the number of consumer group rebalances in progress

2017-08-09 Thread Colin McCabe
Hi Apurva,

Thanks for taking another look.  Responses below:

On Mon, Aug 7, 2017, at 17:47, Apurva Mehta wrote:
> The KIP looks good to me. In your latest proposal, the change of state
> would be captured as followed in the metrics for groups using Kafka for
> membership management:
> 
> PreparingRebalance -> CompletingRebalance -> Stable -> Dead?

Right.  As described in
core/src/main/scala/kafka/coordinator/group/GroupMetadata.scala

> 
> If a group is just being used to store offsets, then it is always Empty?

I believe so.  Groups can also be Empty if they have no more members,
but the offsets have not yet expired.

Of course, this KIP is just exposing the metrics, not changing how
groups work in any way.

Maybe we should start a vote, if this looks good to everyone?

best,
Colin


> 
> If so, this makes sense to me.
> 
> Thanks,
> Apurva
> 
> On Mon, Aug 7, 2017 at 5:09 PM, Colin McCabe  wrote:
> 
> > How about PreparingRebalance / CompletingRebalance?
> >
> > cheers,
> > Colin
> >
> >
> > On Fri, Aug 4, 2017, at 09:03, Ismael Juma wrote:
> > > I agree that we should make them consistent. I think RebalanceJoin and
> > > RebalanceAssignment are reasonable names. I think they are a bit more
> > > descriptive than `PreparingRebalance` and `CompletingRebalance`. If we
> > > need
> > > to add more states, it seems a little easier to do if the states are a
> > > bit
> > > more descriptive. I am OK with either of the 2 options as I think they
> > > are
> > > both better than the status quo.
> > >
> > > Ismael
> > >
> > > On Fri, Aug 4, 2017 at 4:52 PM, Jason Gustafson 
> > > wrote:
> > >
> > > > Hey Guozhang,
> > > >
> > > > Usually I think such naming inconsistencies are best avoided. It adds
> > > > another level of confusion for people who have to dip into the code,
> > figure
> > > > out a problem, and ultimately explain it. Since we already have the
> > > > PreparingRebalance state, maybe we could just rename the AwaitingSync
> > state
> > > > to CompletingRebalance?
> > > >
> > > > -Jason
> > > >
> > > > On Thu, Aug 3, 2017 at 6:09 PM, Guozhang Wang 
> > wrote:
> > > >
> > > > > From an ops person's view who are mostly likely watching the metrics
> > > > these
> > > > > names may not be very clear as people may not know the internals
> > well.
> > > > I'd
> > > > > prefer PrepareRebalance and CompleteRebalance since they may be
> > easier to
> > > > > understand thought not 100 percent accurately match to internal
> > > > > implementation.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > >
> > > > > On Thu, Aug 3, 2017 at 4:14 PM, Jason Gustafson 
> > > > > wrote:
> > > > >
> > > > > > Hey Colin, Guozhang,
> > > > > >
> > > > > > I agree the current state names are not ideal for end users. I
> > tend to
> > > > > see
> > > > > > the rebalance as joining the group and receiving the assignment.
> > Maybe
> > > > > the
> > > > > > states could be named in those terms? For example: RebalanceJoin
> > and
> > > > > > RebalanceAssignment. What do you think?
> > > > > >
> > > > > > -Jason
> > > > > >
> > > > > > On Fri, Jul 28, 2017 at 11:18 AM, Guozhang Wang <
> > wangg...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I feel we can change `AwaitSync` to `completeRebalance` while
> > keeping
> > > > > the
> > > > > > > other as is.
> > > > > > >
> > > > > > > cc Jason?
> > > > > > >
> > > > > > > Guozhang
> > > > > > >
> > > > > > > On Fri, Jul 28, 2017 at 10:08 AM, Colin McCabe <
> > cmcc...@apache.org>
> > > > > > wrote:
> > > > > > >
> > > > > > >> Thanks for the explanation.  I guess maybe we should just keep
> > the
> > > > > group
> > > > > > >> names as they are, then?
> > > > > > >>
> > > > > > >> best,
> > > > > > >> Colin
> > > > > > >>
> > > > > > >>
> > > > > > >> On Wed, Jul 26, 2017, at 11:25, Guozhang Wang wrote:
> > > > > > >> > To me `PreparingRebalance` sounds better than
> > `StartingRebalance`
> > > > > > since
> > > > > > >> > only by the end of that stage we have formed a new group. More
> > > > > > >> > specifically, this this the workflow from the coordinator's
> > point
> > > > of
> > > > > > >> > view:
> > > > > > >> >
> > > > > > >> > 1. decided to trigger a rebalance, enter PreparingRebalance
> > phase;
> > > > > > >> >   |
> > > > > > >> >   |   send out error code for all heartbeat
> > > > reponses
> > > > > > >> >  \|/
> > > > > > >> >   |
> > > > > > >> >   |   waiting for join group requests from
> > members
> > > > > > >> >  \|/
> > > > > > >> > 2. formed a new group, increment the generation number, now
> > start
> > > > > > >> > rebalancing, entering AwaitSync phase:
> > > > > > >> >   |
> > > > > > >> >   |   send out the join group responses for
> > > > whoever
> > > > > > >> > requested join
> > > > > > >> >  \|/
> > > > > > >> >

Usage of SSL_ENDPOINT_IDENTIFICATION_ALGORITHM_CONFIG

2017-08-09 Thread M. Manna
Hello,

I have been trying to find the usage of this property within Kafka source.
All I am trying to understand is how this is looked up for hostname
verification.

I used notepad++ to search for this and only link was in KafkaConfig.scala
file - after that I cannot see this scala property being used anywhere
directly.

Could anyone please tell me the names of some source files which is using
this for hostname verification? It will be highly appreciated.

Kindest Regards,


[jira] [Created] (KAFKA-5720) In Jenkins, kafka.api.SaslSslAdminClientIntegrationTest failed with org.apache.kafka.common.errors.TimeoutException

2017-08-09 Thread Colin P. McCabe (JIRA)
Colin P. McCabe created KAFKA-5720:
--

 Summary: In Jenkins, kafka.api.SaslSslAdminClientIntegrationTest 
failed with org.apache.kafka.common.errors.TimeoutException
 Key: KAFKA-5720
 URL: https://issues.apache.org/jira/browse/KAFKA-5720
 Project: Kafka
  Issue Type: Bug
Reporter: Colin P. McCabe
Priority: Minor


In Jenkins, kafka.api.SaslSslAdminClientIntegrationTest failed with 
org.apache.kafka.common.errors.TimeoutException.

{code}
java.util.concurrent.ExecutionException: 
org.apache.kafka.common.errors.TimeoutException: Aborted due to timeout.
at 
org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
at 
org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
at 
org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
at 
org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:213)
at 
kafka.api.AdminClientIntegrationTest.testCallInFlightTimeouts(AdminClientIntegrationTest.scala:399)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.common.errors.TimeoutException: Aborted due to 
timeout.
{code}

It's unclear whether this was an environment error or test bug.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: kafka-trunk-jdk8 #1895

2017-08-09 Thread Apache Jenkins Server
See 

--
[...truncated 4.20 MB...]

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
PASSED

kafka.integration.TopicMetadataTest > testIsrAfterBrokerShutDownAndJoinsBack 
STARTED

kafka.integration.TopicMetadataTest > testIsrAfterBrokerShutDownAndJoinsBack 
PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithCollision STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithCollision PASSED

kafka.integration.TopicMetadataTest > testAliveBrokerListWithNoTopics STARTED

kafka.integration.TopicMetadataTest > testAliveBrokerListWithNoTopics PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.TopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.TopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.TopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.TopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithInvalidReplication 
STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithInvalidReplication 
PASSED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.FetcherTest > testFetcher STARTED

kafka.integration.FetcherTest > testFetcher PASSED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionEnabled 
STARTED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionEnabled 
PASSED

kafka.integration.UncleanLeaderElectionTest > 
testCleanLeaderElectionDisabledByTopicOverride STARTED

kafka.integration.UncleanLeaderElectionTest > 
testCleanLeaderElectionDisabledByTopicOverride SKIPPED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionDisabled 
STARTED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionDisabled 
SKIPPED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionInvalidTopicOverride STARTED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionInvalidTopicOverride PASSED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionEnabledByTopicOverride STARTED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionEnabledByTopicOverride PASSED

kafka.integration.MinIsrConfigTest > testDefaultKafkaConfig STARTED

kafka.integration.MinIsrConfigTest > testDefaultKafkaConfig PASSED

unit.kafka.server.KafkaApisTest > 
shouldRespondWithUnsupportedForMessageFormatOnHandleWriteTxnMarkersWhenMagicLowerThanRequired
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldRespondWithUnsupportedForMessageFormatOnHandleWriteTxnMarkersWhenMagicLowerThanRequired
 PASSED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleTxnOffsetCommitRequestWhenInterBrokerProtocolNotSupported
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleTxnOffsetCommitRequestWhenInterBrokerProtocolNotSupported
 PASSED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleAddPartitionsToTxnRequestWhenInterBrokerProtocolNotSupported
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleAddPartitionsToTxnRequestWhenInterBrokerProtocolNotSupported
 PASSED

unit.kafka.server.KafkaApisTest > testReadUncommittedConsumerListOffsetLatest 
STARTED

unit.kafka.server.KafkaApisTest > testReadUncommittedConsumerListOffsetLatest 
PASSED

unit.kafka.server.KafkaApisTest > 
shouldAppendToLogOnWriteTxnMarkersWhenCorrectMagicVersion STARTED

unit.kafka.server.KafkaApisTest > 
shouldAppendToLogOnWriteTxnMarkersWhenCorrectMagicVersion PASSED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleWriteTxnMarkersRequestWhenInterBrokerProtocolNotSupported
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleWriteTxnMarkersRequestWhenInterBrokerProtocolNotSupported
 PASSED

unit.kafka.server.KafkaApisTest > 

Re: [DISCUSS] KIP-185: Make exactly once in order delivery per partition the default producer setting

2017-08-09 Thread Becket Qin
Thanks for the KIP, Apurva. It is a good time to review the configurations
to see if we can improve the user experience. We also might need to think
from users standpoint about the out of the box experience.

01. Generally speaking, I think it makes sense to make idempotence=true so
we can enable producer side pipeline without ordering issue. However, the
impact is that users may occasionally receive OutOfOrderSequencException.
In this case, there is not much user can do if they want to ensure
ordering. They basically have to close the producer in the call back and
resend all the records that are in the RecordAccumulator. This is very
involved. And the users may not have a way to retrieve the Records in the
accumulator anymore. So for the users who really want to achieve the
exactly once semantic, there are actually still a lot of work to do even
with those default. For the rest of the users, they need to handle one more
exception, which might not be a big deal.

02. Setting acks=-1 would significantly reduce the likelihood of
OutOfOrderSequenceException from happening. However, the latency/throughput
impact and additional purgatory burden on the broker are big concerns. And
it does not really guarantee exactly once without broker side
configurations. i.e unclean.leader.election, min.isr, etc. I am not sure if
it is worth making acks=-1 a global config instead of letting the users who
are really care about this to configure correctly.

03. Regarding retries, I think we had some discussion in KIP-91. The
problem of setting retries to max integer is that producer.flush() may take
forever. Will this KIP be depending on KIP-91?

I am not sure about having a cap on the max.in.flight.requests. It seems
that on some long RTT link, sending more requests in the pipeline would be
the only way to keep the latency to be close to RTT.

Thanks,

Jiangjie (Becket) Qin


On Wed, Aug 9, 2017 at 11:28 AM, Apurva Mehta  wrote:

> Thanks for the comments Ismael and Jason.
>
> Regarding the OutOfOrderSequenceException, it is more likely when you
> enable idempotence and have acks=1, simply because you have a greater
> probability of losing acknowledged data with acks=1, and the error code
> indicates that.
>
> The particular scenario is that a broker acknowledges a message with
> sequence N before replication happens, and then crashes. Since the message
> was acknowledged the producer increments its sequence to N+1. The new
> leader would not have received the message, and still expects sequence N
> from the producer. When it receives N+1 for the next message, it will
> return an OutOfOrderSequenceNumber, correctl/y indicating some previously
> acknowledged messages are missing.
>
> For the idempotent producer alone, the OutOfOrderSequenceException is
> returned in the Future and Callback, indicating to the application that
> some acknowledged data was lost. However, the application can continue
> producing data using the producer instance. The only compatibility issue
> here is that the application will now see a new exception for a state which
> went previously undetected.
>
> For a transactional producer, an OutOfOrderSequenceException is fatal and
> the application must use a new instance of the producer.
>
> Another point about acks=1 with enable.idempotence=true. What semantics are
> we promising here? Essentially we are saying that the default mode would be
> 'if a message is in the log, it will occur only once, but all acknowledged
> messages may not make it to the log'. I don't think that this is a
> desirable default guarantee.
>
> I will update the KIP to indicate that with the new default, applications
> might get a new 'OutOfOrderSequenceException'.
>
> Thanks,
> Apurva
>
> On Wed, Aug 9, 2017 at 9:33 AM, Ismael Juma  wrote:
>
> > Hi Jason,
> >
> > Thanks for the correction. See inline.
> >
> > On Wed, Aug 9, 2017 at 5:13 PM, Jason Gustafson 
> > wrote:
> >
> > > Minor correction: the OutOfOrderSequenceException is not fatal for the
> > > idempotent producer and it is not necessarily tied to the acks setting
> > > (though it is more likely to be thrown with acks=1).
> >
> >
> > Right, it would be worth expanding on the specifics of this. My
> > understanding is that common failure scenarios could trigger it.
> >
> >
> > > It is used to signal
> > > the user that there was a gap in the delivery of messages. You can hit
> > this
> > > if there is a pause on the producer and the topic retention kicks in
> and
> > > deletes the last records the producer had written. However, it is
> > possible
> > > for the user to catch it and simply keep producing (internally the
> > producer
> > > will generate a new ProducerId).
> >
> >
> > I see, our documentation states that it's fatal in the following example
> > and in the `send` method. I had overlooked that this was mentioned in the
> > context of transactions. If we were to enable idempotence by default,
> we'd
> 

[jira] [Created] (KAFKA-5719) Create a quickstart archetype project for Kafka Streams

2017-08-09 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-5719:


 Summary: Create a quickstart archetype project for Kafka Streams
 Key: KAFKA-5719
 URL: https://issues.apache.org/jira/browse/KAFKA-5719
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Guozhang Wang
Assignee: Guozhang Wang


People have been suggesting to add some tutorial-like web doc sections on how 
to implement a stream processing application using the Kafka Streams library. 
Plus the current o.a.k.streams.examples package is that it is part of the 
o.a.k. package which conflicts with common development cycle experience. So I'd 
like to propose adding a maven archetype project for Kafka Streams along with 
an additional tutorial / quickstart web doc sections. Here is a step-by-step 
plan for the project:

1. Add the archetype project and include in the release (both in dist.apache 
and repository.apache maven repos); also add a {{Write you own Streams 
application}} section on web docs along with the existing {{Play with a demo 
Streams application}}.

2. Modify {{Play with a demo Streams application}} to be based on the archetype 
project as well.

3. Migrate all examples code from {{o.a.k.streams.example}} to the archetype 
project and remove this package.

* 4. Moving forward, we can add more complicated examples in the archetype 
project such as integration with Connect, interactive queries, etc; and augment 
the demo and tutorial web doc sections accordingly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to normal : kafka-0.11.0-jdk7 #266

2017-08-09 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-175: Additional '--describe' views for ConsumerGroupCommand

2017-08-09 Thread Jason Gustafson
Thanks for the KIP. +1

On Thu, Jul 27, 2017 at 2:04 PM, Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> Hi all,
>
> Thanks to everyone who participated in the discussion on KIP-175, and
> provided feedback.
> The KIP can be found at
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 175%3A+Additional+%27--describe%27+views+for+ConsumerGroupCommand
> .
> I believe the concerns have been addressed in the recent version of the
> KIP; so I'd like to start a vote.
>
> Thanks.
> --Vahid
>
>


Re: [DISCUSS] KIP-91 Provide Intuitive User Timeouts in The Producer

2017-08-09 Thread Apurva Mehta
> > There seems to be no relationship with cluster metadata availability or
> > staleness. Expiry is just based on the time since the batch has been
> ready.
> > Please correct me if I am wrong.
> >
>
> I was not very specific about where we do expiration. I glossed over some
> details because (again) we've other mechanisms to detect non progress. The
> condition (!muted.contains(tp) && (isMetadataStale ||
> > cluster.leaderFor(tp) == null)) is used in
> RecordAccumualtor.expiredBatches:
> https://github.com/apache/kafka/blob/trunk/clients/src/
> main/java/org/apache/kafka/clients/producer/internals/
> RecordAccumulator.java#L443
>
>
> Effectively, we expire in all the following cases
> 1) producer is partitioned from the brokers. When metadata age grows beyond
> 3x it's max value. It's safe to say that we're not talking to the brokers
> at all. Report.
> 2) fresh metadata && leader for a partition is not known && a batch is
> sitting there for longer than request.timeout.ms. This is one case we
> would
> like to improve and use batch.expiry.ms because request.timeout.ms is too
> small.
> 3) fresh metadata && leader for a partition is known && batch is sitting
> there for longer than batch.expiry.ms. This is a new case that is
> different
> from #2. This is the catch-up mode case. Things are moving too slowly.
> Pipeline SLAs are broken. Report and shutdown kmm.
>
> The second and the third cases are useful to a real-time app for a
> completely different reason. Report, forget about the batch, and just move
> on (without shutting down).
>
>
If I understand correctly, you are talking about a fork of apache kafka
which has these additional conditions? Because that check doesn't exist on
trunk today.  Or are you proposing to change the behavior of expiry to
account for stale metadata and partitioned producers as part of this KIP?


[GitHub] kafka pull request #3566: KAFKA-5629: Changed behavior of ConsoleConsumer wh...

2017-08-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3566


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-5629) Console Consumer overrides auto.offset.reset property when provided on the command line without warning about it.

2017-08-09 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-5629.

   Resolution: Fixed
Fix Version/s: 1.0.0

Issue resolved by pull request 3566
[https://github.com/apache/kafka/pull/3566]

> Console Consumer overrides auto.offset.reset property when provided on the 
> command line without warning about it.
> -
>
> Key: KAFKA-5629
> URL: https://issues.apache.org/jira/browse/KAFKA-5629
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Affects Versions: 0.11.0.0
>Reporter: Sönke Liebau
>Assignee: Sönke Liebau
>Priority: Trivial
> Fix For: 1.0.0
>
>
> The console consumer allows to specify consumer options on the command line 
> with the --consumer-property parameter.
> In the case of auto.offset.reset this parameter will always silently be 
> ignored though, because this behavior is controlled via the --from-beginning 
> parameter.
> I believe that behavior to be fine, however we should log a warning in case 
> auto.offset.reset is specified on the command line and overridden to 
> something else in the code to avoid potential confusion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to normal : kafka-trunk-jdk7 #2622

2017-08-09 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-trunk-jdk8 #1894

2017-08-09 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] KAFKA-5717; InMemoryKeyValueStore should delete keys with null 
values

--
[...truncated 3.68 MB...]
org.apache.kafka.common.security.scram.ScramMessagesTest > 
invalidServerFinalMessage PASSED

org.apache.kafka.common.security.scram.ScramMessagesTest > 
invalidClientFirstMessage STARTED

org.apache.kafka.common.security.scram.ScramMessagesTest > 
invalidClientFirstMessage PASSED

org.apache.kafka.common.security.scram.ScramMessagesTest > 
validClientFinalMessage STARTED

org.apache.kafka.common.security.scram.ScramMessagesTest > 
validClientFinalMessage PASSED

org.apache.kafka.common.security.scram.ScramMessagesTest > 
invalidServerFirstMessage STARTED

org.apache.kafka.common.security.scram.ScramMessagesTest > 
invalidServerFirstMessage PASSED

org.apache.kafka.common.security.scram.ScramMessagesTest > 
validServerFinalMessage STARTED

org.apache.kafka.common.security.scram.ScramMessagesTest > 
validServerFinalMessage PASSED

org.apache.kafka.common.security.ssl.SslFactoryTest > 
testSslFactoryWithoutPasswordConfiguration STARTED

org.apache.kafka.common.security.ssl.SslFactoryTest > 
testSslFactoryWithoutPasswordConfiguration PASSED

org.apache.kafka.common.security.ssl.SslFactoryTest > testClientMode STARTED

org.apache.kafka.common.security.ssl.SslFactoryTest > testClientMode PASSED

org.apache.kafka.common.security.ssl.SslFactoryTest > 
testSslFactoryConfiguration STARTED

org.apache.kafka.common.security.ssl.SslFactoryTest > 
testSslFactoryConfiguration PASSED

org.apache.kafka.common.security.kerberos.KerberosNameTest > testParse STARTED

org.apache.kafka.common.security.kerberos.KerberosNameTest > testParse PASSED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testPrincipalNameCanContainSeparator STARTED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testPrincipalNameCanContainSeparator PASSED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testEqualsAndHashCode STARTED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testEqualsAndHashCode PASSED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithListenerNameOverride STARTED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithListenerNameOverride PASSED

org.apache.kafka.common.security.JaasContextTest > testMissingOptionValue 
STARTED

org.apache.kafka.common.security.JaasContextTest > testMissingOptionValue PASSED

org.apache.kafka.common.security.JaasContextTest > testSingleOption STARTED

org.apache.kafka.common.security.JaasContextTest > testSingleOption PASSED

org.apache.kafka.common.security.JaasContextTest > 
testNumericOptionWithoutQuotes STARTED

org.apache.kafka.common.security.JaasContextTest > 
testNumericOptionWithoutQuotes PASSED

org.apache.kafka.common.security.JaasContextTest > testConfigNoOptions STARTED

org.apache.kafka.common.security.JaasContextTest > testConfigNoOptions PASSED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithWrongListenerName STARTED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithWrongListenerName PASSED

org.apache.kafka.common.security.JaasContextTest > testNumericOptionWithQuotes 
STARTED

org.apache.kafka.common.security.JaasContextTest > testNumericOptionWithQuotes 
PASSED

org.apache.kafka.common.security.JaasContextTest > testQuotedOptionValue STARTED

org.apache.kafka.common.security.JaasContextTest > testQuotedOptionValue PASSED

org.apache.kafka.common.security.JaasContextTest > testMissingLoginModule 
STARTED

org.apache.kafka.common.security.JaasContextTest > testMissingLoginModule PASSED

org.apache.kafka.common.security.JaasContextTest > testMissingSemicolon STARTED

org.apache.kafka.common.security.JaasContextTest > testMissingSemicolon PASSED

org.apache.kafka.common.security.JaasContextTest > testMultipleOptions STARTED

org.apache.kafka.common.security.JaasContextTest > testMultipleOptions PASSED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForClientWithListenerName STARTED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForClientWithListenerName PASSED

org.apache.kafka.common.security.JaasContextTest > testMultipleLoginModules 
STARTED

org.apache.kafka.common.security.JaasContextTest > testMultipleLoginModules 
PASSED

org.apache.kafka.common.security.JaasContextTest > testMissingControlFlag 
STARTED

org.apache.kafka.common.security.JaasContextTest > testMissingControlFlag PASSED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithListenerNameAndFallback STARTED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithListenerNameAndFallback PASSED

org.apache.kafka.common.security.JaasContextTest > testQuotedOptionName STARTED

org.apache.kafka.common.security.JaasContextTest > testQuotedOptionName PASSED


[GitHub] kafka pull request #3651: MINOR: Add missing deprecations on old request obj...

2017-08-09 Thread hachikuji
GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/3651

MINOR: Add missing deprecations on old request objects



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka 
add-missing-request-deprecations

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3651.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3651


commit a318ae47f6515c1138d56decbf274f5196bb5411
Author: Jason Gustafson 
Date:   2017-08-09T19:49:12Z

MINOR: Add missing deprecations on old request objects




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-186: Increase offsets retention default to 7 days

2017-08-09 Thread Apurva Mehta
Thanks for the KIP. +1 from me.

On Tue, Aug 8, 2017 at 5:24 PM, Ewen Cheslack-Postava 
wrote:

> Hi all,
>
> I posted a simple new KIP for a problem we see with a lot of users:
> KIP-186: Increase offsets retention default to 7 days
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 186%3A+Increase+offsets+retention+default+to+7+days
>
> Note that in addition to the KIP text itself, the linked JIRA already
> existed and has a bunch of discussion on the subject.
>
> -Ewen
>


Re: [DISCUSS] KIP-182: Reduce Streams DSL overloads and allow easier use of custom storage engines

2017-08-09 Thread Damian Guy
On Wed, 9 Aug 2017 at 20:00 Guozhang Wang  wrote:

> >> Another comment about Printed in general is it differs with other
> options
> >> that it is a required option than optional one, since it includes
> toSysOut
> >> / toFile specs; what are the pros and cons for including these two in
> the
> >> option and hence make it a required option than leaving them at the API
> >> layer and make Printed as optional for mapper / label only?
> >>
> >>
> >It isn't required as we will still have the no-arg print() which will just
> >go to sysout as it does now.
>
> Got it. So just to clarify are we going to deprecate writeAsText or not?
>
>
Correct.


>
> On Wed, Aug 9, 2017 at 11:38 AM, Guozhang Wang  wrote:
>
> > >> The key idea is that by using the same function name string for static
> > >> constructor and member functions, users do not need to remember what
> > are
> > >> the differences but can call these functions with any ordering they
> > want,
> > >> and later calls on the same spec will win over early calls.
> > >>
> > >>
> > >That would be great if java supported it, but it doesn't. You can't have
> > >static an member functions with the same signature.
> >
> > Got it, thanks.
> >
> > Does it still make sense to have one static constructors for each spec,
> > with one constructor having only one parameter to make it more usable,
> i.e.
> > as a user I do not need to give all parameters if I only want to override
> > one of them? Maybe we can just name the constructors as `with` but I'm
> not
> > sure if Java distinguish:
> >
> > public static  Produced with(final Serde keySerde)
> > public static  Produced with(final Serde valueSerde)
> >
> > as two function signatures.
> >
> >
> > Guozhang
> >
> >
> > On Wed, Aug 9, 2017 at 6:20 AM, Damian Guy  wrote:
> >
> >> On Tue, 8 Aug 2017 at 20:11 Guozhang Wang  wrote:
> >>
> >> > Damian,
> >> >
> >> > Thanks for the proposal, I had a few comments on the APIs:
> >> >
> >> > 1. Printed#withFile seems not needed, as users should always spec if
> it
> >> is
> >> > to sysOut or to File at the beginning. In addition as a second
> thought,
> >> I
> >> > think serdes are not useful for prints anyways since we assume
> >> `toString`
> >> > is provided except for byte arrays, in which we will special handle
> it.
> >> >
> >> >
> >> +1
> >>
> >>
> >> > Another comment about Printed in general is it differs with other
> >> options
> >> > that it is a required option than optional one, since it includes
> >> toSysOut
> >> > / toFile specs; what are the pros and cons for including these two in
> >> the
> >> > option and hence make it a required option than leaving them at the
> API
> >> > layer and make Printed as optional for mapper / label only?
> >> >
> >> >
> >> It isn't required as we will still have the no-arg print() which will
> just
> >> go to sysout as it does now.
> >>
> >>
> >> >
> >> > 2.1 KStream#through / to
> >> >
> >> > We should have an overloaded function without Produced?
> >> >
> >>
> >> Yes - we already have those so they are not part of the KIP, i.e,
> >> through(topic)
> >>
> >>
> >> >
> >> > 2.2 KStream#groupBy / groupByKey
> >> >
> >> > We should have an overloaded function without Serialized?
> >> >
> >>
> >> Yes, as above
> >>
> >> >
> >> > 2.3 KGroupedStream#count / reduce / aggregate
> >> >
> >> > We should have an overloaded function without Materialized?
> >> >
> >>
> >> As above
> >>
> >> >
> >> > 2.4 KStream#join
> >> >
> >> > We should have an overloaded function without Joined?
> >> >
> >>
> >> as above
> >>
> >> >
> >> >
> >> > 2.5 Each of KTable's operators:
> >> >
> >> > We should have an overloaded function without Produced / Serialized /
> >> > Materialized?
> >> >
> >> >
> >> as above
> >>
> >>
> >> >
> >> >
> >> > 3.1 Produced: the static functions have overlaps, which seems not
> >> > necessary. I'd suggest jut having the following three static with
> >> another
> >> > three similar member functions:
> >> >
> >> > public static  Produced withKeySerde(final Serde
> >> keySerde)
> >> >
> >> > public static  Produced withValueSerde(final Serde
> >> > valueSerde)
> >> >
> >> > public static  Produced withStreamPartitioner(final
> >> > StreamPartitioner partitioner)
> >> >
> >> > The key idea is that by using the same function name string for static
> >> > constructor and member functions, users do not need to remember what
> are
> >> > the differences but can call these functions with any ordering they
> >> want,
> >> > and later calls on the same spec will win over early calls.
> >> >
> >> >
> >> That would be great if java supported it, but it doesn't. You can't have
> >> static an member functions with the same signature.
> >>
> >>
> >> >
> >> > 3.2 Serialized: similarly
> >> >
> >> > public static  Serialized withKeySerde(final Serde
> >> keySerde)
> >> >
> >> > public static 

[jira] [Resolved] (KAFKA-5717) [streams] 'null' values in state stores

2017-08-09 Thread Damian Guy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Guy resolved KAFKA-5717.
---
   Resolution: Fixed
Fix Version/s: 0.11.0.1
   1.0.0

Issue resolved by pull request 3650
[https://github.com/apache/kafka/pull/3650]

> [streams] 'null' values in state stores
> ---
>
> Key: KAFKA-5717
> URL: https://issues.apache.org/jira/browse/KAFKA-5717
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1, 0.11.0.0
>Reporter: Bart Vercammen
>Assignee: Damian Guy
> Fix For: 1.0.0, 0.11.0.1
>
>
> When restoring the state on an in-memory KeyValue store (at startup of the 
> Kafka Streams application), the _deleted_ values are put in the store as 
> _key_ with _value_ {{null}} instead of being removed from the store.
> (this happens when the underlying kafka topic segment did not get compacted 
> yet)
> After some digging I came across this in {{InMemoryKeyValueStore}}:
> {code}
> public synchronized void put(K key, V value) {
> this.map.put(key, value);
> }
> {code}
> I would assume this implementation misses the check on {{value}} being 
> {{null}} to *delete* the entry instead of just storing it.
> In the RocksDB implementation it is done correctly:
> {code}
> if (rawValue == null) {
> try {
> db.delete(wOptions, rawKey);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] kafka pull request #3650: KAFKA-5717: InMemoryKeyValueStore should delete ke...

2017-08-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3650


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-182: Reduce Streams DSL overloads and allow easier use of custom storage engines

2017-08-09 Thread Guozhang Wang
>> Another comment about Printed in general is it differs with other options
>> that it is a required option than optional one, since it includes
toSysOut
>> / toFile specs; what are the pros and cons for including these two in the
>> option and hence make it a required option than leaving them at the API
>> layer and make Printed as optional for mapper / label only?
>>
>>
>It isn't required as we will still have the no-arg print() which will just
>go to sysout as it does now.

Got it. So just to clarify are we going to deprecate writeAsText or not?

Guozhang


On Wed, Aug 9, 2017 at 11:38 AM, Guozhang Wang  wrote:

> >> The key idea is that by using the same function name string for static
> >> constructor and member functions, users do not need to remember what
> are
> >> the differences but can call these functions with any ordering they
> want,
> >> and later calls on the same spec will win over early calls.
> >>
> >>
> >That would be great if java supported it, but it doesn't. You can't have
> >static an member functions with the same signature.
>
> Got it, thanks.
>
> Does it still make sense to have one static constructors for each spec,
> with one constructor having only one parameter to make it more usable, i.e.
> as a user I do not need to give all parameters if I only want to override
> one of them? Maybe we can just name the constructors as `with` but I'm not
> sure if Java distinguish:
>
> public static  Produced with(final Serde keySerde)
> public static  Produced with(final Serde valueSerde)
>
> as two function signatures.
>
>
> Guozhang
>
>
> On Wed, Aug 9, 2017 at 6:20 AM, Damian Guy  wrote:
>
>> On Tue, 8 Aug 2017 at 20:11 Guozhang Wang  wrote:
>>
>> > Damian,
>> >
>> > Thanks for the proposal, I had a few comments on the APIs:
>> >
>> > 1. Printed#withFile seems not needed, as users should always spec if it
>> is
>> > to sysOut or to File at the beginning. In addition as a second thought,
>> I
>> > think serdes are not useful for prints anyways since we assume
>> `toString`
>> > is provided except for byte arrays, in which we will special handle it.
>> >
>> >
>> +1
>>
>>
>> > Another comment about Printed in general is it differs with other
>> options
>> > that it is a required option than optional one, since it includes
>> toSysOut
>> > / toFile specs; what are the pros and cons for including these two in
>> the
>> > option and hence make it a required option than leaving them at the API
>> > layer and make Printed as optional for mapper / label only?
>> >
>> >
>> It isn't required as we will still have the no-arg print() which will just
>> go to sysout as it does now.
>>
>>
>> >
>> > 2.1 KStream#through / to
>> >
>> > We should have an overloaded function without Produced?
>> >
>>
>> Yes - we already have those so they are not part of the KIP, i.e,
>> through(topic)
>>
>>
>> >
>> > 2.2 KStream#groupBy / groupByKey
>> >
>> > We should have an overloaded function without Serialized?
>> >
>>
>> Yes, as above
>>
>> >
>> > 2.3 KGroupedStream#count / reduce / aggregate
>> >
>> > We should have an overloaded function without Materialized?
>> >
>>
>> As above
>>
>> >
>> > 2.4 KStream#join
>> >
>> > We should have an overloaded function without Joined?
>> >
>>
>> as above
>>
>> >
>> >
>> > 2.5 Each of KTable's operators:
>> >
>> > We should have an overloaded function without Produced / Serialized /
>> > Materialized?
>> >
>> >
>> as above
>>
>>
>> >
>> >
>> > 3.1 Produced: the static functions have overlaps, which seems not
>> > necessary. I'd suggest jut having the following three static with
>> another
>> > three similar member functions:
>> >
>> > public static  Produced withKeySerde(final Serde
>> keySerde)
>> >
>> > public static  Produced withValueSerde(final Serde
>> > valueSerde)
>> >
>> > public static  Produced withStreamPartitioner(final
>> > StreamPartitioner partitioner)
>> >
>> > The key idea is that by using the same function name string for static
>> > constructor and member functions, users do not need to remember what are
>> > the differences but can call these functions with any ordering they
>> want,
>> > and later calls on the same spec will win over early calls.
>> >
>> >
>> That would be great if java supported it, but it doesn't. You can't have
>> static an member functions with the same signature.
>>
>>
>> >
>> > 3.2 Serialized: similarly
>> >
>> > public static  Serialized withKeySerde(final Serde
>> keySerde)
>> >
>> > public static  Serialized withValueSerde(final Serde
>> > valueSerde)
>> >
>> > public Serialized withKeySerde(final Serde keySerde)
>> >
>> > public Serialized withValueSerde(final Serde valueSerde)
>> >
>>
>> as above
>>
>>
>> >
>> > Also it has a final Serde otherValueSerde in one of its static
>> > constructor, it that intentional?
>> >
>>
>> Nope: thanks.
>>
>> >
>> > 3.3. Joined: 

Re: [DISCUSS] KIP-182: Reduce Streams DSL overloads and allow easier use of custom storage engines

2017-08-09 Thread Guozhang Wang
>> The key idea is that by using the same function name string for static
>> constructor and member functions, users do not need to remember what are
>> the differences but can call these functions with any ordering they want,
>> and later calls on the same spec will win over early calls.
>>
>>
>That would be great if java supported it, but it doesn't. You can't have
>static an member functions with the same signature.

Got it, thanks.

Does it still make sense to have one static constructors for each spec,
with one constructor having only one parameter to make it more usable, i.e.
as a user I do not need to give all parameters if I only want to override
one of them? Maybe we can just name the constructors as `with` but I'm not
sure if Java distinguish:

public static  Produced with(final Serde keySerde)
public static  Produced with(final Serde valueSerde)

as two function signatures.


Guozhang


On Wed, Aug 9, 2017 at 6:20 AM, Damian Guy  wrote:

> On Tue, 8 Aug 2017 at 20:11 Guozhang Wang  wrote:
>
> > Damian,
> >
> > Thanks for the proposal, I had a few comments on the APIs:
> >
> > 1. Printed#withFile seems not needed, as users should always spec if it
> is
> > to sysOut or to File at the beginning. In addition as a second thought, I
> > think serdes are not useful for prints anyways since we assume `toString`
> > is provided except for byte arrays, in which we will special handle it.
> >
> >
> +1
>
>
> > Another comment about Printed in general is it differs with other options
> > that it is a required option than optional one, since it includes
> toSysOut
> > / toFile specs; what are the pros and cons for including these two in the
> > option and hence make it a required option than leaving them at the API
> > layer and make Printed as optional for mapper / label only?
> >
> >
> It isn't required as we will still have the no-arg print() which will just
> go to sysout as it does now.
>
>
> >
> > 2.1 KStream#through / to
> >
> > We should have an overloaded function without Produced?
> >
>
> Yes - we already have those so they are not part of the KIP, i.e,
> through(topic)
>
>
> >
> > 2.2 KStream#groupBy / groupByKey
> >
> > We should have an overloaded function without Serialized?
> >
>
> Yes, as above
>
> >
> > 2.3 KGroupedStream#count / reduce / aggregate
> >
> > We should have an overloaded function without Materialized?
> >
>
> As above
>
> >
> > 2.4 KStream#join
> >
> > We should have an overloaded function without Joined?
> >
>
> as above
>
> >
> >
> > 2.5 Each of KTable's operators:
> >
> > We should have an overloaded function without Produced / Serialized /
> > Materialized?
> >
> >
> as above
>
>
> >
> >
> > 3.1 Produced: the static functions have overlaps, which seems not
> > necessary. I'd suggest jut having the following three static with another
> > three similar member functions:
> >
> > public static  Produced withKeySerde(final Serde keySerde)
> >
> > public static  Produced withValueSerde(final Serde
> > valueSerde)
> >
> > public static  Produced withStreamPartitioner(final
> > StreamPartitioner partitioner)
> >
> > The key idea is that by using the same function name string for static
> > constructor and member functions, users do not need to remember what are
> > the differences but can call these functions with any ordering they want,
> > and later calls on the same spec will win over early calls.
> >
> >
> That would be great if java supported it, but it doesn't. You can't have
> static an member functions with the same signature.
>
>
> >
> > 3.2 Serialized: similarly
> >
> > public static  Serialized withKeySerde(final Serde
> keySerde)
> >
> > public static  Serialized withValueSerde(final Serde
> > valueSerde)
> >
> > public Serialized withKeySerde(final Serde keySerde)
> >
> > public Serialized withValueSerde(final Serde valueSerde)
> >
>
> as above
>
>
> >
> > Also it has a final Serde otherValueSerde in one of its static
> > constructor, it that intentional?
> >
>
> Nope: thanks.
>
> >
> > 3.3. Joined: similarly, keep the static constructor signatures the same
> as
> > its corresponding member fields.
> >
> >
> As above
>
>
> > 3.4 Materialized: it is a bit special, and I think we can keep its static
> > constructors with only two `as` as they are today.K
> >
> >
> 4. Is there any modifications on StateStoreSupplier? Is it replaced by
> > BytesStoreSupplier? Seems some more descriptions are lacking here. Also
> in
> >
> >
> No modifications to StateStoreSupplier. It is superseceded by
> BytesStoreSupplier.
>
>
>
> > public static  Materialized
> > as(final StateStoreSupplier
> > supplier)
> >
> > Is the parameter in type of BytesStoreSupplier?
> >
>
> Yep - thanks
>
>
> >
> >
> >
> >
> > Guozhang
> >
> >
> > On Thu, Jul 27, 2017 at 5:26 AM, Damian Guy 
> wrote:
> >
> > > 

Re: [DISCUSS] KIP-184 Rename LogCleaner and related classes to LogCompactor

2017-08-09 Thread Jason Gustafson
Hey Pranav,

Let's see what others think before closing the KIP. If there are strong
reasons for the renaming, I would reconsider.

As far as deprecating `log.cleaner.enable`, I think it's a good idea and
can be done in a separate KIP. Guozhang's suggestion seems reasonable, but
I'd just turn it on always (it won't cause much harm if there are no topics
enabled for compaction). This is an implementation detail which probably
doesn't need to be included in the KIP.

-Jason

On Wed, Aug 9, 2017 at 10:47 AM, Pranav Maniar  wrote:

> Thanks Ismael, Jason for the suggestion.
> My bad. I should have followed up on mail-list discussion before starting
> KIP. Apologies.
>
> I am relatively new, so I do not know if any confusion was reported in past
> due to terminology. May be others can chime in.
> If the old naming is fine with majority then no changes will be needed. I
> will mark JIRA as wont'fix and close the KIP !
>
> Ismael, Jason,
> There was another suggestion from Guozhang on deprecating and eventually
> removing log.cleaner.enable property all together and always enabling log
> cleaner if "log.cleanup.policy=compact".
> What are your suggestion on this ?
>
>
> Thanks,
> Pranav
>
> On Wed, Aug 9, 2017 at 10:27 PM, Jason Gustafson 
> wrote:
>
> > Yes, as Ismael noted above, I am not fond of this renaming. Keep in mind
> > that the LogCleaner does not only handle compaction. It is possible to
> > configure a cleanup policy of "compact" and "delete," in which case the
> > LogCleaner also handles removal of old segments. Hence the more general
> > LogCleaner name is more appropriate in my opinion.
> >
> > -Jason
> >
> > On Wed, Aug 9, 2017 at 9:49 AM, Pranav Maniar 
> > wrote:
> >
> > > Thanks Ewen for the suggestions.
> > > I have updated KIP-184. Updates done are :
> > >
> > > 1. If deprecated property is encountered currently, then its value will
> > be
> > > considered while enabling compactor.
> > > 2.  log.compactor.min.compaction.lag.ms updated it to be
> > > log.compactor.min.lag.ms ( Other naming suggestions are also welcomed)
> > > 3. Removed implementation details from KIP
> > >
> > > ~Pranav
> > >
> > > On Wed, Aug 9, 2017 at 11:19 AM, Ewen Cheslack-Postava <
> > e...@confluent.io>
> > > wrote:
> > >
> > >> A simple log message is standard, but the KIP should probably specify
> > what
> > >> happens when the deprecated config is encountered.
> > >>
> > >> Other than that, the change LGTM. Other things that might be worth
> > >> addressing
> > >>
> > >> * log.compactor.min.compaction.lag.ms seems a bit redundant with
> > >> compactor
> > >> and compaction. Not sure if we'd want to tweak the new version.
> > >> * The class renaming doesn't even need to be in the KIP as it is an
> > >> implementation detail.
> > >>
> > >> -Ewen
> > >>
> > >> On Tue, Aug 8, 2017 at 10:17 PM, Pranav Maniar 
> > >> wrote:
> > >>
> > >> > Thanks Guozhang for the suggestion.
> > >> >
> > >> > For now, I have updated KIP incorporating your suggestion.
> > >> > Personally I think implicitly enabling compaction whenever policy is
> > >> set to
> > >> > compact is more appropriate. Because new users like me will always
> > >> assume
> > >> > that setting policy to compact will enable compaction.
> > >> >
> > >> > But having said that, It will be interesting to know, if there are
> any
> > >> > use-cases where user would explicitly want to turn off the
> compactor.
> > >> >
> > >> > Thanks,
> > >> > Pranav
> > >> >
> > >> > On Tue, Aug 8, 2017 at 2:20 AM, Guozhang Wang 
> > >> wrote:
> > >> >
> > >> > > Thanks for the KIP proposal,
> > >> > >
> > >> > > I thought one suggestion before this discussion is to deprecate
> the
> > "
> > >> > > log.cleaner.enable" and always turn on compaction for those topics
> > >> that
> > >> > > have compact policies?
> > >> > >
> > >> > >
> > >> > > Guozhang
> > >> > >
> > >> > > On Sat, Aug 5, 2017 at 9:36 AM, Pranav Maniar <
> pranav9...@gmail.com
> > >
> > >> > > wrote:
> > >> > >
> > >> > > > Hi All,
> > >> > > >
> > >> > > > Following a discussion on JIRA KAFKA-1944
> > >> > > >  . I have
> > created
> > >> > > > KIP-184
> > >> > > >  > >> > > > 184%3A+Rename+LogCleaner+and+related+classes+to+LogCompactor>
> > >> > > > as
> > >> > > > it will require configuration change.
> > >> > > >
> > >> > > > As per the process I am starting Discussion on mail thread for
> > >> KIP-184.
> > >> > > >
> > >> > > > Renaming of configuration "log.cleaner.enable" is discussed on
> > >> > > KAFKA-1944.
> > >> > > > But other log.cleaner configuration also seems to be used by
> > cleaner
> > >> > > only.
> > >> > > > So to maintain naming consistency, I have proposed to rename all
> > >> these
> > >> > > > configuration.
> > >> > > >
> > >> > > > Please provide your suggestion/views for the same. 

Re: [DISCUSS] KIP-185: Make exactly once in order delivery per partition the default producer setting

2017-08-09 Thread Apurva Mehta
Thanks for the comments Ismael and Jason.

Regarding the OutOfOrderSequenceException, it is more likely when you
enable idempotence and have acks=1, simply because you have a greater
probability of losing acknowledged data with acks=1, and the error code
indicates that.

The particular scenario is that a broker acknowledges a message with
sequence N before replication happens, and then crashes. Since the message
was acknowledged the producer increments its sequence to N+1. The new
leader would not have received the message, and still expects sequence N
from the producer. When it receives N+1 for the next message, it will
return an OutOfOrderSequenceNumber, correctl/y indicating some previously
acknowledged messages are missing.

For the idempotent producer alone, the OutOfOrderSequenceException is
returned in the Future and Callback, indicating to the application that
some acknowledged data was lost. However, the application can continue
producing data using the producer instance. The only compatibility issue
here is that the application will now see a new exception for a state which
went previously undetected.

For a transactional producer, an OutOfOrderSequenceException is fatal and
the application must use a new instance of the producer.

Another point about acks=1 with enable.idempotence=true. What semantics are
we promising here? Essentially we are saying that the default mode would be
'if a message is in the log, it will occur only once, but all acknowledged
messages may not make it to the log'. I don't think that this is a
desirable default guarantee.

I will update the KIP to indicate that with the new default, applications
might get a new 'OutOfOrderSequenceException'.

Thanks,
Apurva

On Wed, Aug 9, 2017 at 9:33 AM, Ismael Juma  wrote:

> Hi Jason,
>
> Thanks for the correction. See inline.
>
> On Wed, Aug 9, 2017 at 5:13 PM, Jason Gustafson 
> wrote:
>
> > Minor correction: the OutOfOrderSequenceException is not fatal for the
> > idempotent producer and it is not necessarily tied to the acks setting
> > (though it is more likely to be thrown with acks=1).
>
>
> Right, it would be worth expanding on the specifics of this. My
> understanding is that common failure scenarios could trigger it.
>
>
> > It is used to signal
> > the user that there was a gap in the delivery of messages. You can hit
> this
> > if there is a pause on the producer and the topic retention kicks in and
> > deletes the last records the producer had written. However, it is
> possible
> > for the user to catch it and simply keep producing (internally the
> producer
> > will generate a new ProducerId).
>
>
> I see, our documentation states that it's fatal in the following example
> and in the `send` method. I had overlooked that this was mentioned in the
> context of transactions. If we were to enable idempotence by default, we'd
> want to flesh out the docs for idempotence without transactions.
>
> * try {
> * producer.beginTransaction();
> * for (int i = 0; i < 100; i++)
> * producer.send(new ProducerRecord<>("my-topic",
> Integer.toString(i), Integer.toString(i)));
> * producer.commitTransaction();
> * } catch (ProducerFencedException | OutOfOrderSequenceException |
> AuthorizationException e) {
> * // We can't recover from these exceptions, so our only option is
> to close the producer and exit.
> * producer.close();
> * } catch (KafkaException e) {
> * // For all other exceptions, just abort the transaction and try
> again.
> * producer.abortTransaction();
> * }
> * producer.close();
>
> Nevertheless, pre-idempotent-producer code
> > won't be expecting this exception, and that may cause it to break in
> cases
> > where it previously wouldn't. This is probably the biggest risk of the
> > change.
> >
>
> This is a good point and we should include it in the KIP.
>
> Ismael
>


Re: [DISCUSS] KIP-185: Make exactly once in order delivery per partition the default producer setting

2017-08-09 Thread Apurva Mehta
Thanks for the comments, Ewen. Responses inline.


> 1. Re: the mention of exactly once, this is within a producer session,
> right? And so really only idempotent. Applications still need to take extra
> steps for exactly once if they, e.g., are producing data from some other
> log like a DB txn log.
>
>
Yes, this is really idempotence. What it means that a single acknowledged
send will result in a single copy of the message in the log. Further, a
sequence of sends to a partition will appear in the log in the order they
were sent.

I think it is worth clarifying this in the KIP, and will do so.


> 2.
>
> > Further, the results above show that there is a large improvement in
> throughput and latency when we go from max.in.flight=1 to max.in.flight=2,
> but then there no discernible difference for higher values of this setting.
>
> If in the tests there's no difference with higher values, perhaps leaving
> it alone is better. There are a bunch of other configs we expose and this
> test only evaluates one environment. Without testing things like cross-DC
> traffic, I'd be wary of jumping to the conclusion that max.in.flight > 2
> never makes a difference and that some people aren't already relying on a
> larger default OOTB.
>
>
Are you suggesting to leave the default at 5? I think that is fair.
However, we would need to bound the value of this variable, in order to
make idempotence actually work. Particularly, if max.inflight = N, then we
need to preserve the record metadata (offset, timestamp, sequence) of the
last N appends per partition in order to be able to de-duplicate .

The proposal is to to default this value to 2, but allow any value up to 5.

I would also be interested to learn if there are actual cross-dc use cases
which set max.inflight > 1 today since any higher value means that you lose
ordering guarantees, meaning the two DC's are not guaranteed to be exact
replicas of each other.


> 3. The acks=all change is actually unrelated to the title of the KIP and
> orthogonal to all the other changes. It's also the most risky since
> acks=all needs more network round trips. And while I think it makes sense
> to have the more durable default, this seems like it's actually fairly
> likely to break things for some people (even if a minority of people). This
> one seems like a setting change that needs more sensitive handling, e.g.
> both release notes and log notification that the default is going to
> change, followed by the actual change later.
>
>
Not sure I follow. The KIP proposes to make exactly once delivery the
default behavior. As mentioned above, this means that each acknowledged
send should result in exactly one copy of the message in the log. With
acks=1, we can only ever have at-most once delivery, ie. an acknowledged
send could result in 0 copies of the message in the log. Please let me know
if I have missed something.

Thanks,
Apurva


> -Ewen
>
> On Tue, Aug 8, 2017 at 5:23 PM, Apurva Mehta  wrote:
>
> > Hi,
> >
> > I've put together a new KIP which proposes to ship Kafka with its
> strongest
> > delivery guarantees by default.
> >
> > We currently ship with at most once semantics and don't provide any
> > ordering guarantees per partition. The proposal is is to provide exactly
> > once in order delivery per partition by default in the upcoming 1.0.0
> > release.
> >
> > The KIP linked to below also outlines the performance characteristics of
> > the proposed default.
> >
> > The KIP is here:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 185%3A+Make+exactly+once+in+order+delivery+per+partition+
> > the+default+producer+setting
> >
> > Please have a look, I would love your feedback!
> >
> > Thanks,
> > Apurva
> >
>


Re: [DISCUSS] KIP-184 Rename LogCleaner and related classes to LogCompactor

2017-08-09 Thread Pranav Maniar
Thanks Ismael, Jason for the suggestion.
My bad. I should have followed up on mail-list discussion before starting
KIP. Apologies.

I am relatively new, so I do not know if any confusion was reported in past
due to terminology. May be others can chime in.
If the old naming is fine with majority then no changes will be needed. I
will mark JIRA as wont'fix and close the KIP !

Ismael, Jason,
There was another suggestion from Guozhang on deprecating and eventually
removing log.cleaner.enable property all together and always enabling log
cleaner if "log.cleanup.policy=compact".
What are your suggestion on this ?


Thanks,
Pranav

On Wed, Aug 9, 2017 at 10:27 PM, Jason Gustafson  wrote:

> Yes, as Ismael noted above, I am not fond of this renaming. Keep in mind
> that the LogCleaner does not only handle compaction. It is possible to
> configure a cleanup policy of "compact" and "delete," in which case the
> LogCleaner also handles removal of old segments. Hence the more general
> LogCleaner name is more appropriate in my opinion.
>
> -Jason
>
> On Wed, Aug 9, 2017 at 9:49 AM, Pranav Maniar 
> wrote:
>
> > Thanks Ewen for the suggestions.
> > I have updated KIP-184. Updates done are :
> >
> > 1. If deprecated property is encountered currently, then its value will
> be
> > considered while enabling compactor.
> > 2.  log.compactor.min.compaction.lag.ms updated it to be
> > log.compactor.min.lag.ms ( Other naming suggestions are also welcomed)
> > 3. Removed implementation details from KIP
> >
> > ~Pranav
> >
> > On Wed, Aug 9, 2017 at 11:19 AM, Ewen Cheslack-Postava <
> e...@confluent.io>
> > wrote:
> >
> >> A simple log message is standard, but the KIP should probably specify
> what
> >> happens when the deprecated config is encountered.
> >>
> >> Other than that, the change LGTM. Other things that might be worth
> >> addressing
> >>
> >> * log.compactor.min.compaction.lag.ms seems a bit redundant with
> >> compactor
> >> and compaction. Not sure if we'd want to tweak the new version.
> >> * The class renaming doesn't even need to be in the KIP as it is an
> >> implementation detail.
> >>
> >> -Ewen
> >>
> >> On Tue, Aug 8, 2017 at 10:17 PM, Pranav Maniar 
> >> wrote:
> >>
> >> > Thanks Guozhang for the suggestion.
> >> >
> >> > For now, I have updated KIP incorporating your suggestion.
> >> > Personally I think implicitly enabling compaction whenever policy is
> >> set to
> >> > compact is more appropriate. Because new users like me will always
> >> assume
> >> > that setting policy to compact will enable compaction.
> >> >
> >> > But having said that, It will be interesting to know, if there are any
> >> > use-cases where user would explicitly want to turn off the compactor.
> >> >
> >> > Thanks,
> >> > Pranav
> >> >
> >> > On Tue, Aug 8, 2017 at 2:20 AM, Guozhang Wang 
> >> wrote:
> >> >
> >> > > Thanks for the KIP proposal,
> >> > >
> >> > > I thought one suggestion before this discussion is to deprecate the
> "
> >> > > log.cleaner.enable" and always turn on compaction for those topics
> >> that
> >> > > have compact policies?
> >> > >
> >> > >
> >> > > Guozhang
> >> > >
> >> > > On Sat, Aug 5, 2017 at 9:36 AM, Pranav Maniar  >
> >> > > wrote:
> >> > >
> >> > > > Hi All,
> >> > > >
> >> > > > Following a discussion on JIRA KAFKA-1944
> >> > > >  . I have
> created
> >> > > > KIP-184
> >> > > >  >> > > > 184%3A+Rename+LogCleaner+and+related+classes+to+LogCompactor>
> >> > > > as
> >> > > > it will require configuration change.
> >> > > >
> >> > > > As per the process I am starting Discussion on mail thread for
> >> KIP-184.
> >> > > >
> >> > > > Renaming of configuration "log.cleaner.enable" is discussed on
> >> > > KAFKA-1944.
> >> > > > But other log.cleaner configuration also seems to be used by
> cleaner
> >> > > only.
> >> > > > So to maintain naming consistency, I have proposed to rename all
> >> these
> >> > > > configuration.
> >> > > >
> >> > > > Please provide your suggestion/views for the same. Thanks !
> >> > > >
> >> > > >
> >> > > > Thanks,
> >> > > > Pranav
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > -- Guozhang
> >> > >
> >> >
> >>
> >
> >
>


Re: [DISCUSS] KIP-91 Provide Intuitive User Timeouts in The Producer

2017-08-09 Thread Sumant Tambe
Responses inline.

> > However, one thing which has not come out of the JIRA discussion is the
> > > actual use cases for batch expiry.
> >
> > There are two usecases I can think of for batch expiry mechanism
> > irrespective of how we try to bound the time (batch.expiry.ms or
> > max.message.delivery.wait.ms). Let's call it X.
> >
> > 1. A real-time app (e.g., periodic healthcheck producer, temperature
> sensor
> > producer) has a soft upper bound on both message delivery and failure
> > notification of message delivery. In both cases, it wants to know. Such
> an
> > app does not close the producer on the first error reported (due to batch
> > expiry) because there's data lined up right behind. It's ok to lose a few
> > samples of temperature measurement (IoT scenario). So it simply drops it
> > and moves on. May be when drop rate is like 70% it would close it. Such
> an
> > app may use acks=0. In this case, X will have some value in single digit
> > minutes. But X=MAX_LONG is not suitable.
> >
>
> I guess my question is: batches would only start expiring if partition(s)
> are unavailable. What would 'moving on' mean for a real time app in this
> case? How would that help?
>
Partition unavailability is one of the cases. The other one is kmm "catch
up" mode where producer's outbound throughput is lower than consumer's
inbound tput. This can happen when kmm consumers consume locally but the
producer produces remotely. In such a case, the pipeline latency SLA starts
to break. Batches remain in the accumulator for too long. But there's no
unavailable partition. Because it's kmm, it can't just give up on the
batches. It will shut itself down and alerts will fire.

By "moving on", I mean fire and forget. For a real-time app, in both cases
(a partition is unavailable, or simply just outbound tput is low), it's
important to deliver newer data rather than trying hard to delivery old,
stale data.

>> After expiring one set of batches, the next set
>> would also be expired until the cluster comes back. So what is achieved
by
>> expiring the batches at all?
Yes, expiring is important. Even next series of batches should be expired
if they have crossed batch.expiry.ms. That means, the data is stale and
it's a hint to the middleware that don't bother with this stale data. Try
to use the resources on newer data further back in the queue. An example of
such an app would be healthcheck sensor thread that publishes
application-specific metrics periodically. Correct answer too late is a
wrong answer.


>
> >
> > 2. Today we run KMM in Linkedin as if batch.expiry==MAX_LONG. We expire
> > under the condition: (!muted.contains(tp) && (isMetadataStale ||
> > cluster.leaderFor(tp) == null)) In essence, as long as the partition is
> > making progress (even if it's a trickle), the producer keeps on going.
> > We've other internal systems to detect whether a pipeline is making
> > *sufficient* progress or not. We're not dependent on the producer to tell
> > us that it's not making progress on a certain partition.
> >
> > This is less than ideal though. We would be happy to configure
> > batch.expiry.ms to 1800,000 or so and upon notification of expiry
> restart
> > the process and what not. It can tell also tell us which specific
> > partitions of a specific topic is falling behind. We achieve a similar
> > effect via alternative mechanisms.
>
>
> I presume you are referring to partitions here? If a some partitions are
> unavailable, what could mirror maker do with that knowledge? Presumably
> focus on the partitions which are online. But does that really help?  Today
> the expiry message already contains the partition information, is that
> being used? If not, how would accurate expiry times with something like
> batch.expiry.ms change that fact?
>
You are right, KMM can't do much with the partition information.  My point
is that if we assume for a moment that Producer is the only piece that can
tell us whether a kmm pipeline is making progress or not. Would you
configure it to wait forever (batch.expiry=max_long) or would it be some
small value? It's the later we prefer. However, we really don't need
producer to tell us because we've other infrastructure pieces to tell us
about non-progress.

>
>
> >
> > > Also, the KIP document states the
> > > following:
> > >
> > > *The per message timeout is easy to compute - linger.ms
> > > >  + (retries + 1) * request.timeout.ms
> > > > ". *This is false.
> > >
> >
> > > Why is the statement false? Doesn't that provide an accurate upperbound
> > on
> > > the timeout for a produce request today?
> > >
> > The KIP-91 write-up describes the reasons why. Just reiterating the
> reason:
> > "the condition that if the metadata for a partition is known then we do
> not
> > expire its batches even if they are ready".  Do you not agree with the
> > explanation? If not, what part?
> >
> >
> Unless I have missed something, the producer doesn't seem to rely 

Build failed in Jenkins: kafka-trunk-jdk7 #2621

2017-08-09 Thread Apache Jenkins Server
See 


Changes:

[me] MINOR: Standardize logging of Worker-level messages from Tasks and

--
[...truncated 1.97 MB...]
org.apache.kafka.common.header.internals.RecordHeadersTest > testLastHeader 
PASSED

org.apache.kafka.common.header.internals.RecordHeadersTest > testRemove STARTED

org.apache.kafka.common.header.internals.RecordHeadersTest > testRemove PASSED

org.apache.kafka.common.header.internals.RecordHeadersTest > 
testAddRemoveInterleaved STARTED

org.apache.kafka.common.header.internals.RecordHeadersTest > 
testAddRemoveInterleaved PASSED

org.apache.kafka.common.ClusterTest > testBootstrap STARTED

org.apache.kafka.common.ClusterTest > testBootstrap PASSED

org.apache.kafka.common.cache.LRUCacheTest > testEviction STARTED

org.apache.kafka.common.cache.LRUCacheTest > testEviction PASSED

org.apache.kafka.common.cache.LRUCacheTest > testPutGet STARTED

org.apache.kafka.common.cache.LRUCacheTest > testPutGet PASSED

org.apache.kafka.common.cache.LRUCacheTest > testRemove STARTED

org.apache.kafka.common.cache.LRUCacheTest > testRemove PASSED

org.apache.kafka.common.utils.ShellTest > testEchoHello STARTED

org.apache.kafka.common.utils.ShellTest > testEchoHello PASSED

org.apache.kafka.common.utils.ShellTest > testHeadDevZero STARTED

org.apache.kafka.common.utils.ShellTest > testHeadDevZero PASSED

org.apache.kafka.common.utils.UtilsTest > testAbs STARTED

org.apache.kafka.common.utils.UtilsTest > testAbs PASSED

org.apache.kafka.common.utils.UtilsTest > testMin STARTED

org.apache.kafka.common.utils.UtilsTest > testMin PASSED

org.apache.kafka.common.utils.UtilsTest > toArray STARTED

org.apache.kafka.common.utils.UtilsTest > toArray PASSED

org.apache.kafka.common.utils.UtilsTest > utf8ByteBufferSerde STARTED

org.apache.kafka.common.utils.UtilsTest > utf8ByteBufferSerde PASSED

org.apache.kafka.common.utils.UtilsTest > testJoin STARTED

org.apache.kafka.common.utils.UtilsTest > testJoin PASSED

org.apache.kafka.common.utils.UtilsTest > 
testReadFullyOrFailWithPartialFileChannelReads STARTED

org.apache.kafka.common.utils.UtilsTest > 
testReadFullyOrFailWithPartialFileChannelReads PASSED

org.apache.kafka.common.utils.UtilsTest > toArrayDirectByteBuffer STARTED

org.apache.kafka.common.utils.UtilsTest > toArrayDirectByteBuffer PASSED

org.apache.kafka.common.utils.UtilsTest > testReadBytes STARTED

org.apache.kafka.common.utils.UtilsTest > testReadBytes PASSED

org.apache.kafka.common.utils.UtilsTest > testGetHost STARTED

org.apache.kafka.common.utils.UtilsTest > testGetHost PASSED

org.apache.kafka.common.utils.UtilsTest > testGetPort STARTED

org.apache.kafka.common.utils.UtilsTest > testGetPort PASSED

org.apache.kafka.common.utils.UtilsTest > 
testReadFullyWithPartialFileChannelReads STARTED

org.apache.kafka.common.utils.UtilsTest > 
testReadFullyWithPartialFileChannelReads PASSED

org.apache.kafka.common.utils.UtilsTest > testRecursiveDelete STARTED

org.apache.kafka.common.utils.UtilsTest > testRecursiveDelete PASSED

org.apache.kafka.common.utils.UtilsTest > testReadFullyOrFailWithRealFile 
STARTED

org.apache.kafka.common.utils.UtilsTest > testReadFullyOrFailWithRealFile PASSED

org.apache.kafka.common.utils.UtilsTest > writeToBuffer STARTED

org.apache.kafka.common.utils.UtilsTest > writeToBuffer PASSED

org.apache.kafka.common.utils.UtilsTest > testReadFullyIfEofIsReached STARTED

org.apache.kafka.common.utils.UtilsTest > testReadFullyIfEofIsReached PASSED

org.apache.kafka.common.utils.UtilsTest > testFormatBytes STARTED

org.apache.kafka.common.utils.UtilsTest > testFormatBytes PASSED

org.apache.kafka.common.utils.UtilsTest > utf8ByteArraySerde STARTED

org.apache.kafka.common.utils.UtilsTest > utf8ByteArraySerde PASSED

org.apache.kafka.common.utils.UtilsTest > testCloseAll STARTED

org.apache.kafka.common.utils.UtilsTest > testCloseAll PASSED

org.apache.kafka.common.utils.UtilsTest > testFormatAddress STARTED

org.apache.kafka.common.utils.UtilsTest > testFormatAddress PASSED

org.apache.kafka.common.utils.JavaTest > testLoadKerberosLoginModule STARTED

org.apache.kafka.common.utils.JavaTest > testLoadKerberosLoginModule PASSED

org.apache.kafka.common.utils.JavaTest > testIsIBMJdk STARTED

org.apache.kafka.common.utils.JavaTest > testIsIBMJdk PASSED

org.apache.kafka.common.utils.ChecksumsTest > testUpdateInt STARTED

org.apache.kafka.common.utils.ChecksumsTest > testUpdateInt PASSED

org.apache.kafka.common.utils.ChecksumsTest > testUpdateLong STARTED

org.apache.kafka.common.utils.ChecksumsTest > testUpdateLong PASSED

org.apache.kafka.common.utils.ChecksumsTest > 
testUpdateByteBufferWithOffsetPosition STARTED

org.apache.kafka.common.utils.ChecksumsTest > 
testUpdateByteBufferWithOffsetPosition PASSED

org.apache.kafka.common.utils.ChecksumsTest > testUpdateByteBuffer STARTED

org.apache.kafka.common.utils.ChecksumsTest > testUpdateByteBuffer PASSED


Build failed in Jenkins: kafka-trunk-jdk8 #1893

2017-08-09 Thread Apache Jenkins Server
See 


Changes:

[me] MINOR: Standardize logging of Worker-level messages from Tasks and

--
[...truncated 875.49 KB...]
kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
PASSED

kafka.integration.TopicMetadataTest > testIsrAfterBrokerShutDownAndJoinsBack 
STARTED

kafka.integration.TopicMetadataTest > testIsrAfterBrokerShutDownAndJoinsBack 
PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithCollision STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithCollision PASSED

kafka.integration.TopicMetadataTest > testAliveBrokerListWithNoTopics STARTED

kafka.integration.TopicMetadataTest > testAliveBrokerListWithNoTopics PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.TopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.TopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.TopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.TopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithInvalidReplication 
STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithInvalidReplication 
PASSED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.FetcherTest > testFetcher STARTED

kafka.integration.FetcherTest > testFetcher PASSED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionEnabled 
STARTED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionEnabled 
PASSED

kafka.integration.UncleanLeaderElectionTest > 
testCleanLeaderElectionDisabledByTopicOverride STARTED

kafka.integration.UncleanLeaderElectionTest > 
testCleanLeaderElectionDisabledByTopicOverride SKIPPED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionDisabled 
STARTED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionDisabled 
SKIPPED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionInvalidTopicOverride STARTED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionInvalidTopicOverride PASSED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionEnabledByTopicOverride STARTED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionEnabledByTopicOverride PASSED

kafka.integration.MinIsrConfigTest > testDefaultKafkaConfig STARTED

kafka.integration.MinIsrConfigTest > testDefaultKafkaConfig PASSED

unit.kafka.server.KafkaApisTest > 
shouldRespondWithUnsupportedForMessageFormatOnHandleWriteTxnMarkersWhenMagicLowerThanRequired
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldRespondWithUnsupportedForMessageFormatOnHandleWriteTxnMarkersWhenMagicLowerThanRequired
 PASSED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleTxnOffsetCommitRequestWhenInterBrokerProtocolNotSupported
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleTxnOffsetCommitRequestWhenInterBrokerProtocolNotSupported
 PASSED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleAddPartitionsToTxnRequestWhenInterBrokerProtocolNotSupported
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleAddPartitionsToTxnRequestWhenInterBrokerProtocolNotSupported
 PASSED

unit.kafka.server.KafkaApisTest > testReadUncommittedConsumerListOffsetLatest 
STARTED

unit.kafka.server.KafkaApisTest > testReadUncommittedConsumerListOffsetLatest 
PASSED

unit.kafka.server.KafkaApisTest > 
shouldAppendToLogOnWriteTxnMarkersWhenCorrectMagicVersion STARTED

unit.kafka.server.KafkaApisTest > 
shouldAppendToLogOnWriteTxnMarkersWhenCorrectMagicVersion PASSED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleWriteTxnMarkersRequestWhenInterBrokerProtocolNotSupported
 STARTED

unit.kafka.server.KafkaApisTest > 

Re: [DISCUSS] KIP-184 Rename LogCleaner and related classes to LogCompactor

2017-08-09 Thread Jason Gustafson
Yes, as Ismael noted above, I am not fond of this renaming. Keep in mind
that the LogCleaner does not only handle compaction. It is possible to
configure a cleanup policy of "compact" and "delete," in which case the
LogCleaner also handles removal of old segments. Hence the more general
LogCleaner name is more appropriate in my opinion.

-Jason

On Wed, Aug 9, 2017 at 9:49 AM, Pranav Maniar  wrote:

> Thanks Ewen for the suggestions.
> I have updated KIP-184. Updates done are :
>
> 1. If deprecated property is encountered currently, then its value will be
> considered while enabling compactor.
> 2.  log.compactor.min.compaction.lag.ms updated it to be
> log.compactor.min.lag.ms ( Other naming suggestions are also welcomed)
> 3. Removed implementation details from KIP
>
> ~Pranav
>
> On Wed, Aug 9, 2017 at 11:19 AM, Ewen Cheslack-Postava 
> wrote:
>
>> A simple log message is standard, but the KIP should probably specify what
>> happens when the deprecated config is encountered.
>>
>> Other than that, the change LGTM. Other things that might be worth
>> addressing
>>
>> * log.compactor.min.compaction.lag.ms seems a bit redundant with
>> compactor
>> and compaction. Not sure if we'd want to tweak the new version.
>> * The class renaming doesn't even need to be in the KIP as it is an
>> implementation detail.
>>
>> -Ewen
>>
>> On Tue, Aug 8, 2017 at 10:17 PM, Pranav Maniar 
>> wrote:
>>
>> > Thanks Guozhang for the suggestion.
>> >
>> > For now, I have updated KIP incorporating your suggestion.
>> > Personally I think implicitly enabling compaction whenever policy is
>> set to
>> > compact is more appropriate. Because new users like me will always
>> assume
>> > that setting policy to compact will enable compaction.
>> >
>> > But having said that, It will be interesting to know, if there are any
>> > use-cases where user would explicitly want to turn off the compactor.
>> >
>> > Thanks,
>> > Pranav
>> >
>> > On Tue, Aug 8, 2017 at 2:20 AM, Guozhang Wang 
>> wrote:
>> >
>> > > Thanks for the KIP proposal,
>> > >
>> > > I thought one suggestion before this discussion is to deprecate the "
>> > > log.cleaner.enable" and always turn on compaction for those topics
>> that
>> > > have compact policies?
>> > >
>> > >
>> > > Guozhang
>> > >
>> > > On Sat, Aug 5, 2017 at 9:36 AM, Pranav Maniar 
>> > > wrote:
>> > >
>> > > > Hi All,
>> > > >
>> > > > Following a discussion on JIRA KAFKA-1944
>> > > >  . I have created
>> > > > KIP-184
>> > > > > > > > 184%3A+Rename+LogCleaner+and+related+classes+to+LogCompactor>
>> > > > as
>> > > > it will require configuration change.
>> > > >
>> > > > As per the process I am starting Discussion on mail thread for
>> KIP-184.
>> > > >
>> > > > Renaming of configuration "log.cleaner.enable" is discussed on
>> > > KAFKA-1944.
>> > > > But other log.cleaner configuration also seems to be used by cleaner
>> > > only.
>> > > > So to maintain naming consistency, I have proposed to rename all
>> these
>> > > > configuration.
>> > > >
>> > > > Please provide your suggestion/views for the same. Thanks !
>> > > >
>> > > >
>> > > > Thanks,
>> > > > Pranav
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > -- Guozhang
>> > >
>> >
>>
>
>


Re: [DISCUSS] KIP-184 Rename LogCleaner and related classes to LogCompactor

2017-08-09 Thread Pranav Maniar
Thanks Ewen for the suggestions.
I have updated KIP-184. Updates done are :

1. If deprecated property is encountered currently, then its value will be
considered while enabling compactor.
2.  log.compactor.min.compaction.lag.ms updated it to be
log.compactor.min.lag.ms ( Other naming suggestions are also welcomed)
3. Removed implementation details from KIP

~Pranav

On Wed, Aug 9, 2017 at 11:19 AM, Ewen Cheslack-Postava 
wrote:

> A simple log message is standard, but the KIP should probably specify what
> happens when the deprecated config is encountered.
>
> Other than that, the change LGTM. Other things that might be worth
> addressing
>
> * log.compactor.min.compaction.lag.ms seems a bit redundant with compactor
> and compaction. Not sure if we'd want to tweak the new version.
> * The class renaming doesn't even need to be in the KIP as it is an
> implementation detail.
>
> -Ewen
>
> On Tue, Aug 8, 2017 at 10:17 PM, Pranav Maniar 
> wrote:
>
> > Thanks Guozhang for the suggestion.
> >
> > For now, I have updated KIP incorporating your suggestion.
> > Personally I think implicitly enabling compaction whenever policy is set
> to
> > compact is more appropriate. Because new users like me will always assume
> > that setting policy to compact will enable compaction.
> >
> > But having said that, It will be interesting to know, if there are any
> > use-cases where user would explicitly want to turn off the compactor.
> >
> > Thanks,
> > Pranav
> >
> > On Tue, Aug 8, 2017 at 2:20 AM, Guozhang Wang 
> wrote:
> >
> > > Thanks for the KIP proposal,
> > >
> > > I thought one suggestion before this discussion is to deprecate the "
> > > log.cleaner.enable" and always turn on compaction for those topics that
> > > have compact policies?
> > >
> > >
> > > Guozhang
> > >
> > > On Sat, Aug 5, 2017 at 9:36 AM, Pranav Maniar 
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > Following a discussion on JIRA KAFKA-1944
> > > >  . I have created
> > > > KIP-184
> > > >  > > > 184%3A+Rename+LogCleaner+and+related+classes+to+LogCompactor>
> > > > as
> > > > it will require configuration change.
> > > >
> > > > As per the process I am starting Discussion on mail thread for
> KIP-184.
> > > >
> > > > Renaming of configuration "log.cleaner.enable" is discussed on
> > > KAFKA-1944.
> > > > But other log.cleaner configuration also seems to be used by cleaner
> > > only.
> > > > So to maintain naming consistency, I have proposed to rename all
> these
> > > > configuration.
> > > >
> > > > Please provide your suggestion/views for the same. Thanks !
> > > >
> > > >
> > > > Thanks,
> > > > Pranav
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>


[GitHub] kafka-site pull request #70: Add learn more verbiage on ctas

2017-08-09 Thread derrickdoo
GitHub user derrickdoo opened a pull request:

https://github.com/apache/kafka-site/pull/70

Add learn more verbiage on ctas

Added 'learn more' to the 3 CTAs at the very top of the homepage

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/derrickdoo/kafka-site home-ctas

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka-site/pull/70.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #70


commit 8f8e080e88e7acbc6d42143b77bac04a4981d51b
Author: Derrick Or 
Date:   2017-08-09T16:44:56Z

add learn more verbiage on ctas




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-186: Increase offsets retention default to 7 days

2017-08-09 Thread Jason Gustafson
+1 on the bump to 7 days. Wanted to mention one minor point. The
OffsetCommit RPC still provides the ability to set the retention time from
the client, but we do not use it in the consumer. Should we consider adding
a consumer config to set this? Given the problems people had with the old
default, such a config would probably have gotten a fair bit of use. Maybe
it's less necessary with the new default, but there may be situations where
you don't want to keep the offsets for too long. For example, the console
consumer commits offsets with a generated group id. We might want to set a
low retention time to keep it from filling the offset cache with garbage
from such groups.

-Jason

On Wed, Aug 9, 2017 at 5:24 AM, Sönke Liebau <
soenke.lie...@opencore.com.invalid> wrote:

> Just had this create issues at a customer as well, +1
>
> On Wed, Aug 9, 2017 at 11:46 AM, Mickael Maison 
> wrote:
>
> > Yes the current default is too short, +1
> >
> > On Wed, Aug 9, 2017 at 8:56 AM, Ismael Juma  wrote:
> > > Thanks for the KIP, +1 from me.
> > >
> > > Ismael
> > >
> > > On Wed, Aug 9, 2017 at 1:24 AM, Ewen Cheslack-Postava <
> e...@confluent.io
> > >
> > > wrote:
> > >
> > >> Hi all,
> > >>
> > >> I posted a simple new KIP for a problem we see with a lot of users:
> > >> KIP-186: Increase offsets retention default to 7 days
> > >>
> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> 186%3A+Increase+offsets+retention+default+to+7+days
> > >>
> > >> Note that in addition to the KIP text itself, the linked JIRA already
> > >> existed and has a bunch of discussion on the subject.
> > >>
> > >> -Ewen
> > >>
> >
>
>
>
> --
> Sönke Liebau
> Partner
> Tel. +49 179 7940878
> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
>


Re: [DISCUSS] KIP-185: Make exactly once in order delivery per partition the default producer setting

2017-08-09 Thread Ismael Juma
Hi Jason,

Thanks for the correction. See inline.

On Wed, Aug 9, 2017 at 5:13 PM, Jason Gustafson  wrote:

> Minor correction: the OutOfOrderSequenceException is not fatal for the
> idempotent producer and it is not necessarily tied to the acks setting
> (though it is more likely to be thrown with acks=1).


Right, it would be worth expanding on the specifics of this. My
understanding is that common failure scenarios could trigger it.


> It is used to signal
> the user that there was a gap in the delivery of messages. You can hit this
> if there is a pause on the producer and the topic retention kicks in and
> deletes the last records the producer had written. However, it is possible
> for the user to catch it and simply keep producing (internally the producer
> will generate a new ProducerId).


I see, our documentation states that it's fatal in the following example
and in the `send` method. I had overlooked that this was mentioned in the
context of transactions. If we were to enable idempotence by default, we'd
want to flesh out the docs for idempotence without transactions.

* try {
* producer.beginTransaction();
* for (int i = 0; i < 100; i++)
* producer.send(new ProducerRecord<>("my-topic",
Integer.toString(i), Integer.toString(i)));
* producer.commitTransaction();
* } catch (ProducerFencedException | OutOfOrderSequenceException |
AuthorizationException e) {
* // We can't recover from these exceptions, so our only option is
to close the producer and exit.
* producer.close();
* } catch (KafkaException e) {
* // For all other exceptions, just abort the transaction and try again.
* producer.abortTransaction();
* }
* producer.close();

Nevertheless, pre-idempotent-producer code
> won't be expecting this exception, and that may cause it to break in cases
> where it previously wouldn't. This is probably the biggest risk of the
> change.
>

This is a good point and we should include it in the KIP.

Ismael


Re: [DISCUSS] KIP-185: Make exactly once in order delivery per partition the default producer setting

2017-08-09 Thread Jason Gustafson
Minor correction: the OutOfOrderSequenceException is not fatal for the
idempotent producer and it is not necessarily tied to the acks setting
(though it is more likely to be thrown with acks=1). It is used to signal
the user that there was a gap in the delivery of messages. You can hit this
if there is a pause on the producer and the topic retention kicks in and
deletes the last records the producer had written. However, it is possible
for the user to catch it and simply keep producing (internally the producer
will generate a new ProducerId). Nevertheless, pre-idempotent-producer code
won't be expecting this exception, and that may cause it to break in cases
where it previously wouldn't. This is probably the biggest risk of the
change.

-Jason

On Wed, Aug 9, 2017 at 6:33 AM, Ismael Juma  wrote:

> Thanks for the KIP, Apurva. In general, I think it's a good idea to
> strengthen the guarantees we provide by default. And people who are willing
> to trade correctness for performance can then change the configs to suit
> them. I will comment on the KIP specifics in more detail later, but one
> additional comment inline:
>
> On Wed, Aug 9, 2017 at 7:11 AM, Ewen Cheslack-Postava 
> wrote:
>
> > 3. The acks=all change is actually unrelated to the title of the KIP and
> > orthogonal to all the other changes. It's also the most risky since
> > acks=all needs more network round trips. And while I think it makes sense
> > to have the more durable default, this seems like it's actually fairly
> > likely to break things for some people (even if a minority of people).
> This
> > one seems like a setting change that needs more sensitive handling, e.g.
> > both release notes and log notification that the default is going to
> > change, followed by the actual change later.
> >
>
> The issue is that with acks=1 and idempotence, OutOfOrderSequenceException
> may be thrown, which is a fatal error for the Producer (it needs to be
> closed and restarted). I'll leave it to Apurva to explain this in more
> detail.
>
> I wanted to comment on the "log notification" suggestion. Why do you think
> this is needed since users can just change the config back to `acks=1` (or
> 0)? We haven't done this in the past when changing defaults, so it would be
> good to understand it. Given that the next release is 1.0.0, I think it
> would be OK to just change it and advertise it well. Logging warnings for
> deprecated configs makes sense because:
>
> 1. The config will go away and there may not be an exact replacement, so
> good to give some time for users to transition
> 2. Users should not be using the config, so it's OK to spam their logs
>
> Neither of those is true when we change defaults. Having said that, Git
> does what you are suggesting and I agree that the impact can be negative if
> people don't read the upgrade notes. Not sure what's the best way to solve
> that.
>
> Ismael
>


[GitHub] kafka pull request #3639: MINOR: Standardize logging of Worker-level message...

2017-08-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3639


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Kafka Producer/Consumer APIs vs Apache Camel

2017-08-09 Thread mayank rathi
Hello All,

Are their any advantages of using Kafka Producer/Consumer APIs over Apache
Camel?

Thanks!!

-- 
NOTICE: This email message is for the sole use of the intended recipient(s)
and may contain confidential and privileged information. Any unauthorized
review, use, disclosure or distribution is prohibited. If you are not the
intended recipient, please contact the sender by reply email and destroy
all copies of the original message.


[jira] [Resolved] (KAFKA-2360) The kafka-consumer-perf-test.sh script help information print useless parameters.

2017-08-09 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-2360.

   Resolution: Fixed
Fix Version/s: 1.0.0

> The kafka-consumer-perf-test.sh script help information print useless 
> parameters.
> -
>
> Key: KAFKA-2360
> URL: https://issues.apache.org/jira/browse/KAFKA-2360
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.8.2.1
> Environment: Linux
>Reporter: Bo Wang
>Assignee: huxihx
>Priority: Minor
>  Labels: newbie
> Fix For: 1.0.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Run kafka-consumer-perf-test.sh --help to show help information,  but found 3 
> parameters useless : 
> --batch-size and --batch-size --messages
> That is producer of parameters.
> bin]# ./kafka-consumer-perf-test.sh --help
> Missing required argument "[topic]"
> Option  Description   
>  
> --  ---   
>  
> --batch-size Number of messages to write in a  
>  
>   single batch. (default: 200)
>  
> --compression-codec   
>   supported codec: NoCompressionCodec (default: 0)
>  
>   as 0, GZIPCompressionCodec as 1,
>  
>   SnappyCompressionCodec as 2,
>  
>   LZ4CompressionCodec as 3>   
>  
> --date-format  The date format to use for formatting 
>  
>   the time field. See java.text.  
>  
>   SimpleDateFormat for options.   
>  
>   (default: -MM-dd HH:mm:ss:SSS)  
>  
> --fetch-size The amount of data to fetch in a  
>  
>   single request. (default: 1048576)  
>  
> --messages The number of messages to send or 
>  
>   consume (default:   
>  
>   9223372036854775807)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: kafka-0.11.0-jdk7 #265

2017-08-09 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] MINOR: streams memory management docs

--
[...truncated 2.42 MB...]

org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStoreTest > 
shouldReturnNullOnPutIfAbsentWhenNoPreviousValue STARTED

org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStoreTest > 
shouldReturnNullOnPutIfAbsentWhenNoPreviousValue PASSED

org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStoreTest > 
shouldLogKeyNullOnDelete STARTED

org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStoreTest > 
shouldLogKeyNullOnDelete PASSED

org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStoreTest > 
shouldReturnNullOnGetWhenDoesntExist STARTED

org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStoreTest > 
shouldReturnNullOnGetWhenDoesntExist PASSED

org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStoreTest > 
shouldReturnOldValueOnDelete STARTED

org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStoreTest > 
shouldReturnOldValueOnDelete PASSED

org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStoreTest > 
shouldNotWriteToInnerOnPutIfAbsentWhenValueForKeyExists STARTED

org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStoreTest > 
shouldNotWriteToInnerOnPutIfAbsentWhenValueForKeyExists PASSED

org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStoreTest > 
shouldReturnValueOnGetWhenExists STARTED

org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStoreTest > 
shouldReturnValueOnGetWhenExists PASSED

org.apache.kafka.streams.state.internals.RocksDBSegmentedBytesStoreTest > 
shouldRemove STARTED

org.apache.kafka.streams.state.internals.RocksDBSegmentedBytesStoreTest > 
shouldRemove PASSED

org.apache.kafka.streams.state.internals.RocksDBSegmentedBytesStoreTest > 
shouldPutAndFetch STARTED

org.apache.kafka.streams.state.internals.RocksDBSegmentedBytesStoreTest > 
shouldPutAndFetch PASSED

org.apache.kafka.streams.state.internals.RocksDBSegmentedBytesStoreTest > 
shouldRollSegments STARTED

org.apache.kafka.streams.state.internals.RocksDBSegmentedBytesStoreTest > 
shouldRollSegments PASSED

org.apache.kafka.streams.state.internals.RocksDBSegmentedBytesStoreTest > 
shouldFindValuesWithinRange STARTED

org.apache.kafka.streams.state.internals.RocksDBSegmentedBytesStoreTest > 
shouldFindValuesWithinRange PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldUseNewConfigsWhenPresent 
STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldUseNewConfigsWhenPresent 
PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldUseCorrectDefaultsWhenNoneSpecified STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldUseCorrectDefaultsWhenNoneSpecified PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
STARTED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerMaxInFlightRequestPerConnectionsWhenEosDisabled 
STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerMaxInFlightRequestPerConnectionsWhenEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowStreamsExceptionIfValueSerdeConfigFails STARTED

org.apache.kafka.streams.StreamsConfigTest > 

Re: [DISCUSS] KAFKA-4930 & KAFKA 4938 - Treatment of name parameter in create connector requests

2017-08-09 Thread Sönke Liebau
Could someone have a look at the PR for KAFKA-4930 if they get the chance
(not necessarily you Gwen, just bumping in general)? I've updated it
according to the latest comments a little while ago and would like to get
this done, before I forget what I did in case more changes are necessary :)

Thanks!

Jira: https://issues.apache.org/jira/browse/KAFKA-4930
PR: https://github.com/apache/kafka/pull/2755

On Thu, Jul 6, 2017 at 8:19 PM, Gwen Shapira  wrote:

> This sounds great. I'll try to review later today :)
>
> On Thu, Jul 6, 2017 at 12:35 AM Sönke Liebau
>  wrote:
>
> > I've updated the pull request to behave as follows:
> >  - reject create requests that contain no "name" element with a
> > BadRequestException
> >  - reject name that are empty or contain illegal characters with a
> > ConfigException
> >  - leave current logic around when to copy the name from the create
> request
> > to the config element intact
> >  - added unit tests for the validator to check that illegal characters
> are
> > correctly identified
> >
> > The list of illegal characters is the result of some quick testing I did,
> > all of the characters in the list currently cause issues when used in a
> > connector name (similar to KAFKA-4827), so this should not break anything
> > that anybody relies on.
> > I think we may want to start a larger discussion around connector names,
> > allowed characters, max length, ..  to come up with an airtight set of
> > rules that we can then enforce, I am sure this is currently not perfect
> as
> > is.
> >
> > Best regards,
> > Sönke
> >
> > On Wed, Jul 5, 2017 at 9:31 AM, Sönke Liebau  >
> > wrote:
> >
> > > Hi,
> > >
> > > regarding "breaking existing functionality" .. yes...that was me
> getting
> > > confused about intended and existing functionality :)
> > > You are right, this won't break anything that is currently working.
> > >
> > > I'll leave placement of "name" parameter as is and open a new issue to
> > > clarify this later on.
> > >
> > > Kind regards,
> > > Sönke
> > >
> > > On Wed, Jul 5, 2017 at 5:41 AM, Gwen Shapira 
> wrote:
> > >
> > >> Hey,
> > >>
> > >> Nice research and summary.
> > >>
> > >> Regarding the ability to have a "nameless" connector - I'm pretty sure
> > we
> > >> never intended to allow that.
> > >> I'm confused about breaking something that currently works though -
> > since
> > >> we get NPE, how will giving more intentional exceptions break
> anything?
> > >>
> > >> Regarding the placing of the name - inside or outside the config. It
> > looks
> > >> messy and I'm as confused as you are. I think Konstantine had some
> ideas
> > >> how this should be resolved. I hope he responds, but I think that for
> > your
> > >> PR, just accept current mess as given...
> > >>
> > >> Gwen
> > >>
> > >> On Tue, Jul 4, 2017 at 3:28 AM Sönke Liebau
> > >>  wrote:
> > >>
> > >> > While working on KAFKA-4930 and KAFKA-4938 I came across some sort
> of
> > >> > fundamental questions about the rest api for creating connectors in
> > >> Kafka
> > >> > Connect that I'd like to put up for discussion.
> > >> >
> > >> > Currently requests that do not contain a "name" element on the top
> > level
> > >> > are not accepted by the API, but that is due to a
> NullPointerException
> > >> [1]
> > >> > so not entirely intentional. Previous (and current if the lines
> > causing
> > >> the
> > >> > Exception are removed) functionality was to create a connector named
> > >> "null"
> > >> > if that parameter was missing. I am not sure if this is a good
> thing,
> > as
> > >> > for example that connector will be overwritten every time a new
> > request
> > >> > without a name is sent, as opposed to the expected warning that a
> > >> connector
> > >> > of that name already exists.
> > >> >
> > >> > I would propose to reject api calls without a name provided on the
> top
> > >> > level, but this might break requests that currently work, so should
> > >> > probably be mentioned in the release notes.
> > >> >
> > >> > 
> > >> >
> > >> > Additionally, the "name" parameter is also copied into the "config"
> > >> > sub-element of the connector request - unless a name parameter was
> > >> provided
> > >> > there in the original request[2].
> > >> >
> > >> > So this:
> > >> >
> > >> > {
> > >> >   "name": "connectorname",
> > >> >   "config": {
> > >> > "connector.class":
> > >> > "org.apache.kafka.connect.tools.MockSourceConnector",
> > >> > "tasks.max": "1",
> > >> > "topics": "test-topic"
> > >> >   }
> > >> > }
> > >> >
> > >> > would become this:
> > >> > {
> > >> >   "name": "connectorname",
> > >> >   "config": {
> > >> > "name": "connectorname",
> > >> > "connector.class":
> > >> > "org.apache.kafka.connect.tools.MockSourceConnector",
> > >> > "tasks.max": "1",
> > >> > "topics": "test-topic"
> > >> >   }
> > >> > }
> > >> >
> 

Re: [DISCUSS] KIP-179: Change ReassignPartitionsCommand to use AdminClient

2017-08-09 Thread Tom Bentley
Hi Dong and Jun,

Thanks again for your input in this discussion and on KIP-113. It's
difficult that discussion is split between this thread and the one for
KIP-113, but I'll try to respond on this thread to questions asked on this
thread.

It seems there is some consensus that the alterTopic() API is the wrong
thing, and it would make more sense to separate the different kinds of
alteration into separate APIs. It seems to me there is then a choice.

1. Have separate alterPartitionCount(), alterReplicationFactor() and
reassignPartitions() methods. This would most closely match the facilities
currently offered by kafka-alter-topics and kafka-reassign-partitions.
2. Just provide a reassignPartitions() method and infer from the shape of
the data passed to that that a change in replication factor and/or
partition count is required, as suggested by Dong.

The choice we make here is also relevant to KIP-113 and KIP-178. By
choosing (1) we can change the replication factor or partition count
without providing an assignment and therefore are necessarily requiring the
controller to make a decision for us about which broker (and, for 113,
which log directory and thus which disk) the replica should be reside.
There is then the matter of what criteria the controller should use to make
that decision (a decision is also required on topic creation of course, but
I'll try not to go there right now).

Choosing (2), on the other hand, forces us to make an assignment, and
currently the AdminClient doesn't provide the APIs necessary to make a very
informed decision. To do the job properly we'd need APIs to enumerate the
permissible log directories on each broker, know the current disk usage
etc. These just don't exist today, and I think it would be a lot of work to
bite off to specify them. We've already got three KIPs on the go.

So, for the moment, I think we have to choose 1, for pragmatic reasons.
There is nothing to stop us deprecating alterPartitionCount() and
alterReplicationFactor() and moving to (2) at a later date.

To answer a specific questions of Jun's:

3. It's not very clear to me what status_time in ReplicaStatusResponse
> means.


It corresponds to the ReplicaStatus.statusTime, and the idea was it was the
epoch offset at which the controllers notion of the lag was current (which
I think is the epoch offset of the last FetchRequest from the follower). At
one point I was considering some other ways of determining progress and
completion, and TBH I think it was more pertinent to those rejected
alternatives.

I've already made some changes to KIP-179. As suggested in the thread for
KIP-113, I will be changing some more since I think we can DescribeDirsRequest
instead of ReplicaStatusResponse to implement the --progress option.

Cheers,

Tom



On 9 August 2017 at 02:28, Jun Rao  wrote:

> Hi, Tom,
>
> Thanks for the KIP. A few minor comments below.
>
> 1. Implementation wise, the broker handles adding partitions differently
> from changing replica assignment. For the former, we directly update the
> topic path in ZK with the new partitions. For the latter, we write the new
> partition reassignment in the partition reassignment path. Changing the
> replication factor is handled in the same way as changing replica
> assignment. So, it would be useful to document how the broker handles these
> different cases accordingly. I think it's simpler to just allow one change
> (partition, replication factor, or replica assignment) in a request.
>
> 2. Currently, we only allow adding partitions. We probably want to document
> the restriction in the api.
>
> 3. It's not very clear to me what status_time in ReplicaStatusResponse
> means.
>
> Jun
>
>
>
> On Fri, Aug 4, 2017 at 10:04 AM, Dong Lin  wrote:
>
> > Hey Tom,
> >
> > Thanks for your reply. Here are my thoughts:
> >
> > 1) I think the DescribeDirsResponse can be used by AdminClient to query
> the
> > lag of follower replica as well. Here is how it works:
> >
> > - AdminClient sends DescribeDirsRequest to both the leader and the
> follower
> > of the partition.
> > - DescribeDirsResponse from both leader and follower shows the LEO of the
> > leader replica and the follower replica.
> > - Lag of the follower replica is the difference in the LEO between leader
> > and follower.
> >
> > In comparison to ReplicaStatusRequest, DescribeDirsRequest needs to be
> send
> > to each replica of the partition whereas ReplicaStatusRequest only needs
> to
> > be sent to the leader of the partition. It doesn't seem to make much
> > difference though. In practice we probably want to query the replica lag
> of
> > many partitions and AminClient needs to send exactly one request to each
> > broker with either solution. Does this make sense?
> >
> >
> > 2) KIP-179 proposes to add the following AdminClient API:
> >
> > alterTopics(Collection alteredTopics, AlterTopicsOptions
> > options)
> >
> > Where AlteredTopic includes the following fields
> >
> > 

[jira] [Created] (KAFKA-5718) Better document what LogAppendTime means

2017-08-09 Thread Dustin Cote (JIRA)
Dustin Cote created KAFKA-5718:
--

 Summary: Better document what LogAppendTime means
 Key: KAFKA-5718
 URL: https://issues.apache.org/jira/browse/KAFKA-5718
 Project: Kafka
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.11.0.0
Reporter: Dustin Cote
Priority: Trivial


There isn't a good description of LogAppendTime in the documentation. It would 
be nice to add this in somewhere to say something like:

LogAppendTime is some time between when the partition leader receives the 
request and before it writes it to it's local log. 

There are two important distinctions that trip people up:
1) This timestamp is not when the consumer could have first consumed the 
message. This instead requires min.insync.replicas to have been satisfied.
2) This is not precisely when the leader wrote to it's log, there can be delays 
along the path between receiving the request and writing to the log.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-08-09 Thread Tom Bentley
Hi Dong and Jun,

Thanks for your responses!

Jun's interpretation of how AlterTopicsRequest could be sent to any broker
is indeed what I meant. Since the data has to get persisted in ZK anyway,
it doesn't really matter whether we send it to the controller (it will will
have to write it to the znode). And while we still support --zookeeper the
controller will have to remain a listener to that znode anyway. Requiring
the AlterTopicsRequest to be sent to the controller might make sense if we
can foresee some way to take ZK out of the equation in the future (a la
KIP-183). But I think we have to keep ZK in the picture to make it
resilient, and therefore I see no value in requiring the receiver of the
AlterTopicsRequest to be the controller.

I will have a go at putting a "unified API" (for reassigning partitions
between brokers and to particular log directories), so we have something
concrete to discuss, though we may well conclude separate APIs make more
sense.

Finally about measuring progress, Dong said:

I think if the slight difference in the accuracy between
> the two approaches does not make a difference to the intended use-case of
> this API


Lacking data to evaluate the "if", I guess we could go with
DescribeDirsResponse and change it in a future KIP if it turned out to be
inadequate. But if anyone is able to give insight into what the difference
is, that would be better.

Thanks again for the feedback.


On 9 August 2017 at 02:26, Dong Lin  wrote:

> Hey Jun,
>
> Thanks for the comment!
>
> Yes, it should work. The tool can send request to any broker and broker can
> just write the reassignment znode. My previous intuition is that it may be
> better to only send this request to controller. But I don't have good
> reasons for this restriction.
>
> My intuition is that we can keep them separate as well. Becket and I have
> discussed this both offline and in https://github.com/apache/
> kafka/pull/3621.
> Currently I don't have a strong opinion on this and I am open to using only
> one API to do both if someone can come up with a reasonable API signature
> for this method. For now I have added the method alterReplicaDir() in
> KafkaAdminClient instead of the AdminClient interface so that the
> reassignment script can use this method without concluding what the API
> would look like in AdminClient in the future.
>
> Regarding DescribeDirsResponse, I think it is probably OK to have slightly
> more lag. The script can calculate the lag of the follower replica as
> Math.max(0, leaderLEO - followerLEO). I agree that it will be slightly less
> accurate than the current approach in KIP-179. But even with the current
> approach in KIP-179, the result provided by the script is an approximation
> anyway, since there is delay from the time that leader returns response to
> the time that the script collects response from all brokers and prints
> result to user. I think if the slight difference in the accuracy between
> the two approaches does not make a difference to the intended use-case of
> this API, then we probably want to re-use the exiting request/response to
> keep the protocol simple.
>
> Thanks,
> Dong
>
>
>
>
>
> On Tue, Aug 8, 2017 at 5:56 PM, Jun Rao  wrote:
>
> > Hi, Dong,
> >
> > I think Tom was suggesting to have the AlterTopicsRequest sent to any
> > broker, which just writes the reassignment json to ZK. The controller
> will
> > pick up the reassignment and act on it as usual. This should work, right?
> >
> > Having a separate AlterTopicsRequest and AlterReplicaDirRequest seems
> > simpler to me. The former is handled by the controller and the latter is
> > handled by the affected broker. They don't always have to be done
> together.
> > Merging the two into a single request probably will make both the api and
> > the implementation a bit more complicated. If we do keep the two separate
> > requests, it seems that we should just add AlterReplicaDirRequest to the
> > AdminClient interface?
> >
> > Now, regarding DescribeDirsResponse. I agree that it can be used for the
> > status reporting in KIP-179 as well. However, it seems that reporting the
> > log end offset of each replica may not be easy to use. The log end offset
> > will be returned from different brokers in slightly different time. If
> > there is continuous producing traffic, the difference in log end offset
> > between the leader and the follower could be larger than 0 even if the
> > follower has fully caught up. I am wondering if it's better to instead
> > return the lag in offset per replica. This way, the status can probably
> be
> > reported more reliably.
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Aug 8, 2017 at 11:23 AM, Dong Lin  wrote:
> >
> > > Hey Tom,
> > >
> > > Thanks for the quick reply. Please see my comment inline.
> > >
> > > On Tue, Aug 8, 2017 at 11:06 AM, Tom Bentley 
> > > wrote:
> > >
> > > > Hi Dong,
> > > >
> > > > Replies 

Re: [DISCUSS] KIP-185: Make exactly once in order delivery per partition the default producer setting

2017-08-09 Thread Ismael Juma
Thanks for the KIP, Apurva. In general, I think it's a good idea to
strengthen the guarantees we provide by default. And people who are willing
to trade correctness for performance can then change the configs to suit
them. I will comment on the KIP specifics in more detail later, but one
additional comment inline:

On Wed, Aug 9, 2017 at 7:11 AM, Ewen Cheslack-Postava 
wrote:

> 3. The acks=all change is actually unrelated to the title of the KIP and
> orthogonal to all the other changes. It's also the most risky since
> acks=all needs more network round trips. And while I think it makes sense
> to have the more durable default, this seems like it's actually fairly
> likely to break things for some people (even if a minority of people). This
> one seems like a setting change that needs more sensitive handling, e.g.
> both release notes and log notification that the default is going to
> change, followed by the actual change later.
>

The issue is that with acks=1 and idempotence, OutOfOrderSequenceException
may be thrown, which is a fatal error for the Producer (it needs to be
closed and restarted). I'll leave it to Apurva to explain this in more
detail.

I wanted to comment on the "log notification" suggestion. Why do you think
this is needed since users can just change the config back to `acks=1` (or
0)? We haven't done this in the past when changing defaults, so it would be
good to understand it. Given that the next release is 1.0.0, I think it
would be OK to just change it and advertise it well. Logging warnings for
deprecated configs makes sense because:

1. The config will go away and there may not be an exact replacement, so
good to give some time for users to transition
2. Users should not be using the config, so it's OK to spam their logs

Neither of those is true when we change defaults. Having said that, Git
does what you are suggesting and I agree that the impact can be negative if
people don't read the upgrade notes. Not sure what's the best way to solve
that.

Ismael


[GitHub] kafka pull request #3633: MINOR: streams memory management docs

2017-08-09 Thread dguy
Github user dguy closed the pull request at:

https://github.com/apache/kafka/pull/3633


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3650: KAFKA-5717: InMemoryKeyValueStore should delete ke...

2017-08-09 Thread dguy
GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/3650

KAFKA-5717: InMemoryKeyValueStore should delete keys with null values 
during restore

Fixed a bug in the InMemoryKeyValueStore restoration where a key with a 
`null` value is written in to the map rather than being deleted.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka kafka-5717

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3650.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3650






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-182: Reduce Streams DSL overloads and allow easier use of custom storage engines

2017-08-09 Thread Damian Guy
On Tue, 8 Aug 2017 at 20:11 Guozhang Wang  wrote:

> Damian,
>
> Thanks for the proposal, I had a few comments on the APIs:
>
> 1. Printed#withFile seems not needed, as users should always spec if it is
> to sysOut or to File at the beginning. In addition as a second thought, I
> think serdes are not useful for prints anyways since we assume `toString`
> is provided except for byte arrays, in which we will special handle it.
>
>
+1


> Another comment about Printed in general is it differs with other options
> that it is a required option than optional one, since it includes toSysOut
> / toFile specs; what are the pros and cons for including these two in the
> option and hence make it a required option than leaving them at the API
> layer and make Printed as optional for mapper / label only?
>
>
It isn't required as we will still have the no-arg print() which will just
go to sysout as it does now.


>
> 2.1 KStream#through / to
>
> We should have an overloaded function without Produced?
>

Yes - we already have those so they are not part of the KIP, i.e,
through(topic)


>
> 2.2 KStream#groupBy / groupByKey
>
> We should have an overloaded function without Serialized?
>

Yes, as above

>
> 2.3 KGroupedStream#count / reduce / aggregate
>
> We should have an overloaded function without Materialized?
>

As above

>
> 2.4 KStream#join
>
> We should have an overloaded function without Joined?
>

as above

>
>
> 2.5 Each of KTable's operators:
>
> We should have an overloaded function without Produced / Serialized /
> Materialized?
>
>
as above


>
>
> 3.1 Produced: the static functions have overlaps, which seems not
> necessary. I'd suggest jut having the following three static with another
> three similar member functions:
>
> public static  Produced withKeySerde(final Serde keySerde)
>
> public static  Produced withValueSerde(final Serde
> valueSerde)
>
> public static  Produced withStreamPartitioner(final
> StreamPartitioner partitioner)
>
> The key idea is that by using the same function name string for static
> constructor and member functions, users do not need to remember what are
> the differences but can call these functions with any ordering they want,
> and later calls on the same spec will win over early calls.
>
>
That would be great if java supported it, but it doesn't. You can't have
static an member functions with the same signature.


>
> 3.2 Serialized: similarly
>
> public static  Serialized withKeySerde(final Serde keySerde)
>
> public static  Serialized withValueSerde(final Serde
> valueSerde)
>
> public Serialized withKeySerde(final Serde keySerde)
>
> public Serialized withValueSerde(final Serde valueSerde)
>

as above


>
> Also it has a final Serde otherValueSerde in one of its static
> constructor, it that intentional?
>

Nope: thanks.

>
> 3.3. Joined: similarly, keep the static constructor signatures the same as
> its corresponding member fields.
>
>
As above


> 3.4 Materialized: it is a bit special, and I think we can keep its static
> constructors with only two `as` as they are today.K
>
>
4. Is there any modifications on StateStoreSupplier? Is it replaced by
> BytesStoreSupplier? Seems some more descriptions are lacking here. Also in
>
>
No modifications to StateStoreSupplier. It is superseceded by
BytesStoreSupplier.



> public static  Materialized
> as(final StateStoreSupplier
> supplier)
>
> Is the parameter in type of BytesStoreSupplier?
>

Yep - thanks


>
>
>
>
> Guozhang
>
>
> On Thu, Jul 27, 2017 at 5:26 AM, Damian Guy  wrote:
>
> > Updated link:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+
> > use+of+custom+storage+engines
> >
> > Thanks,
> > Damian
> >
> > On Thu, 27 Jul 2017 at 13:09 Damian Guy  wrote:
> >
> > > Hi,
> > >
> > > I've put together a KIP to make some changes to the KafkaStreams DSL
> that
> > > will hopefully allow us to:
> > > 1) reduce the explosion of overloads
> > > 2) add new features without having to continue adding more overloads
> > > 3) provide simpler ways for people to use custom storage engines and
> wrap
> > > them with logging, caching etc if desired
> > > 4) enable per-operator caching rather than global caching without
> having
> > > to resort to supplying a StateStoreSupplier when you just want to turn
> > > caching off.
> > >
> > > The KIP is here:
> > > https://cwiki.apache.org/confluence/pages/viewpage.
> > action?pageId=73631309
> > >
> > > Thanks,
> > > Damian
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: [DISCUSS] KIP-183 - Change PreferredReplicaLeaderElectionCommand to use AdminClient

2017-08-09 Thread Ismael Juma
On Wed, Aug 9, 2017 at 11:40 AM, Tom Bentley  wrote:
>
> There are responses with detailed error messages as well as the codes,
> CreateTopicsResponse, {Describe|Alter}ConfigsResponse, and the responses
> for managing ACLs for instance. To be honest, I assumed including a message
> was the norm. In this case, however, I don't think there is any extra
> detail to include beyond the error itself, so I've removed it from the KIP.
>

We started sending back actual error messages when we introduced the Create
Topic Policy as the errors are custom and much more helpful if a string can
be sent back.

In general, I think it would be better if we had an optional error message
for all request types as error codes alone sometimes result in people
having to check the broker logs. Examples that come to mind are Group
Coordinator Not Available and Invalid Request. For the latter, we want to
include what was invalid and it's not practical to have an error code for
every possible validation error (this becomes more obvious once we start
adding admin protocol APIs).

Ismael


Re: [DISCUSS] KIP-186: Increase offsets retention default to 7 days

2017-08-09 Thread Sönke Liebau
Just had this create issues at a customer as well, +1

On Wed, Aug 9, 2017 at 11:46 AM, Mickael Maison 
wrote:

> Yes the current default is too short, +1
>
> On Wed, Aug 9, 2017 at 8:56 AM, Ismael Juma  wrote:
> > Thanks for the KIP, +1 from me.
> >
> > Ismael
> >
> > On Wed, Aug 9, 2017 at 1:24 AM, Ewen Cheslack-Postava  >
> > wrote:
> >
> >> Hi all,
> >>
> >> I posted a simple new KIP for a problem we see with a lot of users:
> >> KIP-186: Increase offsets retention default to 7 days
> >>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> 186%3A+Increase+offsets+retention+default+to+7+days
> >>
> >> Note that in addition to the KIP text itself, the linked JIRA already
> >> existed and has a bunch of discussion on the subject.
> >>
> >> -Ewen
> >>
>



-- 
Sönke Liebau
Partner
Tel. +49 179 7940878
OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany


Re: [DISCUSS] KIP-174 - Deprecate and remove internal converter configs in WorkerConfig

2017-08-09 Thread UMESH CHAUDHARY
Thanks Ewen,
I just edited the KIP to reflect the changes.

Regards,
Umesh

On Wed, 9 Aug 2017 at 11:00 Ewen Cheslack-Postava  wrote:

> Great, looking good. I'd probably be a bit more concrete about the
> Proposed Changes (e.g., "will log an warning if the config is specified"
> and "since the JsonConverter is the default, the configs will be removed
> immediately from the example worker configuration files").
>
> Other than that this LGTM and I'll be happy to get rid of those settings!
>
> -Ewen
>
> On Tue, Aug 8, 2017 at 2:54 AM, UMESH CHAUDHARY 
> wrote:
>
>> Hi Ewen,
>> Sorry, I am bit late in responding this.
>>
>> Thanks for your inputs and I've updated the KIP by adding more details to
>> it.
>>
>> Regards,
>> Umesh
>>
>> On Mon, 31 Jul 2017 at 21:51 Ewen Cheslack-Postava 
>> wrote:
>>
>>> On Sun, Jul 30, 2017 at 10:21 PM, UMESH CHAUDHARY 
>>> wrote:
>>>
 Hi Ewen,
 Thanks for your comments.

 1) Yes, there are some test and java classes which refer these configs,
 so I will include them as well in "public interface" section of KIP. What
 should be our approach to deal with the classes and tests which use these
 configs: we need to change them to use JsonConverter when we plan for
 removal of these configs right?

>>>
>>> I actually meant the references in config/connect-standalone.properties
>>> and config/connect-distributed.properties
>>>
>>>
 2) I believe we can target the deprecation in 1.0.0 release as it is
 planned in October 2017 and then removal in next major release. Let me
 know your thoughts as we don't have any information for next major release
 (next to 1.0.0) yet.

>>>
>>> That sounds fine. Tough to say at this point what our approach to major
>>> version bumps will be since the approach to version numbering is changing a
>>> bit.
>>>
>>>
 3) Thats a good point and mentioned JIRA can help us to validate the
 usage of any other converters. I will list this down in the KIP.

 Let me know if you have some additional thoughts on this.

 Regards,
 Umesh



 On Wed, 26 Jul 2017 at 09:27 Ewen Cheslack-Postava 
 wrote:

> Umesh,
>
> Thanks for the KIP. Straightforward and I think it's a good change.
> Unfortunately it is hard to tell how many people it would affect since
> we
> can't tell how many people have adjusted that config, but I think this
> is
> the right thing to do long term.
>
> A couple of quick things that might be helpful to refine:
>
> * Note that there are also some references in the example configs that
> we
> should remove.
> * It's nice to be explicit about when the removal is planned. This
> lets us
> set expectations with users for timeframe (especially now that we have
> time
> based releases), allows us to give info about the removal timeframe in
> log
> error messages, and lets us file a JIRA against that release so we
> remember
> to follow up. Given the update to 1.0.0 for the next release, we may
> also
> need to adjust how we deal with deprecations/removal if we don't want
> to
> have to wait all the way until 2.0 to remove (though it is unclear how
> exactly we will be handling version bumps from now on).
> * Migration path -- I think this is the major missing gap in the KIP.
> Do we
> need a migration path? If not, presumably it is because people aren't
> using
> any other converters in practice. Do we have some way of validating
> this (
> https://issues.apache.org/jira/browse/KAFKA-3988 might be pretty
> convincing
> evidence)? If there are some users using other converters, how would
> they
> migrate to newer versions which would no longer support that?
>
> -Ewen
>
>
> On Fri, Jul 14, 2017 at 2:37 AM, UMESH CHAUDHARY 
> wrote:
>
> > Hi there,
> > Resending as probably missed earlier to grab your attention.
> >
> > Regards,
> > Umesh
> >
> > -- Forwarded message -
> > From: UMESH CHAUDHARY 
> > Date: Mon, 3 Jul 2017 at 11:04
> > Subject: [DISCUSS] KIP-174 - Deprecate and remove internal converter
> > configs in WorkerConfig
> > To: dev@kafka.apache.org 
> >
> >
> > Hello All,
> > I have added a KIP recently to deprecate and remove internal
> converter
> > configs in WorkerConfig.java class because these have ultimately just
> > caused a lot more trouble and confusion than it is worth.
> >
> > Please find the KIP here
> >  >
> 174+-+Deprecate+and+remove+internal+converter+configs+in+WorkerConfig>
> > and
> > the related JIRA here <
> 

Re: 答复: [DISCUSS] KIP-178: Size-based log directory selection strategy

2017-08-09 Thread Tom Bentley
Hi Hu,

I wonder whether changing, or configuring a size-balancing strategy would
be sufficient for all users. I would expect that users might want to take
other factors into account. For example, with KIP-113, balancing IO across
the disks might also be a factor, in addition to balancing free space: An
allocation that best optimises for free space might make one disc much for
heavily utilised than another, increasing latency.

Also, an alternative to putting this logic on the server side would be
providing the necessary information via the AdminClient to allow replica
assignments to be calculated off-server. This has a couple of benefits:

1. Finding optimal or near-optimal solutions to this kind of "knapsack
problem" might well involve compute-intensive algorithms. It would be
better to not run such computations on the server if possible.
2. It's not one-size-fits-all: It would let people devise the best
algorithm for their particular needs.

If this is an alternative you've rejected, it would be good to explain why
in the rejected alternatives section.

On 9 August 2017 at 03:38, Hu Xi  wrote:

> Hi Lin,
>
>
> Yes, it is a problem since it's not easy for us to predict the possible
> disk spaces each partition occupies in the future. How about the algorithm
> below:
>
>
>   1.  Introduce a data structure to maintain the current free spaces for
> each log directory in `log.dirs`. Say, a ConcurrentHashMap AtomicLong> named logFreeDiskMap
>   2.  Initialize and update `logFreeDiskMap` before batch invoking
> `nextLogDir`. Invoke it before `makeLeaders` for instance.
>   3.  Every time we call `nextLogDir`, check:
>  *   If `log.retention.bytes` or  topic-level `retention.bytes` is
> set(> 0), then it's to say user does want to control the total disk spaces
> for that partition. So we select the directory with most disk spaces from
> `logFreeDiskMap` and subtract `retention.bytes` from the original value.
>  *   If `log.retention.bytes` or  topic-level `retention.bytes` is not
> set, meaning user does not care about the disk or he/she does not know how
> much the total disk spaces the partition occupies, then we turn back to the
> current strategy: selecting directory with least partitions
>
> The algorithm should be working well with the situation you mentioned,
> namely large number of partitions in a single LeaderAndIsr request.
> However, I have to admit that it's poor for the situation when consecutive
> LeaderAndIsr requests come from the controller.
>
> Any comments are welcomed.
>
> 
> 发件人: Dong Lin 
> 发送时间: 2017年8月5日 2:00
> 收件人: dev@kafka.apache.org
> 主题: Re: [DISCUSS] KIP-178: Size-based log directory selection strategy
>
> Hey Hu,
>
> I am not sure it is OK. Say kafka-reassign-partitions.sh is used to move
> 100 replicas to a broker. The the controller will send LeaderAndIsrRequest
> asking this broker to be the follower of these 100 partitions. While it is
> true that the broker will create replicas sequentially, but they will be
> created in a very short period of time (e.g. 2 seconds) and thus the
> replicas will be put in the same log directory that has the most free space
> at the time this broker receives the LeaderAndIsrRequest. Do you think this
> is a problem?
>
> Dong
>
>
> On Thu, Aug 3, 2017 at 7:36 PM, Hu Xi  wrote:
>
> > Hi Dong, some thoughts on your second mail. Since currently logs for
> > multiple partitions are created sequentially not in parallel, it's
> probably
> > okay for us to simply select the directory with most disk spaces in a
> > single round of `nextLogDir` calling. which can be guaranteed to lead to
> > extreme skew. Does it make any senses?
> >
> >
> > 
> > 发件人: Hu Xi 
> > 发送时间: 2017年8月3日 16:51
> > 收件人: dev@kafka.apache.org
> > 主题: 答复: 答复: [DISCUSS] KIP-178: Size-based log directory selection
> strategy
> >
> >
> > Dong, yes, many thanks for the comments from the second mail. Will take
> > some time to figure out an algorithm to better handle the situation you
> > mentioned. Thanks again.
> >
> >
> > 
> > 发件人: Dong Lin 
> > 发送时间: 2017年8月3日 12:07
> > 收件人: dev@kafka.apache.org
> > 主题: Re: 答复: [DISCUSS] KIP-178: Size-based log directory selection
> strategy
> >
> > Hu, I think this is worth discussion even if it doesn't require new
> config.
> > Could you also read my second email?
> >
> > On Wed, Aug 2, 2017 at 6:17 PM, Hu Xi  wrote:
> >
> > > Thanks Dong,  do you mean it is more like a naive improvement and no
> KIP
> > > is needed  then?
> > >
> > > 
> > > 发件人: Dong Lin 
> > > 发送时间: 2017年8月3日 9:10
> > > 收件人: dev@kafka.apache.org
> > > 主题: Re: [DISCUSS] KIP-178: Size-based log directory selection strategy
> > >
> > > Hey Xu,
> > >
> > > Thanks for the KIP. This is a very good 

[jira] [Created] (KAFKA-5717) [streams] 'null' values in state stores

2017-08-09 Thread Bart Vercammen (JIRA)
Bart Vercammen created KAFKA-5717:
-

 Summary: [streams] 'null' values in state stores
 Key: KAFKA-5717
 URL: https://issues.apache.org/jira/browse/KAFKA-5717
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.11.0.0, 0.10.2.1
Reporter: Bart Vercammen


When restoring the state on an in-memory KeyValue store (at startup of the 
Kafka Streams application), the _deleted_ values are put in the store as _key_ 
with _value_ {{null}} instead of being removed from the store.
(this happens when the underlying kafka topic segment did not get compacted yet)

After some digging I came across this in {{InMemoryKeyValueStore}}:
{code}
public synchronized void put(K key, V value) {
this.map.put(key, value);
}
{code}

I would assume this implementation misses the check on {{value}} being {{null}} 
to *delete* the entry instead of just storing it.

In the RocksDB implementation it is done correctly:
{code}
if (rawValue == null) {
try {
db.delete(wOptions, rawKey);
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] KIP-183 - Change PreferredReplicaLeaderElectionCommand to use AdminClient

2017-08-09 Thread Tom Bentley
Hi Ewen,

Thanks for looking at the KIP. I've updated it for some of your comments,
but see also a few replies inline.

On 9 August 2017 at 06:02, Ewen Cheslack-Postava  wrote:

> Thanks for the KIP. Generally the move away from ZK and to native Kafka
> requests is good, so I'm generally +1 on this. A couple of
> comments/questions.
>
> * You gave the signature
> electPreferredReplicaLeader(Collection partitions) for the
> admin client. The old command allows not specifying the topic partitions
> which results in election for all topic partitions. How would this be
> expressed in the new API? Does the tool need to do a metadata request and
> then issue the request for all topic partitions or would there be a
> shortcut via the protocol (e.g. another electPreferredReplicaLeader() with
> no args which translates to, e.g., an empty list or null in the request)?
>

I was trying to figure out the best way to do this. It seems there are no
methods in the AdminClient setting a precedent on this. I would prefer to
avoid having to do a metadata request to discover all the partitions. So
the two ways I can think of are:

1. As you suggest, an empty or null list in the request would signify "all
partitions". This is not intuitive, and using an empty list is downright
surprising (an empty list should be a noop after all).
2. We could have an allPartitions flag in the Options class. This is also
unintuitive because it would ignore whatever partitions were in the list,
though I suppose the AdminClient itself could check for this and signal an
error.

Of these options, I think using a null list in the request is the least
surprising.


> * All the existing *Options classes for the AdminClient have at least a
> timeout. Should we add that to this APIs options?
> * Other AdminClient *Result classes provide some convenience methods when
> you just want all the results. Should we have that as well?
> * I don't think any other requests include error strings in the response,
> do they? I'm pretty sure we just generate the error string on the client
> side based on the error code. If an error code was ambiguous, we should
> probably just split it across 2 or more codes.
>

There are responses with detailed error messages as well as the codes,
CreateTopicsResponse, {Describe|Alter}ConfigsResponse, and the responses
for managing ACLs for instance. To be honest, I assumed including a message
was the norm. In this case, however, I don't think there is any extra
detail to include beyond the error itself, so I've removed it from the KIP.


> * re: permissions, would there also be a per-topic permission required? (I
> haven't kept track of all the ACLs and required permissions so I'm not
> sure, but a bunch of other operations check per-resource they are modifying
> -- this operation could be considered to be against either Cluster or
> Topics.)
>

I'm also unsure, and would welcome input from others about this. Note that
there is some similarity here with some of the discussion on
https://issues.apache.org/jira/browse/KAFKA-5638: To what extent should
some permission on the Cluster imply some permission across all Topics in
the cluster? My feeling is that there should be no such implication,
because it will just make ACLs harder to reason about, and require more
documentation about what implies what else.

Getting back to this specific case, it would make some sense for Alter on
the Topic to be necessary, though it could be argued that the leader of a
partition is an operational concern, and not really part of the topic
configuration. Changing the leader of a bunch of partitions could affect
partitions of other topics. On the other hand, anyone with Alter on the
topic would presumably have the freedom to set the preferred leader for a
partition, so it would be a little strange if they couldn't actually give
effect to that choice.

On balance I think I prefer needing alter on the topic (and have adjusted
the KIP accordingly), but I'm very happy to go with the consensus view.



> * Should we consider making the request only valid against the broker and
> having the command do a metadata request to discover the controller and
> then send the request directly to it? I think this could a) simplify the
> implementation a bit by taking the ZK node out of the equation b) avoid the
> REPLICA_LEADER_ELECTION_IN_PROGRESS error since this is only required due
> to using the ZK node to communicate with the controller afaict and c)
> potentially open the door for a synchronous request/response in the future.
>
>
That's a really good point. I agree that the request should be made to the
controller, because it allows us to make improvements in future. I'm not
totally sure about taking ZK out of the equation completely though: Right
now during controller failover the new controller checks the znode and
ensures those elections get honoured. So it seems we need to keep ZK in the
picture, and without rewriting how we read and write 

[GitHub] kafka pull request #3649: MINOR: Add one more instruction

2017-08-09 Thread enothereska
GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/3649

MINOR: Add one more instruction



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka minor-docker-docs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3649.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3649






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-5716) Connect: When SourceTask.commit it is possible not everthing from SourceTask.poll has been sent

2017-08-09 Thread Per Steffensen (JIRA)
Per Steffensen created KAFKA-5716:
-

 Summary: Connect: When SourceTask.commit it is possible not 
everthing from SourceTask.poll has been sent
 Key: KAFKA-5716
 URL: https://issues.apache.org/jira/browse/KAFKA-5716
 Project: Kafka
  Issue Type: Bug
Reporter: Per Steffensen
Priority: Minor


Not looking at the very latest code, so the "problem" may have been corrected 
recently. If so, I apologize. I found the "problem" by code-inspection alone, 
so I may be wrong. Have not had the time to write tests to confirm.

According to java-doc on SourceTask.commit
{quote}
Commit the offsets, up to the offsets that have been returned by \{@link 
#poll()}. This
method should block until the commit is complete.

SourceTasks are not required to implement this functionality; Kafka Connect 
will record offsets
automatically. This hook is provided for systems that also need to store 
offsets internally
in their own system.
{quote}

As I read this, when commit-method is called, the SourceTask-developer is 
"told" that everything returned from poll up until "now" has been sent/stored - 
both the outgoing messages and the associated connect-offsets. Looking at the 
implementation it also seems that this is what it tries to "guarantee/achieve".

But as I see read the code, it is not necessarily true
The following threads are involved
* Task-thread: WorkerSourceTask has its own thread running 
WorkerSourceTask.execute.
* Committer-thread: From time to time SourceTaskOffsetCommitter is scheduled to 
call WorkerSourceTask.commitOffsets (from a different thread)

The two thread synchronize (on the WorkerSourceTask-object) in sendRecord and 
commitOffsets respectively, hindering the task-thread to add to 
outstandingMessages and offsetWriter while committer-thread is marking what has 
to be flushed in the offsetWriter and waiting for outstandingMessages to be 
empty. This means that the offsets committed will be consistent with what has 
been sent out, but not necessarily what has been polled. At least I do not see 
why the following is not possible:
* Task-thread polls something from the task.poll
* Before task-thread gets to add (all) the polled records to 
outstandingMessages and offsetWriter in sendRecords, committer-thread kicks in 
and does its commiting, while hindering the task-thread adding the polled 
records to outstandingMessages and offsetWriter
* Consistency will not have been compromised, but committer-thread will end up 
calling task.commit (via WorkerSourceTask.commitSourceTask), without the 
records just polled from task.poll has been sent or corresponding 
connector-offsets flushed.

If I am right, I guess there are two way to fix it
* Either change the java-doc of SourceTask.commit, to something a-la (which I 
do believe is true)
{quote}
Commit the offsets, up to the offsets that have been returned by \{@link 
#poll()}
*and confirmed by a call to \{@link #commitRecord(SourceRecord)}*.
This method should block until the commit is complete.

SourceTasks are not required to implement this functionality; Kafka Connect 
will record offsets
automatically. This hook is provided for systems that also need to store 
offsets internally
in their own system.
{quote}
* or, fix the "problem" so that it actually does what the java-doc says :-)

If I am not right, of course I apologize for the inconvenience. I would 
appreciate an explanation where my code-inspection is not correct, and why it 
works even though I cannot see it. I will not expect such an explanation, 
though.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] KIP-186: Increase offsets retention default to 7 days

2017-08-09 Thread Mickael Maison
Yes the current default is too short, +1

On Wed, Aug 9, 2017 at 8:56 AM, Ismael Juma  wrote:
> Thanks for the KIP, +1 from me.
>
> Ismael
>
> On Wed, Aug 9, 2017 at 1:24 AM, Ewen Cheslack-Postava 
> wrote:
>
>> Hi all,
>>
>> I posted a simple new KIP for a problem we see with a lot of users:
>> KIP-186: Increase offsets retention default to 7 days
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 186%3A+Increase+offsets+retention+default+to+7+days
>>
>> Note that in addition to the KIP text itself, the linked JIRA already
>> existed and has a bunch of discussion on the subject.
>>
>> -Ewen
>>


Build failed in Jenkins: kafka-0.11.0-jdk7 #264

2017-08-09 Thread Apache Jenkins Server
See 


Changes:

[me] KAFKA-5704: Corrected Connect distributed startup behavior to allow

--
[...truncated 861.89 KB...]

kafka.api.SaslScramSslEndToEndAuthorizationTest > testNoGroupAcl STARTED

kafka.api.SaslScramSslEndToEndAuthorizationTest > testNoGroupAcl PASSED

kafka.api.SaslScramSslEndToEndAuthorizationTest > testNoProduceWithDescribeAcl 
STARTED

kafka.api.SaslScramSslEndToEndAuthorizationTest > testNoProduceWithDescribeAcl 
PASSED

kafka.api.SaslScramSslEndToEndAuthorizationTest > 
testProduceConsumeViaSubscribe STARTED

kafka.api.SaslScramSslEndToEndAuthorizationTest > 
testProduceConsumeViaSubscribe PASSED

kafka.api.SaslScramSslEndToEndAuthorizationTest > 
testNoProduceWithoutDescribeAcl STARTED

kafka.api.SaslScramSslEndToEndAuthorizationTest > 
testNoProduceWithoutDescribeAcl PASSED

kafka.api.AdminClientWithPoliciesIntegrationTest > testInvalidAlterConfigs 
STARTED

kafka.api.AdminClientWithPoliciesIntegrationTest > testInvalidAlterConfigs 
PASSED

kafka.api.AdminClientWithPoliciesIntegrationTest > testValidAlterConfigs STARTED

kafka.api.AdminClientWithPoliciesIntegrationTest > testValidAlterConfigs PASSED

kafka.api.AdminClientWithPoliciesIntegrationTest > 
testInvalidAlterConfigsDueToPolicy STARTED

kafka.api.AdminClientWithPoliciesIntegrationTest > 
testInvalidAlterConfigsDueToPolicy PASSED

kafka.api.AdminClientIntegrationTest > testInvalidAlterConfigs STARTED

kafka.api.AdminClientIntegrationTest > testInvalidAlterConfigs PASSED

kafka.api.AdminClientIntegrationTest > testClose STARTED

kafka.api.AdminClientIntegrationTest > testClose PASSED

kafka.api.AdminClientIntegrationTest > testMinimumRequestTimeouts STARTED

kafka.api.AdminClientIntegrationTest > testMinimumRequestTimeouts PASSED

kafka.api.AdminClientIntegrationTest > testForceClose STARTED

kafka.api.AdminClientIntegrationTest > testForceClose PASSED

kafka.api.AdminClientIntegrationTest > testListNodes STARTED

kafka.api.AdminClientIntegrationTest > testListNodes PASSED

kafka.api.AdminClientIntegrationTest > testDelayedClose STARTED

kafka.api.AdminClientIntegrationTest > testDelayedClose PASSED

kafka.api.AdminClientIntegrationTest > testCreateDeleteTopics STARTED

kafka.api.AdminClientIntegrationTest > testCreateDeleteTopics PASSED

kafka.api.AdminClientIntegrationTest > testAclOperations STARTED

kafka.api.AdminClientIntegrationTest > testAclOperations PASSED

kafka.api.AdminClientIntegrationTest > testDescribeCluster STARTED

kafka.api.AdminClientIntegrationTest > testDescribeCluster PASSED

kafka.api.AdminClientIntegrationTest > testDescribeNonExistingTopic STARTED

kafka.api.AdminClientIntegrationTest > testDescribeNonExistingTopic PASSED

kafka.api.AdminClientIntegrationTest > testDescribeAndAlterConfigs STARTED

kafka.api.AdminClientIntegrationTest > testDescribeAndAlterConfigs PASSED

kafka.api.AdminClientIntegrationTest > testCallInFlightTimeouts STARTED

kafka.api.AdminClientIntegrationTest > testCallInFlightTimeouts PASSED

kafka.api.GroupCoordinatorIntegrationTest > 
testGroupCoordinatorPropagatesOfffsetsTopicCompressionCodec STARTED

kafka.api.GroupCoordinatorIntegrationTest > 
testGroupCoordinatorPropagatesOfffsetsTopicCompressionCodec PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testAcls STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testAcls PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testTwoConsumersWithDifferentSaslCredentials STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testTwoConsumersWithDifferentSaslCredentials PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaSubscribe STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaSubscribe PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testProduceConsumeViaAssign 
STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testProduceConsumeViaAssign 
PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaAssign STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaAssign PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaSubscribe STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaSubscribe PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaAssign STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaAssign PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoGroupAcl STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoGroupAcl PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoProduceWithDescribeAcl 
STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoProduceWithDescribeAcl 
PASSED


Re: [DISCUSS] KIP-184 Rename LogCleaner and related classes to LogCompactor

2017-08-09 Thread Ismael Juma
Hi Pranav,

In the JIRA and the previous mailing list thread, some of us wondered if
the benefit is worth the transition pain. To quote Jason:

"The "log cleaner" naming may not be ideal, but
it is not incorrect and some of the terminology used elsewhere makes more
sense given this name (e.g. cleanable ratio, dirty offset). I personally
see little benefit in the name change, especially if we have to propagate
the change to configuration names (and it makes little sense if we do not
do so). My guess is that most users have already gotten used to the "log
cleaner" naming anyway, so adding new configs would just cause confusion."

Do we have any evidence that users actually get confused by the terminology?

Ismael

On Sat, Aug 5, 2017 at 5:36 PM, Pranav Maniar  wrote:

> Hi All,
>
> Following a discussion on JIRA KAFKA-1944
>  . I have created
> KIP-184
>  184%3A+Rename+LogCleaner+and+related+classes+to+LogCompactor>
> as
> it will require configuration change.
>
> As per the process I am starting Discussion on mail thread for KIP-184.
>
> Renaming of configuration "log.cleaner.enable" is discussed on KAFKA-1944.
> But other log.cleaner configuration also seems to be used by cleaner only.
> So to maintain naming consistency, I have proposed to rename all these
> configuration.
>
> Please provide your suggestion/views for the same. Thanks !
>
>
> Thanks,
> Pranav
>


Re: [DISCUSS] KIP-186: Increase offsets retention default to 7 days

2017-08-09 Thread Ismael Juma
Thanks for the KIP, +1 from me.

Ismael

On Wed, Aug 9, 2017 at 1:24 AM, Ewen Cheslack-Postava 
wrote:

> Hi all,
>
> I posted a simple new KIP for a problem we see with a lot of users:
> KIP-186: Increase offsets retention default to 7 days
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 186%3A+Increase+offsets+retention+default+to+7+days
>
> Note that in addition to the KIP text itself, the linked JIRA already
> existed and has a bunch of discussion on the subject.
>
> -Ewen
>


Re: [DISCUSS] KIP-185: Make exactly once in order delivery per partition the default producer setting

2017-08-09 Thread Ewen Cheslack-Postava
Apurva,

For the benchmarking, I have a couple of questions:

1. Re: the mention of exactly once, this is within a producer session,
right? And so really only idempotent. Applications still need to take extra
steps for exactly once if they, e.g., are producing data from some other
log like a DB txn log.

2.

> Further, the results above show that there is a large improvement in
throughput and latency when we go from max.in.flight=1 to max.in.flight=2,
but then there no discernible difference for higher values of this setting.

If in the tests there's no difference with higher values, perhaps leaving
it alone is better. There are a bunch of other configs we expose and this
test only evaluates one environment. Without testing things like cross-DC
traffic, I'd be wary of jumping to the conclusion that max.in.flight > 2
never makes a difference and that some people aren't already relying on a
larger default OOTB.

3. The acks=all change is actually unrelated to the title of the KIP and
orthogonal to all the other changes. It's also the most risky since
acks=all needs more network round trips. And while I think it makes sense
to have the more durable default, this seems like it's actually fairly
likely to break things for some people (even if a minority of people). This
one seems like a setting change that needs more sensitive handling, e.g.
both release notes and log notification that the default is going to
change, followed by the actual change later.

-Ewen

On Tue, Aug 8, 2017 at 5:23 PM, Apurva Mehta  wrote:

> Hi,
>
> I've put together a new KIP which proposes to ship Kafka with its strongest
> delivery guarantees by default.
>
> We currently ship with at most once semantics and don't provide any
> ordering guarantees per partition. The proposal is is to provide exactly
> once in order delivery per partition by default in the upcoming 1.0.0
> release.
>
> The KIP linked to below also outlines the performance characteristics of
> the proposed default.
>
> The KIP is here:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 185%3A+Make+exactly+once+in+order+delivery+per+partition+
> the+default+producer+setting
>
> Please have a look, I would love your feedback!
>
> Thanks,
> Apurva
>


Jenkins build is back to normal : kafka-trunk-jdk8 #1892

2017-08-09 Thread Apache Jenkins Server
See