Re: Mirroring multiple clusters into one

2017-07-06 Thread James Cheng
Answers inline below.

-James

Sent from my iPhone

> On Jul 7, 2017, at 1:18 AM, Vahid S Hashemian  
> wrote:
> 
> James,
> 
> Thanks for sharing your thoughts and experience.
> Could you please also confirm whether
> - you do any encryption for the mirrored data?
Not at the Kafka level. The data goes over a VPN.

> - you have a many-to-one mirroring similar to what I described?
> 

Yes, we mirror multiple source clusters to a single target cluster. We have a 
topic naming convention where our topics are prefixed with their cluster name, 
so as long as we follow that convention, each source topic gets mirrored to a 
unique target topic. That is, we try not to have multiple mirrormakers writing 
to a single target topic. 

Our topic names in the target cluster get prefixed with the string "mirror." 
And then we never mirror topics that start with "mirror." This prevents us from 
creating mirroring loops.

> Thanks.
> --Vahid
> 
> 
> 
> From:   James Cheng 
> To: us...@kafka.apache.org
> Cc: dev 
> Date:   07/06/2017 12:37 PM
> Subject:Re: Mirroring multiple clusters into one
> 
> 
> 
> I'm not sure what the "official" recommendation is. At TiVo, we *do* run 
> all our mirrormakers near the target cluster. It works fine for us, but 
> we're still fairly inexperienced, so I'm not sure how strong of a data 
> point we should be.
> 
> I think the thought process is, if you are mirroring from a source cluster 
> to a target cluster where there is a WAN between the two, then whichever 
> request goes across the WAN has a higher chance of intermittent failure 
> than the one over the LAN. That means that if mirrormaker is near the 
> source cluster, the produce request over the WAN to the target cluster may 
> fail. If the mirrormaker is near the target cluster, then the fetch 
> request over the WAN to the source cluster may fail.
> 
> Failed fetch requests don't have much impact on data replication, it just 
> delays it. Whereas a failure during a produce request may introduce 
> duplicates.
> 
> Becket Qin from LinkedIn did a presentation on tuning producer performance 
> at a meetup last year, and I remember he specifically talked about 
> producing over a WAN as one of the cases where you have to tune settings. 
> Maybe that presentation will give more ideas about what to look at. 
> https://www.slideshare.net/mobile/JiangjieQin/producer-performance-tuning-for-apache-kafka-63147600
> 
> 
> -James
> 
> Sent from my iPhone
> 
>> On Jul 6, 2017, at 1:00 AM, Vahid S Hashemian 
>  wrote:
>> 
>> The literature suggests running the MM on the target cluster when 
> possible 
>> (with the exception of when encryption is required for transferred 
> data).
>> I am wondering if this is still the recommended approach when mirroring 
>> from multiple clusters to a single cluster (i.e. multiple MM instances).
>> Is there anything in particular (metric, specification, etc.) to 
> consider 
>> before making a decision?
>> 
>> Thanks.
>> --Vahid
>> 
>> 
> 
> 
> 
> 


[GitHub] kafka pull request #3499: KAFKA-5567: Connect sink worker should commit offs...

2017-07-06 Thread kkonstantine
GitHub user kkonstantine opened a pull request:

https://github.com/apache/kafka/pull/3499

KAFKA-5567: Connect sink worker should commit offsets of original topic 
partitions



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kkonstantine/kafka 
KAFKA-5567-With-transformations-that-mutate-the-topic-partition-committing-offsets-should-to-refer-to-the-original-topic-partition

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3499.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3499


commit b9eeeb12812f0b237b4356717dfc5bf0e252e8e6
Author: Konstantine Karantasis 
Date:   2017-07-07T05:13:50Z

KAFKA-5567: Connect sink worker should commit offsets of original topic 
partitions.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Dealing with noisy timestamps in kafka streams + 10.1

2017-07-06 Thread Greg Fodor
I managed to answer some of my own questions :)

For future google'ers:

to deal with retention.ms see
https://issues.apache.org/jira/browse/KAFKA-4340
to deal with early rejection of bad timestamps the
message.timestamp.difference.max.ms config is relevant discussion here
https://issues.apache.org/jira/browse/KAFKA-5344

In our case, we can live with setting the retention.ms during backfills.
Still would like to know if there are any better practices for dealing with
mis-stamped records during backills w/ state store topics.


On Thu, Jul 6, 2017 at 12:32 PM, Greg Fodor  wrote:

> Hey all, we are currently working on migrating our system to kafka 10.2
> from 10.0 and one thing that we have hit that I wanted some advice on is
> dealing with the new log retention/rolling semantics that are based on
> timestamps.
>
> We send telemetry data from installed clients into kafka via kafka REST
> proxy and the timestamps we land the messages with are "create time" based
> that are timestamped on the sender side. We try to adjust for clock skew
> but this is not perfect and in practice we end up having some small subset
> of data landing into this topic with very erroneous timestamps (for
> example, some arrive with timestamps many years in the future.)
>
> The first problem we are hitting is that these corrupt timestamps now
> influence log segment rolling. For example, when reprocessing the entire
> log, we end up seeing a bunch of segment files generated for state stores
> changelogs in kafka streams that store these events since as corrupted
> timestamps come in a single one can trigger a segment roll if they are
> timestamped far in the future due to the new heuristics. The result is we
> end up with hundreds of small segment files (which actually in our current
> configuration ends up causing kafka to run out of memory, but that's
> another story :))
>
> The second problem we are hitting is when reprocessing the full log, since
> these timestamps are in the past as we run from the beginning, if we have a
> time based retention policy set on the state store changelog topic (say, a
> week) kafka ends up just deleting segments immediately since the timestamps
> are far in the past and the segments are considered expired. Previously
> this worked fine during reprocessing since the state store changelogs were
> just going to get deleted a week after the reprocess job ran since the
> retention policy was based upon segment file timestamp.
>
> Both of these problems could potentially be compensated for by writing a
> clever timestamp extractor that tried to a) normalize timestamps that
> appear very skewed and b) for changelog entries, extract a "logged at"
> instead of "created at" timestamp when landing into the state store
> changelog. The second problem could also be addressed by temporarily
> changing the topic configuration during a reprocess to prevent "old" log
> segments from being deleted. Neither of these seem ideal.
>
> I wanted to know if there are any recommendations on how to deal with this
> -- it seems like there is a conflict between having segment file policies
> be based on message timestamps and also having message timestamps be based
> on application creation time, since origin create time can often be subject
> to noise/skew/errors. One potential path forward would be to be able to
> have topic-specific settings for log rolling (including the ability to use
> the legacy behavior for retention that relies upon filesystem timestamps)
> but I am sure there are problems with this proposal.
>
> In general, I don't really feel like I have a good sense of what a correct
> solution would be, other than messages always having two timestamps and
> being able to have control over which timestamp is authoritative for log
> segment management policies, but that obviously seems like something that
> was considered and rejected for KIP-32 already.
>


Fwd: How to fetch offset for all the partitions of one topic through single KafkaClient function call (0.10.x)?

2017-07-06 Thread Martin Peng
Hi,

I am trying to use Apache Kafka client 0.10.2 to do batch offset fetching
for all partitions of one topic. However I found the
KafkaConsumer.commited()  only provided the ability to fetch offset for
single partition.

I read the KafkaConsumer class code and found actually it does have the
batch fetching ability through coordinator.fetchCommittedOffsets(). However
the coordinator object is a private one and the fetchCommitedOffsets() is
not exposed either.

Could anyone tell me the reason of this design? May I add one more function
in KafkaConsumer to expose the fetchCommitedOffsets() function?

Thanks,
Martin

-- Forwarded message --
From: Martin Peng 
Date: 2017-07-06 17:36 GMT-07:00
Subject: How to fetch offset for all the partitions of one topic through
single KafkaClient function call (0.10.x)?
To: us...@kafka.apache.org


Hi,

I am using Kafka client 0.10.2. Is there a way to fetch latest committed
offset for all the partitions in one function call?

I am call the KafkaConsumer.commited() to get this for single partition, is
there a simple way to batch fetch offsets for all the partitions in single
topic in one shot? And how to fetch all partition offsets for multiple
topics?

Thanks
Martin


[GitHub] kafka pull request #3498: Kafka-4763 (Used for triggering test only)

2017-07-06 Thread lindong28
Github user lindong28 closed the pull request at:

https://github.com/apache/kafka/pull/3498


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3498: Kafka-4763 (Used for triggering test only)

2017-07-06 Thread lindong28
GitHub user lindong28 reopened a pull request:

https://github.com/apache/kafka/pull/3498

Kafka-4763 (Used for triggering test only)

This patch is used only for triggering test. No need for review.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-4763-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3498.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3498


commit b10e55ac5c83d0a356f79b0325d0dd8cefe00a42
Author: Dong Lin 
Date:   2017-04-03T00:46:34Z

KAFKA-4763; Handle disk failure for JBOD (KIP-112)

commit 0c97e0907e93c00fc23958478ec103f6ac49a48d
Author: Dong Lin 
Date:   2017-06-12T04:05:12Z

Address comments

commit c013a815d72fa85175dfee8816366a02323526e2
Author: Dong Lin 
Date:   2017-06-12T09:13:28Z

StopReplicaResponse should specify error if replica-to-be-deleted is not 
found and there is offline directory

commit 15ab857bb60755a1e5c296dabf59d1d43f44fc0f
Author: Dong Lin 
Date:   2017-06-13T20:15:58Z

Address comments

commit 563bf001b78662834e403619657f21415095d38c
Author: Dong Lin 
Date:   2017-06-17T02:35:18Z

Address comments

commit 94efa20ac5193b459ff33c4cbbf5118a17749354
Author: Dong Lin 
Date:   2017-06-20T23:17:14Z

Address comments

commit 96af4783655466f481d69a5852d1ddb678fde0d1
Author: Dong Lin 
Date:   2017-06-21T20:27:09Z

Close file handler of all files in the offline log directory so that the 
disk can be umounted

commit 6e80507cb507b07d7e8378a4e49e3b0273026d94
Author: Dong Lin 
Date:   2017-06-23T00:41:21Z

Address comments

commit ffc91298efa66eb02d365c5e198e033b8b098062
Author: Dong Lin 
Date:   2017-06-24T02:09:54Z

Catch and handle more IOExceptions

commit 785bd12618978b4771b30f7bafcd8b3bdf30a53c
Author: Dong Lin 
Date:   2017-06-28T00:31:34Z

Address comments

commit 8b0b9c58c68958f8558304c8fce70fecf99b3b3c
Author: Dong Lin 
Date:   2017-06-30T21:55:30Z

Address comments




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-5568) Transformations that mutate topic-partitions break sink connectors that manage their own configuration

2017-07-06 Thread Ewen Cheslack-Postava (JIRA)
Ewen Cheslack-Postava created KAFKA-5568:


 Summary: Transformations that mutate topic-partitions break sink 
connectors that manage their own configuration
 Key: KAFKA-5568
 URL: https://issues.apache.org/jira/browse/KAFKA-5568
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.11.0.0, 0.10.2.1, 0.10.2.0
Reporter: Ewen Cheslack-Postava


KAFKA-5567 describes how offset commits for sink connectors are broken if a 
record's topic-partition is mutated by an SMT, e.g RegexRouter or 
TimestampRouter.

This is also a problem for sink connectors that manage their own offsets, i.e. 
those that store offsets elsewhere and call SinkTaskContext.rewind(). In this 
case, the transformation has already been applied by the time the SinkTask sees 
it, so there is no way it could correctly track offsets and call rewind() with 
valid values. For example, this would make the offset tracking that Confluent's 
HDFS connector does by working with filenames no longer work. Even if they were 
stored separately in a file rather than relying on filenames, it still wouldn't 
have ever had the correct offsets to write to that file.

There are a couple of options:

1. Decide that this is an acceptable consequence of combining SMTs with sink 
connectors and it's a limitation we accept. You can either transform the data 
via Kafka Streams instead or accept that you can't do these "routing" type 
operations in the sink connector unless it supports it natively. This *might* 
not be the wrong choice since we think there are very few connectors that track 
their own offsets. In the case of HDFS, we might rarely hit this issue because 
it supports its own file/directory partitioning schemes anyway so doing this 
via SMTs isn't as necessary there.
2. Try to expose the original record information to the sink connector via the 
records. I can think of 2 ways this could be done. The first is to attach the 
original record to each SinkRecord. The cost here is relatively high in terms 
of memory, especially for sink connectors that need to buffer data. The second 
is to add fields to SinkRecords for originalTopic() and originalPartition(). 
This feels a bit ugly to me but might be the least intrusive change API-wise 
and we can guarantee those fields aren't overwritten by not allowing public 
constructors to set them.
3. Try to expose the original record information to the sink connector via a 
new pre-processing callback. The idea is similar to preCommit, but instead 
would happen before any processing occurs. Taken to its logical conclusion this 
turns into a sort of interceptor interface (preConversion, preTransformation, 
put, and preCommit).
4. Add something to the Context that allows the connector to get back at the 
original information. Maybe some sort of IdentityMap 
originalPutRecords() that would let you get a mapping back to the original 
records. One nice aspect of this is that the connector can hold onto the 
original only if it needs it.
5. A very intrusive change/extension to the SinkTask API that passes in pairs 
of  records. Accomplishes the same as 2 but requires 
what I think are more complicated changes. Mentioned for completeness.
6. Something else I haven't thought of?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5567) With transformations that mutate the topic-partition committing offsets should to refer to the original topic-partition

2017-07-06 Thread Konstantine Karantasis (JIRA)
Konstantine Karantasis created KAFKA-5567:
-

 Summary: With transformations that mutate the topic-partition 
committing offsets should to refer to the original topic-partition
 Key: KAFKA-5567
 URL: https://issues.apache.org/jira/browse/KAFKA-5567
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.11.0.0, 0.10.2.1, 0.10.2.0
Reporter: Konstantine Karantasis
Assignee: Konstantine Karantasis
 Fix For: 0.11.0.1



  When a chain of transformations (SMTs) that mutate a record's topic-partition 
is applied, then Connect is unable to map the transformed record to its 
original topic-partition. This affects committing offsets. 

 Currently, in order to reproduce the issue one could use the 
{{TimestampRouter}} transformation with a sink connector such as the 
{{FileStreamSinkConnector}}.

  In this ticket we'll address the issue for connectors that don't 
manage/commit their offsets themselves. For the connectors that do such 
management, broader API changes are required to supply the connectors with the 
necessary information that will allow them to map a transformed record to the 
original. 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] kafka-site pull request #66: New landing page for Streams API

2017-07-06 Thread derrickdoo
GitHub user derrickdoo opened a pull request:

https://github.com/apache/kafka-site/pull/66

New landing page for Streams API

New Kafka Streams API landing page for desktop and mobile web.

Note: There are a couple of items that need to be filled in before we 
launch this page.

**Fill in links. Placeholders added for:**
1. Write your first app
2. View Demo
3. Tutorials

**Add desired Scala and Java 7 code snippets**

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/derrickdoo/kafka-site streams-landing

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka-site/pull/66.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #66


commit 2033bf094655b5f5493218d73c8ba945ecf0c128
Author: Derrick Or 
Date:   2017-07-06T23:15:02Z

new streams landing page




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Mirroring multiple clusters into one

2017-07-06 Thread Vahid S Hashemian
James,

Thanks for sharing your thoughts and experience.
Could you please also confirm whether
- you do any encryption for the mirrored data?
- you have a many-to-one mirroring similar to what I described?

Thanks.
--Vahid



From:   James Cheng 
To: us...@kafka.apache.org
Cc: dev 
Date:   07/06/2017 12:37 PM
Subject:Re: Mirroring multiple clusters into one



I'm not sure what the "official" recommendation is. At TiVo, we *do* run 
all our mirrormakers near the target cluster. It works fine for us, but 
we're still fairly inexperienced, so I'm not sure how strong of a data 
point we should be.

I think the thought process is, if you are mirroring from a source cluster 
to a target cluster where there is a WAN between the two, then whichever 
request goes across the WAN has a higher chance of intermittent failure 
than the one over the LAN. That means that if mirrormaker is near the 
source cluster, the produce request over the WAN to the target cluster may 
fail. If the mirrormaker is near the target cluster, then the fetch 
request over the WAN to the source cluster may fail.

Failed fetch requests don't have much impact on data replication, it just 
delays it. Whereas a failure during a produce request may introduce 
duplicates.

Becket Qin from LinkedIn did a presentation on tuning producer performance 
at a meetup last year, and I remember he specifically talked about 
producing over a WAN as one of the cases where you have to tune settings. 
Maybe that presentation will give more ideas about what to look at. 
https://www.slideshare.net/mobile/JiangjieQin/producer-performance-tuning-for-apache-kafka-63147600


-James

Sent from my iPhone

> On Jul 6, 2017, at 1:00 AM, Vahid S Hashemian 
 wrote:
> 
> The literature suggests running the MM on the target cluster when 
possible 
> (with the exception of when encryption is required for transferred 
data).
> I am wondering if this is still the recommended approach when mirroring 
> from multiple clusters to a single cluster (i.e. multiple MM instances).
> Is there anything in particular (metric, specification, etc.) to 
consider 
> before making a decision?
> 
> Thanks.
> --Vahid
> 
> 






Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-07-06 Thread Jeyhun Karimov
Hi Damian,

Thanks for comments.
About overrides, what other alternatives do we have? For
backwards-compatibility we have to add extra methods to the existing ones.

About ProcessorContext vs RecordContext, you are right. I think I need to
implement a prototype to understand the full picture as some parts of the
KIP might not be as straightforward as I thought.


Cheers,
Jeyhun

On Wed, Jul 5, 2017 at 10:40 AM Damian Guy  wrote:

> HI Jeyhun,
>
> Is the intention that these methods are new overloads on the KStream,
> KTable, etc?
>
> It is worth noting that a ProcessorContext is not a RecordContext. A
> RecordContext, as it stands, only exists during the processing of a single
> record. Whereas the ProcessorContext exists for the lifetime of the
> Processor. Sot it doesn't make sense to cast a ProcessorContext to a
> RecordContext.
> You mentioned above passing the InternalProcessorContext to the init()
> calls. It is internal for a reason and i think it should remain that way.
> It might be better to move the recordContext() method from
> InternalProcessorContext to ProcessorContext.
>
> In the KIP you have an example showing:
> richMapper.init((RecordContext) processorContext);
> But the interface is:
> public interface RichValueMapper {
> VR apply(final V value, final RecordContext recordContext);
> }
> i.e., there is no init(...), besides as above this wouldn't make sense.
>
> Thanks,
> Damian
>
> On Tue, 4 Jul 2017 at 23:30 Jeyhun Karimov  wrote:
>
> > Hi Matthias,
> >
> > Actually my intend was to provide to RichInitializer and later on we
> could
> > provide the context of the record as you also mentioned.
> > I remove that not to confuse the users.
> > Regarding the RecordContext and ProcessorContext interfaces, I just
> > realized the InternalProcessorContext class. Can't we pass this as a
> > parameter to init() method of processors? Then we would be able to get
> > RecordContext easily with just a method call.
> >
> >
> > Cheers,
> > Jeyhun
> >
> > On Thu, Jun 29, 2017 at 10:14 PM Matthias J. Sax 
> > wrote:
> >
> > > One more thing:
> > >
> > > I don't think `RichInitializer` does make sense. As we don't have any
> > > input record, there is also no context. We could of course provide the
> > > context of the record that triggers the init call, but this seems to be
> > > semantically questionable. Also, the context for this first record will
> > > be provided by the consecutive call to aggregate anyways.
> > >
> > >
> > > -Matthias
> > >
> > > On 6/29/17 1:11 PM, Matthias J. Sax wrote:
> > > > Thanks for updating the KIP.
> > > >
> > > > I have one concern with regard to backward compatibility. You suggest
> > to
> > > > use RecrodContext as base interface for ProcessorContext. This will
> > > > break compatibility.
> > > >
> > > > I think, we should just have two independent interfaces. Our own
> > > > ProcessorContextImpl class would implement both. This allows us to
> cast
> > > > it to `RecordContext` and thus limit the visible scope.
> > > >
> > > >
> > > > -Matthias
> > > >
> > > >
> > > >
> > > > On 6/27/17 1:35 PM, Jeyhun Karimov wrote:
> > > >> Hi all,
> > > >>
> > > >> I updated the KIP w.r.t. discussion and comments.
> > > >> Basically I eliminated overloads for particular method if they are
> > more
> > > >> than 3.
> > > >> As we can see there are a lot of overloads (and more will come with
> > > KIP-149
> > > >> :) )
> > > >> So, is it wise to
> > > >> wait the result of constructive DSL thread or
> > > >> extend KIP to address this issue as well or
> > > >> continue as it is?
> > > >>
> > > >> Cheers,
> > > >> Jeyhun
> > > >>
> > > >> On Wed, Jun 14, 2017 at 11:29 PM Guozhang Wang 
> > > wrote:
> > > >>
> > > >>> LGTM. Thanks!
> > > >>>
> > > >>>
> > > >>> Guozhang
> > > >>>
> > > >>> On Tue, Jun 13, 2017 at 2:20 PM, Jeyhun Karimov <
> > je.kari...@gmail.com>
> > > >>> wrote:
> > > >>>
> > >  Thanks for the comment Matthias. After all the discussion (thanks
> to
> > > all
> > >  participants), I think this (single method that passes in a
> > > RecordContext
> > >  object) is the best alternative.
> > >  Just a side note: I think KAFKA-3907 [1] can also be integrated
> into
> > > the
> > >  KIP by adding related method inside RecordContext interface.
> > > 
> > > 
> > >  [1] https://issues.apache.org/jira/browse/KAFKA-3907
> > > 
> > > 
> > >  Cheers,
> > >  Jeyhun
> > > 
> > >  On Tue, Jun 13, 2017 at 7:50 PM Matthias J. Sax <
> > > matth...@confluent.io>
> > >  wrote:
> > > 
> > > > Hi,
> > > >
> > > > I would like to push this discussion further. It seems we got
> nice
> > > > alternatives (thanks for the summary Jeyhun!).
> > > >
> > > > With respect to RichFunctions and allowing them to be stateful, I
> > > have
> > > > my doubt as expressed already. From my understanding, the idea
> was

[jira] [Resolved] (KAFKA-4726) ValueMapper should have (read) access to key

2017-07-06 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-4726.

Resolution: Duplicate

See 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner

> ValueMapper should have (read) access to key
> 
>
> Key: KAFKA-4726
> URL: https://issues.apache.org/jira/browse/KAFKA-4726
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.1.1
>Reporter: Steven Schlansker
>Assignee: Jeyhun Karimov
>  Labels: kip
>
> {{ValueMapper}} should have read-only access to the key for the value it is 
> mapping.  Sometimes the value transformation will depend on the key.
> It is possible to do this with a full blown {{KeyValueMapper}} but that loses 
> the promise that you won't change the key -- so you might introduce a 
> re-keying phase that is totally unnecessary.  It also requires you to return 
> an identity KeyValue object which costs something to construct (unless we are 
> lucky and the optimizer elides it).
> [ If mapValues() is guaranteed to be no less efficient than map() the issue 
> may be moot, but I presume there are some optimizations that are valid with 
> the former but not latter. ]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-3745) Consider adding join key to ValueJoiner interface

2017-07-06 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-3745.

Resolution: Duplicate

See 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner

> Consider adding join key to ValueJoiner interface
> -
>
> Key: KAFKA-3745
> URL: https://issues.apache.org/jira/browse/KAFKA-3745
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>Assignee: Jeyhun Karimov
>Priority: Minor
>  Labels: api, needs-kip, newbie
>
> In working with Kafka Stream joining, it's sometimes the case that a join key 
> is not actually present in the values of the joins themselves (if, for 
> example, a previous transform generated an ephemeral join key.) In such 
> cases, the actual key of the join is not available in the ValueJoiner 
> implementation to be used to construct the final joined value. This can be 
> worked around by explicitly threading the join key into the value if needed, 
> but it seems like extending the interface to pass the join key along as well 
> would be helpful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5566) Instable test QueryableStateIntegrationTest.shouldAllowToQueryAfterThreadDied

2017-07-06 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-5566:
--

 Summary: Instable test 
QueryableStateIntegrationTest.shouldAllowToQueryAfterThreadDied
 Key: KAFKA-5566
 URL: https://issues.apache.org/jira/browse/KAFKA-5566
 Project: Kafka
  Issue Type: Bug
  Components: streams, unit tests
Reporter: Matthias J. Sax
Assignee: Eno Thereska


This test failed about 4 times in the last 24h. Always the same stack trace so 
far:
{noformat}
java.lang.AssertionError: Condition not met within timeout 3. wait for agg 
to be '123'
at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:274)
at 
org.apache.kafka.streams.integration.QueryableStateIntegrationTest.shouldAllowToQueryAfterThreadDied(QueryableStateIntegrationTest.java:793)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] kafka pull request #3498: Kafka-4763 (Used for triggering test only)

2017-07-06 Thread lindong28
Github user lindong28 closed the pull request at:

https://github.com/apache/kafka/pull/3498


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3498: Kafka-4763 (Used for triggering test only)

2017-07-06 Thread lindong28
GitHub user lindong28 reopened a pull request:

https://github.com/apache/kafka/pull/3498

Kafka-4763 (Used for triggering test only)

This patch is used only for triggering test. No need for review.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-4763-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3498.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3498


commit b10e55ac5c83d0a356f79b0325d0dd8cefe00a42
Author: Dong Lin 
Date:   2017-04-03T00:46:34Z

KAFKA-4763; Handle disk failure for JBOD (KIP-112)

commit 0c97e0907e93c00fc23958478ec103f6ac49a48d
Author: Dong Lin 
Date:   2017-06-12T04:05:12Z

Address comments

commit c013a815d72fa85175dfee8816366a02323526e2
Author: Dong Lin 
Date:   2017-06-12T09:13:28Z

StopReplicaResponse should specify error if replica-to-be-deleted is not 
found and there is offline directory

commit 15ab857bb60755a1e5c296dabf59d1d43f44fc0f
Author: Dong Lin 
Date:   2017-06-13T20:15:58Z

Address comments

commit 563bf001b78662834e403619657f21415095d38c
Author: Dong Lin 
Date:   2017-06-17T02:35:18Z

Address comments

commit 94efa20ac5193b459ff33c4cbbf5118a17749354
Author: Dong Lin 
Date:   2017-06-20T23:17:14Z

Address comments

commit 96af4783655466f481d69a5852d1ddb678fde0d1
Author: Dong Lin 
Date:   2017-06-21T20:27:09Z

Close file handler of all files in the offline log directory so that the 
disk can be umounted

commit 6e80507cb507b07d7e8378a4e49e3b0273026d94
Author: Dong Lin 
Date:   2017-06-23T00:41:21Z

Address comments




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: RocksDB flushing issue on 0.10.2 streams

2017-07-06 Thread Greg Fodor
Also sorry, to clarify the job context:

- This is a job running across 5 nodes on AWS Linux
- It is under load with a large number of partitions: approximately 700-800
topic-partitions assignments in total for the entire job. Topics involved
have large # of partitions, 128 each.
- 32 stream threads per host.
- Peak TPS seems to be approximately 5k-10k tuples/sec per node. We're
reprocessing historical data in kafka.

On Thu, Jul 6, 2017 at 10:45 AM, Greg Fodor  wrote:

> That's great news, thanks!
>
> On Thu, Jul 6, 2017 at 6:18 AM, Damian Guy  wrote:
>
>> Hi Greg,
>> I've been able to reproduce it by running multiple instances with standby
>> tasks and many threads. If i force some rebalances, then i see the
>> failure.
>> Now to see if i can repro in a test.
>> I think it is probably the same issue as:
>> https://issues.apache.org/jira/browse/KAFKA-5070
>>
>> On Thu, 6 Jul 2017 at 12:43 Damian Guy  wrote:
>>
>> > Greg, what OS are you running on?
>> > Are you able to reproduce this in a test at all?
>> > For instance, based on what you described it would seem that i should be
>> > able to start a streams app, wait for it to be up and running, run the
>> > state dir cleanup, see it fail. However, i can't reproduce it.
>> >
>> > On Wed, 5 Jul 2017 at 23:23 Damian Guy  wrote:
>> >
>> >> Thanks Greg. I'll look into it more tomorrow. Just finding it difficult
>> >> to reproduce in a test.
>> >> Thanks for providing the sequence, gives me something to try and repo.
>> >> Appreciated.
>> >>
>> >> Thanks,
>> >> Damian
>> >> On Wed, 5 Jul 2017 at 19:57, Greg Fodor  wrote:
>> >>
>> >>> Also, the sequence of events is:
>> >>>
>> >>> - Job starts, rebalance happens, things run along smoothly.
>> >>> - After 10 minutes (retrospectively) the cleanup task kicks on and
>> >>> removes
>> >>> some directories
>> >>> - Tasks immediately start failing when trying to flush their state
>> stores
>> >>>
>> >>>
>> >>>
>> >>> On Wed, Jul 5, 2017 at 11:55 AM, Greg Fodor  wrote:
>> >>>
>> >>> > The issue I am hitting is not the directory locking issues we've
>> seen
>> >>> in
>> >>> > the past. The issue seems to be, as you mentioned, that the state
>> dir
>> >>> is
>> >>> > getting deleted by the store cleanup process, but there are still
>> tasks
>> >>> > running that are trying to flush the state store. It seems more
>> than a
>> >>> > little scary given that right now it seems either a) there are tasks
>> >>> > running that should have been re-assigned or b) the cleanup job is
>> >>> removing
>> >>> > state directories for currently running + assigned tasks (perhaps
>> >>> during a
>> >>> > rebalance there is a race condition?) I'm guessing there's probably
>> a
>> >>> more
>> >>> > benign explanation, but that is what it looks like right now.
>> >>> >
>> >>> > On Wed, Jul 5, 2017 at 7:00 AM, Damian Guy 
>> >>> wrote:
>> >>> >
>> >>> >> BTW - i'm trying to reproduce it, but not having much luck so
>> far...
>> >>> >>
>> >>> >> On Wed, 5 Jul 2017 at 09:27 Damian Guy 
>> wrote:
>> >>> >>
>> >>> >> > Thans for the updates Greg. There were some minor changes around
>> >>> this in
>> >>> >> > 0.11.0 to make it less likely to happen, but we've only ever seen
>> >>> the
>> >>> >> > locking fail in the event of a rebalance. When everything is
>> running
>> >>> >> state
>> >>> >> > dirs shouldn't be deleted if they are being used as the lock will
>> >>> fail.
>> >>> >> >
>> >>> >> >
>> >>> >> > On Wed, 5 Jul 2017 at 08:15 Greg Fodor  wrote:
>> >>> >> >
>> >>> >> >> I can report that setting state.cleanup.delay.ms to a very
>> large
>> >>> value
>> >>> >> >> (effectively disabling it) works around the issue. It seems that
>> >>> the
>> >>> >> state
>> >>> >> >> store cleanup process can somehow get out ahead of another task
>> >>> that
>> >>> >> still
>> >>> >> >> thinks it should be writing to the state store/flushing it. In
>> my
>> >>> test
>> >>> >> >> runs, this does not seem to be happening during a rebalancing
>> >>> event,
>> >>> >> but
>> >>> >> >> after the cluster is stable.
>> >>> >> >>
>> >>> >> >> On Tue, Jul 4, 2017 at 12:29 PM, Greg Fodor 
>> >>> wrote:
>> >>> >> >>
>> >>> >> >> > Upon another run, I see the same error occur during a
>> rebalance,
>> >>> so
>> >>> >> >> either
>> >>> >> >> > my log was showing a rebalance or there is a shared underlying
>> >>> issue
>> >>> >> >> with
>> >>> >> >> > state stores.
>> >>> >> >> >
>> >>> >> >> > On Tue, Jul 4, 2017 at 11:35 AM, Greg Fodor > >
>> >>> >> wrote:
>> >>> >> >> >
>> >>> >> >> >> Also, I am on 0.10.2.1, so poll interval was already set to
>> >>> >> MAX_VALUE.
>> >>> >> >> >>
>> >>> >> >> >> On Tue, Jul 4, 2017 at 11:28 AM, Greg Fodor <
>> gfo...@gmail.com>
>> >>> >> wrote:
>> >>> >> >> >>
>> >>> >> >> >>> I've nuked the nodes this happened on, but 

Re: Wiki access

2017-07-06 Thread Jun Rao
Hi, Mitch,

Thanks for your interest. Just gave you the wiki permission.

Jun

On Thu, Jul 6, 2017 at 10:43 AM, Mitch Seymour 
wrote:

> Hello,
>
> I'd like to make some contributions to the wiki. Could someone please grant
> write permissions to the following user?
>
> Username: mitch-seymour
>
> Thanks,
>
> Mitch
>


Re: [DISCUSS] KIP-175: Additional '--describe' views for ConsumerGroupCommand

2017-07-06 Thread Jeff Widman
Thanks for the KIP Vahid. I think it'd be useful to have these filters.

That said, I also agree with Edo.

We don't currently rely on the output, but there's been more than one time
when debugging an issue that I notice something amiss when I see all the
data at once but if it wasn't present in the default view I probably would
have missed it as I wouldn't have thought to look at that particular
filter.

This would also be more consistent with the API of the kafka-topics.sh
where "--describe" gives everything and then can be filtered down.



On Tue, Jul 4, 2017 at 10:42 AM, Edoardo Comar  wrote:

> Hi Vahid,
> no we are not relying on parsing the current output.
>
> I just thought that keeping the full output isn't necessarily that bad as
> it shows some sort of history of how a group was used.
>
> ciao
> Edo
> --
>
> Edoardo Comar
>
> IBM Message Hub
>
> IBM UK Ltd, Hursley Park, SO21 2JN
>
> "Vahid S Hashemian"  wrote on 04/07/2017
> 17:11:43:
>
> > From: "Vahid S Hashemian" 
> > To: dev@kafka.apache.org
> > Cc: "Kafka User" 
> > Date: 04/07/2017 17:12
> > Subject: Re: [DISCUSS] KIP-175: Additional '--describe' views for
> > ConsumerGroupCommand
> >
> > Hi Edo,
> >
> > Thanks for reviewing the KIP.
> >
> > Modifying the default behavior of `--describe` was suggested in the
> > related JIRA.
> > We could poll the community to see whether they go for that option, or,
> as
> > you suggested, introducing a new `--only-xxx` ( can't also think of a
> > proper name right now :) ) option instead.
> >
> > Are you making use of the current `--describe` output and relying on the
>
> > full data set?
> >
> > Thanks.
> > --Vahid
> >
> >
> >
> >
> > From:   Edoardo Comar 
> > To: dev@kafka.apache.org
> > Cc: "Kafka User" 
> > Date:   07/04/2017 03:17 AM
> > Subject:Re: [DISCUSS] KIP-175: Additional '--describe' views for
>
> > ConsumerGroupCommand
> >
> >
> >
> > Thanks Vahid, I like the KIP.
> >
> > One question - could we keep the current "--describe" behavior unchanged
>
> > and introduce "--only-xxx" options to filter down the full output as you
>
> > proposed ?
> >
> > ciao,
> > Edo
> > --
> >
> > Edoardo Comar
> >
> > IBM Message Hub
> >
> > IBM UK Ltd, Hursley Park, SO21 2JN
> >
> >
> >
> > From:   "Vahid S Hashemian" 
> > To: dev , "Kafka User"
> 
> > Date:   04/07/2017 00:06
> > Subject:[DISCUSS] KIP-175: Additional '--describe' views for
> > ConsumerGroupCommand
> >
> >
> >
> > Hi,
> >
> > I created KIP-175 to make some improvements to the ConsumerGroupCommand
> > tool.
> > The KIP can be found here:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-175%3A
> > +Additional+%27--describe%27+views+for+ConsumerGroupCommand
> >
> >
> >
> > Your review and feedback is welcome!
> >
> > Thanks.
> > --Vahid
> >
> >
> >
> >
> >
> > Unless stated otherwise above:
> > IBM United Kingdom Limited - Registered in England and Wales with number
>
> > 741598.
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
> 3AU
> >
> >
> >
> >
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>


[GitHub] kafka pull request #3498: Kafka-4763 (Used for triggering test only)

2017-07-06 Thread lindong28
Github user lindong28 closed the pull request at:

https://github.com/apache/kafka/pull/3498


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3498: Kafka-4763 (Used for triggering test only)

2017-07-06 Thread lindong28
GitHub user lindong28 reopened a pull request:

https://github.com/apache/kafka/pull/3498

Kafka-4763 (Used for triggering test only)

This patch is used only for triggering test. No need for review.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-4763-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3498.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3498


commit b10e55ac5c83d0a356f79b0325d0dd8cefe00a42
Author: Dong Lin 
Date:   2017-04-03T00:46:34Z

KAFKA-4763; Handle disk failure for JBOD (KIP-112)

commit 0c97e0907e93c00fc23958478ec103f6ac49a48d
Author: Dong Lin 
Date:   2017-06-12T04:05:12Z

Address comments

commit c013a815d72fa85175dfee8816366a02323526e2
Author: Dong Lin 
Date:   2017-06-12T09:13:28Z

StopReplicaResponse should specify error if replica-to-be-deleted is not 
found and there is offline directory

commit 15ab857bb60755a1e5c296dabf59d1d43f44fc0f
Author: Dong Lin 
Date:   2017-06-13T20:15:58Z

Address comments

commit 563bf001b78662834e403619657f21415095d38c
Author: Dong Lin 
Date:   2017-06-17T02:35:18Z

Address comments

commit 94efa20ac5193b459ff33c4cbbf5118a17749354
Author: Dong Lin 
Date:   2017-06-20T23:17:14Z

Address comments

commit 96af4783655466f481d69a5852d1ddb678fde0d1
Author: Dong Lin 
Date:   2017-06-21T20:27:09Z

Close file handler of all files in the offline log directory so that the 
disk can be umounted

commit 6e80507cb507b07d7e8378a4e49e3b0273026d94
Author: Dong Lin 
Date:   2017-06-23T00:41:21Z

Address comments




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3498: Kafka-4763 (Used for triggering test only)

2017-07-06 Thread lindong28
GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/3498

Kafka-4763 (Used for triggering test only)

This patch is used only for triggering test. No need for review.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-4763-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3498.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3498


commit b10e55ac5c83d0a356f79b0325d0dd8cefe00a42
Author: Dong Lin 
Date:   2017-04-03T00:46:34Z

KAFKA-4763; Handle disk failure for JBOD (KIP-112)

commit 0c97e0907e93c00fc23958478ec103f6ac49a48d
Author: Dong Lin 
Date:   2017-06-12T04:05:12Z

Address comments

commit c013a815d72fa85175dfee8816366a02323526e2
Author: Dong Lin 
Date:   2017-06-12T09:13:28Z

StopReplicaResponse should specify error if replica-to-be-deleted is not 
found and there is offline directory

commit 15ab857bb60755a1e5c296dabf59d1d43f44fc0f
Author: Dong Lin 
Date:   2017-06-13T20:15:58Z

Address comments

commit 563bf001b78662834e403619657f21415095d38c
Author: Dong Lin 
Date:   2017-06-17T02:35:18Z

Address comments

commit 94efa20ac5193b459ff33c4cbbf5118a17749354
Author: Dong Lin 
Date:   2017-06-20T23:17:14Z

Address comments

commit 96af4783655466f481d69a5852d1ddb678fde0d1
Author: Dong Lin 
Date:   2017-06-21T20:27:09Z

Close file handler of all files in the offline log directory so that the 
disk can be umounted

commit 6e80507cb507b07d7e8378a4e49e3b0273026d94
Author: Dong Lin 
Date:   2017-06-23T00:41:21Z

Address comments




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3497: HOTFIX: disable flaky system tests

2017-07-06 Thread mjsax
GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/3497

HOTFIX: disable flaky system tests



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka disable-flaky-system-tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3497.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3497


commit e3f9e81c398655146eb1aa2398894fce893d4760
Author: Matthias J. Sax 
Date:   2017-07-06T18:37:32Z

HOTFIX: disable flaky system tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE]: KIP-149: Enabling key access in ValueTransformer, ValueMapper, and ValueJoiner

2017-07-06 Thread Gwen Shapira
+1

On Wed, Jul 5, 2017 at 9:25 AM Matthias J. Sax 
wrote:

> +1
>
> On 6/27/17 1:41 PM, Jeyhun Karimov wrote:
> > Dear all,
> >
> > I would like to start the vote on KIP-149 [1].
> >
> >
> > Cheers,
> > Jeyhun
> >
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner
> >
>
>


Re: Mirroring multiple clusters into one

2017-07-06 Thread James Cheng
I'm not sure what the "official" recommendation is. At TiVo, we *do* run all 
our mirrormakers near the target cluster. It works fine for us, but we're still 
fairly inexperienced, so I'm not sure how strong of a data point we should be.

I think the thought process is, if you are mirroring from a source cluster to a 
target cluster where there is a WAN between the two, then whichever request 
goes across the WAN has a higher chance of intermittent failure than the one 
over the LAN. That means that if mirrormaker is near the source cluster, the 
produce request over the WAN to the target cluster may fail. If the mirrormaker 
is near the target cluster, then the fetch request over the WAN to the source 
cluster may fail.

Failed fetch requests don't have much impact on data replication, it just 
delays it. Whereas a failure during a produce request may introduce duplicates.

Becket Qin from LinkedIn did a presentation on tuning producer performance at a 
meetup last year, and I remember he specifically talked about producing over a 
WAN as one of the cases where you have to tune settings. Maybe that 
presentation will give more ideas about what to look at. 
https://www.slideshare.net/mobile/JiangjieQin/producer-performance-tuning-for-apache-kafka-63147600

-James

Sent from my iPhone

> On Jul 6, 2017, at 1:00 AM, Vahid S Hashemian  
> wrote:
> 
> The literature suggests running the MM on the target cluster when possible 
> (with the exception of when encryption is required for transferred data).
> I am wondering if this is still the recommended approach when mirroring 
> from multiple clusters to a single cluster (i.e. multiple MM instances).
> Is there anything in particular (metric, specification, etc.) to consider 
> before making a decision?
> 
> Thanks.
> --Vahid
> 
> 


Dealing with noisy timestamps in kafka streams + 10.1

2017-07-06 Thread Greg Fodor
Hey all, we are currently working on migrating our system to kafka 10.2
from 10.0 and one thing that we have hit that I wanted some advice on is
dealing with the new log retention/rolling semantics that are based on
timestamps.

We send telemetry data from installed clients into kafka via kafka REST
proxy and the timestamps we land the messages with are "create time" based
that are timestamped on the sender side. We try to adjust for clock skew
but this is not perfect and in practice we end up having some small subset
of data landing into this topic with very erroneous timestamps (for
example, some arrive with timestamps many years in the future.)

The first problem we are hitting is that these corrupt timestamps now
influence log segment rolling. For example, when reprocessing the entire
log, we end up seeing a bunch of segment files generated for state stores
changelogs in kafka streams that store these events since as corrupted
timestamps come in a single one can trigger a segment roll if they are
timestamped far in the future due to the new heuristics. The result is we
end up with hundreds of small segment files (which actually in our current
configuration ends up causing kafka to run out of memory, but that's
another story :))

The second problem we are hitting is when reprocessing the full log, since
these timestamps are in the past as we run from the beginning, if we have a
time based retention policy set on the state store changelog topic (say, a
week) kafka ends up just deleting segments immediately since the timestamps
are far in the past and the segments are considered expired. Previously
this worked fine during reprocessing since the state store changelogs were
just going to get deleted a week after the reprocess job ran since the
retention policy was based upon segment file timestamp.

Both of these problems could potentially be compensated for by writing a
clever timestamp extractor that tried to a) normalize timestamps that
appear very skewed and b) for changelog entries, extract a "logged at"
instead of "created at" timestamp when landing into the state store
changelog. The second problem could also be addressed by temporarily
changing the topic configuration during a reprocess to prevent "old" log
segments from being deleted. Neither of these seem ideal.

I wanted to know if there are any recommendations on how to deal with this
-- it seems like there is a conflict between having segment file policies
be based on message timestamps and also having message timestamps be based
on application creation time, since origin create time can often be subject
to noise/skew/errors. One potential path forward would be to be able to
have topic-specific settings for log rolling (including the ability to use
the legacy behavior for retention that relies upon filesystem timestamps)
but I am sure there are problems with this proposal.

In general, I don't really feel like I have a good sense of what a correct
solution would be, other than messages always having two timestamps and
being able to have control over which timestamp is authoritative for log
segment management policies, but that obviously seems like something that
was considered and rejected for KIP-32 already.


[jira] [Created] (KAFKA-5565) Add a broker metric specifying the number of consumer group rebalances in progress

2017-07-06 Thread Colin P. McCabe (JIRA)
Colin P. McCabe created KAFKA-5565:
--

 Summary: Add a broker metric specifying the number of consumer 
group rebalances in progress
 Key: KAFKA-5565
 URL: https://issues.apache.org/jira/browse/KAFKA-5565
 Project: Kafka
  Issue Type: Improvement
Reporter: Colin P. McCabe


We should add a broker metric specifying the number of consumer group 
rebalances in progress.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] KAFKA-4930 & KAFKA 4938 - Treatment of name parameter in create connector requests

2017-07-06 Thread Gwen Shapira
This sounds great. I'll try to review later today :)

On Thu, Jul 6, 2017 at 12:35 AM Sönke Liebau
 wrote:

> I've updated the pull request to behave as follows:
>  - reject create requests that contain no "name" element with a
> BadRequestException
>  - reject name that are empty or contain illegal characters with a
> ConfigException
>  - leave current logic around when to copy the name from the create request
> to the config element intact
>  - added unit tests for the validator to check that illegal characters are
> correctly identified
>
> The list of illegal characters is the result of some quick testing I did,
> all of the characters in the list currently cause issues when used in a
> connector name (similar to KAFKA-4827), so this should not break anything
> that anybody relies on.
> I think we may want to start a larger discussion around connector names,
> allowed characters, max length, ..  to come up with an airtight set of
> rules that we can then enforce, I am sure this is currently not perfect as
> is.
>
> Best regards,
> Sönke
>
> On Wed, Jul 5, 2017 at 9:31 AM, Sönke Liebau 
> wrote:
>
> > Hi,
> >
> > regarding "breaking existing functionality" .. yes...that was me getting
> > confused about intended and existing functionality :)
> > You are right, this won't break anything that is currently working.
> >
> > I'll leave placement of "name" parameter as is and open a new issue to
> > clarify this later on.
> >
> > Kind regards,
> > Sönke
> >
> > On Wed, Jul 5, 2017 at 5:41 AM, Gwen Shapira  wrote:
> >
> >> Hey,
> >>
> >> Nice research and summary.
> >>
> >> Regarding the ability to have a "nameless" connector - I'm pretty sure
> we
> >> never intended to allow that.
> >> I'm confused about breaking something that currently works though -
> since
> >> we get NPE, how will giving more intentional exceptions break anything?
> >>
> >> Regarding the placing of the name - inside or outside the config. It
> looks
> >> messy and I'm as confused as you are. I think Konstantine had some ideas
> >> how this should be resolved. I hope he responds, but I think that for
> your
> >> PR, just accept current mess as given...
> >>
> >> Gwen
> >>
> >> On Tue, Jul 4, 2017 at 3:28 AM Sönke Liebau
> >>  wrote:
> >>
> >> > While working on KAFKA-4930 and KAFKA-4938 I came across some sort of
> >> > fundamental questions about the rest api for creating connectors in
> >> Kafka
> >> > Connect that I'd like to put up for discussion.
> >> >
> >> > Currently requests that do not contain a "name" element on the top
> level
> >> > are not accepted by the API, but that is due to a NullPointerException
> >> [1]
> >> > so not entirely intentional. Previous (and current if the lines
> causing
> >> the
> >> > Exception are removed) functionality was to create a connector named
> >> "null"
> >> > if that parameter was missing. I am not sure if this is a good thing,
> as
> >> > for example that connector will be overwritten every time a new
> request
> >> > without a name is sent, as opposed to the expected warning that a
> >> connector
> >> > of that name already exists.
> >> >
> >> > I would propose to reject api calls without a name provided on the top
> >> > level, but this might break requests that currently work, so should
> >> > probably be mentioned in the release notes.
> >> >
> >> > 
> >> >
> >> > Additionally, the "name" parameter is also copied into the "config"
> >> > sub-element of the connector request - unless a name parameter was
> >> provided
> >> > there in the original request[2].
> >> >
> >> > So this:
> >> >
> >> > {
> >> >   "name": "connectorname",
> >> >   "config": {
> >> > "connector.class":
> >> > "org.apache.kafka.connect.tools.MockSourceConnector",
> >> > "tasks.max": "1",
> >> > "topics": "test-topic"
> >> >   }
> >> > }
> >> >
> >> > would become this:
> >> > {
> >> >   "name": "connectorname",
> >> >   "config": {
> >> > "name": "connectorname",
> >> > "connector.class":
> >> > "org.apache.kafka.connect.tools.MockSourceConnector",
> >> > "tasks.max": "1",
> >> > "topics": "test-topic"
> >> >   }
> >> > }
> >> >
> >> > But a request that contains two different names like this:
> >> >
> >> >  {
> >> >   "name": "connectorname",
> >> >   "config": {
> >> > "name": "differentconnectorname",
> >> > "connector.class":
> >> > "org.apache.kafka.connect.tools.MockSourceConnector",
> >> > "tasks.max": "1",
> >> > "topics": "test-topic"
> >> >   }
> >> > }
> >> >
> >> > would be allowed as is.
> >> >
> >> > This might be intentional behavior in order to enable Connectors to
> >> have a
> >> > "name" parameter of their own - though I couldn't find any that do,
> but
> >> I
> >> > think this has the potential for misunderstandings, especially as
> there
> >> may
> >> > be code out there that references the connector name from the config
> >> 

Re: RocksDB flushing issue on 0.10.2 streams

2017-07-06 Thread Greg Fodor
That's great news, thanks!

On Thu, Jul 6, 2017 at 6:18 AM, Damian Guy  wrote:

> Hi Greg,
> I've been able to reproduce it by running multiple instances with standby
> tasks and many threads. If i force some rebalances, then i see the failure.
> Now to see if i can repro in a test.
> I think it is probably the same issue as:
> https://issues.apache.org/jira/browse/KAFKA-5070
>
> On Thu, 6 Jul 2017 at 12:43 Damian Guy  wrote:
>
> > Greg, what OS are you running on?
> > Are you able to reproduce this in a test at all?
> > For instance, based on what you described it would seem that i should be
> > able to start a streams app, wait for it to be up and running, run the
> > state dir cleanup, see it fail. However, i can't reproduce it.
> >
> > On Wed, 5 Jul 2017 at 23:23 Damian Guy  wrote:
> >
> >> Thanks Greg. I'll look into it more tomorrow. Just finding it difficult
> >> to reproduce in a test.
> >> Thanks for providing the sequence, gives me something to try and repo.
> >> Appreciated.
> >>
> >> Thanks,
> >> Damian
> >> On Wed, 5 Jul 2017 at 19:57, Greg Fodor  wrote:
> >>
> >>> Also, the sequence of events is:
> >>>
> >>> - Job starts, rebalance happens, things run along smoothly.
> >>> - After 10 minutes (retrospectively) the cleanup task kicks on and
> >>> removes
> >>> some directories
> >>> - Tasks immediately start failing when trying to flush their state
> stores
> >>>
> >>>
> >>>
> >>> On Wed, Jul 5, 2017 at 11:55 AM, Greg Fodor  wrote:
> >>>
> >>> > The issue I am hitting is not the directory locking issues we've seen
> >>> in
> >>> > the past. The issue seems to be, as you mentioned, that the state dir
> >>> is
> >>> > getting deleted by the store cleanup process, but there are still
> tasks
> >>> > running that are trying to flush the state store. It seems more than
> a
> >>> > little scary given that right now it seems either a) there are tasks
> >>> > running that should have been re-assigned or b) the cleanup job is
> >>> removing
> >>> > state directories for currently running + assigned tasks (perhaps
> >>> during a
> >>> > rebalance there is a race condition?) I'm guessing there's probably a
> >>> more
> >>> > benign explanation, but that is what it looks like right now.
> >>> >
> >>> > On Wed, Jul 5, 2017 at 7:00 AM, Damian Guy 
> >>> wrote:
> >>> >
> >>> >> BTW - i'm trying to reproduce it, but not having much luck so far...
> >>> >>
> >>> >> On Wed, 5 Jul 2017 at 09:27 Damian Guy 
> wrote:
> >>> >>
> >>> >> > Thans for the updates Greg. There were some minor changes around
> >>> this in
> >>> >> > 0.11.0 to make it less likely to happen, but we've only ever seen
> >>> the
> >>> >> > locking fail in the event of a rebalance. When everything is
> running
> >>> >> state
> >>> >> > dirs shouldn't be deleted if they are being used as the lock will
> >>> fail.
> >>> >> >
> >>> >> >
> >>> >> > On Wed, 5 Jul 2017 at 08:15 Greg Fodor  wrote:
> >>> >> >
> >>> >> >> I can report that setting state.cleanup.delay.ms to a very large
> >>> value
> >>> >> >> (effectively disabling it) works around the issue. It seems that
> >>> the
> >>> >> state
> >>> >> >> store cleanup process can somehow get out ahead of another task
> >>> that
> >>> >> still
> >>> >> >> thinks it should be writing to the state store/flushing it. In my
> >>> test
> >>> >> >> runs, this does not seem to be happening during a rebalancing
> >>> event,
> >>> >> but
> >>> >> >> after the cluster is stable.
> >>> >> >>
> >>> >> >> On Tue, Jul 4, 2017 at 12:29 PM, Greg Fodor 
> >>> wrote:
> >>> >> >>
> >>> >> >> > Upon another run, I see the same error occur during a
> rebalance,
> >>> so
> >>> >> >> either
> >>> >> >> > my log was showing a rebalance or there is a shared underlying
> >>> issue
> >>> >> >> with
> >>> >> >> > state stores.
> >>> >> >> >
> >>> >> >> > On Tue, Jul 4, 2017 at 11:35 AM, Greg Fodor 
> >>> >> wrote:
> >>> >> >> >
> >>> >> >> >> Also, I am on 0.10.2.1, so poll interval was already set to
> >>> >> MAX_VALUE.
> >>> >> >> >>
> >>> >> >> >> On Tue, Jul 4, 2017 at 11:28 AM, Greg Fodor  >
> >>> >> wrote:
> >>> >> >> >>
> >>> >> >> >>> I've nuked the nodes this happened on, but the job had been
> >>> running
> >>> >> >> for
> >>> >> >> >>> about 5-10 minutes across 5 nodes before this happened. Does
> >>> the
> >>> >> log
> >>> >> >> show a
> >>> >> >> >>> rebalance was happening? It looks to me like the standby task
> >>> was
> >>> >> just
> >>> >> >> >>> committing as part of normal operations.
> >>> >> >> >>>
> >>> >> >> >>> On Tue, Jul 4, 2017 at 7:40 AM, Damian Guy <
> >>> damian@gmail.com>
> >>> >> >> wrote:
> >>> >> >> >>>
> >>> >> >>  Hi Greg,
> >>> >> >> 
> >>> >> >>  Obviously a bit difficult to read the RocksDBException, but
> my
> >>> >> guess
> >>> >> >> is
> >>> >> >>  it

Wiki access

2017-07-06 Thread Mitch Seymour
Hello,

I'd like to make some contributions to the wiki. Could someone please grant
write permissions to the following user?

Username: mitch-seymour

Thanks,

Mitch


[GitHub] kafka-site issue #65: MINOR: add Interactive Queries docs

2017-07-06 Thread guozhangwang
Github user guozhangwang commented on the issue:

https://github.com/apache/kafka-site/pull/65
  
Actually I think we do not need a PR for this: as a committer anyone can do 
the docs jar packaging -> pasting to `kafka-site` and push to `asf-site` 
branch. You can follow this step of the release process here:


https://cwiki.apache.org/confluence/display/KAFKA/Release+Process#ReleaseProcess-Websiteupdateprocess


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka-site issue #64: MINOR: Add release dates to downloads page

2017-07-06 Thread ewencp
Github user ewencp commented on the issue:

https://github.com/apache/kafka-site/pull/64
  
I grabbed the dates from that page and filled in the beta ones based on 
mailing list posts. The actual dates seem to be inconsistent and the JIRA 
release dates aren't always maintained (I know because I dug up dates and 
closed a few other releases when I last ran one), but tbh the point here is 
just the rough timing anyway.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3496: KAFKA-5464: Follow up. Increase poll timeout

2017-07-06 Thread mjsax
GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/3496

KAFKA-5464: Follow up. Increase poll timeout



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka KAFKA-5464-follow-up

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3496.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3496


commit 4fdca2b613d0934ded999ca8a1d445954fdcb6c4
Author: Matthias J. Sax 
Date:   2017-07-06T17:04:37Z

KAFKA-5464: Follow up. Increase poll timeout




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-trunk-jdk8 #1789

2017-07-06 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-trunk-jdk7 #2498

2017-07-06 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] HOTFIX: fix broken streams test

--
[...truncated 2.45 MB...]

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingForwardToSourceTopology PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingStatefulTopology STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingStatefulTopology PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingInternalRepartitioningTopology STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingInternalRepartitioningTopology PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingInternalRepartitioningForwardingTimestampTopology STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingInternalRepartitioningForwardingTimestampTopology PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingSimpleTopology STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingSimpleTopology PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingSimpleMultiSourceTopology STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingSimpleMultiSourceTopology PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
shouldDriveGlobalStore STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
shouldDriveGlobalStore PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
shouldCreateStringWithMultipleSourcesAndTopics STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
shouldCreateStringWithMultipleSourcesAndTopics PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testTopologyMetadata STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testTopologyMetadata PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingMultiplexByNameTopology STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingMultiplexByNameTopology PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
shouldCreateStringWithSourceAndTopics STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
shouldCreateStringWithSourceAndTopics PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
shouldRecursivelyPrintChildren STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
shouldRecursivelyPrintChildren PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldCloseStateManagerWithOffsets STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldCloseStateManagerWithOffsets PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldProcessRecordsForTopic STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldProcessRecordsForTopic PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldInitializeProcessorTopology STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldInitializeProcessorTopology PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldInitializeContext STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldInitializeContext PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldCheckpointOffsetsWhenStateIsFlushed STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldCheckpointOffsetsWhenStateIsFlushed PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldInitializeStateManager STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldInitializeStateManager PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldProcessRecordsForOtherTopic STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldProcessRecordsForOtherTopic PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
STARTED


[GitHub] kafka pull request #3493: HOTFIX: fix broken streams test

2017-07-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3493


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: RocksDB flushing issue on 0.10.2 streams

2017-07-06 Thread Damian Guy
Hi Greg,
I've been able to reproduce it by running multiple instances with standby
tasks and many threads. If i force some rebalances, then i see the failure.
Now to see if i can repro in a test.
I think it is probably the same issue as:
https://issues.apache.org/jira/browse/KAFKA-5070

On Thu, 6 Jul 2017 at 12:43 Damian Guy  wrote:

> Greg, what OS are you running on?
> Are you able to reproduce this in a test at all?
> For instance, based on what you described it would seem that i should be
> able to start a streams app, wait for it to be up and running, run the
> state dir cleanup, see it fail. However, i can't reproduce it.
>
> On Wed, 5 Jul 2017 at 23:23 Damian Guy  wrote:
>
>> Thanks Greg. I'll look into it more tomorrow. Just finding it difficult
>> to reproduce in a test.
>> Thanks for providing the sequence, gives me something to try and repo.
>> Appreciated.
>>
>> Thanks,
>> Damian
>> On Wed, 5 Jul 2017 at 19:57, Greg Fodor  wrote:
>>
>>> Also, the sequence of events is:
>>>
>>> - Job starts, rebalance happens, things run along smoothly.
>>> - After 10 minutes (retrospectively) the cleanup task kicks on and
>>> removes
>>> some directories
>>> - Tasks immediately start failing when trying to flush their state stores
>>>
>>>
>>>
>>> On Wed, Jul 5, 2017 at 11:55 AM, Greg Fodor  wrote:
>>>
>>> > The issue I am hitting is not the directory locking issues we've seen
>>> in
>>> > the past. The issue seems to be, as you mentioned, that the state dir
>>> is
>>> > getting deleted by the store cleanup process, but there are still tasks
>>> > running that are trying to flush the state store. It seems more than a
>>> > little scary given that right now it seems either a) there are tasks
>>> > running that should have been re-assigned or b) the cleanup job is
>>> removing
>>> > state directories for currently running + assigned tasks (perhaps
>>> during a
>>> > rebalance there is a race condition?) I'm guessing there's probably a
>>> more
>>> > benign explanation, but that is what it looks like right now.
>>> >
>>> > On Wed, Jul 5, 2017 at 7:00 AM, Damian Guy 
>>> wrote:
>>> >
>>> >> BTW - i'm trying to reproduce it, but not having much luck so far...
>>> >>
>>> >> On Wed, 5 Jul 2017 at 09:27 Damian Guy  wrote:
>>> >>
>>> >> > Thans for the updates Greg. There were some minor changes around
>>> this in
>>> >> > 0.11.0 to make it less likely to happen, but we've only ever seen
>>> the
>>> >> > locking fail in the event of a rebalance. When everything is running
>>> >> state
>>> >> > dirs shouldn't be deleted if they are being used as the lock will
>>> fail.
>>> >> >
>>> >> >
>>> >> > On Wed, 5 Jul 2017 at 08:15 Greg Fodor  wrote:
>>> >> >
>>> >> >> I can report that setting state.cleanup.delay.ms to a very large
>>> value
>>> >> >> (effectively disabling it) works around the issue. It seems that
>>> the
>>> >> state
>>> >> >> store cleanup process can somehow get out ahead of another task
>>> that
>>> >> still
>>> >> >> thinks it should be writing to the state store/flushing it. In my
>>> test
>>> >> >> runs, this does not seem to be happening during a rebalancing
>>> event,
>>> >> but
>>> >> >> after the cluster is stable.
>>> >> >>
>>> >> >> On Tue, Jul 4, 2017 at 12:29 PM, Greg Fodor 
>>> wrote:
>>> >> >>
>>> >> >> > Upon another run, I see the same error occur during a rebalance,
>>> so
>>> >> >> either
>>> >> >> > my log was showing a rebalance or there is a shared underlying
>>> issue
>>> >> >> with
>>> >> >> > state stores.
>>> >> >> >
>>> >> >> > On Tue, Jul 4, 2017 at 11:35 AM, Greg Fodor 
>>> >> wrote:
>>> >> >> >
>>> >> >> >> Also, I am on 0.10.2.1, so poll interval was already set to
>>> >> MAX_VALUE.
>>> >> >> >>
>>> >> >> >> On Tue, Jul 4, 2017 at 11:28 AM, Greg Fodor 
>>> >> wrote:
>>> >> >> >>
>>> >> >> >>> I've nuked the nodes this happened on, but the job had been
>>> running
>>> >> >> for
>>> >> >> >>> about 5-10 minutes across 5 nodes before this happened. Does
>>> the
>>> >> log
>>> >> >> show a
>>> >> >> >>> rebalance was happening? It looks to me like the standby task
>>> was
>>> >> just
>>> >> >> >>> committing as part of normal operations.
>>> >> >> >>>
>>> >> >> >>> On Tue, Jul 4, 2017 at 7:40 AM, Damian Guy <
>>> damian@gmail.com>
>>> >> >> wrote:
>>> >> >> >>>
>>> >> >>  Hi Greg,
>>> >> >> 
>>> >> >>  Obviously a bit difficult to read the RocksDBException, but my
>>> >> guess
>>> >> >> is
>>> >> >>  it
>>> >> >>  is because the state directory gets deleted right before the
>>> flush
>>> >> >>  happens:
>>> >> >>  2017-07-04 10:54:46,829 [myid:] - INFO
>>> >> >> [StreamThread-21:StateDirector
>>> >> >>  y@213]
>>> >> >>  - Deleting obsolete state directory 0_10 for task 0_10
>>> >> >> 
>>> >> >>  Yes it looks like it is possibly the same bug 

[jira] [Created] (KAFKA-5564) Fail to create topics with error 'While recording the replica LEO, the partition [topic2,0] hasn't been created'

2017-07-06 Thread Klearchos Chaloulos (JIRA)
Klearchos Chaloulos created KAFKA-5564:
--

 Summary: Fail to create topics with error 'While recording the 
replica LEO, the partition [topic2,0] hasn't been created'
 Key: KAFKA-5564
 URL: https://issues.apache.org/jira/browse/KAFKA-5564
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.9.0.1
Reporter: Klearchos Chaloulos


Hello,

*Short version*
we have seen sporadic occurrences of the following issue: Topics whose leader 
is a specific broker fail to be created properly, and it is impossible to 
produce to them or consume from them.
 The following logs appears in the broker that is the leader of the faulty 
topics:
{noformat}
[2017-07-05 05:22:15,564] WARN [Replica Manager on Broker 3]: While recording 
the replica LEO, the partition [topic2,0] hasn't been created. 
(kafka.server.ReplicaManager)
{noformat}

*Detailed version*:
Our setup consists of three brokers with ids 1, 2, 3. Broker 2 is the 
controller. We create 7 topics called topic1, topic2, topic3, topic4, topic5, 
topic6, topic7.

Sometimes (sporadically) some of the topics are faulty. In the particular 
example I describe here the faulty topics are topics are topic6, topic4, 
topic2, topic3. The faulty topics all have the same leader broker 3.

If we do a kafka-topics.sh --describe on the topics we see that for topics that 
do not have broker 3 as leader, the in sync replicas report that broker 3 is 
not synced:
{noformat}
 bin/kafka-topics.sh --describe --zookeeper zookeeper:2181/kafka
Topic:topic6PartitionCount:1ReplicationFactor:3 Configs:
Topic: topic6   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 
3,1,2
Topic:topic5PartitionCount:1ReplicationFactor:3 
Configs:retention.ms=30
Topic: topic5   Partition: 0Leader: 2   Replicas: 2,3,1 Isr: 2,1
Topic:topic7PartitionCount:1ReplicationFactor:3 Configs:
Topic: topic7   Partition: 0Leader: 1   Replicas: 1,3,2 Isr: 1,2
Topic:topic4PartitionCount:1ReplicationFactor:3 Configs:
Topic: topic4   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 
3,1,2
Topic:topic1PartitionCount:1ReplicationFactor:3 Configs:
Topic: topic1   Partition: 0Leader: 2   Replicas: 2,1,3 Isr: 2,1
Topic:topic2PartitionCount:1ReplicationFactor:3 Configs:
Topic: topic2   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 
3,1,2
Topic:topic3PartitionCount:1ReplicationFactor:3 Configs:
Topic: topic3   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 
3,1,2
{noformat}
While for the faulty topics it is reported that all replicas are in sync.

Also, the topic directories under the log.dir folder were not created in the 
faulty broker 3.

We see the following logs in broker 3, which is the leader of the faulty topics:
{noformat}
[2017-07-05 05:22:15,564] WARN [Replica Manager on Broker 3]: While recording 
the replica LEO, the partition [topic2,0] hasn't been created. 
(kafka.server.ReplicaManager)
{noformat}
The above log is logged continuously.

and the following error logs in the other 2 brokers, the replicas:
{noformat}
ERROR [ReplicaFetcherThread-0-3], Error for partition [topic3,0] to broker 
3:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
does not host this topic-partition
{noformat}
Again the above log is logged continuously.

The issue described above occurs immediately after the deployment of the kafka 
cluster.
A restart of the faulty broker (3 in this case) fixes the problem and the 
faulty topics work normally.

I have also attached the broker configuration we use.

Do you have any idea what might cause this issue?

Best regards,

Klearchos




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] kafka pull request #3495: KAFKA-3362: Update protocol schema and field doc s...

2017-07-06 Thread andrasbeni
GitHub user andrasbeni opened a pull request:

https://github.com/apache/kafka/pull/3495

KAFKA-3362: Update protocol schema and field doc strings

Added doc to protocol fields

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andrasbeni/kafka KAFKA-3362

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3495.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3495


commit ad144271aa8a491faa31be998bf2bfa32ac87847
Author: Andras Beni 
Date:   2017-06-22T07:10:43Z

KAFKA-3362: Update protocol schema and field doc strings

Added doc to protocol fields




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3494: MINOR: Move quickstart under streams

2017-07-06 Thread enothereska
GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/3494

MINOR: Move quickstart under streams



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka minor-quickstart-docs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3494.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3494


commit efe629b31db591699df52cb73c7542ecb42d74f3
Author: Eno Thereska 
Date:   2017-07-06T13:02:08Z

Move quickstart under streams




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk8 #1788

2017-07-06 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] KAFKA-5372; fixes to state transitions

[damian.guy] KAFKA-5508; Documentation for altering topics

--
[...truncated 2.46 MB...]

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > 
shouldOnlyReadRecordsWhereEarliestSpecifiedWithNoCommittedOffsetsWithDefaultGlobalAutoOffsetResetEarliest
 PASSED

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > shouldThrowStreamsExceptionNoResetSpecified STARTED

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > shouldThrowStreamsExceptionNoResetSpecified PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenDeleted STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenDeleted PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed STARTED


Re: RocksDB flushing issue on 0.10.2 streams

2017-07-06 Thread Damian Guy
Greg, what OS are you running on?
Are you able to reproduce this in a test at all?
For instance, based on what you described it would seem that i should be
able to start a streams app, wait for it to be up and running, run the
state dir cleanup, see it fail. However, i can't reproduce it.

On Wed, 5 Jul 2017 at 23:23 Damian Guy  wrote:

> Thanks Greg. I'll look into it more tomorrow. Just finding it difficult to
> reproduce in a test.
> Thanks for providing the sequence, gives me something to try and repo.
> Appreciated.
>
> Thanks,
> Damian
> On Wed, 5 Jul 2017 at 19:57, Greg Fodor  wrote:
>
>> Also, the sequence of events is:
>>
>> - Job starts, rebalance happens, things run along smoothly.
>> - After 10 minutes (retrospectively) the cleanup task kicks on and removes
>> some directories
>> - Tasks immediately start failing when trying to flush their state stores
>>
>>
>>
>> On Wed, Jul 5, 2017 at 11:55 AM, Greg Fodor  wrote:
>>
>> > The issue I am hitting is not the directory locking issues we've seen in
>> > the past. The issue seems to be, as you mentioned, that the state dir is
>> > getting deleted by the store cleanup process, but there are still tasks
>> > running that are trying to flush the state store. It seems more than a
>> > little scary given that right now it seems either a) there are tasks
>> > running that should have been re-assigned or b) the cleanup job is
>> removing
>> > state directories for currently running + assigned tasks (perhaps
>> during a
>> > rebalance there is a race condition?) I'm guessing there's probably a
>> more
>> > benign explanation, but that is what it looks like right now.
>> >
>> > On Wed, Jul 5, 2017 at 7:00 AM, Damian Guy 
>> wrote:
>> >
>> >> BTW - i'm trying to reproduce it, but not having much luck so far...
>> >>
>> >> On Wed, 5 Jul 2017 at 09:27 Damian Guy  wrote:
>> >>
>> >> > Thans for the updates Greg. There were some minor changes around
>> this in
>> >> > 0.11.0 to make it less likely to happen, but we've only ever seen the
>> >> > locking fail in the event of a rebalance. When everything is running
>> >> state
>> >> > dirs shouldn't be deleted if they are being used as the lock will
>> fail.
>> >> >
>> >> >
>> >> > On Wed, 5 Jul 2017 at 08:15 Greg Fodor  wrote:
>> >> >
>> >> >> I can report that setting state.cleanup.delay.ms to a very large
>> value
>> >> >> (effectively disabling it) works around the issue. It seems that the
>> >> state
>> >> >> store cleanup process can somehow get out ahead of another task that
>> >> still
>> >> >> thinks it should be writing to the state store/flushing it. In my
>> test
>> >> >> runs, this does not seem to be happening during a rebalancing event,
>> >> but
>> >> >> after the cluster is stable.
>> >> >>
>> >> >> On Tue, Jul 4, 2017 at 12:29 PM, Greg Fodor 
>> wrote:
>> >> >>
>> >> >> > Upon another run, I see the same error occur during a rebalance,
>> so
>> >> >> either
>> >> >> > my log was showing a rebalance or there is a shared underlying
>> issue
>> >> >> with
>> >> >> > state stores.
>> >> >> >
>> >> >> > On Tue, Jul 4, 2017 at 11:35 AM, Greg Fodor 
>> >> wrote:
>> >> >> >
>> >> >> >> Also, I am on 0.10.2.1, so poll interval was already set to
>> >> MAX_VALUE.
>> >> >> >>
>> >> >> >> On Tue, Jul 4, 2017 at 11:28 AM, Greg Fodor 
>> >> wrote:
>> >> >> >>
>> >> >> >>> I've nuked the nodes this happened on, but the job had been
>> running
>> >> >> for
>> >> >> >>> about 5-10 minutes across 5 nodes before this happened. Does the
>> >> log
>> >> >> show a
>> >> >> >>> rebalance was happening? It looks to me like the standby task
>> was
>> >> just
>> >> >> >>> committing as part of normal operations.
>> >> >> >>>
>> >> >> >>> On Tue, Jul 4, 2017 at 7:40 AM, Damian Guy <
>> damian@gmail.com>
>> >> >> wrote:
>> >> >> >>>
>> >> >>  Hi Greg,
>> >> >> 
>> >> >>  Obviously a bit difficult to read the RocksDBException, but my
>> >> guess
>> >> >> is
>> >> >>  it
>> >> >>  is because the state directory gets deleted right before the
>> flush
>> >> >>  happens:
>> >> >>  2017-07-04 10:54:46,829 [myid:] - INFO
>> >> >> [StreamThread-21:StateDirector
>> >> >>  y@213]
>> >> >>  - Deleting obsolete state directory 0_10 for task 0_10
>> >> >> 
>> >> >>  Yes it looks like it is possibly the same bug as KAFKA-5070.
>> >> >> 
>> >> >>  It looks like your application is constantly rebalancing during
>> >> store
>> >> >>  intialization, which may be the reason this bug comes about
>> (there
>> >> >> is a
>> >> >>  chance that the state dir lock is released so when the thread
>> >> tries
>> >> >> to
>> >> >>  removes the stale state directory it is able to get the lock).
>> You
>> >> >>  probably
>> >> >>  want to configure `max.poll.interval.ms` to be a reasonably
>> large
>> >> >> 

[GitHub] kafka pull request #3492: HOTFIX: Hotfix for trunk failing

2017-07-06 Thread enothereska
Github user enothereska closed the pull request at:

https://github.com/apache/kafka/pull/3492


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-0.11.0-jdk7 #210

2017-07-06 Thread Apache Jenkins Server
See 




[GitHub] kafka pull request #3493: HOTFIX: fix broken streams test

2017-07-06 Thread dguy
GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/3493

HOTFIX: fix broken streams test



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka hotfix-test-failure

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3493.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3493


commit 0e940198076ea65387b408587e146bb08aa75e3a
Author: Damian Guy 
Date:   2017-07-06T11:05:35Z

fix broken tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk7 #2497

2017-07-06 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] KAFKA-5372; fixes to state transitions

[damian.guy] KAFKA-5508; Documentation for altering topics

--
[...truncated 2.46 MB...]

org.apache.kafka.streams.KafkaStreamsTest > testIllegalMetricsConfig PASSED

org.apache.kafka.streams.KafkaStreamsTest > shouldNotGetAllTasksWhenNotRunning 
STARTED

org.apache.kafka.streams.KafkaStreamsTest > shouldNotGetAllTasksWhenNotRunning 
PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
testInitializesAndDestroysMetricsReporters STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
testInitializesAndDestroysMetricsReporters PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetTaskWithKeyAndPartitionerWhenNotRunning STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetTaskWithKeyAndPartitionerWhenNotRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testLegalMetricsConfig STARTED

org.apache.kafka.streams.KafkaStreamsTest > testLegalMetricsConfig PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetTaskWithKeyAndSerializerWhenNotRunning STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetTaskWithKeyAndSerializerWhenNotRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testToString STARTED

org.apache.kafka.streams.KafkaStreamsTest > testToString PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStateCloseAfterCreate STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStateCloseAfterCreate PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldReturnFalseOnCloseWhenThreadsHaventTerminated STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldReturnFalseOnCloseWhenThreadsHaventTerminated PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetAllTasksWithStoreWhenNotRunning STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetAllTasksWithStoreWhenNotRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartOnceClosed STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartOnceClosed PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStateGlobalThreadClose STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStateGlobalThreadClose PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup PASSED

org.apache.kafka.streams.KafkaStreamsTest > testNumberDefaultMetrics STARTED

org.apache.kafka.streams.KafkaStreamsTest > testNumberDefaultMetrics PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning 
STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStateThreadClose STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStateThreadClose PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStateChanges STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStateChanges PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
STARTED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs PASSED


[GitHub] kafka pull request #3492: HOTFIX: Hotfix for trunk failing

2017-07-06 Thread enothereska
GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/3492

HOTFIX: Hotfix for trunk failing

Disable 3 tests that need rethinking. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka hotfix-trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3492.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3492


commit 78d3b1f885763434414cbc8784e91f5750043fc0
Author: Eno Thereska 
Date:   2017-07-06T10:36:50Z

Hotfix for trunk failing




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-5563) Clarify handling of connector name in config

2017-07-06 Thread JIRA
Sönke Liebau created KAFKA-5563:
---

 Summary: Clarify handling of connector name in config 
 Key: KAFKA-5563
 URL: https://issues.apache.org/jira/browse/KAFKA-5563
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.11.0.0
Reporter: Sönke Liebau
Priority: Minor


The connector name is currently being stored in two places, once at the root 
level of the connector and once in the config:
{code:java}
{
"name": "test",
"config": {
"connector.class": 
"org.apache.kafka.connect.tools.MockSinkConnector",
"tasks.max": "3",
"topics": "test-topic",
"name": "test"
},
"tasks": [
{
"connector": "test",
"task": 0
}
]
}
{code}

If no name is provided in the "config" element, then the name from the root 
level is [copied there when the connector is being 
created|https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectorsResource.java#L95].
 If however a name is provided in the config then it is not touched, which 
means it is possible to create a connector with a different name at the root 
level and in the config like this:
{code:java}
{
"name": "test1",
"config": {
"connector.class": 
"org.apache.kafka.connect.tools.MockSinkConnector",
"tasks.max": "3",
"topics": "test-topic",
"name": "differentname"
},
"tasks": [
{
"connector": "test1",
"task": 0
}
]
}
{code}

I am not aware of any issues that this currently causes, but it is at least 
confusing and probably not intended behavior and definitely bears potential for 
bugs, if different functions take the name from different places.

Would it make sense to add a check to reject requests that provide different 
names in the request and the config section?




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5562) Do streams state directory cleanup on a single thread

2017-07-06 Thread Damian Guy (JIRA)
Damian Guy created KAFKA-5562:
-

 Summary: Do streams state directory cleanup on a single thread
 Key: KAFKA-5562
 URL: https://issues.apache.org/jira/browse/KAFKA-5562
 Project: Kafka
  Issue Type: Bug
Reporter: Damian Guy
Assignee: Damian Guy


Currently in streams we clean up old state directories every so often (as 
defined by {{state.cleanup.delay.ms}}). However, every StreamThread runs the 
cleanup, which is both unnecessary and can potentially lead to race conditions.

It would be better to perform the state cleanup on a single thread and only 
when the {{KafkaStreams}} instance is in a running state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: kafka-trunk-jdk7 #2496

2017-07-06 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] KAFKA-5464; StreamsKafkaClient should not use

--
[...truncated 2.45 MB...]
org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerOuterJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerOuterJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerOuterJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerOuterJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftOuterJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftOuterJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoinQueryable PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs PASSED

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > 

[jira] [Created] (KAFKA-5561) Rewrite TopicCommand using the new Admin client

2017-07-06 Thread Paolo Patierno (JIRA)
Paolo Patierno created KAFKA-5561:
-

 Summary: Rewrite TopicCommand using the new Admin client
 Key: KAFKA-5561
 URL: https://issues.apache.org/jira/browse/KAFKA-5561
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Paolo Patierno
Assignee: Paolo Patierno


Hi, 

as suggested in the https://issues.apache.org/jira/browse/KAFKA-3331, it could 
be great to have the TopicCommand using the new Admin client instead of the way 
it works today.
As pushed by [~gwenshap] in the above JIRA, I'm going to work on it.

Thanks,
Paolo



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: kafka-0.11.0-jdk7 #209

2017-07-06 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] KAFKA-5464; StreamsKafkaClient should not use

--
[...truncated 969.68 KB...]
kafka.producer.AsyncProducerTest > testBatchSize PASSED

kafka.producer.AsyncProducerTest > testSerializeEvents STARTED

kafka.producer.AsyncProducerTest > testSerializeEvents PASSED

kafka.producer.AsyncProducerTest > testProducerQueueSize STARTED

kafka.producer.AsyncProducerTest > testProducerQueueSize PASSED

kafka.producer.AsyncProducerTest > testRandomPartitioner STARTED

kafka.producer.AsyncProducerTest > testRandomPartitioner PASSED

kafka.producer.AsyncProducerTest > testInvalidConfiguration STARTED

kafka.producer.AsyncProducerTest > testInvalidConfiguration PASSED

kafka.producer.AsyncProducerTest > testInvalidPartition STARTED

kafka.producer.AsyncProducerTest > testInvalidPartition PASSED

kafka.producer.AsyncProducerTest > testNoBroker STARTED

kafka.producer.AsyncProducerTest > testNoBroker PASSED

kafka.producer.AsyncProducerTest > testProduceAfterClosed STARTED

kafka.producer.AsyncProducerTest > testProduceAfterClosed PASSED

kafka.producer.AsyncProducerTest > testJavaProducer STARTED

kafka.producer.AsyncProducerTest > testJavaProducer PASSED

kafka.producer.AsyncProducerTest > testIncompatibleEncoder STARTED

kafka.producer.AsyncProducerTest > testIncompatibleEncoder PASSED

kafka.producer.SyncProducerTest > testReachableServer STARTED

kafka.producer.SyncProducerTest > testReachableServer PASSED

kafka.producer.SyncProducerTest > testMessageSizeTooLarge STARTED

kafka.producer.SyncProducerTest > testMessageSizeTooLarge PASSED

kafka.producer.SyncProducerTest > testNotEnoughReplicas STARTED

kafka.producer.SyncProducerTest > testNotEnoughReplicas PASSED

kafka.producer.SyncProducerTest > testMessageSizeTooLargeWithAckZero STARTED

kafka.producer.SyncProducerTest > testMessageSizeTooLargeWithAckZero PASSED

kafka.producer.SyncProducerTest > testProducerCanTimeout STARTED

kafka.producer.SyncProducerTest > testProducerCanTimeout PASSED

kafka.producer.SyncProducerTest > testProduceRequestWithNoResponse STARTED

kafka.producer.SyncProducerTest > testProduceRequestWithNoResponse PASSED

kafka.producer.SyncProducerTest > testEmptyProduceRequest STARTED

kafka.producer.SyncProducerTest > testEmptyProduceRequest PASSED

kafka.producer.SyncProducerTest > testProduceCorrectlyReceivesResponse STARTED

kafka.producer.SyncProducerTest > testProduceCorrectlyReceivesResponse PASSED

kafka.producer.ProducerTest > testSendToNewTopic STARTED

kafka.producer.ProducerTest > testSendToNewTopic PASSED

kafka.producer.ProducerTest > testAsyncSendCanCorrectlyFailWithTimeout STARTED

kafka.producer.ProducerTest > testAsyncSendCanCorrectlyFailWithTimeout PASSED

kafka.producer.ProducerTest > testSendNullMessage STARTED

kafka.producer.ProducerTest > testSendNullMessage PASSED

kafka.producer.ProducerTest > testUpdateBrokerPartitionInfo STARTED

kafka.producer.ProducerTest > testUpdateBrokerPartitionInfo PASSED

kafka.producer.ProducerTest > testSendWithDeadBroker STARTED

kafka.producer.ProducerTest > testSendWithDeadBroker PASSED

unit.kafka.server.KafkaApisTest > 
shouldRespondWithUnsupportedForMessageFormatOnHandleWriteTxnMarkersWhenMagicLowerThanRequired
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldRespondWithUnsupportedForMessageFormatOnHandleWriteTxnMarkersWhenMagicLowerThanRequired
 PASSED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleTxnOffsetCommitRequestWhenInterBrokerProtocolNotSupported
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleTxnOffsetCommitRequestWhenInterBrokerProtocolNotSupported
 PASSED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleAddPartitionsToTxnRequestWhenInterBrokerProtocolNotSupported
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleAddPartitionsToTxnRequestWhenInterBrokerProtocolNotSupported
 PASSED

unit.kafka.server.KafkaApisTest > testReadUncommittedConsumerListOffsetLatest 
STARTED

unit.kafka.server.KafkaApisTest > testReadUncommittedConsumerListOffsetLatest 
PASSED

unit.kafka.server.KafkaApisTest > 
shouldAppendToLogOnWriteTxnMarkersWhenCorrectMagicVersion STARTED

unit.kafka.server.KafkaApisTest > 
shouldAppendToLogOnWriteTxnMarkersWhenCorrectMagicVersion PASSED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleWriteTxnMarkersRequestWhenInterBrokerProtocolNotSupported
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleWriteTxnMarkersRequestWhenInterBrokerProtocolNotSupported
 PASSED

unit.kafka.server.KafkaApisTest > 
testReadCommittedConsumerListOffsetEarliestOffsetEqualsLastStableOffset STARTED

unit.kafka.server.KafkaApisTest > 
testReadCommittedConsumerListOffsetEarliestOffsetEqualsLastStableOffset PASSED


Build failed in Jenkins: kafka-0.10.2-jdk7 #180

2017-07-06 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] KAFKA-5464; StreamsKafkaClient should not use

--
[...truncated 160.53 KB...]
kafka.server.AbstractFetcherThreadTest > testConsumerLagRemovedWithPartition 
STARTED

kafka.server.AbstractFetcherThreadTest > testConsumerLagRemovedWithPartition 
PASSED

kafka.server.AbstractFetcherThreadTest > 
testFetchRequestCorruptedMessageException STARTED

kafka.server.AbstractFetcherThreadTest > 
testFetchRequestCorruptedMessageException PASSED

kafka.server.AbstractFetcherThreadTest > testMetricsRemovedOnShutdown STARTED

kafka.server.AbstractFetcherThreadTest > testMetricsRemovedOnShutdown PASSED

kafka.server.CreateTopicsRequestTest > testValidCreateTopicsRequests STARTED

kafka.server.CreateTopicsRequestTest > testValidCreateTopicsRequests PASSED

kafka.server.CreateTopicsRequestTest > testErrorCreateTopicsRequests STARTED

kafka.server.CreateTopicsRequestTest > testErrorCreateTopicsRequests PASSED

kafka.server.CreateTopicsRequestTest > testInvalidCreateTopicsRequests STARTED

kafka.server.CreateTopicsRequestTest > testInvalidCreateTopicsRequests PASSED

kafka.server.CreateTopicsRequestTest > testNotController STARTED

kafka.server.CreateTopicsRequestTest > testNotController PASSED

kafka.server.LogOffsetTest > testFetchOffsetsBeforeWithChangingSegmentSize 
STARTED

kafka.server.LogOffsetTest > testFetchOffsetsBeforeWithChangingSegmentSize 
PASSED

kafka.server.LogOffsetTest > testGetOffsetsBeforeEarliestTime STARTED

kafka.server.LogOffsetTest > testGetOffsetsBeforeEarliestTime PASSED

kafka.server.LogOffsetTest > testGetOffsetsForUnknownTopic STARTED

kafka.server.LogOffsetTest > testGetOffsetsForUnknownTopic PASSED

kafka.server.LogOffsetTest > testEmptyLogsGetOffsets STARTED

kafka.server.LogOffsetTest > testEmptyLogsGetOffsets PASSED

kafka.server.LogOffsetTest > testFetchOffsetsBeforeWithChangingSegments STARTED

kafka.server.LogOffsetTest > testFetchOffsetsBeforeWithChangingSegments PASSED

kafka.server.LogOffsetTest > testGetOffsetsBeforeLatestTime STARTED

kafka.server.LogOffsetTest > testGetOffsetsBeforeLatestTime PASSED

kafka.server.LogOffsetTest > testGetOffsetsBeforeNow STARTED

kafka.server.LogOffsetTest > testGetOffsetsBeforeNow PASSED

kafka.server.ReplicaManagerTest > testHighWaterMarkDirectoryMapping STARTED

kafka.server.ReplicaManagerTest > testHighWaterMarkDirectoryMapping PASSED

kafka.server.ReplicaManagerTest > 
testFetchBeyondHighWatermarkReturnEmptyResponse STARTED

kafka.server.ReplicaManagerTest > 
testFetchBeyondHighWatermarkReturnEmptyResponse PASSED

kafka.server.ReplicaManagerTest > testIllegalRequiredAcks STARTED

kafka.server.ReplicaManagerTest > testIllegalRequiredAcks PASSED

kafka.server.ReplicaManagerTest > testClearPurgatoryOnBecomingFollower STARTED

kafka.server.ReplicaManagerTest > testClearPurgatoryOnBecomingFollower PASSED

kafka.server.ReplicaManagerTest > testHighwaterMarkRelativeDirectoryMapping 
STARTED

kafka.server.ReplicaManagerTest > testHighwaterMarkRelativeDirectoryMapping 
PASSED

kafka.server.ServerShutdownTest > testCleanShutdownAfterFailedStartup STARTED

kafka.server.ServerShutdownTest > testCleanShutdownAfterFailedStartup PASSED

kafka.server.ServerShutdownTest > testConsecutiveShutdown STARTED

kafka.server.ServerShutdownTest > testConsecutiveShutdown PASSED

kafka.server.ServerShutdownTest > testCleanShutdown STARTED

kafka.server.ServerShutdownTest > testCleanShutdown PASSED

kafka.server.ServerShutdownTest > testCleanShutdownWithDeleteTopicEnabled 
STARTED

kafka.server.ServerShutdownTest > testCleanShutdownWithDeleteTopicEnabled PASSED

kafka.server.ServerGenerateBrokerIdTest > testGetSequenceIdMethod STARTED

kafka.server.ServerGenerateBrokerIdTest > testGetSequenceIdMethod PASSED

kafka.server.ServerGenerateBrokerIdTest > testBrokerMetadataOnIdCollision 
STARTED

kafka.server.ServerGenerateBrokerIdTest > testBrokerMetadataOnIdCollision PASSED

kafka.server.ServerGenerateBrokerIdTest > testAutoGenerateBrokerId STARTED

kafka.server.ServerGenerateBrokerIdTest > testAutoGenerateBrokerId PASSED

kafka.server.ServerGenerateBrokerIdTest > testMultipleLogDirsMetaProps STARTED

kafka.server.ServerGenerateBrokerIdTest > testMultipleLogDirsMetaProps PASSED

kafka.server.ServerGenerateBrokerIdTest > testDisableGeneratedBrokerId STARTED

kafka.server.ServerGenerateBrokerIdTest > testDisableGeneratedBrokerId PASSED

kafka.server.ServerGenerateBrokerIdTest > testUserConfigAndGeneratedBrokerId 
STARTED

kafka.server.ServerGenerateBrokerIdTest > testUserConfigAndGeneratedBrokerId 
PASSED

kafka.server.ServerGenerateBrokerIdTest > 
testConsistentBrokerIdFromUserConfigAndMetaProps STARTED

kafka.server.ServerGenerateBrokerIdTest > 
testConsistentBrokerIdFromUserConfigAndMetaProps PASSED

kafka.server.DelayedOperationTest > testRequestPurge STARTED

kafka.server.DelayedOperationTest > testRequestPurge PASSED


[GitHub] kafka pull request #3429: KAFKA-5508: Documentation for altering topics

2017-07-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3429


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-5508) Documentation for altering topics

2017-07-06 Thread Damian Guy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Guy resolved KAFKA-5508.
---
   Resolution: Fixed
Fix Version/s: 0.11.0.1
   0.11.1.0

Issue resolved by pull request 3429
[https://github.com/apache/kafka/pull/3429]

> Documentation for altering topics
> -
>
> Key: KAFKA-5508
> URL: https://issues.apache.org/jira/browse/KAFKA-5508
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Reporter: Tom Bentley
>Assignee: huxihx
>Priority: Minor
> Fix For: 0.11.1.0, 0.11.0.1
>
>
> According to the upgrade documentation:
> bq. Altering topic configuration from the kafka-topics.sh script 
> (kafka.admin.TopicCommand) has been deprecated. Going forward, please use the 
> kafka-configs.sh script (kafka.admin.ConfigCommand) for this functionality. 
> But the Operations documentation still tells people to use kafka-topics.sh to 
> alter their topic configurations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] kafka pull request #3432: KAFKA-5372: fixes to state transitions

2017-07-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3432


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3439: KAFKA-5464: StreamsKafkaClient should not use Stre...

2017-07-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3439


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [ANNOUNCE] New Kafka PMC member Ismael Juma

2017-07-06 Thread Ismael Juma
Thanks everyone!

Ismael

On Wed, Jul 5, 2017 at 9:55 PM, Jun Rao  wrote:

> Hi, Everyone,
>
> Ismael Juma has been active in the Kafka community since he became
> a Kafka committer about a year ago. I am glad to announce that Ismael is
> now a member of Kafka PMC.
>
> Congratulations, Ismael!
>
> Jun
>


[GitHub] kafka-site pull request #65: MINOR: add Interactive Queries docs

2017-07-06 Thread dguy
GitHub user dguy opened a pull request:

https://github.com/apache/kafka-site/pull/65

MINOR: add Interactive Queries docs



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka-site iq-doc-update

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka-site/pull/65.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #65


commit 1bd56196b679d2923bcf72c0d1bc61ac111f700e
Author: Damian Guy 
Date:   2017-07-06T08:19:59Z

add IQ docs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka-site issue #65: MINOR: add Interactive Queries docs

2017-07-06 Thread dguy
Github user dguy commented on the issue:

https://github.com/apache/kafka-site/pull/65
  
@guozhangwang 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3491: HOTFIX: Fixes to metric names

2017-07-06 Thread enothereska
GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/3491

HOTFIX: Fixes to metric names

A couple of fixes to metric names to match the KIP
- Removed extra strings in the metric names that are already in the tags
- add a separate metric for "all"

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka hotfix-metric-names

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3491.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3491


commit 31edf2c02ec3d033ebcf29ebccb38c70c2851110
Author: Eno Thereska 
Date:   2017-07-06T08:08:12Z

Fixes to metric names




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: 答复: [ANNOUNCE] New Kafka PMC member Ismael Juma

2017-07-06 Thread Paolo Patierno
Congratulations Ismael ... well deserved !


Paolo Patierno
Senior Software Engineer (IoT) @ Red Hat
Microsoft MVP on Windows Embedded & IoT
Microsoft Azure Advisor

Twitter : @ppatierno
Linkedin : paolopatierno
Blog : DevExperience



From: Mickael Maison 
Sent: Thursday, July 6, 2017 8:01 AM
To: dev@kafka.apache.org
Subject: Re: 答复: [ANNOUNCE] New Kafka PMC member Ismael Juma

Congratulations Ismael !

On Thu, Jul 6, 2017 at 8:57 AM, Hu Xi  wrote:
> Congrats, Ismael 
>
>
> 
> 发件人: Molnár Bálint 
> 发送时间: 2017年7月6日 15:51
> 收件人: dev@kafka.apache.org
> 主题: Re: [ANNOUNCE] New Kafka PMC member Ismael Juma
>
> Congrats, Ismael:)
>
> 2017-07-06 2:57 GMT+02:00 Matthias J. Sax :
>
>> Congrats!
>>
>> On 7/5/17 3:42 PM, Roger Hoover wrote:
>> > Well deserved, indeed!  Congrats, Ismael.
>> >
>> > On Wed, Jul 5, 2017 at 3:24 PM, Damian Guy  wrote:
>> >
>> >> Congratulations Ismael! Very well deserved.
>> >> Cheers,
>> >> Damian
>> >> On Wed, 5 Jul 2017 at 22:54, Dong Lin  wrote:
>> >>
>> >>> Congratulations Ismael!
>> >>>
>> >>> On Wed, Jul 5, 2017 at 1:55 PM, Jun Rao  wrote:
>> >>>
>>  Hi, Everyone,
>> 
>>  Ismael Juma has been active in the Kafka community since he became
>>  a Kafka committer about a year ago. I am glad to announce that Ismael
>> >> is
>>  now a member of Kafka PMC.
>> 
>>  Congratulations, Ismael!
>> 
>>  Jun
>> 
>> >>>
>> >>
>> >
>>
>>


Re: 答复: [ANNOUNCE] New Kafka PMC member Ismael Juma

2017-07-06 Thread Mickael Maison
Congratulations Ismael !

On Thu, Jul 6, 2017 at 8:57 AM, Hu Xi  wrote:
> Congrats, Ismael 
>
>
> 
> 发件人: Molnár Bálint 
> 发送时间: 2017年7月6日 15:51
> 收件人: dev@kafka.apache.org
> 主题: Re: [ANNOUNCE] New Kafka PMC member Ismael Juma
>
> Congrats, Ismael:)
>
> 2017-07-06 2:57 GMT+02:00 Matthias J. Sax :
>
>> Congrats!
>>
>> On 7/5/17 3:42 PM, Roger Hoover wrote:
>> > Well deserved, indeed!  Congrats, Ismael.
>> >
>> > On Wed, Jul 5, 2017 at 3:24 PM, Damian Guy  wrote:
>> >
>> >> Congratulations Ismael! Very well deserved.
>> >> Cheers,
>> >> Damian
>> >> On Wed, 5 Jul 2017 at 22:54, Dong Lin  wrote:
>> >>
>> >>> Congratulations Ismael!
>> >>>
>> >>> On Wed, Jul 5, 2017 at 1:55 PM, Jun Rao  wrote:
>> >>>
>>  Hi, Everyone,
>> 
>>  Ismael Juma has been active in the Kafka community since he became
>>  a Kafka committer about a year ago. I am glad to announce that Ismael
>> >> is
>>  now a member of Kafka PMC.
>> 
>>  Congratulations, Ismael!
>> 
>>  Jun
>> 
>> >>>
>> >>
>> >
>>
>>


答复: [ANNOUNCE] New Kafka PMC member Ismael Juma

2017-07-06 Thread Hu Xi
Congrats, Ismael 



发件人: Molnár Bálint 
发送时间: 2017年7月6日 15:51
收件人: dev@kafka.apache.org
主题: Re: [ANNOUNCE] New Kafka PMC member Ismael Juma

Congrats, Ismael:)

2017-07-06 2:57 GMT+02:00 Matthias J. Sax :

> Congrats!
>
> On 7/5/17 3:42 PM, Roger Hoover wrote:
> > Well deserved, indeed!  Congrats, Ismael.
> >
> > On Wed, Jul 5, 2017 at 3:24 PM, Damian Guy  wrote:
> >
> >> Congratulations Ismael! Very well deserved.
> >> Cheers,
> >> Damian
> >> On Wed, 5 Jul 2017 at 22:54, Dong Lin  wrote:
> >>
> >>> Congratulations Ismael!
> >>>
> >>> On Wed, Jul 5, 2017 at 1:55 PM, Jun Rao  wrote:
> >>>
>  Hi, Everyone,
> 
>  Ismael Juma has been active in the Kafka community since he became
>  a Kafka committer about a year ago. I am glad to announce that Ismael
> >> is
>  now a member of Kafka PMC.
> 
>  Congratulations, Ismael!
> 
>  Jun
> 
> >>>
> >>
> >
>
>


[jira] [Created] (KAFKA-5560) LogManager should be able to create new logs based on free disk space

2017-07-06 Thread huxihx (JIRA)
huxihx created KAFKA-5560:
-

 Summary: LogManager should be able to create new logs based on 
free disk space
 Key: KAFKA-5560
 URL: https://issues.apache.org/jira/browse/KAFKA-5560
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.11.0.0
Reporter: huxihx


Currently, log manager chooses a directory configured in `log.dirs` by 
calculating the number partitions in each directory and then choosing the one 
with the fewest partitions. But in some real production scenarios where data 
volumes of partitions are not even, some disks nearly become full whereas the 
others have a lot of spaces which lead to a poor data distribution.

We should offer a new strategy to users to have log manager honor the real disk 
free spaces and choose the directory with the most disk space. Maybe a new 
broker configuration parameter is needed, `log.directory.strategy` for 
instance. Perhaps this needs a new KIP also.

Does it make sense?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [ANNOUNCE] New Kafka PMC member Ismael Juma

2017-07-06 Thread Molnár Bálint
Congrats, Ismael:)

2017-07-06 2:57 GMT+02:00 Matthias J. Sax :

> Congrats!
>
> On 7/5/17 3:42 PM, Roger Hoover wrote:
> > Well deserved, indeed!  Congrats, Ismael.
> >
> > On Wed, Jul 5, 2017 at 3:24 PM, Damian Guy  wrote:
> >
> >> Congratulations Ismael! Very well deserved.
> >> Cheers,
> >> Damian
> >> On Wed, 5 Jul 2017 at 22:54, Dong Lin  wrote:
> >>
> >>> Congratulations Ismael!
> >>>
> >>> On Wed, Jul 5, 2017 at 1:55 PM, Jun Rao  wrote:
> >>>
>  Hi, Everyone,
> 
>  Ismael Juma has been active in the Kafka community since he became
>  a Kafka committer about a year ago. I am glad to announce that Ismael
> >> is
>  now a member of Kafka PMC.
> 
>  Congratulations, Ismael!
> 
>  Jun
> 
> >>>
> >>
> >
>
>


[GitHub] kafka-site issue #64: MINOR: Add release dates to downloads page

2017-07-06 Thread ijuma
Github user ijuma commented on the issue:

https://github.com/apache/kafka-site/pull/64
  
Good idea. As you said, the dates seem to differ slightly from: 
https://issues.apache.org/jira/browse/KAFKA/?selectedTab=com.atlassian.jira.jira-projects-plugin:versions-panel

It would be nice to get pre-0.8.2 dates since we're doing this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KAFKA-4930 & KAFKA 4938 - Treatment of name parameter in create connector requests

2017-07-06 Thread Sönke Liebau
I've updated the pull request to behave as follows:
 - reject create requests that contain no "name" element with a
BadRequestException
 - reject name that are empty or contain illegal characters with a
ConfigException
 - leave current logic around when to copy the name from the create request
to the config element intact
 - added unit tests for the validator to check that illegal characters are
correctly identified

The list of illegal characters is the result of some quick testing I did,
all of the characters in the list currently cause issues when used in a
connector name (similar to KAFKA-4827), so this should not break anything
that anybody relies on.
I think we may want to start a larger discussion around connector names,
allowed characters, max length, ..  to come up with an airtight set of
rules that we can then enforce, I am sure this is currently not perfect as
is.

Best regards,
Sönke

On Wed, Jul 5, 2017 at 9:31 AM, Sönke Liebau 
wrote:

> Hi,
>
> regarding "breaking existing functionality" .. yes...that was me getting
> confused about intended and existing functionality :)
> You are right, this won't break anything that is currently working.
>
> I'll leave placement of "name" parameter as is and open a new issue to
> clarify this later on.
>
> Kind regards,
> Sönke
>
> On Wed, Jul 5, 2017 at 5:41 AM, Gwen Shapira  wrote:
>
>> Hey,
>>
>> Nice research and summary.
>>
>> Regarding the ability to have a "nameless" connector - I'm pretty sure we
>> never intended to allow that.
>> I'm confused about breaking something that currently works though - since
>> we get NPE, how will giving more intentional exceptions break anything?
>>
>> Regarding the placing of the name - inside or outside the config. It looks
>> messy and I'm as confused as you are. I think Konstantine had some ideas
>> how this should be resolved. I hope he responds, but I think that for your
>> PR, just accept current mess as given...
>>
>> Gwen
>>
>> On Tue, Jul 4, 2017 at 3:28 AM Sönke Liebau
>>  wrote:
>>
>> > While working on KAFKA-4930 and KAFKA-4938 I came across some sort of
>> > fundamental questions about the rest api for creating connectors in
>> Kafka
>> > Connect that I'd like to put up for discussion.
>> >
>> > Currently requests that do not contain a "name" element on the top level
>> > are not accepted by the API, but that is due to a NullPointerException
>> [1]
>> > so not entirely intentional. Previous (and current if the lines causing
>> the
>> > Exception are removed) functionality was to create a connector named
>> "null"
>> > if that parameter was missing. I am not sure if this is a good thing, as
>> > for example that connector will be overwritten every time a new request
>> > without a name is sent, as opposed to the expected warning that a
>> connector
>> > of that name already exists.
>> >
>> > I would propose to reject api calls without a name provided on the top
>> > level, but this might break requests that currently work, so should
>> > probably be mentioned in the release notes.
>> >
>> > 
>> >
>> > Additionally, the "name" parameter is also copied into the "config"
>> > sub-element of the connector request - unless a name parameter was
>> provided
>> > there in the original request[2].
>> >
>> > So this:
>> >
>> > {
>> >   "name": "connectorname",
>> >   "config": {
>> > "connector.class":
>> > "org.apache.kafka.connect.tools.MockSourceConnector",
>> > "tasks.max": "1",
>> > "topics": "test-topic"
>> >   }
>> > }
>> >
>> > would become this:
>> > {
>> >   "name": "connectorname",
>> >   "config": {
>> > "name": "connectorname",
>> > "connector.class":
>> > "org.apache.kafka.connect.tools.MockSourceConnector",
>> > "tasks.max": "1",
>> > "topics": "test-topic"
>> >   }
>> > }
>> >
>> > But a request that contains two different names like this:
>> >
>> >  {
>> >   "name": "connectorname",
>> >   "config": {
>> > "name": "differentconnectorname",
>> > "connector.class":
>> > "org.apache.kafka.connect.tools.MockSourceConnector",
>> > "tasks.max": "1",
>> > "topics": "test-topic"
>> >   }
>> > }
>> >
>> > would be allowed as is.
>> >
>> > This might be intentional behavior in order to enable Connectors to
>> have a
>> > "name" parameter of their own - though I couldn't find any that do, but
>> I
>> > think this has the potential for misunderstandings, especially as there
>> may
>> > be code out there that references the connector name from the config
>> object
>> > and would thus grab the "wrong" one.
>> >
>> > Again, this may be intentional, so I am mostly looking for comments on
>> how
>> > to proceed here.
>> >
>> > My first instinct is to make the top-level "name" parameter mandatory as
>> > suggested above and then add a check to reject requests that contain a
>> > different "name" field in the config element.
>> >
>> > Any comments or thoughts welcome.
>> >
>> > TL/DR: