Re: [DISCUSS] KIP-199: Add Kafka Connect offset reset tool

2017-09-12 Thread Gwen Shapira
Totally agree about getting things right in the first round. In addition, I
believe that it is important that names of our tools will describe what the
tools do as accurately as possible - this improves usability. Therefore, we
should shift the discussion to whether the tool should handle all offsets
or just source offsets. The name of the tool will naturally be derived from
the decision of what the tool will do.

Now that it is an important discussion about functionality and not
bike-shedding about names:

I am strongly supportive of duplicating functionality and having a Connect
tool that handles all aspects of resetting connector state. As you said,
the specifics of how Connect manages its offsets are an implementation
details that I don't expect users to know and care about.

I also want to point out that the Streams reset tool already duplicates
some functionality and this doesn't seem to be a problem because the
context (Streams) is well defined.

Gwen



On Tue, Sep 12, 2017 at 7:30 PM Ewen Cheslack-Postava 
wrote:

> I don't think it's bikeshedding and it is a fair question. I try to avoid
> that because this is part of the point of KIPs -- avoid extra pain around
> usability, documentation, maintenance, and compatibility and deprecation by
> just trying to get things right on the first go around.
>
> It's obviously just my preference, but I would rather name it what we
> ultimately want and document limitations while we fill in the gaps. This is
> a case where I really don't know whether we'll want to extend it to
> effectively duplicate some of the behavior of an existing tool for the sake
> of consistency, so I raised it for discussion.
>
> -Ewen
>
> On Tue, Sep 12, 2017 at 10:51 AM, Gwen Shapira  wrote:
>
> > >
> > >
> > > > * re: naming, can we avoid including 'source' in the command name?
> even
> > > if
> > > > that's all it supports today, I don't think we want to restrict it.
> > While
> > > > we only *require* this for source offsets, I think for users it will,
> > > > long-term, be way more natural to consider connect offsets generally
> > and
> > > > not need to know how to drop down to consumer offsets for sink
> > > connectors.
> > > > In fact, in an ideal world, many users/operators may not even
> > > *understand*
> > > > consumer offsets deeply while still being generally familiar with
> > connect
> > > > offsets. We can always include an error message for now if they try
> to
> > do
> > > > something with a sink connector (which we presumably need to detect
> > > anyway
> > > > to correctly handle source connectors).
> > > >
> > >
> > > ack
> > >
> > >
> > Sorry about bikeshedding, but:
> > I think a general name for something that has limited functionality is
> very
> > confusing (and we've seen this play out before). Why not name the tool to
> > describe what it does and change its name if we change the functionality?
> >
> > >
> > >
> >
>


[jira] [Created] (KAFKA-5880) Transactional producer and read committed consumer causes consumer to stuck

2017-09-12 Thread Lae (JIRA)
Lae created KAFKA-5880:
--

 Summary: Transactional producer and read committed consumer causes 
consumer to stuck
 Key: KAFKA-5880
 URL: https://issues.apache.org/jira/browse/KAFKA-5880
 Project: Kafka
  Issue Type: Bug
Reporter: Lae
 Attachments: index-updates-3.zip

We use transactional producers, and have configured isolation level on the 
consumer to only read committed data. The consumer has somehow got into a stuck 
state where it can no longer move forward because the Kafka server always 
return empty list of records despite there are thousands more successful 
transactions after the offset.

This is an example producer code:
{code:java}
Properties config = new Properties();
config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
config.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, 
UUID.randomUUID().toString());
config.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, 
StringSerializer.class.getName());
config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, 
StringSerializer.class.getName());
try (Producer producer = new KafkaProducer<>(config)) {
producer.initTransactions();
try {
producer.beginTransaction();
// Multiple producer.send(...) here
producer.commitTransaction();
} catch (Throwable e) {
producer.abortTransaction();
}
}
{code}

This is the test consumer code:
{code:java}
Properties config = new Properties();
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
config.put(ConsumerConfig.GROUP_ID_CONFIG, "my-group");
config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
config.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
config.put(ConsumerConfig.ISOLATION_LEVEL_CONFIG, 
IsolationLevel.READ_COMMITTED.toString().toLowerCase(ENGLISH));
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, 
StringDeserializer.class.getName());
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, 
StringDeserializer.class.getName());

try (KafkaConsumer consumer = new KafkaConsumer<>(config)) {
consumer.subscribe(Collections.singleton("index-updates"));
while (true) {
ConsumerRecords records = consumer.poll(5000);
for (ConsumerRecord record : records) {
System.err.println(record.value());
}
consumer.commitSync();
}
}
{code}

I have also attached the problematic partition data index-updates-3.zip, to 
reproduce the issue using the data, you can run a local Kafka instance, then 
create a topic called "index-updates" with 3 partitions, and replace the 
content of the index-updates-3 log directory with the attached content, then 
running the above consumer code.

Then the consumer will be stuck at some point (not always at the same offset) 
not making anymore progress even if you send new data into the partition (other 
partitions seem fine). The following example is when the consumer was stuck at 
offset 46644, and the Kafka server always return empty list of records when the 
consumer fetches from 46644:
{noformat}
root@0b1e67f0c34b:/# /opt/kafka/bin/kafka-consumer-groups.sh --describe --group 
my-group --bootstrap-server localhost:9092
Note: This will only show information about consumers that use the Java 
consumer API (non-ZooKeeper-based consumers).


TOPIC  PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG   
 CONSUMER-ID   HOST 
  CLIENT-ID
index-updates  0  15281   15281   0 
 consumer-1-c8f3fede-ef6f-4d12-a426-44aa9c47e71f   /10.100.1.97 
  consumer-1
index-updates  1  0   0   0 
 consumer-1-c8f3fede-ef6f-4d12-a426-44aa9c47e71f   /10.100.1.97 
  consumer-1
index-updates  2  0   0   0 
 consumer-1-c8f3fede-ef6f-4d12-a426-44aa9c47e71f   /10.100.1.97 
  consumer-1
index-updates  3  46644   65735   19091 
 consumer-1-c8f3fede-ef6f-4d12-a426-44aa9c47e71f   /10.100.1.97 
  consumer-1
index-updates  4  0   0   0 
 consumer-1-c8f3fede-ef6f-4d12-a426-44aa9c47e71f   /10.100.1.97 
  consumer-1
index-updates  5  0   0   0 
 consumer-1-c8f3fede-ef6f-4d12-a426-44aa9c47e71f   /10.100.1.97 
  consumer-1
index-updates  6  0   0   0 
 consumer-1-c8f3fede-ef6f-4d12-a426-44aa9c47e71f   /10.100.1.97 
  consumer-1
index-updates  7  0   0 

[GitHub] kafka pull request #3842: [Kafka-5301] Improve exception handling on consume...

2017-09-12 Thread ConcurrencyPractitioner
GitHub user ConcurrencyPractitioner opened a pull request:

https://github.com/apache/kafka/pull/3842

[Kafka-5301] Improve exception handling on consumer path

This is an improvised approach towards fixing @guozhangwang 's second 
issue. 
I have changed the method return type as well as override such that it 
returns exception.
If the exception returned is not null (the default value), than we skip the 
callback.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ConcurrencyPractitioner/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3842.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3842


commit 6290df2070f215d0b355f3e59717d911e50b8973
Author: Richard Yu 
Date:   2017-09-13T03:19:24Z

[Kafka-5301] Improve exception handling on consumer path




---


Re: [DISCUSS] KIP-199: Add Kafka Connect offset reset tool

2017-09-12 Thread Ewen Cheslack-Postava
I don't think it's bikeshedding and it is a fair question. I try to avoid
that because this is part of the point of KIPs -- avoid extra pain around
usability, documentation, maintenance, and compatibility and deprecation by
just trying to get things right on the first go around.

It's obviously just my preference, but I would rather name it what we
ultimately want and document limitations while we fill in the gaps. This is
a case where I really don't know whether we'll want to extend it to
effectively duplicate some of the behavior of an existing tool for the sake
of consistency, so I raised it for discussion.

-Ewen

On Tue, Sep 12, 2017 at 10:51 AM, Gwen Shapira  wrote:

> >
> >
> > > * re: naming, can we avoid including 'source' in the command name? even
> > if
> > > that's all it supports today, I don't think we want to restrict it.
> While
> > > we only *require* this for source offsets, I think for users it will,
> > > long-term, be way more natural to consider connect offsets generally
> and
> > > not need to know how to drop down to consumer offsets for sink
> > connectors.
> > > In fact, in an ideal world, many users/operators may not even
> > *understand*
> > > consumer offsets deeply while still being generally familiar with
> connect
> > > offsets. We can always include an error message for now if they try to
> do
> > > something with a sink connector (which we presumably need to detect
> > anyway
> > > to correctly handle source connectors).
> > >
> >
> > ack
> >
> >
> Sorry about bikeshedding, but:
> I think a general name for something that has limited functionality is very
> confusing (and we've seen this play out before). Why not name the tool to
> describe what it does and change its name if we change the functionality?
>
> >
> >
>


[jira] [Resolved] (KAFKA-2887) TopicMetadataRequest creates topic if it does not exist

2017-09-12 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-2887.
--
Resolution: Fixed

> TopicMetadataRequest creates topic if it does not exist
> ---
>
> Key: KAFKA-2887
> URL: https://issues.apache.org/jira/browse/KAFKA-2887
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8.2.0
> Environment: Centos6, Java 1.7.0_75
>Reporter: Andrew Winterman
>Priority: Minor
>
> We wired up a probe http endpoint to make TopicMetadataRequests with a 
> possible topic name. If no topic was found, we expected an empty response. 
> However if we asked for the same topic twice, it would exist the second time!
> I think this is a bug because the purpose of the TopicMetadaRequest is to 
> provide  information about the cluster, not mutate it. I can provide example 
> code if needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] kafka pull request #3841: KAFKA-5833: Reset thread interrupt state in case o...

2017-09-12 Thread mjsax
GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/3841

KAFKA-5833: Reset thread interrupt state in case of InterruptedException



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka kafka-5833-interrupts

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3841.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3841


commit a359ea8825e456db8f2ee5307b89b72b982e6e44
Author: Matthias J. Sax 
Date:   2017-09-13T01:34:10Z

KAFKA-5833: Reset thread interrupt state in case of InterruptedException




---


Re: [DISCUSS] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Sriram Subramanian
FWIW, I agree that time metrics have been very useful in the past. The
reasoning around perf overhead seems reasonable as well. Can we agree on a
subset of time metrics that we feel would be super useful for debugging?

On Tue, Sep 12, 2017 at 6:08 PM, Roger Hoover 
wrote:

> Thanks, Ewen.
>
> I agree with you on the overhead of measuring time for SMTs and
> converters.  I'd still argue for keeping other metrics like flush time b/c
> even small batches should still be small overhead compared to writing to a
> sink.
>
> On Tue, Sep 12, 2017 at 3:06 PM, Ewen Cheslack-Postava 
> wrote:
>
> > Requests are generally substantial batches of data, you are not
> guaranteed
> > that for the processing batches both because source connectors can hand
> you
> > batches of whatever size they want and consumer's max.poll.records can be
> > overridden.
> >
> > Both SMTs and converters are a concern because they can both be
> relatively
> > cheap such that just checking the time in between them could possibly
> dwarf
> > the cost of applying them.
> >
> > Also, another thought re: rebalance metrics: we are already getting some
> > info via AbstractCoordinator and those actually provide a bit more detail
> > in some ways (e.g. join & sync vs the entire rebalance). Not sure if we
> > want to effectively duplicate some info so it can all be located under
> > Connect names or rely on the existing metrics for some of these.
> >
> > -Ewen
> >
> > On Tue, Sep 12, 2017 at 2:05 PM, Roger Hoover 
> > wrote:
> >
> > > Ewen,
> > >
> > > I don't know the details of the perf concern.  How is it that the Kafka
> > > broker can keep latency stats per request without suffering too much
> > > performance?  Maybe SMTs are the only concern b/c they are per-message.
> > If
> > > so, let's remove those and keep timing info for everything else like
> > > flushes, which are batch-based.
> > >
> > >
> > > On Tue, Sep 12, 2017 at 1:32 PM, Ewen Cheslack-Postava <
> > e...@confluent.io>
> > > wrote:
> > >
> > > > On Tue, Sep 12, 2017 at 10:55 AM, Gwen Shapira 
> > > wrote:
> > > >
> > > > > Ewen, you gave a nice talk at Kafka Summit where you warned about
> the
> > > > > danger of SMTs that slow down the data pipe. If we don't provide
> the
> > > time
> > > > > metrics, how will users know when their SMTs are causing
> performance
> > > > > issues?
> > > > >
> > > >
> > > > Metrics aren't the only way to gain insight about performance and
> > always
> > > > measuring this even when it's not necessarily being used may not make
> > > > sense. SMT authors are much better off starting out with a JMH or
> > similar
> > > > benchmark. What I was referring to in the talk is more about
> > > understanding
> > > > that the processing for SMTs is entirely synchronous and that means
> > > certain
> > > > classes of operations will just generally be a bad idea, e.g.
> anything
> > > that
> > > > goes out over the network to another service. You don't even really
> > need
> > > > performance info to determine that that type of transformation will
> > cause
> > > > problems.
> > > >
> > > > But my point wasn't that timing info isn't useful. It's that we know
> > that
> > > > getting timestamps is pretty expensive and we'll already be doing so
> > > > elsewhere (e.g. if a source record doesn't include a timestamp). For
> > some
> > > > use cases such as ByteArrayConverter + no SMTs + lightweight
> processing
> > > > (e.g. just gets handed to a background thread that deals with sending
> > the
> > > > data), it wouldn't be out of the question that adding 4 or so more
> > calls
> > > to
> > > > get timestamps could become a bottleneck. Since I don't know if it
> > would
> > > > but we have definitely seen the issue come up before, I would be
> > > > conservative in adding the metrics unless we had some numbers showing
> > it
> > > > doesn't matter or doesn't matter much.
> > > >
> > > > In general, I don't think metrics that require always-on measurement
> > are
> > > a
> > > > good way to get fine grained performance information. Instrumenting
> > > > different phases that imply different types of performance problems
> can
> > > be
> > > > helpful (e.g. "processing time" that should be CPU/memory throughput
> > > bound
> > > > vs. "send time" that, at least for many connectors, is more likely to
> > be
> > > IO
> > > > bound), but if you want finer-grained details, you probably either
> want
> > > > something that can be toggled on/off temporarily or just use a tool
> > > that's
> > > > really designed for the job, i.e. a profiler like perf.
> > > >
> > > > -Ewen
> > > >
> > > >
> > > > >
> > > > > Gwen
> > > > >
> > > > > On Mon, Sep 11, 2017 at 7:50 PM Ewen Cheslack-Postava <
> > > e...@confluent.io
> > > > >
> > > > > wrote:
> > > > >
> > > > > > re: questions about additional metrics, I think we'll undoubtedly
> > > find
> > > > > more
> > > > > > that people want in 

Re: [DISCUSS] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Roger Hoover
Thanks, Ewen.

I agree with you on the overhead of measuring time for SMTs and
converters.  I'd still argue for keeping other metrics like flush time b/c
even small batches should still be small overhead compared to writing to a
sink.

On Tue, Sep 12, 2017 at 3:06 PM, Ewen Cheslack-Postava 
wrote:

> Requests are generally substantial batches of data, you are not guaranteed
> that for the processing batches both because source connectors can hand you
> batches of whatever size they want and consumer's max.poll.records can be
> overridden.
>
> Both SMTs and converters are a concern because they can both be relatively
> cheap such that just checking the time in between them could possibly dwarf
> the cost of applying them.
>
> Also, another thought re: rebalance metrics: we are already getting some
> info via AbstractCoordinator and those actually provide a bit more detail
> in some ways (e.g. join & sync vs the entire rebalance). Not sure if we
> want to effectively duplicate some info so it can all be located under
> Connect names or rely on the existing metrics for some of these.
>
> -Ewen
>
> On Tue, Sep 12, 2017 at 2:05 PM, Roger Hoover 
> wrote:
>
> > Ewen,
> >
> > I don't know the details of the perf concern.  How is it that the Kafka
> > broker can keep latency stats per request without suffering too much
> > performance?  Maybe SMTs are the only concern b/c they are per-message.
> If
> > so, let's remove those and keep timing info for everything else like
> > flushes, which are batch-based.
> >
> >
> > On Tue, Sep 12, 2017 at 1:32 PM, Ewen Cheslack-Postava <
> e...@confluent.io>
> > wrote:
> >
> > > On Tue, Sep 12, 2017 at 10:55 AM, Gwen Shapira 
> > wrote:
> > >
> > > > Ewen, you gave a nice talk at Kafka Summit where you warned about the
> > > > danger of SMTs that slow down the data pipe. If we don't provide the
> > time
> > > > metrics, how will users know when their SMTs are causing performance
> > > > issues?
> > > >
> > >
> > > Metrics aren't the only way to gain insight about performance and
> always
> > > measuring this even when it's not necessarily being used may not make
> > > sense. SMT authors are much better off starting out with a JMH or
> similar
> > > benchmark. What I was referring to in the talk is more about
> > understanding
> > > that the processing for SMTs is entirely synchronous and that means
> > certain
> > > classes of operations will just generally be a bad idea, e.g. anything
> > that
> > > goes out over the network to another service. You don't even really
> need
> > > performance info to determine that that type of transformation will
> cause
> > > problems.
> > >
> > > But my point wasn't that timing info isn't useful. It's that we know
> that
> > > getting timestamps is pretty expensive and we'll already be doing so
> > > elsewhere (e.g. if a source record doesn't include a timestamp). For
> some
> > > use cases such as ByteArrayConverter + no SMTs + lightweight processing
> > > (e.g. just gets handed to a background thread that deals with sending
> the
> > > data), it wouldn't be out of the question that adding 4 or so more
> calls
> > to
> > > get timestamps could become a bottleneck. Since I don't know if it
> would
> > > but we have definitely seen the issue come up before, I would be
> > > conservative in adding the metrics unless we had some numbers showing
> it
> > > doesn't matter or doesn't matter much.
> > >
> > > In general, I don't think metrics that require always-on measurement
> are
> > a
> > > good way to get fine grained performance information. Instrumenting
> > > different phases that imply different types of performance problems can
> > be
> > > helpful (e.g. "processing time" that should be CPU/memory throughput
> > bound
> > > vs. "send time" that, at least for many connectors, is more likely to
> be
> > IO
> > > bound), but if you want finer-grained details, you probably either want
> > > something that can be toggled on/off temporarily or just use a tool
> > that's
> > > really designed for the job, i.e. a profiler like perf.
> > >
> > > -Ewen
> > >
> > >
> > > >
> > > > Gwen
> > > >
> > > > On Mon, Sep 11, 2017 at 7:50 PM Ewen Cheslack-Postava <
> > e...@confluent.io
> > > >
> > > > wrote:
> > > >
> > > > > re: questions about additional metrics, I think we'll undoubtedly
> > find
> > > > more
> > > > > that people want in practice, but as I mentioned earlier I think
> it's
> > > > > better to add the ones we know we need and then fill out the rest
> as
> > we
> > > > > figure it out. So, e.g., batch size metrics sound like they could
> be
> > > > > useful, but I'd probably wait until we have a clear use case. It
> > seems
> > > > > likely that it could be useful in diagnosing slow connectors (e.g.
> > the
> > > > > implementation just does something inefficient), but I'm not really
> > > sure
> > > > > about that yet.
> > > > >
> > > > > -Ewen
> > > > >
> > > > > On Mon, Sep 

Build failed in Jenkins: kafka-trunk-jdk8 #2013

2017-09-12 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] KAFKA-4468: Correctly calculate the window end timestamp after read 
from

--
[...truncated 2.53 MB...]

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin PASSED

org.apache.kafka.streams.integration.ResetIntegrationTest > 
testReprocessingFromScratchAfterResetWithIntermediateUserTopic STARTED

org.apache.kafka.streams.integration.ResetIntegrationTest > 
testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
java.lang.AssertionError: expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.kafka.streams.integration.ResetIntegrationTest.cleanGlobal(ResetIntegrationTest.java:363)
at 
org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:172)

org.apache.kafka.streams.integration.ResetIntegrationTest > 

Build failed in Jenkins: kafka-trunk-jdk7 #2753

2017-09-12 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] KAFKA-4468: Correctly calculate the window end timestamp after read 
from

--
[...truncated 2.51 MB...]
org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerOuterJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftOuterJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftOuterJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoinQueryable PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenDeleted STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenDeleted PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource PASSED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegionWithZeroByteCache STARTED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegionWithZeroByteCache PASSED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegionWithNonZeroByteCache STARTED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegionWithNonZeroByteCache PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin STARTED
ERROR: Could not install GRADLE_3_4_RC_2_HOME
java.lang.NullPointerException
at 
hudson.plugins.toolenv.ToolEnvBuildWrapper$1.buildEnvVars(ToolEnvBuildWrapper.java:46)
at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:886)
at 

[GitHub] kafka pull request #3840: KAFKA-5879; Controller should read the latest IsrC...

2017-09-12 Thread lindong28
GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/3840

KAFKA-5879; Controller should read the latest IsrChangeNotification znodes 
when handling IsrChangeNotification event

…

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-5879

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3840.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3840


commit 54160a4d1780e58317e14f75754b40cefb5ac79a
Author: Dong Lin 
Date:   2017-09-12T22:38:26Z

KAFKA-5879; Controller should read the latest IsrChangeNotification znodes 
when handling IsrChangeNotification event




---


[jira] [Created] (KAFKA-5879) Controller should read the latest IsrChangeNotification znodes when handling IsrChangeNotification event

2017-09-12 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-5879:
---

 Summary: Controller should read the latest IsrChangeNotification 
znodes when handling IsrChangeNotification event
 Key: KAFKA-5879
 URL: https://issues.apache.org/jira/browse/KAFKA-5879
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin
Priority: Critical


Currently controller can be very inefficient in handling IsrChangeNotification 
event because it may need to access znode O(n^2) times to handle O(n) 
IsrChangeNotification znodes.

For example, say there are 100 IsrChangeNotification nodes added to the 
zookeeper. This will generate 100 IsrChangeNotification events with children 
[1], [1, 2], [1, 2, 3], ... [1, 2, .. 100]. Let's say the controller now needs 
to handle the IsrChangeNotification event with children [1, 2, ... 100]. 
Controller will read zookeeper 100 times, delete these 100 znodes, which 
further generates 100 IsrChangeNotification events with children count from 0 
to 99.

The main cause of the problem is that, controller will attempt to access 
zookeeper n times, where n is the count of the children at the time the 
IsrChangeNotification event is generated, even though there is no 
IsrChangeNotification znodes in the zookeeper.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Randall Hauch
Regarding the existing rebalance metrics under
"kafka.connect:type=connect-coordinator-metrics", I think we should just
plan on reusing them rather than duplicating them.

On Tue, Sep 12, 2017 at 5:06 PM, Ewen Cheslack-Postava 
wrote:

> Requests are generally substantial batches of data, you are not guaranteed
> that for the processing batches both because source connectors can hand you
> batches of whatever size they want and consumer's max.poll.records can be
> overridden.
>
> Both SMTs and converters are a concern because they can both be relatively
> cheap such that just checking the time in between them could possibly dwarf
> the cost of applying them.
>
> Also, another thought re: rebalance metrics: we are already getting some
> info via AbstractCoordinator and those actually provide a bit more detail
> in some ways (e.g. join & sync vs the entire rebalance). Not sure if we
> want to effectively duplicate some info so it can all be located under
> Connect names or rely on the existing metrics for some of these.
>
> -Ewen
>
> On Tue, Sep 12, 2017 at 2:05 PM, Roger Hoover 
> wrote:
>
> > Ewen,
> >
> > I don't know the details of the perf concern.  How is it that the Kafka
> > broker can keep latency stats per request without suffering too much
> > performance?  Maybe SMTs are the only concern b/c they are per-message.
> If
> > so, let's remove those and keep timing info for everything else like
> > flushes, which are batch-based.
> >
> >
> > On Tue, Sep 12, 2017 at 1:32 PM, Ewen Cheslack-Postava <
> e...@confluent.io>
> > wrote:
> >
> > > On Tue, Sep 12, 2017 at 10:55 AM, Gwen Shapira 
> > wrote:
> > >
> > > > Ewen, you gave a nice talk at Kafka Summit where you warned about the
> > > > danger of SMTs that slow down the data pipe. If we don't provide the
> > time
> > > > metrics, how will users know when their SMTs are causing performance
> > > > issues?
> > > >
> > >
> > > Metrics aren't the only way to gain insight about performance and
> always
> > > measuring this even when it's not necessarily being used may not make
> > > sense. SMT authors are much better off starting out with a JMH or
> similar
> > > benchmark. What I was referring to in the talk is more about
> > understanding
> > > that the processing for SMTs is entirely synchronous and that means
> > certain
> > > classes of operations will just generally be a bad idea, e.g. anything
> > that
> > > goes out over the network to another service. You don't even really
> need
> > > performance info to determine that that type of transformation will
> cause
> > > problems.
> > >
> > > But my point wasn't that timing info isn't useful. It's that we know
> that
> > > getting timestamps is pretty expensive and we'll already be doing so
> > > elsewhere (e.g. if a source record doesn't include a timestamp). For
> some
> > > use cases such as ByteArrayConverter + no SMTs + lightweight processing
> > > (e.g. just gets handed to a background thread that deals with sending
> the
> > > data), it wouldn't be out of the question that adding 4 or so more
> calls
> > to
> > > get timestamps could become a bottleneck. Since I don't know if it
> would
> > > but we have definitely seen the issue come up before, I would be
> > > conservative in adding the metrics unless we had some numbers showing
> it
> > > doesn't matter or doesn't matter much.
> > >
> > > In general, I don't think metrics that require always-on measurement
> are
> > a
> > > good way to get fine grained performance information. Instrumenting
> > > different phases that imply different types of performance problems can
> > be
> > > helpful (e.g. "processing time" that should be CPU/memory throughput
> > bound
> > > vs. "send time" that, at least for many connectors, is more likely to
> be
> > IO
> > > bound), but if you want finer-grained details, you probably either want
> > > something that can be toggled on/off temporarily or just use a tool
> > that's
> > > really designed for the job, i.e. a profiler like perf.
> > >
> > > -Ewen
> > >
> > >
> > > >
> > > > Gwen
> > > >
> > > > On Mon, Sep 11, 2017 at 7:50 PM Ewen Cheslack-Postava <
> > e...@confluent.io
> > > >
> > > > wrote:
> > > >
> > > > > re: questions about additional metrics, I think we'll undoubtedly
> > find
> > > > more
> > > > > that people want in practice, but as I mentioned earlier I think
> it's
> > > > > better to add the ones we know we need and then fill out the rest
> as
> > we
> > > > > figure it out. So, e.g., batch size metrics sound like they could
> be
> > > > > useful, but I'd probably wait until we have a clear use case. It
> > seems
> > > > > likely that it could be useful in diagnosing slow connectors (e.g.
> > the
> > > > > implementation just does something inefficient), but I'm not really
> > > sure
> > > > > about that yet.
> > > > >
> > > > > -Ewen
> > > > >
> > > > > On Mon, Sep 11, 2017 at 7:11 PM, Randall Hauch 
> > > wrote:

[GitHub] kafka pull request #3745: [KAFKA-4468] Correctly calculate the window end ti...

2017-09-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3745


---


[jira] [Created] (KAFKA-5878) Add sensor for queue size of the controller-event-thread

2017-09-12 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-5878:
---

 Summary: Add sensor for queue size of the controller-event-thread
 Key: KAFKA-5878
 URL: https://issues.apache.org/jira/browse/KAFKA-5878
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Ewen Cheslack-Postava
Requests are generally substantial batches of data, you are not guaranteed
that for the processing batches both because source connectors can hand you
batches of whatever size they want and consumer's max.poll.records can be
overridden.

Both SMTs and converters are a concern because they can both be relatively
cheap such that just checking the time in between them could possibly dwarf
the cost of applying them.

Also, another thought re: rebalance metrics: we are already getting some
info via AbstractCoordinator and those actually provide a bit more detail
in some ways (e.g. join & sync vs the entire rebalance). Not sure if we
want to effectively duplicate some info so it can all be located under
Connect names or rely on the existing metrics for some of these.

-Ewen

On Tue, Sep 12, 2017 at 2:05 PM, Roger Hoover 
wrote:

> Ewen,
>
> I don't know the details of the perf concern.  How is it that the Kafka
> broker can keep latency stats per request without suffering too much
> performance?  Maybe SMTs are the only concern b/c they are per-message.  If
> so, let's remove those and keep timing info for everything else like
> flushes, which are batch-based.
>
>
> On Tue, Sep 12, 2017 at 1:32 PM, Ewen Cheslack-Postava 
> wrote:
>
> > On Tue, Sep 12, 2017 at 10:55 AM, Gwen Shapira 
> wrote:
> >
> > > Ewen, you gave a nice talk at Kafka Summit where you warned about the
> > > danger of SMTs that slow down the data pipe. If we don't provide the
> time
> > > metrics, how will users know when their SMTs are causing performance
> > > issues?
> > >
> >
> > Metrics aren't the only way to gain insight about performance and always
> > measuring this even when it's not necessarily being used may not make
> > sense. SMT authors are much better off starting out with a JMH or similar
> > benchmark. What I was referring to in the talk is more about
> understanding
> > that the processing for SMTs is entirely synchronous and that means
> certain
> > classes of operations will just generally be a bad idea, e.g. anything
> that
> > goes out over the network to another service. You don't even really need
> > performance info to determine that that type of transformation will cause
> > problems.
> >
> > But my point wasn't that timing info isn't useful. It's that we know that
> > getting timestamps is pretty expensive and we'll already be doing so
> > elsewhere (e.g. if a source record doesn't include a timestamp). For some
> > use cases such as ByteArrayConverter + no SMTs + lightweight processing
> > (e.g. just gets handed to a background thread that deals with sending the
> > data), it wouldn't be out of the question that adding 4 or so more calls
> to
> > get timestamps could become a bottleneck. Since I don't know if it would
> > but we have definitely seen the issue come up before, I would be
> > conservative in adding the metrics unless we had some numbers showing it
> > doesn't matter or doesn't matter much.
> >
> > In general, I don't think metrics that require always-on measurement are
> a
> > good way to get fine grained performance information. Instrumenting
> > different phases that imply different types of performance problems can
> be
> > helpful (e.g. "processing time" that should be CPU/memory throughput
> bound
> > vs. "send time" that, at least for many connectors, is more likely to be
> IO
> > bound), but if you want finer-grained details, you probably either want
> > something that can be toggled on/off temporarily or just use a tool
> that's
> > really designed for the job, i.e. a profiler like perf.
> >
> > -Ewen
> >
> >
> > >
> > > Gwen
> > >
> > > On Mon, Sep 11, 2017 at 7:50 PM Ewen Cheslack-Postava <
> e...@confluent.io
> > >
> > > wrote:
> > >
> > > > re: questions about additional metrics, I think we'll undoubtedly
> find
> > > more
> > > > that people want in practice, but as I mentioned earlier I think it's
> > > > better to add the ones we know we need and then fill out the rest as
> we
> > > > figure it out. So, e.g., batch size metrics sound like they could be
> > > > useful, but I'd probably wait until we have a clear use case. It
> seems
> > > > likely that it could be useful in diagnosing slow connectors (e.g.
> the
> > > > implementation just does something inefficient), but I'm not really
> > sure
> > > > about that yet.
> > > >
> > > > -Ewen
> > > >
> > > > On Mon, Sep 11, 2017 at 7:11 PM, Randall Hauch 
> > wrote:
> > > >
> > > > > Based on Roger and Ewen's feedback, I removed the aggregate metrics
> > as
> > > > they
> > > > > would be difficult to make use of without extra work. This
> simplified
> > > > > things a great deal, and I took the opportunity to reorganize the
> > > groups
> > > > of
> > > > > metrics. Also, based upon Ewen's concerns regarding measuring
> > > > > times/durations, I removed all time-related metrics except for the
> > > offset
> > > > > commits and rebalances, which are infrequent 

[GitHub] kafka pull request #3839: KAFKA-5877; Controller should only update reassign...

2017-09-12 Thread lindong28
GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/3839

KAFKA-5877; Controller should only update reassignment znode if there is 
change in the reassignment data



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-5877

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3839.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3839


commit 4cd104b6f30d4a0203816d4daa2141b60461cc93
Author: Dong Lin 
Date:   2017-09-12T21:24:54Z

KAFKA-5877; Controller should only update reassignment znode if there is 
change in the reassignment data




---


[jira] [Created] (KAFKA-5877) Controller should only update reassignment znode if there is change in the reassignment data

2017-09-12 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-5877:
---

 Summary: Controller should only update reassignment znode if there 
is change in the reassignment data
 Key: KAFKA-5877
 URL: https://issues.apache.org/jira/browse/KAFKA-5877
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin


I encountered a scenario where KafkaController keeps print





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Roger Hoover
Ewen,

I don't know the details of the perf concern.  How is it that the Kafka
broker can keep latency stats per request without suffering too much
performance?  Maybe SMTs are the only concern b/c they are per-message.  If
so, let's remove those and keep timing info for everything else like
flushes, which are batch-based.


On Tue, Sep 12, 2017 at 1:32 PM, Ewen Cheslack-Postava 
wrote:

> On Tue, Sep 12, 2017 at 10:55 AM, Gwen Shapira  wrote:
>
> > Ewen, you gave a nice talk at Kafka Summit where you warned about the
> > danger of SMTs that slow down the data pipe. If we don't provide the time
> > metrics, how will users know when their SMTs are causing performance
> > issues?
> >
>
> Metrics aren't the only way to gain insight about performance and always
> measuring this even when it's not necessarily being used may not make
> sense. SMT authors are much better off starting out with a JMH or similar
> benchmark. What I was referring to in the talk is more about understanding
> that the processing for SMTs is entirely synchronous and that means certain
> classes of operations will just generally be a bad idea, e.g. anything that
> goes out over the network to another service. You don't even really need
> performance info to determine that that type of transformation will cause
> problems.
>
> But my point wasn't that timing info isn't useful. It's that we know that
> getting timestamps is pretty expensive and we'll already be doing so
> elsewhere (e.g. if a source record doesn't include a timestamp). For some
> use cases such as ByteArrayConverter + no SMTs + lightweight processing
> (e.g. just gets handed to a background thread that deals with sending the
> data), it wouldn't be out of the question that adding 4 or so more calls to
> get timestamps could become a bottleneck. Since I don't know if it would
> but we have definitely seen the issue come up before, I would be
> conservative in adding the metrics unless we had some numbers showing it
> doesn't matter or doesn't matter much.
>
> In general, I don't think metrics that require always-on measurement are a
> good way to get fine grained performance information. Instrumenting
> different phases that imply different types of performance problems can be
> helpful (e.g. "processing time" that should be CPU/memory throughput bound
> vs. "send time" that, at least for many connectors, is more likely to be IO
> bound), but if you want finer-grained details, you probably either want
> something that can be toggled on/off temporarily or just use a tool that's
> really designed for the job, i.e. a profiler like perf.
>
> -Ewen
>
>
> >
> > Gwen
> >
> > On Mon, Sep 11, 2017 at 7:50 PM Ewen Cheslack-Postava  >
> > wrote:
> >
> > > re: questions about additional metrics, I think we'll undoubtedly find
> > more
> > > that people want in practice, but as I mentioned earlier I think it's
> > > better to add the ones we know we need and then fill out the rest as we
> > > figure it out. So, e.g., batch size metrics sound like they could be
> > > useful, but I'd probably wait until we have a clear use case. It seems
> > > likely that it could be useful in diagnosing slow connectors (e.g. the
> > > implementation just does something inefficient), but I'm not really
> sure
> > > about that yet.
> > >
> > > -Ewen
> > >
> > > On Mon, Sep 11, 2017 at 7:11 PM, Randall Hauch 
> wrote:
> > >
> > > > Based on Roger and Ewen's feedback, I removed the aggregate metrics
> as
> > > they
> > > > would be difficult to make use of without extra work. This simplified
> > > > things a great deal, and I took the opportunity to reorganize the
> > groups
> > > of
> > > > metrics. Also, based upon Ewen's concerns regarding measuring
> > > > times/durations, I removed all time-related metrics except for the
> > offset
> > > > commits and rebalances, which are infrequent enough to warrant the
> > > capture
> > > > of percentiles. Roger asked about capturing batch size metrics for
> > source
> > > > and sink tasks, and offset lag metrics for sink tasks. Finally, Ewen
> > > > pointed out that all count/total metrics are only valid since the
> most
> > > > recent rebalance and are therefore less meaningful, and were removed.
> > > >
> > > > On Mon, Sep 11, 2017 at 6:50 PM, Randall Hauch 
> > wrote:
> > > >
> > > > > Thanks, Ewen. Comments inline below.
> > > > >
> > > > > On Mon, Sep 11, 2017 at 5:46 PM, Ewen Cheslack-Postava <
> > > > e...@confluent.io>
> > > > > wrote:
> > > > >
> > > > >> Randall,
> > > > >>
> > > > >> A couple of questions:
> > > > >>
> > > > >> * Some metrics don't seem to have unique names? e.g.
> > > > >> source-record-produce-rate and source-record-produce-total seem
> like
> > > > they
> > > > >> are duplicated. Looks like maybe just an oversight that the second
> > > ones
> > > > >> should be changed from "produce" to "write".
> > > > >>
> > > > >
> > > > > Nice catch. You 

Re: [DISCUSS] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Ewen Cheslack-Postava
On Tue, Sep 12, 2017 at 10:55 AM, Gwen Shapira  wrote:

> Ewen, you gave a nice talk at Kafka Summit where you warned about the
> danger of SMTs that slow down the data pipe. If we don't provide the time
> metrics, how will users know when their SMTs are causing performance
> issues?
>

Metrics aren't the only way to gain insight about performance and always
measuring this even when it's not necessarily being used may not make
sense. SMT authors are much better off starting out with a JMH or similar
benchmark. What I was referring to in the talk is more about understanding
that the processing for SMTs is entirely synchronous and that means certain
classes of operations will just generally be a bad idea, e.g. anything that
goes out over the network to another service. You don't even really need
performance info to determine that that type of transformation will cause
problems.

But my point wasn't that timing info isn't useful. It's that we know that
getting timestamps is pretty expensive and we'll already be doing so
elsewhere (e.g. if a source record doesn't include a timestamp). For some
use cases such as ByteArrayConverter + no SMTs + lightweight processing
(e.g. just gets handed to a background thread that deals with sending the
data), it wouldn't be out of the question that adding 4 or so more calls to
get timestamps could become a bottleneck. Since I don't know if it would
but we have definitely seen the issue come up before, I would be
conservative in adding the metrics unless we had some numbers showing it
doesn't matter or doesn't matter much.

In general, I don't think metrics that require always-on measurement are a
good way to get fine grained performance information. Instrumenting
different phases that imply different types of performance problems can be
helpful (e.g. "processing time" that should be CPU/memory throughput bound
vs. "send time" that, at least for many connectors, is more likely to be IO
bound), but if you want finer-grained details, you probably either want
something that can be toggled on/off temporarily or just use a tool that's
really designed for the job, i.e. a profiler like perf.

-Ewen


>
> Gwen
>
> On Mon, Sep 11, 2017 at 7:50 PM Ewen Cheslack-Postava 
> wrote:
>
> > re: questions about additional metrics, I think we'll undoubtedly find
> more
> > that people want in practice, but as I mentioned earlier I think it's
> > better to add the ones we know we need and then fill out the rest as we
> > figure it out. So, e.g., batch size metrics sound like they could be
> > useful, but I'd probably wait until we have a clear use case. It seems
> > likely that it could be useful in diagnosing slow connectors (e.g. the
> > implementation just does something inefficient), but I'm not really sure
> > about that yet.
> >
> > -Ewen
> >
> > On Mon, Sep 11, 2017 at 7:11 PM, Randall Hauch  wrote:
> >
> > > Based on Roger and Ewen's feedback, I removed the aggregate metrics as
> > they
> > > would be difficult to make use of without extra work. This simplified
> > > things a great deal, and I took the opportunity to reorganize the
> groups
> > of
> > > metrics. Also, based upon Ewen's concerns regarding measuring
> > > times/durations, I removed all time-related metrics except for the
> offset
> > > commits and rebalances, which are infrequent enough to warrant the
> > capture
> > > of percentiles. Roger asked about capturing batch size metrics for
> source
> > > and sink tasks, and offset lag metrics for sink tasks. Finally, Ewen
> > > pointed out that all count/total metrics are only valid since the most
> > > recent rebalance and are therefore less meaningful, and were removed.
> > >
> > > On Mon, Sep 11, 2017 at 6:50 PM, Randall Hauch 
> wrote:
> > >
> > > > Thanks, Ewen. Comments inline below.
> > > >
> > > > On Mon, Sep 11, 2017 at 5:46 PM, Ewen Cheslack-Postava <
> > > e...@confluent.io>
> > > > wrote:
> > > >
> > > >> Randall,
> > > >>
> > > >> A couple of questions:
> > > >>
> > > >> * Some metrics don't seem to have unique names? e.g.
> > > >> source-record-produce-rate and source-record-produce-total seem like
> > > they
> > > >> are duplicated. Looks like maybe just an oversight that the second
> > ones
> > > >> should be changed from "produce" to "write".
> > > >>
> > > >
> > > > Nice catch. You are correct - should be "write" instead of
> "produce". I
> > > > will correct.
> > > >
> > > >
> > > >> * I think there's a stray extra character in a couple of
> > > >> places: kafka.connect:type=source-task-metrics,name=source-record-
> > > >> produce-total,worker=([-.\w]+)l,connector=([-.\w]+),task=([\d]+)
> > > >> has an extra char after the worker name.
> > > >>
> > > >
> > > > Thanks. Removed in 2 places.
> > > >
> > > >
> > > >> * Are the produce totals actually useful given rebalancing would
> > cancel
> > > >> them out anyway? Doesn't seem like you could do much with them.
> > > >>
> > > >
> > > > 

Re: [VOTE] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Jason Gustafson
+1. Thanks for the KIP.

On Tue, Sep 12, 2017 at 12:42 PM, Sriram Subramanian 
wrote:

> +1
>
> On Tue, Sep 12, 2017 at 12:41 PM, Gwen Shapira  wrote:
>
> > My +1 remains :)
> >
> > On Tue, Sep 12, 2017 at 12:31 PM Randall Hauch  wrote:
> >
> > > The KIP was modified (most changes due to reorganization of metrics).
> > Feel
> > > free to re-vote if you dislike the changes.
> > >
> > > On Mon, Sep 11, 2017 at 8:40 PM, Sriram Subramanian 
> > > wrote:
> > >
> > > > +1
> > > >
> > > > On Mon, Sep 11, 2017 at 2:56 PM, Gwen Shapira 
> > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > Thanks for this. Can't wait for more complete monitoring for
> Connect.
> > > > >
> > > > > On Mon, Sep 11, 2017 at 7:40 AM Randall Hauch 
> > > wrote:
> > > > >
> > > > > > I'd like to start the vote on KIP-196 to add metrics to the Kafka
> > > > Connect
> > > > > > framework so the worker processes can be measured. Details are
> > here:
> > > > > >
> > > > > >
> > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 196%3A+Add+metrics+to+Kafka+Connect+framework
> > > > > >
> > > > > > Thanks, and best regards.
> > > > > >
> > > > > > Randall
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Sriram Subramanian
+1

On Tue, Sep 12, 2017 at 12:41 PM, Gwen Shapira  wrote:

> My +1 remains :)
>
> On Tue, Sep 12, 2017 at 12:31 PM Randall Hauch  wrote:
>
> > The KIP was modified (most changes due to reorganization of metrics).
> Feel
> > free to re-vote if you dislike the changes.
> >
> > On Mon, Sep 11, 2017 at 8:40 PM, Sriram Subramanian 
> > wrote:
> >
> > > +1
> > >
> > > On Mon, Sep 11, 2017 at 2:56 PM, Gwen Shapira 
> wrote:
> > >
> > > > +1
> > > >
> > > > Thanks for this. Can't wait for more complete monitoring for Connect.
> > > >
> > > > On Mon, Sep 11, 2017 at 7:40 AM Randall Hauch 
> > wrote:
> > > >
> > > > > I'd like to start the vote on KIP-196 to add metrics to the Kafka
> > > Connect
> > > > > framework so the worker processes can be measured. Details are
> here:
> > > > >
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 196%3A+Add+metrics+to+Kafka+Connect+framework
> > > > >
> > > > > Thanks, and best regards.
> > > > >
> > > > > Randall
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Gwen Shapira
My +1 remains :)

On Tue, Sep 12, 2017 at 12:31 PM Randall Hauch  wrote:

> The KIP was modified (most changes due to reorganization of metrics). Feel
> free to re-vote if you dislike the changes.
>
> On Mon, Sep 11, 2017 at 8:40 PM, Sriram Subramanian 
> wrote:
>
> > +1
> >
> > On Mon, Sep 11, 2017 at 2:56 PM, Gwen Shapira  wrote:
> >
> > > +1
> > >
> > > Thanks for this. Can't wait for more complete monitoring for Connect.
> > >
> > > On Mon, Sep 11, 2017 at 7:40 AM Randall Hauch 
> wrote:
> > >
> > > > I'd like to start the vote on KIP-196 to add metrics to the Kafka
> > Connect
> > > > framework so the worker processes can be measured. Details are here:
> > > >
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 196%3A+Add+metrics+to+Kafka+Connect+framework
> > > >
> > > > Thanks, and best regards.
> > > >
> > > > Randall
> > > >
> > >
> >
>


Re: [VOTE] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Randall Hauch
The KIP was modified (most changes due to reorganization of metrics). Feel
free to re-vote if you dislike the changes.

On Mon, Sep 11, 2017 at 8:40 PM, Sriram Subramanian 
wrote:

> +1
>
> On Mon, Sep 11, 2017 at 2:56 PM, Gwen Shapira  wrote:
>
> > +1
> >
> > Thanks for this. Can't wait for more complete monitoring for Connect.
> >
> > On Mon, Sep 11, 2017 at 7:40 AM Randall Hauch  wrote:
> >
> > > I'd like to start the vote on KIP-196 to add metrics to the Kafka
> Connect
> > > framework so the worker processes can be measured. Details are here:
> > >
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 196%3A+Add+metrics+to+Kafka+Connect+framework
> > >
> > > Thanks, and best regards.
> > >
> > > Randall
> > >
> >
>


[jira] [Resolved] (KAFKA-5621) The producer should retry expired batches when retries are enabled

2017-09-12 Thread Apurva Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apurva Mehta resolved KAFKA-5621.
-
Resolution: Won't Fix

> The producer should retry expired batches when retries are enabled
> --
>
> Key: KAFKA-5621
> URL: https://issues.apache.org/jira/browse/KAFKA-5621
> Project: Kafka
>  Issue Type: Bug
>Reporter: Apurva Mehta
>Assignee: Sumant Tambe
> Fix For: 1.0.0
>
>
> Today, when a batch is expired in the accumulator, a {{TimeoutException}} is 
> raised to the user.
> It might be better the producer to retry the expired batch rather up to the 
> configured number of retries. This is more intuitive from the user's point of 
> view. 
> Further the proposed behavior makes it easier for applications like mirror 
> maker to provide ordering guarantees even when batches expire. Today, they 
> would resend the expired batch and it would get added to the back of the 
> queue, causing the output ordering to be different from the input ordering.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [VOTE] KIP-195: AdminClient.createPartitions()

2017-09-12 Thread Jun Rao
Hi, Tom,

2. Thanks for the explanation. That makes sense and we can leave this as it
is.

Jun

On Tue, Sep 12, 2017 at 10:46 AM, Tom Bentley  wrote:

> Hi Jun,
>
> Thanks for the comments.
>
> On 12 September 2017 at 18:15, Jun Rao  wrote:
>
> > Hi, Tom,
> >
> > Thanks for the KIP.  +1. Just a couple of minor comments below.
> >
> > 1. The KIP has "INVALID_PARTITIONS (37) If the partition count was <= the
> > current partition count for the topic." We probably want to add one more
> > constraint: # of replicas in each new partition has to be the same as the
> > existing replication factor for the topic.
> >
>
> This is presently covered by the INVALID_REQUEST error code, but sure I can
> change it.
>
> 2. The KIP has the following.
> >
> >- REASSIGNMENT_IN_PROGRESS (new) If a partition reassignment is in
> > progress. It is necessary to prevent increasing partitions at the same
> > time so that we can be sure the partition has a meaningful replication
> > factor.
> >
> >
> > Currently, it's possible to increase the partition count while a
> > partition reassignment is in progress (adding partitions is much
> > cheaper than partition reassignment). On the other hand, if a topic is
> > being deleted, we will prevent adding new partitions. So, we probably
> > want to do the same in the KIP.
> >
>
> Increasing the partition count at the same time as a reassignment was
> covered in the [DISCUSS] thread. The problem is that during a reassignment
> there isn't a stable notion of replication factor. The brokers don't really
> know about replication factor at all, AFAICS. Currently when the partitions
> are increased we just use the replication factor inferred from partition 0.
> Ismael suggested to add this restriction to prevent this edge case.
>
> I don't see an error code specifically pertaining to topic actions on
> topics being deleted, so I guess INVALID_TOPIC_EXCEPTION would suffice for
> the deletion case.
>
> Thanks again,
>
> Tom
>


Build failed in Jenkins: kafka-trunk-jdk7 #2752

2017-09-12 Thread Apache Jenkins Server
See 


Changes:

[rajinisivaram] KAFKA-5872; Fix transient failure in 
SslSelectorTest.testMuteOnOOM

--
[...truncated 2.51 MB...]

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingUncaughtExceptionHandlerNotInCreateState PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingStateListenerNotInCreateState STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingStateListenerNotInCreateState PASSED

org.apache.kafka.streams.KafkaStreamsTest > testNumberDefaultMetrics STARTED

org.apache.kafka.streams.KafkaStreamsTest > testNumberDefaultMetrics PASSED

org.apache.kafka.streams.KafkaStreamsTest > shouldReturnThreadMetadata STARTED

org.apache.kafka.streams.KafkaStreamsTest > shouldReturnThreadMetadata PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning 
STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStateThreadClose STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStateThreadClose PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStateChanges STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStateChanges PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfProducerEnableIdempotenceIsOverriddenIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfProducerEnableIdempotenceIsOverriddenIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldUseNewConfigsWhenPresent 
STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldUseNewConfigsWhenPresent 
PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldUseCorrectDefaultsWhenNoneSpecified STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldUseCorrectDefaultsWhenNoneSpecified PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
STARTED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerMaxInFlightRequestPerConnectionsWhenEosDisabled 
STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerMaxInFlightRequestPerConnectionsWhenEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowStreamsExceptionIfValueSerdeConfigFails STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowStreamsExceptionIfValueSerdeConfigFails PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfConsumerAutoCommitIsOverridden STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfConsumerAutoCommitIsOverridden PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowExceptionIfNotAtLestOnceOrExactlyOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowExceptionIfNotAtLestOnceOrExactlyOnce PASSED

org.apache.kafka.streams.StreamsConfigTest > 

Re: [VOTE] KIP-198: Remove ZK dependency from Streams Reset Tool

2017-09-12 Thread Matthias J. Sax
+1

Thanks for voting. I am closing this as accepted with

3 binding votes (Guozhang, Sriram, Damian)
2 non-binding votes (Bill, Matthias)


-Matthias



On 9/9/17 2:12 AM, Damian Guy wrote:
> +1
> On Sat, 9 Sep 2017 at 03:46, Sriram Subramanian  wrote:
> 
>> +1
>>
>> On Fri, Sep 8, 2017 at 3:04 PM, Guozhang Wang  wrote:
>>
>>> +1, thanks.
>>>
>>> On Fri, Sep 8, 2017 at 1:54 PM, Bill Bejeck  wrote:
>>>
 +1

 Thanks,
 Bill

 On Fri, Sep 8, 2017 at 4:51 PM, Matthias J. Sax >>
 wrote:

> We want to deprecate it for 1.0.0 release. Unclear how long to keep
>> it.
>
> The point is, that the parameter will just be ignored after it got
> deprecated and thus has no effect anymore when running the tool.
>> Thus,
> we can keep it as long as we think some people might have scripts
>> that
> used the parameter.
>
> Fixed the typo. Thx.
>
>
> -Matthias
>
> On 9/8/17 1:20 PM, Ted Yu wrote:
>> bq. parameter `--zookeper` that will be deprecated.
>>
>> Can you clarify in which release the parameter will be deprecated
>> and
 in
>> which release it will be removed ?
>>
>> On Fri, Sep 8, 2017 at 1:15 PM, Matthias J. Sax <
>>> matth...@confluent.io
>
>> wrote:
>>
>>> Hi,
>>>
>>> I want to propose KIP-198 for 1.0.0 release. It's about removing
>> ZK
>>> dependency from Streams application reset tool. It's a fairly
>> simply
>>> KIP, such I want to start the vote immediately.
>>>
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>> 198%3A+Remove+ZK+dependency+from+Streams+Reset+Tool
>>>
>>>
>>> Thanks a lot!
>>>
>>> -Matthias
>>>
>>>
>>>
>>
>
>

>>>
>>>
>>>
>>> --
>>> -- Guozhang
>>>
>>
> 



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Randall Hauch
Okay, I think I've incorporated all feedback except for Gwen and Roger than
would like to have timing metrics. Given the deadline and Ewen's concern
about degraded performance, I think it's prudent to leave those out of this
KIP and proceed as is.



On Tue, Sep 12, 2017 at 12:48 PM, Randall Hauch  wrote:

>
>
> On Tue, Sep 12, 2017 at 10:36 AM, Roger Hoover 
> wrote:
>
>> Randall/Ewen,
>>
>> I think the timing info is still useful even if it's measured since the
>> last rebalance.  How else do you know where time is being spent?
>>
>
> I think Ewen's concern (correct me if I'm wrong) is that measuring
> time-based metrics might result in excessive performance degradation,
> especially when batch sizes are small.
>
>
>>
>> The use case for seeing the batch size is that you generally have two
>> knobs
>> to configure - max batch size and max wait time.  The batch size metrics
>> would tell you how full your batches are based on your current linger time
>> so you can adjust the config.
>>
>
> It does seem that batch sizes are useful, and the KIP includes these
> ("batch-size-max" and "batch-size-avg").
>
>
>>
>> Cheers,
>>
>> Roger
>>
>> On Mon, Sep 11, 2017 at 7:50 PM, Ewen Cheslack-Postava > >
>> wrote:
>>
>> > re: questions about additional metrics, I think we'll undoubtedly find
>> more
>> > that people want in practice, but as I mentioned earlier I think it's
>> > better to add the ones we know we need and then fill out the rest as we
>> > figure it out. So, e.g., batch size metrics sound like they could be
>> > useful, but I'd probably wait until we have a clear use case. It seems
>> > likely that it could be useful in diagnosing slow connectors (e.g. the
>> > implementation just does something inefficient), but I'm not really sure
>> > about that yet.
>> >
>> > -Ewen
>> >
>> > On Mon, Sep 11, 2017 at 7:11 PM, Randall Hauch 
>> wrote:
>> >
>> > > Based on Roger and Ewen's feedback, I removed the aggregate metrics as
>> > they
>> > > would be difficult to make use of without extra work. This simplified
>> > > things a great deal, and I took the opportunity to reorganize the
>> groups
>> > of
>> > > metrics. Also, based upon Ewen's concerns regarding measuring
>> > > times/durations, I removed all time-related metrics except for the
>> offset
>> > > commits and rebalances, which are infrequent enough to warrant the
>> > capture
>> > > of percentiles. Roger asked about capturing batch size metrics for
>> source
>> > > and sink tasks, and offset lag metrics for sink tasks. Finally, Ewen
>> > > pointed out that all count/total metrics are only valid since the most
>> > > recent rebalance and are therefore less meaningful, and were removed.
>> > >
>> > > On Mon, Sep 11, 2017 at 6:50 PM, Randall Hauch 
>> wrote:
>> > >
>> > > > Thanks, Ewen. Comments inline below.
>> > > >
>> > > > On Mon, Sep 11, 2017 at 5:46 PM, Ewen Cheslack-Postava <
>> > > e...@confluent.io>
>> > > > wrote:
>> > > >
>> > > >> Randall,
>> > > >>
>> > > >> A couple of questions:
>> > > >>
>> > > >> * Some metrics don't seem to have unique names? e.g.
>> > > >> source-record-produce-rate and source-record-produce-total seem
>> like
>> > > they
>> > > >> are duplicated. Looks like maybe just an oversight that the second
>> > ones
>> > > >> should be changed from "produce" to "write".
>> > > >>
>> > > >
>> > > > Nice catch. You are correct - should be "write" instead of
>> "produce". I
>> > > > will correct.
>> > > >
>> > > >
>> > > >> * I think there's a stray extra character in a couple of
>> > > >> places: kafka.connect:type=source-task-metrics,name=source-record-
>> > > >> produce-total,worker=([-.\w]+)l,connector=([-.\w]+),task=([\d]+)
>> > > >> has an extra char after the worker name.
>> > > >>
>> > > >
>> > > > Thanks. Removed in 2 places.
>> > > >
>> > > >
>> > > >> * Are the produce totals actually useful given rebalancing would
>> > cancel
>> > > >> them out anyway? Doesn't seem like you could do much with them.
>> > > >>
>> > > >
>> > > > Yes, the totals would be since the last rebalance. Maybe that isn't
>> > that
>> > > > useful. Might be better to capture the offsets and lag as Roger was
>> > > > suggestion. Thoughts?
>> > > >
>> > > >
>> > > >> * Why do transformations get their own metric but not converters?
>> And
>> > > are
>> > > >> we concerned at all about the performance impact of getting such
>> fine
>> > > >> grained info? Getting current time isn't free and we've seen before
>> > that
>> > > >> we
>> > > >> ended up w/ accidental performance regressions as we tried to
>> check it
>> > > too
>> > > >> frequently to enforce timeouts fine grained in the producer (iirc).
>> > > >> Batching helps w/ this, but on the consumer side, a
>> max.poll.records=1
>> > > >> setting could put you in a bad place, especially since transforms
>> > might
>> > > be
>> > > >> very lightweight (or nothing) and converters are expected to be
>> 

Build failed in Jenkins: kafka-trunk-jdk8 #2012

2017-09-12 Thread Apache Jenkins Server
See 


Changes:

[rajinisivaram] KAFKA-5872; Fix transient failure in 
SslSelectorTest.testMuteOnOOM

--
[...truncated 2.53 MB...]

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin PASSED

org.apache.kafka.streams.integration.ResetIntegrationTest > 
testReprocessingFromScratchAfterResetWithIntermediateUserTopic STARTED

org.apache.kafka.streams.integration.ResetIntegrationTest > 
testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
java.lang.AssertionError: expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.kafka.streams.integration.ResetIntegrationTest.cleanGlobal(ResetIntegrationTest.java:363)
at 
org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:172)

org.apache.kafka.streams.integration.ResetIntegrationTest > 

Re: [VOTE] KIP-197: Include Connector type in Connector REST API

2017-09-12 Thread Ted Yu
I was waiting for the end of day since there were two days across the
weekend.

Thanks for your reminder.

On Tue, Sep 12, 2017 at 11:23 AM, Gwen Shapira  wrote:

> Yes. I think you can close the vote (3 days passed and you have 4 binding
> votes).
>
> Gwen
>
> On Tue, Sep 12, 2017 at 11:21 AM Ted Yu  wrote:
>
> > Ismael had a vote on the DISCUSS thread:
> >
> >
> > http://search-hadoop.com/m/Kafka/uyzND18F2Bu15PQaW1?subj=
> Re+DISCUSS+KIP+197+Include+Connector+type+in+Connector+REST+API
> >
> > On Tue, Sep 12, 2017 at 11:14 AM, Gwen Shapira 
> wrote:
> >
> > > +1 (binding)
> > >
> > > On Mon, Sep 11, 2017 at 8:25 PM Ewen Cheslack-Postava <
> e...@confluent.io
> > >
> > > wrote:
> > >
> > > > +1 binding.
> > > >
> > > > Thanks for the contribution Ted! Simple addition, but makes the API
> > > > significantly more usable.
> > > >
> > > > -Ewen
> > > >
> > > > On Fri, Sep 8, 2017 at 7:46 PM, Sriram Subramanian  >
> > > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > On Fri, Sep 8, 2017 at 4:33 PM, Randall Hauch 
> > > wrote:
> > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > Randall
> > > > > >
> > > > > > On Fri, Sep 8, 2017 at 6:32 PM, Ted Yu 
> > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > > Please take a look at the following and cast your vote:
> > > > > > >
> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > 197+Connect+REST+API+should+include+the+connector+type+
> > > > > > > when+describing+a+connector
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] KIP-197: Include Connector type in Connector REST API

2017-09-12 Thread Gwen Shapira
Yes. I think you can close the vote (3 days passed and you have 4 binding
votes).

Gwen

On Tue, Sep 12, 2017 at 11:21 AM Ted Yu  wrote:

> Ismael had a vote on the DISCUSS thread:
>
>
> http://search-hadoop.com/m/Kafka/uyzND18F2Bu15PQaW1?subj=Re+DISCUSS+KIP+197+Include+Connector+type+in+Connector+REST+API
>
> On Tue, Sep 12, 2017 at 11:14 AM, Gwen Shapira  wrote:
>
> > +1 (binding)
> >
> > On Mon, Sep 11, 2017 at 8:25 PM Ewen Cheslack-Postava  >
> > wrote:
> >
> > > +1 binding.
> > >
> > > Thanks for the contribution Ted! Simple addition, but makes the API
> > > significantly more usable.
> > >
> > > -Ewen
> > >
> > > On Fri, Sep 8, 2017 at 7:46 PM, Sriram Subramanian 
> > > wrote:
> > >
> > > > +1
> > > >
> > > > On Fri, Sep 8, 2017 at 4:33 PM, Randall Hauch 
> > wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Randall
> > > > >
> > > > > On Fri, Sep 8, 2017 at 6:32 PM, Ted Yu 
> wrote:
> > > > >
> > > > > > Hi,
> > > > > > Please take a look at the following and cast your vote:
> > > > > >
> > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > 197+Connect+REST+API+should+include+the+connector+type+
> > > > > > when+describing+a+connector
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-12 Thread Gwen Shapira
If I understand you correctly, you are saying:
1. KIP-190 will not affect anyone who doesn't use special characters in
their client IDs
2. Those who have special characters in client IDs already have tons of
metrics issues and won't be inconvenienced by a KIP that fixes them.

Did I get it right?

On Sat, Sep 9, 2017 at 9:54 AM Mickael Maison 
wrote:

> Hi Gwen, thanks for taking a look at the KIP.
>
> I understand your concern trying to make the transition as smooth as
> possible. However there are several issues with the way client-ids
> with special characters are handled:
> Client-ids that contain invalid ObjectName characters (colon, equals,
> etc) currently fail to be registered by the build-in JMX reporter so
> they already don't appear in all monitoring systems ! These also cause
> issues with Quotas.
>
> The Java clients as well as the kafka-configs.sh tool already reject
> them (even though the error you get from the Produce/Consumer is
> pretty cryptic).
>
> People currently using client-ids with special characters have to be
> running 3rd party clients and probably encounter strange quotas issues
> as well as missing metrics (if they use JMX).
>
> So if we really want to do the smallest possible change, we could only
> encode ObjectName special characters instead of all special
> characters. That way at least the JMX reporter would work correctly.
> Display a warning when using any other special characters. Then in a
> later release, encode everything like we currently do for the
> User/Principal.
>
> What do you think ?
>
> On Fri, Sep 1, 2017 at 7:33 AM, Gwen Shapira  wrote:
> > Thanks for bumping this. I do have a concern:
> >
> > This proposal changes the names of existing metrics - as such, it will
> > require all owners of monitoring systems to update their dashboards. It
> > will also complicate monitoring of multiple clusters with different
> > versions and require some modifications to existing monitoring automation
> > systems.
> >
> > What do you think of an alternative solution:
> > 1. For the next release, add the validations on the broker side and
> print a
> > "warning" that this client id is invalid, that it will break metrics and
> > that it will be rejected in newer versions.
> > 2. Few releases later, actually turn the validation on and return
> > InvalidClientID error to clients.
> >
> > We did something similar when we deprecated acks=2.
> >
> > Gwen
> >
> >
> > On Thu, Aug 31, 2017 at 12:13 PM Mickael Maison <
> mickael.mai...@gmail.com>
> > wrote:
> >
> >> Even though it's pretty non controversial, I was expecting a few
> comments.
> >> I'll wait until next week for comments then I'll start the vote.
> >>
> >> Thanks
> >>
> >> On Mon, Aug 21, 2017 at 6:51 AM, Mickael Maison
> >>  wrote:
> >> > Hi all,
> >> >
> >> > I have created a KIP to cleanup the way client-ids are handled by
> >> > brokers and clients.
> >> >
> >> > Currently the Java clients have some restrictions on the client-ids
> >> > that are not enforced by the brokers. Using 3rd party clients,
> >> > client-ids containing any characters can be used causing some strange
> >> > behaviours in the way brokers handle metrics and quotas.
> >> >
> >> > Feedback is appreciated.
> >> >
> >> > Thanks
> >>
>


Re: [VOTE] KIP-197: Include Connector type in Connector REST API

2017-09-12 Thread Ted Yu
Ismael had a vote on the DISCUSS thread:

http://search-hadoop.com/m/Kafka/uyzND18F2Bu15PQaW1?subj=Re+DISCUSS+KIP+197+Include+Connector+type+in+Connector+REST+API

On Tue, Sep 12, 2017 at 11:14 AM, Gwen Shapira  wrote:

> +1 (binding)
>
> On Mon, Sep 11, 2017 at 8:25 PM Ewen Cheslack-Postava 
> wrote:
>
> > +1 binding.
> >
> > Thanks for the contribution Ted! Simple addition, but makes the API
> > significantly more usable.
> >
> > -Ewen
> >
> > On Fri, Sep 8, 2017 at 7:46 PM, Sriram Subramanian 
> > wrote:
> >
> > > +1
> > >
> > > On Fri, Sep 8, 2017 at 4:33 PM, Randall Hauch 
> wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > Randall
> > > >
> > > > On Fri, Sep 8, 2017 at 6:32 PM, Ted Yu  wrote:
> > > >
> > > > > Hi,
> > > > > Please take a look at the following and cast your vote:
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 197+Connect+REST+API+should+include+the+connector+type+
> > > > > when+describing+a+connector
> > > > >
> > > > > Thanks
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] KIP-131 - Add access to OffsetStorageReader from SourceConnector

2017-09-12 Thread Gwen Shapira
Thanks for clarifying.

+1 again :)



On Sat, Sep 9, 2017 at 6:57 AM Randall Hauch  wrote:

> Gwen,
>
> I've had more time to look into the code. First, the OffsetStorageReader
> JavaDoc says: "OffsetStorageReader provides access to the offset storage
> used by sources. This can be used by connectors to determine offsets to
> start consuming data from. This is most commonly used during initialization
> of a task, but can also be used during runtime, e.g. when reconfiguring a
> task."
>
> Second, this KIP allows the SourceConnector implementations to access the
> OffsetStorageReader, but really the SourceConnector methods are *only*
> related to lifecycle changes when using the OffsetStorageReader would be
> appropriate per the comment above.
>
> In summary, I don't think there is any concern about the
> OffsetStorageReader being used inappropriate by the SourceConnector
> implementations.
>
> Randall
>
> On Fri, Sep 8, 2017 at 9:46 AM, Gwen Shapira  wrote:
>
> > Basically, you are saying that the part where the comment says: "Offset
> > data should only be read during startup or reconfiguration of a task."
> > is incorrect? because the API extension allows reading offset data at any
> > point in the lifecycle, right?
> >
> > On Fri, Sep 8, 2017 at 5:18 AM Florian Hussonnois  >
> > wrote:
> >
> > > Hi Shapira,
> > >
> > > We only expose the OffsetStorageReader to connector which relies on
> > > KafkaOffsetBackingStore. The store continuesly consumes offsets from
> > kafka
> > > so I think we can't have stale data.
> > >
> > >
> > > Le 8 sept. 2017 06:13, "Randall Hauch"  a écrit :
> > >
> > > > The KIP and PR expose the OffsetStorageReader, which is already
> exposed
> > > to
> > > > the tasks. The OffsetStorageWriter is part of the implementation, but
> > was
> > > > not and is not exposed thru the API.
> > > >
> > > > > On Sep 7, 2017, at 9:04 PM, Gwen Shapira 
> wrote:
> > > > >
> > > > > I just re-read the code for the OffsetStorageWriter, and ran into
> > this
> > > > > comment:
> > > > >
> > > > > * Note that this only provides write functionality. This is
> > > > > intentional to ensure stale data is
> > > > > * never read. Offset data should only be read during startup or
> > > > > reconfiguration of a task. By
> > > > > * always serving those requests by reading the values from the
> > backing
> > > > > store, we ensure we never
> > > > > * accidentally use stale data. (One example of how this can occur:
> a
> > > > > task is processing input
> > > > > * partition A, writing offsets; reconfiguration causes partition A
> to
> > > > > be reassigned elsewhere;
> > > > > * reconfiguration causes partition A to be reassigned to this node,
> > > > > but now the offset data is out
> > > > > * of date). Since these offsets are created and managed by the
> > > > > connector itself, there's no way
> > > > > * for the offset management layer to know which keys are "owned" by
> > > > > which tasks at any given
> > > > > * time.
> > > > >
> > > > >
> > > > > I can't figure out how the KIP avoids the stale-reads problem
> > explained
> > > > here.
> > > > >
> > > > > Can you talk me through it? I'm cancelling my vote since right now
> > > > > exposing this interface sounds risky and misleading.
> > > > >
> > > > >
> > > > > Gwen
> > > > >
> > > > >
> > > > >> On Thu, Sep 7, 2017 at 5:04 PM Gwen Shapira 
> > > wrote:
> > > > >>
> > > > >> +1 (binding)
> > > > >>
> > > > >> Looking forward to see how connector implementations use this in
> > > > practice
> > > > >> :)
> > > > >>
> > > > >>> On Thu, Sep 7, 2017 at 3:49 PM Randall Hauch 
> > > wrote:
> > > > >>>
> > > > >>> I'd like to open the vote for KIP-131:
> > > > >>>
> > > > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-131+-+
> > > > Add+access+to+OffsetStorageReader+from+SourceConnector
> > > > >>>
> > > > >>> Thanks to Florian for submitting the KIP and the implementation,
> > and
> > > to
> > > > >>> everyone else that helped review.
> > > > >>>
> > > > >>> Best regards,
> > > > >>>
> > > > >>> Randall
> > > > >>>
> > > > >>
> > > >
> > >
> >
>


Re: [VOTE] KIP-197: Include Connector type in Connector REST API

2017-09-12 Thread Gwen Shapira
+1 (binding)

On Mon, Sep 11, 2017 at 8:25 PM Ewen Cheslack-Postava 
wrote:

> +1 binding.
>
> Thanks for the contribution Ted! Simple addition, but makes the API
> significantly more usable.
>
> -Ewen
>
> On Fri, Sep 8, 2017 at 7:46 PM, Sriram Subramanian 
> wrote:
>
> > +1
> >
> > On Fri, Sep 8, 2017 at 4:33 PM, Randall Hauch  wrote:
> >
> > > +1 (non-binding)
> > >
> > > Randall
> > >
> > > On Fri, Sep 8, 2017 at 6:32 PM, Ted Yu  wrote:
> > >
> > > > Hi,
> > > > Please take a look at the following and cast your vote:
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 197+Connect+REST+API+should+include+the+connector+type+
> > > > when+describing+a+connector
> > > >
> > > > Thanks
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Gwen Shapira
Ewen, you gave a nice talk at Kafka Summit where you warned about the
danger of SMTs that slow down the data pipe. If we don't provide the time
metrics, how will users know when their SMTs are causing performance issues?

Gwen

On Mon, Sep 11, 2017 at 7:50 PM Ewen Cheslack-Postava 
wrote:

> re: questions about additional metrics, I think we'll undoubtedly find more
> that people want in practice, but as I mentioned earlier I think it's
> better to add the ones we know we need and then fill out the rest as we
> figure it out. So, e.g., batch size metrics sound like they could be
> useful, but I'd probably wait until we have a clear use case. It seems
> likely that it could be useful in diagnosing slow connectors (e.g. the
> implementation just does something inefficient), but I'm not really sure
> about that yet.
>
> -Ewen
>
> On Mon, Sep 11, 2017 at 7:11 PM, Randall Hauch  wrote:
>
> > Based on Roger and Ewen's feedback, I removed the aggregate metrics as
> they
> > would be difficult to make use of without extra work. This simplified
> > things a great deal, and I took the opportunity to reorganize the groups
> of
> > metrics. Also, based upon Ewen's concerns regarding measuring
> > times/durations, I removed all time-related metrics except for the offset
> > commits and rebalances, which are infrequent enough to warrant the
> capture
> > of percentiles. Roger asked about capturing batch size metrics for source
> > and sink tasks, and offset lag metrics for sink tasks. Finally, Ewen
> > pointed out that all count/total metrics are only valid since the most
> > recent rebalance and are therefore less meaningful, and were removed.
> >
> > On Mon, Sep 11, 2017 at 6:50 PM, Randall Hauch  wrote:
> >
> > > Thanks, Ewen. Comments inline below.
> > >
> > > On Mon, Sep 11, 2017 at 5:46 PM, Ewen Cheslack-Postava <
> > e...@confluent.io>
> > > wrote:
> > >
> > >> Randall,
> > >>
> > >> A couple of questions:
> > >>
> > >> * Some metrics don't seem to have unique names? e.g.
> > >> source-record-produce-rate and source-record-produce-total seem like
> > they
> > >> are duplicated. Looks like maybe just an oversight that the second
> ones
> > >> should be changed from "produce" to "write".
> > >>
> > >
> > > Nice catch. You are correct - should be "write" instead of "produce". I
> > > will correct.
> > >
> > >
> > >> * I think there's a stray extra character in a couple of
> > >> places: kafka.connect:type=source-task-metrics,name=source-record-
> > >> produce-total,worker=([-.\w]+)l,connector=([-.\w]+),task=([\d]+)
> > >> has an extra char after the worker name.
> > >>
> > >
> > > Thanks. Removed in 2 places.
> > >
> > >
> > >> * Are the produce totals actually useful given rebalancing would
> cancel
> > >> them out anyway? Doesn't seem like you could do much with them.
> > >>
> > >
> > > Yes, the totals would be since the last rebalance. Maybe that isn't
> that
> > > useful. Might be better to capture the offsets and lag as Roger was
> > > suggestion. Thoughts?
> > >
> > >
> > >> * Why do transformations get their own metric but not converters? And
> > are
> > >> we concerned at all about the performance impact of getting such fine
> > >> grained info? Getting current time isn't free and we've seen before
> that
> > >> we
> > >> ended up w/ accidental performance regressions as we tried to check it
> > too
> > >> frequently to enforce timeouts fine grained in the producer (iirc).
> > >> Batching helps w/ this, but on the consumer side, a max.poll.records=1
> > >> setting could put you in a bad place, especially since transforms
> might
> > be
> > >> very lightweight (or nothing) and converters are expected to be
> > relatively
> > >> cheap as well.
> > >>
> > >
> > > We could remove the read, transform, and put time-based metrics for
> sink
> > > tasks, and poll, transform, and write time-based metrics. Can/should
> they
> > > be replaced with anything else?
> > >
> > >
> > >> * If we include the worker id everywhere and don't have metrics
> without
> > >> that included, isn't that a pain for users that dump this data into
> some
> > >> other system? They have to know which worker the connector/task is
> > >> currently on *or* need to do extra work to merge the metrics from
> across
> > >> machines. Including versions with the worker ID can make sense for
> > >> completeness and accuracy (e.g. technically there are still very slim
> > >> risks
> > >> of having a task running twice due to zombies), but it seems like bad
> > >> usability for the common case.
> > >>
> > >
> > > Part of the reason was also to help identify where each of the metrics
> > > came from, but per the next comment this may not be as useful, either.
> > > So remove the worker ID in all the task and connector metric names?
> What
> > > about the worker metrics?
> > >
> > >
> > >> * Is aggregating things like source record rate at the (worker,
> > connector)
> > >> level really useful since 

Re: [DISCUSS] KIP-199: Add Kafka Connect offset reset tool

2017-09-12 Thread Gwen Shapira
>
>
> > * re: naming, can we avoid including 'source' in the command name? even
> if
> > that's all it supports today, I don't think we want to restrict it. While
> > we only *require* this for source offsets, I think for users it will,
> > long-term, be way more natural to consider connect offsets generally and
> > not need to know how to drop down to consumer offsets for sink
> connectors.
> > In fact, in an ideal world, many users/operators may not even
> *understand*
> > consumer offsets deeply while still being generally familiar with connect
> > offsets. We can always include an error message for now if they try to do
> > something with a sink connector (which we presumably need to detect
> anyway
> > to correctly handle source connectors).
> >
>
> ack
>
>
Sorry about bikeshedding, but:
I think a general name for something that has limited functionality is very
confusing (and we've seen this play out before). Why not name the tool to
describe what it does and change its name if we change the functionality?

>
>


[GitHub] kafka pull request #3838: [KAFKA-1194 Invokes unmap if on Windows OS]

2017-09-12 Thread manmedia
GitHub user manmedia opened a pull request:

https://github.com/apache/kafka/pull/3838

[KAFKA-1194 Invokes unmap if on Windows OS]

If on Windows OS, forces unmapping of mmap so that segment suffixes can be 
changed.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/manmedia/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3838.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3838


commit 065e32fab15fd55dbb5fe456bbf3ebdd651d42d8
Author: M. Manna 
Date:   2017-09-12T17:16:33Z

[KAFKA-1194 Invokes unmap if on Windows OS]




---


Re: [DISCUSS] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Randall Hauch
On Tue, Sep 12, 2017 at 10:36 AM, Roger Hoover 
wrote:

> Randall/Ewen,
>
> I think the timing info is still useful even if it's measured since the
> last rebalance.  How else do you know where time is being spent?
>

I think Ewen's concern (correct me if I'm wrong) is that measuring
time-based metrics might result in excessive performance degradation,
especially when batch sizes are small.


>
> The use case for seeing the batch size is that you generally have two knobs
> to configure - max batch size and max wait time.  The batch size metrics
> would tell you how full your batches are based on your current linger time
> so you can adjust the config.
>

It does seem that batch sizes are useful, and the KIP includes these
("batch-size-max" and "batch-size-avg").


>
> Cheers,
>
> Roger
>
> On Mon, Sep 11, 2017 at 7:50 PM, Ewen Cheslack-Postava 
> wrote:
>
> > re: questions about additional metrics, I think we'll undoubtedly find
> more
> > that people want in practice, but as I mentioned earlier I think it's
> > better to add the ones we know we need and then fill out the rest as we
> > figure it out. So, e.g., batch size metrics sound like they could be
> > useful, but I'd probably wait until we have a clear use case. It seems
> > likely that it could be useful in diagnosing slow connectors (e.g. the
> > implementation just does something inefficient), but I'm not really sure
> > about that yet.
> >
> > -Ewen
> >
> > On Mon, Sep 11, 2017 at 7:11 PM, Randall Hauch  wrote:
> >
> > > Based on Roger and Ewen's feedback, I removed the aggregate metrics as
> > they
> > > would be difficult to make use of without extra work. This simplified
> > > things a great deal, and I took the opportunity to reorganize the
> groups
> > of
> > > metrics. Also, based upon Ewen's concerns regarding measuring
> > > times/durations, I removed all time-related metrics except for the
> offset
> > > commits and rebalances, which are infrequent enough to warrant the
> > capture
> > > of percentiles. Roger asked about capturing batch size metrics for
> source
> > > and sink tasks, and offset lag metrics for sink tasks. Finally, Ewen
> > > pointed out that all count/total metrics are only valid since the most
> > > recent rebalance and are therefore less meaningful, and were removed.
> > >
> > > On Mon, Sep 11, 2017 at 6:50 PM, Randall Hauch 
> wrote:
> > >
> > > > Thanks, Ewen. Comments inline below.
> > > >
> > > > On Mon, Sep 11, 2017 at 5:46 PM, Ewen Cheslack-Postava <
> > > e...@confluent.io>
> > > > wrote:
> > > >
> > > >> Randall,
> > > >>
> > > >> A couple of questions:
> > > >>
> > > >> * Some metrics don't seem to have unique names? e.g.
> > > >> source-record-produce-rate and source-record-produce-total seem like
> > > they
> > > >> are duplicated. Looks like maybe just an oversight that the second
> > ones
> > > >> should be changed from "produce" to "write".
> > > >>
> > > >
> > > > Nice catch. You are correct - should be "write" instead of
> "produce". I
> > > > will correct.
> > > >
> > > >
> > > >> * I think there's a stray extra character in a couple of
> > > >> places: kafka.connect:type=source-task-metrics,name=source-record-
> > > >> produce-total,worker=([-.\w]+)l,connector=([-.\w]+),task=([\d]+)
> > > >> has an extra char after the worker name.
> > > >>
> > > >
> > > > Thanks. Removed in 2 places.
> > > >
> > > >
> > > >> * Are the produce totals actually useful given rebalancing would
> > cancel
> > > >> them out anyway? Doesn't seem like you could do much with them.
> > > >>
> > > >
> > > > Yes, the totals would be since the last rebalance. Maybe that isn't
> > that
> > > > useful. Might be better to capture the offsets and lag as Roger was
> > > > suggestion. Thoughts?
> > > >
> > > >
> > > >> * Why do transformations get their own metric but not converters?
> And
> > > are
> > > >> we concerned at all about the performance impact of getting such
> fine
> > > >> grained info? Getting current time isn't free and we've seen before
> > that
> > > >> we
> > > >> ended up w/ accidental performance regressions as we tried to check
> it
> > > too
> > > >> frequently to enforce timeouts fine grained in the producer (iirc).
> > > >> Batching helps w/ this, but on the consumer side, a
> max.poll.records=1
> > > >> setting could put you in a bad place, especially since transforms
> > might
> > > be
> > > >> very lightweight (or nothing) and converters are expected to be
> > > relatively
> > > >> cheap as well.
> > > >>
> > > >
> > > > We could remove the read, transform, and put time-based metrics for
> > sink
> > > > tasks, and poll, transform, and write time-based metrics. Can/should
> > they
> > > > be replaced with anything else?
> > > >
> > > >
> > > >> * If we include the worker id everywhere and don't have metrics
> > without
> > > >> that included, isn't that a pain for users that dump this data into
> > some
> > > >> other system? 

Re: [VOTE] KIP-195: AdminClient.createPartitions()

2017-09-12 Thread Tom Bentley
Hi Jun,

Thanks for the comments.

On 12 September 2017 at 18:15, Jun Rao  wrote:

> Hi, Tom,
>
> Thanks for the KIP.  +1. Just a couple of minor comments below.
>
> 1. The KIP has "INVALID_PARTITIONS (37) If the partition count was <= the
> current partition count for the topic." We probably want to add one more
> constraint: # of replicas in each new partition has to be the same as the
> existing replication factor for the topic.
>

This is presently covered by the INVALID_REQUEST error code, but sure I can
change it.

2. The KIP has the following.
>
>- REASSIGNMENT_IN_PROGRESS (new) If a partition reassignment is in
> progress. It is necessary to prevent increasing partitions at the same
> time so that we can be sure the partition has a meaningful replication
> factor.
>
>
> Currently, it's possible to increase the partition count while a
> partition reassignment is in progress (adding partitions is much
> cheaper than partition reassignment). On the other hand, if a topic is
> being deleted, we will prevent adding new partitions. So, we probably
> want to do the same in the KIP.
>

Increasing the partition count at the same time as a reassignment was
covered in the [DISCUSS] thread. The problem is that during a reassignment
there isn't a stable notion of replication factor. The brokers don't really
know about replication factor at all, AFAICS. Currently when the partitions
are increased we just use the replication factor inferred from partition 0.
Ismael suggested to add this restriction to prevent this edge case.

I don't see an error code specifically pertaining to topic actions on
topics being deleted, so I guess INVALID_TOPIC_EXCEPTION would suffice for
the deletion case.

Thanks again,

Tom


Re: [DISCUSS] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Randall Hauch
Hi, James. I was mistaken about how the Kafka metrics are converted to
MBeans and attributes. The MBean is constructed from the group and tags,
and the metrics show up as attributes on the MBean. I'll update the KIP to
reflect this.

On Tue, Sep 12, 2017 at 1:43 AM, James Cheng  wrote:

> Thanks for the KIP, Randall.
>
> The KIP has one MBean per metric name. Can I suggest an alternate grouping?
>
> kafka.connect:type=connector-metrics,connector=([-.\w]+)
> connector-type
> connector-class
> connector-version
> status
>
> kafka.connect:type=task-metrics,connector=([-.\w]+),task=([\d]+)
> status
> pause-ratio
> offset-commit-success-percentage
> offset-commit-failure-percentage
> offset-commit-max-time
> offset-commit-99p-time
> offset-commit-95p-time
> offset-commit-90p-time
> offset-commit-75p-time
> offset-commit-50p-time
> batch-size-max
> batch-size-avg
>
> kafka.connect:type=source-task-metrics,connector=([-.\w]+),task=([\d]+)
> source-record-poll-rate
> source-record-write-rate
>
> kafka.connect:type=sink-task-metrics,connector=([-.\w]+),task=([\d]+)
> sink-record-read-rate
> sink-record-send-rate
> sink-record-lag-max
> partition-count
> offset-commit-95p-time
> offset-commit-90p-time
> offset-commit-75p-time
> offset-commit-50p-time
> batch-size-max
> batch-size-avg
>
> kafka.connect:type=sink-task-metrics,connector=([-.\w]+),
> task=([\d]+),topic=([-.\w]+),partition=([\d]+)
> sink-record-lag
> sink-record-lag-avg
> sink-record-lag-max
>
> kafka.connect:type=connect-coordinator-metrics
> task-count
> connector-count
> leader-name
> state
> rest-request-rate
>
> kafka.connect:type=connect-coordinator-metrics,name=assigned-tasks
> assigned-tasks (existing metric, so can't merge in above without
> breaking compatibility)
> kafka.connect:type=connect-coordinator-metrics,name=assigned-connectors
> (existing metric, so can't merge in above without breaking compatibility)
> assigned-connectors (existing metric, so can't merge in above
> without breaking compatibility)
>
> kafka.connect:type=connect-worker-rebalance-metrics
> rebalance-success-total
> rebalance-success-percentage
> rebalance-failure-total
> rebalance-failure-percentage
> rebalance-max-time
> rebalance-99p-time
> rebalance-95p-time
> rebalance-90p-time
> rebalance-75p-time
> rebalance-50p-time
> time-since-last-rebalance
> task-failure-rate
>
> This lets you use a single MBean selector to select multiple related
> attributes all at once. You can use JMX's wildcards to target which
> connectors or tasks or topics or partitions you care about.
>
> Also notice that for the topic and partition level metrics, the attributes
> are named identically ("sink-record-lag-avg" instead of
> "sink-record-{topic}-{partition}.records-lag-avg"), so monitoring systems
> have a consistent string they can use, instead of needing to
> prefix-and-suffix matching against the attribute name. And TBH, it
> integrates better with the work I'm doing in https://issues.apache.org/
> jira/browse/KAFKA-3480
>
> -James
>
> > On Sep 7, 2017, at 4:50 PM, Randall Hauch  wrote:
> >
> > Hi everyone.
> >
> > I've created a new KIP to add metrics to the Kafka Connect framework:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 196%3A+Add+metrics+to+Kafka+Connect+framework
> >
> > The KIP approval deadline is looming, so if you're interested in Kafka
> > Connect metrics please review and provide feedback as soon as possible.
> I'm
> > interested not only in whether the metrics are sufficient and
> appropriate,
> > but also in whether the MBean naming conventions are okay.
> >
> > Best regards,
> >
> > Randall
>
>


Re: [DISCUSS] KIP-199: Add Kafka Connect offset reset tool

2017-09-12 Thread Randall Hauch
Thanks for the comments. As mentioned in the VOTE thread, I've withdrawn
the vote since this proposal is obviously incomplete. I have updated the
KIP with your comments, and will discuss each inline below:

On Mon, Sep 11, 2017 at 11:30 PM, Ewen Cheslack-Postava 
wrote:

> A couple of comments:
>
> * I made some minor, non-critical updates to the motivation section to add
> a bit more color/background/clarity. In particular, clarifying how things
> are connected to consumer groups for sinks. Still, the motivation isn't
> entirely clear about how this works -- it is definitely a completely
> offline tool. Specifying that users stop a connector entirely, read/modify
> offsets, then restart, would probably be useful info for users of the
> feature.
>

ack


> * re: naming, can we avoid including 'source' in the command name? even if
> that's all it supports today, I don't think we want to restrict it. While
> we only *require* this for source offsets, I think for users it will,
> long-term, be way more natural to consider connect offsets generally and
> not need to know how to drop down to consumer offsets for sink connectors.
> In fact, in an ideal world, many users/operators may not even *understand*
> consumer offsets deeply while still being generally familiar with connect
> offsets. We can always include an error message for now if they try to do
> something with a sink connector (which we presumably need to detect anyway
> to correctly handle source connectors).
>

ack


> * re: naming of bin/ script, this is largely an artifact of my prototype
> and i don't think an explicit decision (my bad), but currently the connect
> commands don't have a kafka- prefix. I'm fine going either way, but if we
> standardize on kafka-connect-*, I would like to see a simple follow up to
> add the corresponding kafka-connect-standalone[.sh] and
> kafka-connect-distributed[.sh] commands to get things consistent, ideally
> with corresponding deprecations and plans for removal.
>

ack. Let's stick with "connect-offsets.sh" for now to follow the existing
convention.


> * Does export print all the offset commits currently available in the
> topic, or only the most recent? I am curious since this tool could
> potentially be useful for rewinding offsets if things get into a bad state,
> in which case printing all offsets still available could be useful. I think
> this should be a pretty safe extension since if you pipe all available
> offsets -> write offsets, you'll still just get the latest offset commit
> once the connector is restarted.
>

Addressed with a new option.


> * This should probably specify schemaless data. I know there were issues
> wrt internal converters that basically enforced this, but we should just be
> explicit about it moving forward.
>

I was hoping to just use the existing offset storage reader/writer
mechanism, but the OffsetStorageReader requires knowledge of the partition
information. Obviously we need to think more about this whole tool. One
option is to use a different implementation, or to enhance the existing
implementation class (not API) to allow doing this.

Anyway, if this is the way we go and we use the existing worker configs to
start the offset storage reader/writer, I'm not sure why schemaless would
matter.


> * nit, but i'd just change --connector-names -> --connectors. i don't think
> we use -names anywhere else yet and the natural plural seems... more
> natural to me
>

ack.


> * While we've talked about it for awhile in various KIP and pre-KIP
> discussions, standardizing on JSON input/output is new. I'm fine with it,
> but I do wonder if we should include a format flag by default or just add
> it as needed. All the "old" commands will need an explicit format flag if
> we want them to support JSON.
>

sure, we can discuss. Personally, I'd love to see YAML since that's less
verbose and easier to deal with, but that's not really an option since it
adds other complexity not to mention that nothing in Kafka uses YAML to my
knowledge.


> * can we get rid of --offsets-file? isn't this the same as piping stdout to
> a file? does it benefit us some other way i am missing? I was actually
> confused at first b/c I thought --offsets-file was referring to the
> standalone mode file to *read from* not the output JSON file to *write to*.
>

ack.


> * export output: definitely a nit, but do we need to repeat the offset in
> all the entries or should the top-level just be a map of connector -> list
> of (partition, offset)? overhead here is probably minimal so I don't feel
> strongly, but I don't see a good reason not to use a more efficient
> encoding.
>

ack. I agree nesting those related to a connector within a connector-level
doc is cleaner and probably easier to use. I considered that at first, but
then I thought that just having the pairs may be cleaner for representing
history. I'm not sure that's true, and the nested approach definitely seems
better. Another reason 

Build failed in Jenkins: kafka-trunk-jdk8 #2011

2017-09-12 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] KAFKA-5655; materialized count, aggregate, reduce to KGroupedTable

--
[...truncated 2.53 MB...]
org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenDeleted PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin PASSED

org.apache.kafka.streams.integration.ResetIntegrationTest > 
testReprocessingFromScratchAfterResetWithIntermediateUserTopic STARTED

org.apache.kafka.streams.integration.ResetIntegrationTest > 
testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
java.lang.AssertionError: expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.kafka.streams.integration.ResetIntegrationTest.cleanGlobal(ResetIntegrationTest.java:363)
at 
org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:172)


Build failed in Jenkins: kafka-trunk-jdk7 #2751

2017-09-12 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] KAFKA-5655; materialized count, aggregate, reduce to KGroupedTable

--
[...truncated 2.03 MB...]

org.apache.kafka.common.ClusterTest > testBootstrap PASSED

org.apache.kafka.common.cache.LRUCacheTest > testEviction STARTED

org.apache.kafka.common.cache.LRUCacheTest > testEviction PASSED

org.apache.kafka.common.cache.LRUCacheTest > testPutGet STARTED

org.apache.kafka.common.cache.LRUCacheTest > testPutGet PASSED

org.apache.kafka.common.cache.LRUCacheTest > testRemove STARTED

org.apache.kafka.common.cache.LRUCacheTest > testRemove PASSED

org.apache.kafka.common.utils.ShellTest > testEchoHello STARTED

org.apache.kafka.common.utils.ShellTest > testEchoHello PASSED

org.apache.kafka.common.utils.ShellTest > testRunProgramWithErrorReturn STARTED

org.apache.kafka.common.utils.ShellTest > testRunProgramWithErrorReturn PASSED

org.apache.kafka.common.utils.ShellTest > testHeadDevZero STARTED

org.apache.kafka.common.utils.ShellTest > testHeadDevZero PASSED

org.apache.kafka.common.utils.ShellTest > testAttemptToRunNonExistentProgram 
STARTED

org.apache.kafka.common.utils.ShellTest > testAttemptToRunNonExistentProgram 
PASSED

org.apache.kafka.common.utils.UtilsTest > testAbs STARTED

org.apache.kafka.common.utils.UtilsTest > testAbs PASSED

org.apache.kafka.common.utils.UtilsTest > testMin STARTED

org.apache.kafka.common.utils.UtilsTest > testMin PASSED

org.apache.kafka.common.utils.UtilsTest > toArray STARTED

org.apache.kafka.common.utils.UtilsTest > toArray PASSED

org.apache.kafka.common.utils.UtilsTest > utf8ByteBufferSerde STARTED

org.apache.kafka.common.utils.UtilsTest > utf8ByteBufferSerde PASSED

org.apache.kafka.common.utils.UtilsTest > testJoin STARTED

org.apache.kafka.common.utils.UtilsTest > testJoin PASSED

org.apache.kafka.common.utils.UtilsTest > 
testReadFullyOrFailWithPartialFileChannelReads STARTED

org.apache.kafka.common.utils.UtilsTest > 
testReadFullyOrFailWithPartialFileChannelReads PASSED

org.apache.kafka.common.utils.UtilsTest > toArrayDirectByteBuffer STARTED

org.apache.kafka.common.utils.UtilsTest > toArrayDirectByteBuffer PASSED

org.apache.kafka.common.utils.UtilsTest > testReadBytes STARTED

org.apache.kafka.common.utils.UtilsTest > testReadBytes PASSED

org.apache.kafka.common.utils.UtilsTest > testGetHost STARTED

org.apache.kafka.common.utils.UtilsTest > testGetHost PASSED

org.apache.kafka.common.utils.UtilsTest > testGetPort STARTED

org.apache.kafka.common.utils.UtilsTest > testGetPort PASSED

org.apache.kafka.common.utils.UtilsTest > 
testReadFullyWithPartialFileChannelReads STARTED

org.apache.kafka.common.utils.UtilsTest > 
testReadFullyWithPartialFileChannelReads PASSED

org.apache.kafka.common.utils.UtilsTest > testRecursiveDelete STARTED

org.apache.kafka.common.utils.UtilsTest > testRecursiveDelete PASSED

org.apache.kafka.common.utils.UtilsTest > testReadFullyOrFailWithRealFile 
STARTED

org.apache.kafka.common.utils.UtilsTest > testReadFullyOrFailWithRealFile PASSED

org.apache.kafka.common.utils.UtilsTest > writeToBuffer STARTED

org.apache.kafka.common.utils.UtilsTest > writeToBuffer PASSED

org.apache.kafka.common.utils.UtilsTest > testReadFullyIfEofIsReached STARTED

org.apache.kafka.common.utils.UtilsTest > testReadFullyIfEofIsReached PASSED

org.apache.kafka.common.utils.UtilsTest > testFormatBytes STARTED

org.apache.kafka.common.utils.UtilsTest > testFormatBytes PASSED

org.apache.kafka.common.utils.UtilsTest > utf8ByteArraySerde STARTED

org.apache.kafka.common.utils.UtilsTest > utf8ByteArraySerde PASSED

org.apache.kafka.common.utils.UtilsTest > testCloseAll STARTED

org.apache.kafka.common.utils.UtilsTest > testCloseAll PASSED

org.apache.kafka.common.utils.UtilsTest > testFormatAddress STARTED

org.apache.kafka.common.utils.UtilsTest > testFormatAddress PASSED

org.apache.kafka.common.utils.JavaTest > testLoadKerberosLoginModule STARTED

org.apache.kafka.common.utils.JavaTest > testLoadKerberosLoginModule PASSED

org.apache.kafka.common.utils.JavaTest > testIsIBMJdk STARTED

org.apache.kafka.common.utils.JavaTest > testIsIBMJdk PASSED

org.apache.kafka.common.utils.ChecksumsTest > testUpdateInt STARTED

org.apache.kafka.common.utils.ChecksumsTest > testUpdateInt PASSED

org.apache.kafka.common.utils.ChecksumsTest > testUpdateLong STARTED

org.apache.kafka.common.utils.ChecksumsTest > testUpdateLong PASSED

org.apache.kafka.common.utils.ChecksumsTest > 
testUpdateByteBufferWithOffsetPosition STARTED

org.apache.kafka.common.utils.ChecksumsTest > 
testUpdateByteBufferWithOffsetPosition PASSED

org.apache.kafka.common.utils.ChecksumsTest > testUpdateByteBuffer STARTED

org.apache.kafka.common.utils.ChecksumsTest > testUpdateByteBuffer PASSED

org.apache.kafka.common.utils.ByteUtilsTest > testInvalidVarlong STARTED

org.apache.kafka.common.utils.ByteUtilsTest > testInvalidVarlong PASSED


Re: [VOTE] KIP-182 - Reduce Streams DSL overloads and allow easier use of custom storage engines

2017-09-12 Thread Guozhang Wang
Hi Damian,

Why we are deprecating KTable.through while keeping KTable.to? Should we
either keep both of them or deprecate both of them in favor or
KTable.toStream if people agree that it is confusing to users?


Guozhang


On Tue, Sep 12, 2017 at 1:18 AM, Damian Guy  wrote:

> Hi All,
>
> A minor update to the KIP, i needed to add KTable.to(Produced) for
> consistency. KTable.through will be deprecated in favour of using
> KTable.toStream().through()
>
> Thanks,
> Damian
>
> On Thu, 7 Sep 2017 at 08:52 Damian Guy  wrote:
>
> > Thanks all. The vote is now closed and the KIP has been accepted with:
> > 2 non binding votes - bill and matthias
> > 3 binding  - Damian, Guozhang, Sriram
> >
> > Regards,
> > Damian
> >
> > On Tue, 5 Sep 2017 at 22:24 Sriram Subramanian  wrote:
> >
> >> +1
> >>
> >> On Tue, Sep 5, 2017 at 1:33 PM, Guozhang Wang 
> wrote:
> >>
> >> > +1
> >> >
> >> > On Fri, Sep 1, 2017 at 3:45 PM, Matthias J. Sax <
> matth...@confluent.io>
> >> > wrote:
> >> >
> >> > > +1
> >> > >
> >> > > On 9/1/17 2:53 PM, Bill Bejeck wrote:
> >> > > > +1
> >> > > >
> >> > > > On Thu, Aug 31, 2017 at 10:20 AM, Damian Guy <
> damian@gmail.com>
> >> > > wrote:
> >> > > >
> >> > > >> Thanks everyone for voting! Unfortunately i've had to make a bit
> >> of an
> >> > > >> update based on some issues found during implementation.
> >> > > >> The main changes are:
> >> > > >> BytesStoreSupplier -> StoreSupplier
> >> > > >> Addition of:
> >> > > >> WindowBytesStoreSupplier, KeyValueBytesStoreSupplier,
> >> > > >> SessionBytesStoreSupplier that will restrict store types to
>  >> > > byte[]>
> >> > > >> 3 new overloads added to Materialized to enable developers to
> >> create a
> >> > > >> Materialized of the appropriate type, i..e, WindowStore etc
> >> > > >> Update DSL where Materialized is used such that the stores have
> >> > generic
> >> > > >> types of 
> >> > > >> Some minor changes to the arguments to
> Store#persistentWindowStore
> >> and
> >> > > >> Store#persistentSessionStore
> >> > > >>
> >> > > >> Please take a look and recast the votes.
> >> > > >>
> >> > > >> Thanks for your time,
> >> > > >> Damian
> >> > > >>
> >> > > >> On Fri, 25 Aug 2017 at 17:05 Matthias J. Sax <
> >> matth...@confluent.io>
> >> > > >> wrote:
> >> > > >>
> >> > > >>> Thanks Damian. Great KIP!
> >> > > >>>
> >> > > >>> +1
> >> > > >>>
> >> > > >>>
> >> > > >>> -Matthias
> >> > > >>>
> >> > > >>> On 8/25/17 6:45 AM, Damian Guy wrote:
> >> > >  Hi,
> >> > > 
> >> > >  I've just realised we need to add two methods to
> >> StateStoreBuilder
> >> > or
> >> > > >> it
> >> > >  isn't going to work:
> >> > > 
> >> > >  Map logConfig();
> >> > >  boolean loggingEnabled();
> >> > > 
> >> > >  These are needed when we are building the topology and
> >> determining
> >> > >  changelog topic names and configs.
> >> > > 
> >> > > 
> >> > >  I've also update the KIP to add
> >> > > 
> >> > >  StreamBuilder#stream(String topic)
> >> > > 
> >> > >  StreamBuilder#stream(String topic, Consumed options)
> >> > > 
> >> > > 
> >> > >  Thanks
> >> > > 
> >> > > 
> >> > >  On Thu, 24 Aug 2017 at 22:11 Sriram Subramanian <
> >> r...@confluent.io>
> >> > > >>> wrote:
> >> > > 
> >> > > > +1
> >> > > >
> >> > > > On Thu, Aug 24, 2017 at 10:20 AM, Guozhang Wang <
> >> > wangg...@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > >> +1. Thanks Damian!
> >> > > >>
> >> > > >> On Thu, Aug 24, 2017 at 9:47 AM, Bill Bejeck <
> >> bbej...@gmail.com>
> >> > > >>> wrote:
> >> > > >>
> >> > > >>> Thanks for the KIP!
> >> > > >>>
> >> > > >>> +1
> >> > > >>>
> >> > > >>> Thanks,
> >> > > >>> Bill
> >> > > >>>
> >> > > >>> On Thu, Aug 24, 2017 at 12:25 PM, Damian Guy <
> >> > damian@gmail.com
> >> > > >
> >> > > >> wrote:
> >> > > >>>
> >> > >  Hi,
> >> > > 
> >> > >  I'd like to kick off the voting thread for KIP-182:
> >> > >  https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> > >  182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+
> >> > >  use+of+custom+storage+engines
> >> > > 
> >> > >  Thanks,
> >> > >  Damian
> >> > > 
> >> > > >>>
> >> > > >>
> >> > > >>
> >> > > >>
> >> > > >> --
> >> > > >> -- Guozhang
> >> > > >>
> >> > > >
> >> > > 
> >> > > >>>
> >> > > >>>
> >> > > >>
> >> > > >
> >> > >
> >> > >
> >> >
> >> >
> >> > --
> >> > -- Guozhang
> >> >
> >>
> >
>



-- 
-- Guozhang


Re: [VOTE] KIP-195: AdminClient.createPartitions()

2017-09-12 Thread Jun Rao
Hi, Tom,

Thanks for the KIP.  +1. Just a couple of minor comments below.

1. The KIP has "INVALID_PARTITIONS (37) If the partition count was <= the
current partition count for the topic." We probably want to add one more
constraint: # of replicas in each new partition has to be the same as the
existing replication factor for the topic.


2. The KIP has the following.

   - REASSIGNMENT_IN_PROGRESS (new) If a partition reassignment is in
progress. It is necessary to prevent increasing partitions at the same
time so that we can be sure the partition has a meaningful replication
factor.


Currently, it's possible to increase the partition count while a
partition reassignment is in progress (adding partitions is much
cheaper than partition reassignment). On the other hand, if a topic is
being deleted, we will prevent adding new partitions. So, we probably
want to do the same in the KIP.

Jun




On Tue, Sep 12, 2017 at 9:35 AM, Tom Bentley  wrote:

> Following additional comments from Ismael I have updated the KIP slightly
> to:
>
> * No longer apply the CreateTopicPolicy (a future KIP will address applying
> a policy to topic changes)
> * Clarify that the request must be sent to the controller and the response
> will only be sent once the changes are reflected in the metadata cache.
>
> Cheers,
>
> Tom
>
> On 8 September 2017 at 17:42, Tom Bentley  wrote:
>
> > I would like to start the vote on KIP-195 which adds an AdminClient API
> > for increasing the number of partitions of a topic. The details are here:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-195%3A+AdminClient
> .
> > createPartitions
> >
> > Cheers,
> >
> > Tom
> >
>


[jira] [Created] (KAFKA-5876) IQ should throw different exceptions for different errors

2017-09-12 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-5876:
--

 Summary: IQ should throw different exceptions for different errors
 Key: KAFKA-5876
 URL: https://issues.apache.org/jira/browse/KAFKA-5876
 Project: Kafka
  Issue Type: Sub-task
Reporter: Matthias J. Sax


Currently, IQ does only throws {{InvalidStateStoreException}} for all errors 
that occur. However, we have different types of errors and should throw 
different exceptions for those types.

For example, if a store was migrated it must be rediscovered while if a store 
cannot be queried yet, because it is still re-created after a rebalance, the 
user just needs to wait until store recreation is finished.

There might be other examples, too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [VOTE] KIP-199: Add Kafka Connect Offset Tool

2017-09-12 Thread Randall Hauch
I'm actually going to withdraw this vote. Recent discussion has made
apparent there are a few issues still to be worked out, and the proposal is
not complete. Apologies.

Best regards,

Randall

On Mon, Sep 11, 2017 at 8:42 PM, Sriram Subramanian 
wrote:

> +1
>
> On Mon, Sep 11, 2017 at 2:56 PM, Gwen Shapira  wrote:
>
> > +1
> >
> > On Mon, Sep 11, 2017 at 1:33 PM Ted Yu  wrote:
> >
> > > +1
> > >
> > > On Mon, Sep 11, 2017 at 7:43 AM, Randall Hauch 
> wrote:
> > >
> > > > I'd like to start the vote on KIP-199 to add a command line tool that
> > > will
> > > > allow Connect operators to read, modify, and update source connector
> > > > offsets. Details are here:
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 199%3A+Add+Kafka+Connect+offset+tool
> > > >
> > > > Thanks, and best regards.
> > > >
> > > > Randall
> > > >
> > >
> >
>


[GitHub] kafka pull request #3836: KAFKA-5872: Fix transient failure in SslSelectorTe...

2017-09-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3836


---


[jira] [Resolved] (KAFKA-5872) Transient failure in SslSelectorTest.testMuteOnOOM

2017-09-12 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-5872.
---
Resolution: Fixed

Issue resolved by pull request 3836
[https://github.com/apache/kafka/pull/3836]

> Transient failure in SslSelectorTest.testMuteOnOOM
> --
>
> Key: KAFKA-5872
> URL: https://issues.apache.org/jira/browse/KAFKA-5872
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 1.0.0
>
>
> There are a couple of issues:
> 1. `Selector.determineHandlingOrder()` currently doesn't clear selection keys 
> when keys are shuffled. This can result in select returning zero even when 
> there are ready keys, resulting in a tight loop of polls with no keys 
> processed.
> 2. The test expects `Selector.isOutOfMemory()` to be set in a poll that waits 
> only for 10ms. This is expecting two reads from two connections to be 
> processed within 10ms of each other, which may not always be the case.
> Error:
> {quote}
> org.apache.kafka.common.network.SslSelectorTest > testMuteOnOOM FAILED
> java.lang.AssertionError: could not initiate connections within timeout
> at org.junit.Assert.fail(Assert.java:88)
> at 
> org.apache.kafka.common.network.SslSelectorTest.testMuteOnOOM(SslSelectorTest.java:236)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [VOTE] KIP-195: AdminClient.createPartitions()

2017-09-12 Thread Tom Bentley
Following additional comments from Ismael I have updated the KIP slightly
to:

* No longer apply the CreateTopicPolicy (a future KIP will address applying
a policy to topic changes)
* Clarify that the request must be sent to the controller and the response
will only be sent once the changes are reflected in the metadata cache.

Cheers,

Tom

On 8 September 2017 at 17:42, Tom Bentley  wrote:

> I would like to start the vote on KIP-195 which adds an AdminClient API
> for increasing the number of partitions of a topic. The details are here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-195%3A+AdminClient.
> createPartitions
>
> Cheers,
>
> Tom
>


[GitHub] kafka pull request #3837: KAFKA-5873: add materialized overloads to StreamsB...

2017-09-12 Thread dguy
GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/3837

KAFKA-5873: add materialized overloads to StreamsBuilder

Add overloads for `table` and `globalTable` that use `Materialized`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka kafka-5873

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3837.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3837


commit 51a1de8d7e38cb50eeb62678b18e7b93b9913873
Author: Damian Guy 
Date:   2017-09-12T12:52:17Z

materialized table and global table




---


[GitHub] kafka pull request #3829: KAFKA-5655: materialized count, aggregate, reduce ...

2017-09-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3829


---


[jira] [Resolved] (KAFKA-5655) Add new API methods to KGroupedTable

2017-09-12 Thread Damian Guy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Guy resolved KAFKA-5655.
---
   Resolution: Fixed
Fix Version/s: 1.0.0

Issue resolved by pull request 3829
[https://github.com/apache/kafka/pull/3829]

> Add new API methods to KGroupedTable
> 
>
> Key: KAFKA-5655
> URL: https://issues.apache.org/jira/browse/KAFKA-5655
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 1.0.0
>
>
> Placeholder until API finalized



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: kafka-trunk-jdk7 #2750

2017-09-12 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] KAFKA:5653: add join overloads to KTable

--
[...truncated 2.51 MB...]
org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryStateWithNonZeroSizedCache STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryStateWithNonZeroSizedCache PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftOuterJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftOuterJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerOuterJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerOuterJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerOuterJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerOuterJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftOuterJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftOuterJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoinQueryable PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin PASSED

org.apache.kafka.streams.integration.FanoutIntegrationTest > 
shouldFanoutTheInput STARTED

org.apache.kafka.streams.integration.FanoutIntegrationTest > 
shouldFanoutTheInput PASSED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToRunWithEosEnabled STARTED


Build failed in Jenkins: kafka-trunk-jdk8 #2010

2017-09-12 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] KAFKA:5653: add join overloads to KTable

--
[...truncated 2.52 MB...]
org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenDeleted PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin PASSED

org.apache.kafka.streams.integration.ResetIntegrationTest > 
testReprocessingFromScratchAfterResetWithIntermediateUserTopic STARTED

org.apache.kafka.streams.integration.ResetIntegrationTest > 
testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
java.lang.AssertionError: expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.kafka.streams.integration.ResetIntegrationTest.cleanGlobal(ResetIntegrationTest.java:363)
at 
org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:172)


Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

2017-09-12 Thread Tom Bentley
Hi Ismael,

Thanks for your comments.

Regarding the loophole issue, keep in mind that the alter topics
> authorization would still be required, so I don't think it's an issue
>

It could be an issue for people trying to provide a Kafka-as-a-Service
offering, couldn't it? I mean if the providers are using the policy to
enforce a maximum number of partitions rule on their customers, but they've
also given those customers Alter(Topic) so they can change topics configs
then this API provides a loophole that wouldn't otherwise exist. Moreover
the motivation in KIP-170 suggests to me that this isn't a hypothetical
issue.

update the KIP so that the request must
> be sent to the Controller and we wait until the data is propagated into the
> Controller's metadata cache
>

About whether we send the request to the controller and when we consider
the request to have completed successfully... Currently the kafka-topics.sh
tool considers the change successful if the change was written to ZK. I'm
happy to change it, but:

a) I wanted to highlight that it would be a small change in semantics
b) OTOH is there any benefit to clients to change the success criterion to
when the metadata cache gets updated? If not then shouldn't we stick to the
existing criterion of a successful write to ZK?

Thanks,

Tom



On 12 September 2017 at 16:30, Ismael Juma  wrote:

> Hi Tom,
>
> OK, I suggest not calling any policy then. We can do a separate KIP for
> overhauling topic policies so that they work with all operations for 1.1.0.
> Regarding the loophole issue, keep in mind that the alter topics
> authorization would still be required, so I don't think it's an issue.
> Users that really need a policy will have to wait until 1.1.0, but that's
> no worse than if KIP-195 doesn't make it into 1.0.0.
>
> If you remove the policy text and update the KIP so that the request must
> be sent to the Controller and we wait until the data is propagated into the
> Controller's metadata cache (similar to create topic), it's a +1 from me.
>
> Ismael
>
> On Tue, Sep 12, 2017 at 10:26 AM, Tom Bentley 
> wrote:
>
> > 2. About using the create topics policy, I'm not sure. Aside from the
> > > naming issue, there's also the problem that the policy doesn't know if
> a
> > > creation or update is taking place. This matters because one may not
> want
> > > to allow the number of partitions to be changed after creation as it
> > > affects the semantics if keys are used. One option is to introduce a
> new
> > > interface that can be used by create, alter and delete with a new
> config.
> > > And deprecate CreateTopicPolicy. I doubt many are using it. What do you
> > > think?
> > >
> >
> > I included the part about the create topics policy because I felt it was
> > better, in the short term, to prevent the loophole than to just ignore
> it.
> > The create topic policy is obviously not a good fit for applying to topic
> > modifications, but I think designing a good policy interface that covered
> > creation, modification and deletion could be the subject of its own KIP.
> > Note that modification would include the APIs proposed in KIP-195 and
> > KIP-179. KIP-170 is already proposing to change the creation policy and
> add
> > a deletion policy, so shouldn't the changes necessary for KIP-195 be
> > considered as part of that KIP?
> >
> > I'm happy to propose something in KIP-195 if you really want, though it
> > would put in doubt whether it could be part of Kafka 1.0.0.
> >
>


Re: [DISCUSS] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread Roger Hoover
Randall/Ewen,

I think the timing info is still useful even if it's measured since the
last rebalance.  How else do you know where time is being spent?

The use case for seeing the batch size is that you generally have two knobs
to configure - max batch size and max wait time.  The batch size metrics
would tell you how full your batches are based on your current linger time
so you can adjust the config.

Cheers,

Roger

On Mon, Sep 11, 2017 at 7:50 PM, Ewen Cheslack-Postava 
wrote:

> re: questions about additional metrics, I think we'll undoubtedly find more
> that people want in practice, but as I mentioned earlier I think it's
> better to add the ones we know we need and then fill out the rest as we
> figure it out. So, e.g., batch size metrics sound like they could be
> useful, but I'd probably wait until we have a clear use case. It seems
> likely that it could be useful in diagnosing slow connectors (e.g. the
> implementation just does something inefficient), but I'm not really sure
> about that yet.
>
> -Ewen
>
> On Mon, Sep 11, 2017 at 7:11 PM, Randall Hauch  wrote:
>
> > Based on Roger and Ewen's feedback, I removed the aggregate metrics as
> they
> > would be difficult to make use of without extra work. This simplified
> > things a great deal, and I took the opportunity to reorganize the groups
> of
> > metrics. Also, based upon Ewen's concerns regarding measuring
> > times/durations, I removed all time-related metrics except for the offset
> > commits and rebalances, which are infrequent enough to warrant the
> capture
> > of percentiles. Roger asked about capturing batch size metrics for source
> > and sink tasks, and offset lag metrics for sink tasks. Finally, Ewen
> > pointed out that all count/total metrics are only valid since the most
> > recent rebalance and are therefore less meaningful, and were removed.
> >
> > On Mon, Sep 11, 2017 at 6:50 PM, Randall Hauch  wrote:
> >
> > > Thanks, Ewen. Comments inline below.
> > >
> > > On Mon, Sep 11, 2017 at 5:46 PM, Ewen Cheslack-Postava <
> > e...@confluent.io>
> > > wrote:
> > >
> > >> Randall,
> > >>
> > >> A couple of questions:
> > >>
> > >> * Some metrics don't seem to have unique names? e.g.
> > >> source-record-produce-rate and source-record-produce-total seem like
> > they
> > >> are duplicated. Looks like maybe just an oversight that the second
> ones
> > >> should be changed from "produce" to "write".
> > >>
> > >
> > > Nice catch. You are correct - should be "write" instead of "produce". I
> > > will correct.
> > >
> > >
> > >> * I think there's a stray extra character in a couple of
> > >> places: kafka.connect:type=source-task-metrics,name=source-record-
> > >> produce-total,worker=([-.\w]+)l,connector=([-.\w]+),task=([\d]+)
> > >> has an extra char after the worker name.
> > >>
> > >
> > > Thanks. Removed in 2 places.
> > >
> > >
> > >> * Are the produce totals actually useful given rebalancing would
> cancel
> > >> them out anyway? Doesn't seem like you could do much with them.
> > >>
> > >
> > > Yes, the totals would be since the last rebalance. Maybe that isn't
> that
> > > useful. Might be better to capture the offsets and lag as Roger was
> > > suggestion. Thoughts?
> > >
> > >
> > >> * Why do transformations get their own metric but not converters? And
> > are
> > >> we concerned at all about the performance impact of getting such fine
> > >> grained info? Getting current time isn't free and we've seen before
> that
> > >> we
> > >> ended up w/ accidental performance regressions as we tried to check it
> > too
> > >> frequently to enforce timeouts fine grained in the producer (iirc).
> > >> Batching helps w/ this, but on the consumer side, a max.poll.records=1
> > >> setting could put you in a bad place, especially since transforms
> might
> > be
> > >> very lightweight (or nothing) and converters are expected to be
> > relatively
> > >> cheap as well.
> > >>
> > >
> > > We could remove the read, transform, and put time-based metrics for
> sink
> > > tasks, and poll, transform, and write time-based metrics. Can/should
> they
> > > be replaced with anything else?
> > >
> > >
> > >> * If we include the worker id everywhere and don't have metrics
> without
> > >> that included, isn't that a pain for users that dump this data into
> some
> > >> other system? They have to know which worker the connector/task is
> > >> currently on *or* need to do extra work to merge the metrics from
> across
> > >> machines. Including versions with the worker ID can make sense for
> > >> completeness and accuracy (e.g. technically there are still very slim
> > >> risks
> > >> of having a task running twice due to zombies), but it seems like bad
> > >> usability for the common case.
> > >>
> > >
> > > Part of the reason was also to help identify where each of the metrics
> > > came from, but per the next comment this may not be as useful, either.
> > > So remove the worker ID in all the task 

Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

2017-09-12 Thread Ismael Juma
Hi Tom,

OK, I suggest not calling any policy then. We can do a separate KIP for
overhauling topic policies so that they work with all operations for 1.1.0.
Regarding the loophole issue, keep in mind that the alter topics
authorization would still be required, so I don't think it's an issue.
Users that really need a policy will have to wait until 1.1.0, but that's
no worse than if KIP-195 doesn't make it into 1.0.0.

If you remove the policy text and update the KIP so that the request must
be sent to the Controller and we wait until the data is propagated into the
Controller's metadata cache (similar to create topic), it's a +1 from me.

Ismael

On Tue, Sep 12, 2017 at 10:26 AM, Tom Bentley  wrote:

> 2. About using the create topics policy, I'm not sure. Aside from the
> > naming issue, there's also the problem that the policy doesn't know if a
> > creation or update is taking place. This matters because one may not want
> > to allow the number of partitions to be changed after creation as it
> > affects the semantics if keys are used. One option is to introduce a new
> > interface that can be used by create, alter and delete with a new config.
> > And deprecate CreateTopicPolicy. I doubt many are using it. What do you
> > think?
> >
>
> I included the part about the create topics policy because I felt it was
> better, in the short term, to prevent the loophole than to just ignore it.
> The create topic policy is obviously not a good fit for applying to topic
> modifications, but I think designing a good policy interface that covered
> creation, modification and deletion could be the subject of its own KIP.
> Note that modification would include the APIs proposed in KIP-195 and
> KIP-179. KIP-170 is already proposing to change the creation policy and add
> a deletion policy, so shouldn't the changes necessary for KIP-195 be
> considered as part of that KIP?
>
> I'm happy to propose something in KIP-195 if you really want, though it
> would put in doubt whether it could be part of Kafka 1.0.0.
>


Reg: Read External API response Producer

2017-09-12 Thread Mohan, Prithiv
Hi,

I am investigating Kafka as a message bus service for my MANO Monitoring 
module. I have a python producer that will take the message from a file and 
will put it in the message bus in json format. I have a consumer that reads the 
message and does some operation in the external OpenStack API. Is it possible 
for the producer to read the API response from the OpenStack and take it back 
to the message bus?
Do I have to do some interfacing with the external API? I have a separate 
schema definition in json format, I have to check the response message to see 
if all the required fields are available. Is it possible to do with kafka 
producer ? Please help

Many thanks


Prithiv
--
Intel Research and Development Ireland Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263


This e-mail and any attachments may contain confidential material for the sole
use of the intended recipient(s). Any review or distribution by others is
strictly prohibited. If you are not the intended recipient, please contact the
sender and delete all copies.


[jira] [Created] (KAFKA-5875) Consumer group repeatedly fails to join, even across JVM restarts: BufferUnderFlowException reading the {{version}} field in the consumer protocol header

2017-09-12 Thread Evan Pollan (JIRA)
Evan Pollan created KAFKA-5875:
--

 Summary: Consumer group repeatedly fails to join, even across JVM 
restarts: BufferUnderFlowException reading the {{version}} field in the 
consumer protocol header
 Key: KAFKA-5875
 URL: https://issues.apache.org/jira/browse/KAFKA-5875
 Project: Kafka
  Issue Type: Bug
Reporter: Evan Pollan


I've seen this maybe once a month in our large cluster Kubernetes-based Kafka 
consumers & producers.  Every once in a while a consumer in a Kubernetes "pod" 
get this error trying to join a consumer group:

{code}
{"level":"INFO","@timestamp":"2017-09-12T13:45:42.173+","logger":"org.apache.kafka.common.utils.AppInfoParser","message":"Kafka
 version : 0.11.0.0","exception":""}
{"level":"INFO","@timestamp":"2017-09-12T13:45:42.173+","logger":"org.apache.kafka.common.utils.AppInfoParser","message":"Kafka
 commitId : cb8625948210849f","exception":""}
{"level":"INFO","@timestamp":"2017-09-12T13:45:42.178+","logger":"org.apache.kafka.clients.consumer.internals.ConsumerCoordinator","message":"Revoking
 previously assigned partitions [] for group 
conv-fetch-jobs-runner-for-internal","exception":""}
{"level":"INFO","@timestamp":"2017-09-12T13:45:42.178+","logger":"org.apache.kafka.clients.consumer.internals.AbstractCoordinator","message":"(Re-)joining
 group conv-fetch-jobs-runner-for-internal","exception":""}
{"level":"INFO","@timestamp":"2017-09-12T13:45:43.588+","logger":"org.apache.kafka.clients.consumer.internals.AbstractCoordinator","message":"Successfully
 joined group conv-fetch-jobs-runner-for-internal with generation 
17297","exception":""}
{"errorType":"Error reading field 'version': 
java.nio.BufferUnderflowException","level":"ERROR","message":"Died!","operation":"Died!","stacktrace":"org.apache.kafka.common.protocol.types.SchemaException:
 Error reading field 'version': java.nio.BufferUnderflowException\n\tat 
org.apache.kafka.common.protocol.types.Schema.read(Schema.java:75)\n\tat 
org.apache.kafka.clients.consumer.internals.ConsumerProtocol.deserializeAssignment(ConsumerProtocol.java:105)\n\tat
 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:220)\n\tat
 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:363)\n\tat
 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:310)\n\tat
 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:297)\n\tat
 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1078)\n\tat
 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043)\n\tat
 
com.spredfast.common.kafka.consumer.RunnableConsumer.pollOnce(RunnableConsumer.java:141)\n\tat
 
com.spredfast.common.kafka.consumer.RunnableConsumer.access$000(RunnableConsumer.java:28)\n\tat
 
com.spredfast.common.kafka.consumer.RunnableConsumer$Loop.iterate(RunnableConsumer.java:125)\n\tat
 
com.spredfast.common.kafka.consumer.RunnableConsumer.run(RunnableConsumer.java:78)\n\tat
 
java.lang.Thread.run(Thread.java:745)\n","trackingId":"dead-consumer","logger":"com.spredfast.common.kafka.consumer.RunnableConsumer","loggingVersion":"UNKNOWN","component":"MetricsFetch","pid":25,"host":"fetch-2420457278-sh4f5","@timestamp":"2017-09-12T13:45:43.613Z"}
{code}

Pardon the log format -- these get sucked into logstash, thus the JSON.

Here's the raw stacktrace: 
{code}
org.apache.kafka.common.protocol.types.SchemaException: Error reading field 
'version': java.nio.BufferUnderflowException
at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:75)
at 
org.apache.kafka.clients.consumer.internals.ConsumerProtocol.deserializeAssignment(ConsumerProtocol.java:105)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:220)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:363)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:310)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:297)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1078)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043)
at 
com.spredfast.common.kafka.consumer.RunnableConsumer.pollOnce(RunnableConsumer.java:141)
at 
com.spredfast.common.kafka.consumer.RunnableConsumer.access$000(RunnableConsumer.java:28)
at 
com.spredfast.common.kafka.consumer.RunnableConsumer$Loop.iterate(RunnableConsumer.java:125)
at 
com.spredfast.common.kafka.consumer.RunnableConsumer.run(RunnableConsumer.java:78)
   

[GitHub] kafka pull request #3826: KAFKA:5653: add join overloads to KTable

2017-09-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3826


---


Jenkins build is back to normal : kafka-0.11.0-jdk7 #305

2017-09-12 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-5861) KStream close( withTimeout ) - does not work under load conditions in the multi-threaded KStream application

2017-09-12 Thread Seweryn Habdank-Wojewodzki (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Seweryn Habdank-Wojewodzki resolved KAFKA-5861.
---
Resolution: Workaround

> KStream close( withTimeout ) - does not work under load conditions in the 
> multi-threaded KStream application
> 
>
> Key: KAFKA-5861
> URL: https://issues.apache.org/jira/browse/KAFKA-5861
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Seweryn Habdank-Wojewodzki
>
> Recently implemented close( withTimeout ) for streams does not work under 
> load conditions in the multi-threaded KStream application.
> Where there are more consuming threads and there many messages in stream, 
> then close ( withTimeout ) does not work. 
> 1. Timeout is not respected at all and
> 2. application is hanging in some streaming chaos. Theoretically threads are 
> working - they are busy with themselves, so the app cannot end, but they are 
> not processing any further messages.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-1834) No Response when handle LeaderAndIsrRequest some case

2017-09-12 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-1834.
--
Resolution: Fixed

This was fixed in newer Kafka versions.

> No Response when handle LeaderAndIsrRequest some case
> -
>
> Key: KAFKA-1834
> URL: https://issues.apache.org/jira/browse/KAFKA-1834
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.0
>Reporter: xiajun
>  Labels: easyfix
> Attachments: KAFKA-1834.patch
>
>
> When a replica become leader or follower, if this broker no exist in assigned 
> replicas, there are no response.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-1630) ConsumerFetcherThread locked in Tomcat

2017-09-12 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-1630.
--
Resolution: Cannot Reproduce

Closing inactive issue. Please reopen if you think the issue still exists


> ConsumerFetcherThread locked in Tomcat
> --
>
> Key: KAFKA-1630
> URL: https://issues.apache.org/jira/browse/KAFKA-1630
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.0
> Environment: linux redhat
>Reporter: vijay
>Assignee: Neha Narkhede
>  Labels: performance
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> I am using high level consumer API for consuming messages from kafka. 
> ConsumerFetcherThread gets locked. Kindly look in to the below stack trace
> ConsumerFetcherThread-SocialTwitterStream6_172.31.240.136-1410398702143-61a247c3-0-1"
>  prio=10 tid=0x7f294001e800 nid=0x1677 runnable [0x7f297aae9000]
>java.lang.Thread.State: RUNNABLE
>   at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>   at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
>   at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
>   at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
>   - locked <0x7f2a7c38eb40> (a sun.nio.ch.Util$1)
>   - locked <0x7f2a7c38eb28> (a java.util.Collections$UnmodifiableSet)
>   - locked <0x7f2a7c326f20> (a sun.nio.ch.EPollSelectorImpl)
>   at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>   at 
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:193)
>   - locked <0x7f2a7c2163c0> (a java.lang.Object)
>   at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:86)
>   - locked <0x7f2a7c229950> (a 
> sun.nio.ch.SocketAdaptor$SocketInputStream)
>   at 
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:200)
>   - locked <0x7f2a7c38ea50> (a java.lang.Object)
>   at kafka.utils.Utils$.read(Utils.scala:395)
>   at 
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>   at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
>   at 
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
>   at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
>   at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:73)
>   at 
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71)
>   - locked <0x7f2a7c38e9f0> (a java.lang.Object)
>   at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:110)
>   at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:110)
>   at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:110)
>   at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>   at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:109)
>   at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:109)
>   at 
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:109)
>   at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>   at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:108)
>   at 
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:94)
>   at 
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:86)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[RESULTS] [VOTE] Release Kafka version 0.11.0.1

2017-09-12 Thread Damian Guy
This vote passes with 6 +1 votes (3 bindings) and no 0 or -1 votes.

+1 votes
PMC Members:
* Jun Rao
* Gouzhang Wang
* Ismael Juma


Community:
* Manikumar Reddy
* Thomas Crawford

* Magnus Edenhill

0 votes
* No votes

-1 votes
* No votes

Vote threads (two as some votes were on user list
only):http://markmail.org/message/5trmaow3iy6louav

http://markmail.org/message/bpb4s2cdfgv2zccq

I'll continue with the release process and the release announcement
will follow in the next few days.


Thanks,

Damian


Re: [VOTE] 0.11.0.1 RC0

2017-09-12 Thread Damian Guy
Thanks all the vote has now closed and the release has been accepted. I'll
post an announcement to the dev list shortly.

On Tue, 12 Sep 2017 at 13:14 Thomas Crayford 
wrote:

> Heroku has vetted this through our typical performance and regression
> testing, and everything looks good. +1 (non-binding) from us.
>
> On Tue, Sep 5, 2017 at 9:34 PM, Damian Guy  wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the first candidate for release of Apache Kafka 0.11.0.1.
> >
> > This is a bug fix release and it includes fixes and improvements from 49
> > JIRAs (including a few critical bugs).
> >
> > Release notes for the 0.11.0.1 release:
> > http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/RELEASE_NOTES.html
> >
> > *** Please download, test and vote by Saturday, September 9, 9am PT
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/
> >
> > * Javadoc:
> > http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/javadoc/
> >
> > * Tag to be voted upon (off 0.11.0 branch) is the 0.11.0.1 tag:
> > https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
> > a8aa61266aedcf62e45b3595a2cf68c819ca1a6c
> >
> >
> > * Documentation:
> > Note the documentation can't be pushed live due to changes that will not
> go
> > live until the release. You can manually verify by downloading
> > http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/
> > kafka_2.11-0.11.0.1-site-docs.tgz
> >
> >
> > * Protocol:
> > http://kafka.apache.org/0110/protocol.html
> >
> > * Successful Jenkins builds for the 0.11.0 branch:
> > Unit/integration tests: https://builds.apache.org/job/
> > kafka-0.11.0-jdk7/298
> >
> > System tests:
> > http://confluent-kafka-0-11-0-system-test-results.s3-us-
> > west-2.amazonaws.com/2017-09-05--001.1504612096--apache--0.
> > 11.0--7b6e5f9/report.html
> >
> > /**
> >
> > Thanks,
> > Damian
> >
>


Re: [VOTE] 0.11.0.1 RC0

2017-09-12 Thread Thomas Crayford
Heroku has vetted this through our typical performance and regression
testing, and everything looks good. +1 (non-binding) from us.

On Tue, Sep 5, 2017 at 9:34 PM, Damian Guy  wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the first candidate for release of Apache Kafka 0.11.0.1.
>
> This is a bug fix release and it includes fixes and improvements from 49
> JIRAs (including a few critical bugs).
>
> Release notes for the 0.11.0.1 release:
> http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/RELEASE_NOTES.html
>
> *** Please download, test and vote by Saturday, September 9, 9am PT
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> http://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/
>
> * Javadoc:
> http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/javadoc/
>
> * Tag to be voted upon (off 0.11.0 branch) is the 0.11.0.1 tag:
> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
> a8aa61266aedcf62e45b3595a2cf68c819ca1a6c
>
>
> * Documentation:
> Note the documentation can't be pushed live due to changes that will not go
> live until the release. You can manually verify by downloading
> http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/
> kafka_2.11-0.11.0.1-site-docs.tgz
>
>
> * Protocol:
> http://kafka.apache.org/0110/protocol.html
>
> * Successful Jenkins builds for the 0.11.0 branch:
> Unit/integration tests: https://builds.apache.org/job/
> kafka-0.11.0-jdk7/298
>
> System tests:
> http://confluent-kafka-0-11-0-system-test-results.s3-us-
> west-2.amazonaws.com/2017-09-05--001.1504612096--apache--0.
> 11.0--7b6e5f9/report.html
>
> /**
>
> Thanks,
> Damian
>


Re: [VOTE] 0.11.0.1 RC0

2017-09-12 Thread Manikumar
+1 (non-binding) , verified artifacts, quick start using binary and ran
tests on source artifact.

Thanks for running the release.

On Tue, Sep 12, 2017 at 4:30 PM, Ismael Juma  wrote:

> Hi Damian,
>
> Thanks for managing the release. This may just be the first ever RC0 to be
> promoted to final. :) +1 (binding) from me. I verified some
> signatures/hashes, verified quickstart on binary Scala 2.12 artifact, ran
> tests and broker quickstart on the source artifact.
>
> Ismael
>
> On Tue, Sep 5, 2017 at 9:34 PM, Damian Guy  wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the first candidate for release of Apache Kafka 0.11.0.1.
> >
> > This is a bug fix release and it includes fixes and improvements from 49
> > JIRAs (including a few critical bugs).
> >
> > Release notes for the 0.11.0.1 release:
> > http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/RELEASE_NOTES.html
> >
> > *** Please download, test and vote by Saturday, September 9, 9am PT
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/
> >
> > * Javadoc:
> > http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/javadoc/
> >
> > * Tag to be voted upon (off 0.11.0 branch) is the 0.11.0.1 tag:
> > https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
> > a8aa61266aedcf62e45b3595a2cf68c819ca1a6c
> >
> >
> > * Documentation:
> > Note the documentation can't be pushed live due to changes that will not
> go
> > live until the release. You can manually verify by downloading
> > http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/
> > kafka_2.11-0.11.0.1-site-docs.tgz
> >
> >
> > * Protocol:
> > http://kafka.apache.org/0110/protocol.html
> >
> > * Successful Jenkins builds for the 0.11.0 branch:
> > Unit/integration tests: https://builds.apache.org/job/
> > kafka-0.11.0-jdk7/298
> >
> > System tests:
> > http://confluent-kafka-0-11-0-system-test-results.s3-us-
> > west-2.amazonaws.com/2017-09-05--001.1504612096--apache--0.
> > 11.0--7b6e5f9/report.html
> >
> > /**
> >
> > Thanks,
> > Damian
> >
>


[GitHub] kafka pull request #2903: MINOR: Fix needless GC + Result time unit in JMH

2017-09-12 Thread original-brownbear
GitHub user original-brownbear reopened a pull request:

https://github.com/apache/kafka/pull/2903

MINOR: Fix needless GC + Result time unit in JMH

Fixes two issues with the JMH benchmark example:
* Trivial: The output should be in `ops/ms` for readability reasons (it's 
in the millions of operations per second)
* Important: The benchmark is not actually measuring the LRU-Cache 
performance as most of the time in each run is wasted on concatenating `key + 
counter` as well as `value + counter`. Fixed by pre-generating 10k K-V pairs 
(100x the cache capacity) and iterating over them. This brings the performance 
up by a factor of more than 5 on a standard 4 core i7 (`~6k/ms` before goes to 
`~35k/ms`).
  * Also made static what could be made static in the benchmark class to 
lower the GC background noise

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/original-brownbear/kafka fix-jmh-example

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2903.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2903


commit 8d30ef0f3e35f1fa03e035796f4e4d125bde4548
Author: Armin Braun 
Date:   2017-04-24T06:50:57Z

MINOR: Fix needless GC + Result time unit in JMH

commit 2dbbd4cf1d5d06bb154a3436aa8752d3bb89fc9e
Author: Armin Braun 
Date:   2017-04-24T09:10:21Z

MINOR: Fix needless GC + Result time unit in JMH

commit f5e46a7e2fbe3d8b4bc8f1817ce55cfbe821f067
Author: Armin Braun 
Date:   2017-04-24T09:13:44Z

MINOR: Fix needless GC + Result time unit in JMH




---


Build failed in Jenkins: kafka-trunk-jdk7 #2749

2017-09-12 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] MINOR: refactor build method to extract methods from if statements

--
[...truncated 2.51 MB...]
org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingUncaughtExceptionHandlerNotInCreateState STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingUncaughtExceptionHandlerNotInCreateState PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingStateListenerNotInCreateState STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingStateListenerNotInCreateState PASSED

org.apache.kafka.streams.KafkaStreamsTest > testNumberDefaultMetrics STARTED

org.apache.kafka.streams.KafkaStreamsTest > testNumberDefaultMetrics PASSED

org.apache.kafka.streams.KafkaStreamsTest > shouldReturnThreadMetadata STARTED

org.apache.kafka.streams.KafkaStreamsTest > shouldReturnThreadMetadata PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning 
STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStateThreadClose STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStateThreadClose PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStateChanges STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStateChanges PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfProducerEnableIdempotenceIsOverriddenIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfProducerEnableIdempotenceIsOverriddenIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldUseNewConfigsWhenPresent 
STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldUseNewConfigsWhenPresent 
PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldUseCorrectDefaultsWhenNoneSpecified STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldUseCorrectDefaultsWhenNoneSpecified PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
STARTED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerMaxInFlightRequestPerConnectionsWhenEosDisabled 
STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerMaxInFlightRequestPerConnectionsWhenEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowStreamsExceptionIfValueSerdeConfigFails STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowStreamsExceptionIfValueSerdeConfigFails PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfConsumerAutoCommitIsOverridden STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfConsumerAutoCommitIsOverridden PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowExceptionIfNotAtLestOnceOrExactlyOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > 

Re: [VOTE] 0.11.0.1 RC0

2017-09-12 Thread Ismael Juma
Hi Damian,

Thanks for managing the release. This may just be the first ever RC0 to be
promoted to final. :) +1 (binding) from me. I verified some
signatures/hashes, verified quickstart on binary Scala 2.12 artifact, ran
tests and broker quickstart on the source artifact.

Ismael

On Tue, Sep 5, 2017 at 9:34 PM, Damian Guy  wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the first candidate for release of Apache Kafka 0.11.0.1.
>
> This is a bug fix release and it includes fixes and improvements from 49
> JIRAs (including a few critical bugs).
>
> Release notes for the 0.11.0.1 release:
> http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/RELEASE_NOTES.html
>
> *** Please download, test and vote by Saturday, September 9, 9am PT
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> http://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/
>
> * Javadoc:
> http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/javadoc/
>
> * Tag to be voted upon (off 0.11.0 branch) is the 0.11.0.1 tag:
> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
> a8aa61266aedcf62e45b3595a2cf68c819ca1a6c
>
>
> * Documentation:
> Note the documentation can't be pushed live due to changes that will not go
> live until the release. You can manually verify by downloading
> http://home.apache.org/~damianguy/kafka-0.11.0.1-rc0/
> kafka_2.11-0.11.0.1-site-docs.tgz
>
>
> * Protocol:
> http://kafka.apache.org/0110/protocol.html
>
> * Successful Jenkins builds for the 0.11.0 branch:
> Unit/integration tests: https://builds.apache.org/job/
> kafka-0.11.0-jdk7/298
>
> System tests:
> http://confluent-kafka-0-11-0-system-test-results.s3-us-
> west-2.amazonaws.com/2017-09-05--001.1504612096--apache--0.
> 11.0--7b6e5f9/report.html
>
> /**
>
> Thanks,
> Damian
>


[jira] [Created] (KAFKA-5874) Incorrect command line handling

2017-09-12 Thread Viliam Durina (JIRA)
Viliam Durina created KAFKA-5874:


 Summary: Incorrect command line handling
 Key: KAFKA-5874
 URL: https://issues.apache.org/jira/browse/KAFKA-5874
 Project: Kafka
  Issue Type: Bug
  Components: admin
Affects Versions: 0.11.0.0, 0.10.2.0
 Environment: Probably unrelated, but Windows 10 in this case
Reporter: Viliam Durina
Priority: Minor


Extra parameters on command line are silently ignored. This leads to confusion. 
For example, I did this command:

`kafka-topics.bat --alter --topic my_topic partitions 3 --zookeeper localhost`

Seemingly innocuous command took about 3 seconds and output nothing. It took me 
a while to realize I missed "--" before "partitions", after which I got a 
confirmation on output.

Erroneous command lines should produce error output.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: kafka-trunk-jdk8 #2009

2017-09-12 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] MINOR: refactor build method to extract methods from if statements

--
[...truncated 2.03 MB...]

org.apache.kafka.common.security.ssl.SslFactoryTest > testClientMode PASSED

org.apache.kafka.common.security.ssl.SslFactoryTest > 
testSslFactoryConfiguration STARTED

org.apache.kafka.common.security.ssl.SslFactoryTest > 
testSslFactoryConfiguration PASSED

org.apache.kafka.common.security.kerberos.KerberosNameTest > testParse STARTED

org.apache.kafka.common.security.kerberos.KerberosNameTest > testParse PASSED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testPrincipalNameCanContainSeparator STARTED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testPrincipalNameCanContainSeparator PASSED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testEqualsAndHashCode STARTED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testEqualsAndHashCode PASSED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithListenerNameOverride STARTED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithListenerNameOverride PASSED

org.apache.kafka.common.security.JaasContextTest > testMissingOptionValue 
STARTED

org.apache.kafka.common.security.JaasContextTest > testMissingOptionValue PASSED

org.apache.kafka.common.security.JaasContextTest > testSingleOption STARTED

org.apache.kafka.common.security.JaasContextTest > testSingleOption PASSED

org.apache.kafka.common.security.JaasContextTest > 
testNumericOptionWithoutQuotes STARTED

org.apache.kafka.common.security.JaasContextTest > 
testNumericOptionWithoutQuotes PASSED

org.apache.kafka.common.security.JaasContextTest > testConfigNoOptions STARTED

org.apache.kafka.common.security.JaasContextTest > testConfigNoOptions PASSED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithWrongListenerName STARTED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithWrongListenerName PASSED

org.apache.kafka.common.security.JaasContextTest > testNumericOptionWithQuotes 
STARTED

org.apache.kafka.common.security.JaasContextTest > testNumericOptionWithQuotes 
PASSED

org.apache.kafka.common.security.JaasContextTest > testQuotedOptionValue STARTED

org.apache.kafka.common.security.JaasContextTest > testQuotedOptionValue PASSED

org.apache.kafka.common.security.JaasContextTest > testMissingLoginModule 
STARTED

org.apache.kafka.common.security.JaasContextTest > testMissingLoginModule PASSED

org.apache.kafka.common.security.JaasContextTest > testMissingSemicolon STARTED

org.apache.kafka.common.security.JaasContextTest > testMissingSemicolon PASSED

org.apache.kafka.common.security.JaasContextTest > testMultipleOptions STARTED

org.apache.kafka.common.security.JaasContextTest > testMultipleOptions PASSED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForClientWithListenerName STARTED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForClientWithListenerName PASSED

org.apache.kafka.common.security.JaasContextTest > testMultipleLoginModules 
STARTED

org.apache.kafka.common.security.JaasContextTest > testMultipleLoginModules 
PASSED

org.apache.kafka.common.security.JaasContextTest > testMissingControlFlag 
STARTED

org.apache.kafka.common.security.JaasContextTest > testMissingControlFlag PASSED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithListenerNameAndFallback STARTED

org.apache.kafka.common.security.JaasContextTest > 
testLoadForServerWithListenerNameAndFallback PASSED

org.apache.kafka.common.security.JaasContextTest > testQuotedOptionName STARTED

org.apache.kafka.common.security.JaasContextTest > testQuotedOptionName PASSED

org.apache.kafka.common.security.JaasContextTest > testControlFlag STARTED

org.apache.kafka.common.security.JaasContextTest > testControlFlag PASSED

org.apache.kafka.common.security.plain.PlainSaslServerTest > 
noAuthorizationIdSpecified STARTED

org.apache.kafka.common.security.plain.PlainSaslServerTest > 
noAuthorizationIdSpecified PASSED

org.apache.kafka.common.security.plain.PlainSaslServerTest > 
authorizatonIdEqualsAuthenticationId STARTED

org.apache.kafka.common.security.plain.PlainSaslServerTest > 
authorizatonIdEqualsAuthenticationId PASSED

org.apache.kafka.common.security.plain.PlainSaslServerTest > 
authorizatonIdNotEqualsAuthenticationId STARTED

org.apache.kafka.common.security.plain.PlainSaslServerTest > 
authorizatonIdNotEqualsAuthenticationId PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testMissingUsernameSaslPlain STARTED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testMissingUsernameSaslPlain PASSED

org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest > 
testValidSaslScramMechanisms STARTED


[jira] [Created] (KAFKA-5873) Add Materialized overloads to StreamBuilder

2017-09-12 Thread Damian Guy (JIRA)
Damian Guy created KAFKA-5873:
-

 Summary: Add Materialized overloads to StreamBuilder
 Key: KAFKA-5873
 URL: https://issues.apache.org/jira/browse/KAFKA-5873
 Project: Kafka
  Issue Type: Sub-task
  Components: streams
Reporter: Damian Guy
Assignee: Damian Guy


Add the overloads from KIP-182 that use {{Materialized}} to {{StreamsBuilder}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: kafka-trunk-jdk7 #2748

2017-09-12 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] MINOR: update processor topology test driver

--
[...truncated 2.51 MB...]

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerOuterJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerOuterJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftOuterJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftOuterJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoinQueryable PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoinQueryable STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoinQueryable PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin PASSED

org.apache.kafka.streams.integration.FanoutIntegrationTest > 
shouldFanoutTheInput STARTED

org.apache.kafka.streams.integration.FanoutIntegrationTest > 
shouldFanoutTheInput PASSED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToRunWithEosEnabled STARTED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToRunWithEosEnabled PASSED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToCommitToMultiplePartitions STARTED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToCommitToMultiplePartitions PASSED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldNotViolateEosIfOneTaskFails STARTED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldNotViolateEosIfOneTaskFails PASSED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToPerformMultipleTransactions STARTED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToPerformMultipleTransactions PASSED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToCommitMultiplePartitionOffsets STARTED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToCommitMultiplePartitionOffsets PASSED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToRunWithTwoSubtopologies STARTED


Build failed in Jenkins: kafka-trunk-jdk8 #2008

2017-09-12 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] MINOR: update processor topology test driver

--
[...truncated 2.52 MB...]
org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenDeleted PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin PASSED

org.apache.kafka.streams.integration.ResetIntegrationTest > 
testReprocessingFromScratchAfterResetWithIntermediateUserTopic STARTED

org.apache.kafka.streams.integration.ResetIntegrationTest > 
testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
java.lang.AssertionError: expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.kafka.streams.integration.ResetIntegrationTest.cleanGlobal(ResetIntegrationTest.java:363)
at 
org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:172)


Re: [DISCUSS] KIP-195: AdminClient.increasePartitions

2017-09-12 Thread Tom Bentley
2. About using the create topics policy, I'm not sure. Aside from the
> naming issue, there's also the problem that the policy doesn't know if a
> creation or update is taking place. This matters because one may not want
> to allow the number of partitions to be changed after creation as it
> affects the semantics if keys are used. One option is to introduce a new
> interface that can be used by create, alter and delete with a new config.
> And deprecate CreateTopicPolicy. I doubt many are using it. What do you
> think?
>

I included the part about the create topics policy because I felt it was
better, in the short term, to prevent the loophole than to just ignore it.
The create topic policy is obviously not a good fit for applying to topic
modifications, but I think designing a good policy interface that covered
creation, modification and deletion could be the subject of its own KIP.
Note that modification would include the APIs proposed in KIP-195 and
KIP-179. KIP-170 is already proposing to change the creation policy and add
a deletion policy, so shouldn't the changes necessary for KIP-195 be
considered as part of that KIP?

I'm happy to propose something in KIP-195 if you really want, though it
would put in doubt whether it could be part of Kafka 1.0.0.


[GitHub] kafka pull request #3836: KAFKA-5872: Fix transient failure in SslSelectorTe...

2017-09-12 Thread rajinisivaram
GitHub user rajinisivaram opened a pull request:

https://github.com/apache/kafka/pull/3836

KAFKA-5872: Fix transient failure in SslSelectorTest.testMuteOnOOM



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajinisivaram/kafka 
KAFKA-5872-sslselectortest-failure

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3836.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3836


commit 68d0b6383782d79780e84d1806b1718d0f791f41
Author: Rajini Sivaram 
Date:   2017-09-12T09:09:20Z

KAFKA-5872: Fix transient failure in SslSelectorTest.testMuteOnOOM




---


[jira] [Created] (KAFKA-5872) Transient failure in SslSelectorTest.testMuteOnOOM

2017-09-12 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-5872:
-

 Summary: Transient failure in SslSelectorTest.testMuteOnOOM
 Key: KAFKA-5872
 URL: https://issues.apache.org/jira/browse/KAFKA-5872
 Project: Kafka
  Issue Type: Bug
  Components: network
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 1.0.0


There are a couple of issues:
1. `Selector.determineHandlingOrder()` currently doesn't clear selection keys 
when keys are shuffled. This can result in select returning zero even when 
there are ready keys, resulting in a tight loop of polls with no keys processed.
2. The test expects `Selector.isOutOfMemory()` to be set in a poll that waits 
only for 10ms. This is expecting two reads from two connections to be processed 
within 10ms of each other, which may not always be the case.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] kafka pull request #3833: MINOR: refactor build method to extract methods fr...

2017-09-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3833


---


[GitHub] kafka pull request #3828: MINOR: update processor topology test driver

2017-09-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3828


---


Re: [VOTE] KIP-182 - Reduce Streams DSL overloads and allow easier use of custom storage engines

2017-09-12 Thread Damian Guy
Hi All,

A minor update to the KIP, i needed to add KTable.to(Produced) for
consistency. KTable.through will be deprecated in favour of using
KTable.toStream().through()

Thanks,
Damian

On Thu, 7 Sep 2017 at 08:52 Damian Guy  wrote:

> Thanks all. The vote is now closed and the KIP has been accepted with:
> 2 non binding votes - bill and matthias
> 3 binding  - Damian, Guozhang, Sriram
>
> Regards,
> Damian
>
> On Tue, 5 Sep 2017 at 22:24 Sriram Subramanian  wrote:
>
>> +1
>>
>> On Tue, Sep 5, 2017 at 1:33 PM, Guozhang Wang  wrote:
>>
>> > +1
>> >
>> > On Fri, Sep 1, 2017 at 3:45 PM, Matthias J. Sax 
>> > wrote:
>> >
>> > > +1
>> > >
>> > > On 9/1/17 2:53 PM, Bill Bejeck wrote:
>> > > > +1
>> > > >
>> > > > On Thu, Aug 31, 2017 at 10:20 AM, Damian Guy 
>> > > wrote:
>> > > >
>> > > >> Thanks everyone for voting! Unfortunately i've had to make a bit
>> of an
>> > > >> update based on some issues found during implementation.
>> > > >> The main changes are:
>> > > >> BytesStoreSupplier -> StoreSupplier
>> > > >> Addition of:
>> > > >> WindowBytesStoreSupplier, KeyValueBytesStoreSupplier,
>> > > >> SessionBytesStoreSupplier that will restrict store types to > > > byte[]>
>> > > >> 3 new overloads added to Materialized to enable developers to
>> create a
>> > > >> Materialized of the appropriate type, i..e, WindowStore etc
>> > > >> Update DSL where Materialized is used such that the stores have
>> > generic
>> > > >> types of 
>> > > >> Some minor changes to the arguments to Store#persistentWindowStore
>> and
>> > > >> Store#persistentSessionStore
>> > > >>
>> > > >> Please take a look and recast the votes.
>> > > >>
>> > > >> Thanks for your time,
>> > > >> Damian
>> > > >>
>> > > >> On Fri, 25 Aug 2017 at 17:05 Matthias J. Sax <
>> matth...@confluent.io>
>> > > >> wrote:
>> > > >>
>> > > >>> Thanks Damian. Great KIP!
>> > > >>>
>> > > >>> +1
>> > > >>>
>> > > >>>
>> > > >>> -Matthias
>> > > >>>
>> > > >>> On 8/25/17 6:45 AM, Damian Guy wrote:
>> > >  Hi,
>> > > 
>> > >  I've just realised we need to add two methods to
>> StateStoreBuilder
>> > or
>> > > >> it
>> > >  isn't going to work:
>> > > 
>> > >  Map logConfig();
>> > >  boolean loggingEnabled();
>> > > 
>> > >  These are needed when we are building the topology and
>> determining
>> > >  changelog topic names and configs.
>> > > 
>> > > 
>> > >  I've also update the KIP to add
>> > > 
>> > >  StreamBuilder#stream(String topic)
>> > > 
>> > >  StreamBuilder#stream(String topic, Consumed options)
>> > > 
>> > > 
>> > >  Thanks
>> > > 
>> > > 
>> > >  On Thu, 24 Aug 2017 at 22:11 Sriram Subramanian <
>> r...@confluent.io>
>> > > >>> wrote:
>> > > 
>> > > > +1
>> > > >
>> > > > On Thu, Aug 24, 2017 at 10:20 AM, Guozhang Wang <
>> > wangg...@gmail.com>
>> > > > wrote:
>> > > >
>> > > >> +1. Thanks Damian!
>> > > >>
>> > > >> On Thu, Aug 24, 2017 at 9:47 AM, Bill Bejeck <
>> bbej...@gmail.com>
>> > > >>> wrote:
>> > > >>
>> > > >>> Thanks for the KIP!
>> > > >>>
>> > > >>> +1
>> > > >>>
>> > > >>> Thanks,
>> > > >>> Bill
>> > > >>>
>> > > >>> On Thu, Aug 24, 2017 at 12:25 PM, Damian Guy <
>> > damian@gmail.com
>> > > >
>> > > >> wrote:
>> > > >>>
>> > >  Hi,
>> > > 
>> > >  I'd like to kick off the voting thread for KIP-182:
>> > >  https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > >  182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+
>> > >  use+of+custom+storage+engines
>> > > 
>> > >  Thanks,
>> > >  Damian
>> > > 
>> > > >>>
>> > > >>
>> > > >>
>> > > >>
>> > > >> --
>> > > >> -- Guozhang
>> > > >>
>> > > >
>> > > 
>> > > >>>
>> > > >>>
>> > > >>
>> > > >
>> > >
>> > >
>> >
>> >
>> > --
>> > -- Guozhang
>> >
>>
>


[GitHub] kafka pull request #2889: KAFKA-4928: Add integration test for DumpLogSegmen...

2017-09-12 Thread original-brownbear
Github user original-brownbear closed the pull request at:

https://github.com/apache/kafka/pull/2889


---


[GitHub] kafka pull request #2903: MINOR: Fix needless GC + Result time unit in JMH

2017-09-12 Thread original-brownbear
Github user original-brownbear closed the pull request at:

https://github.com/apache/kafka/pull/2903


---


[GitHub] kafka pull request #2901: KAFKA-5018: LogCleaner tests to verify behaviour o...

2017-09-12 Thread original-brownbear
Github user original-brownbear closed the pull request at:

https://github.com/apache/kafka/pull/2901


---


Re: [DISCUSS] KIP-196: Add metrics to Kafka Connect framework

2017-09-12 Thread James Cheng
Thanks for the KIP, Randall.

The KIP has one MBean per metric name. Can I suggest an alternate grouping?

kafka.connect:type=connector-metrics,connector=([-.\w]+)
connector-type
connector-class
connector-version
status

kafka.connect:type=task-metrics,connector=([-.\w]+),task=([\d]+)
status
pause-ratio
offset-commit-success-percentage
offset-commit-failure-percentage
offset-commit-max-time
offset-commit-99p-time
offset-commit-95p-time
offset-commit-90p-time
offset-commit-75p-time
offset-commit-50p-time
batch-size-max
batch-size-avg

kafka.connect:type=source-task-metrics,connector=([-.\w]+),task=([\d]+)
source-record-poll-rate
source-record-write-rate

kafka.connect:type=sink-task-metrics,connector=([-.\w]+),task=([\d]+)
sink-record-read-rate
sink-record-send-rate
sink-record-lag-max
partition-count
offset-commit-95p-time
offset-commit-90p-time
offset-commit-75p-time
offset-commit-50p-time
batch-size-max
batch-size-avg

kafka.connect:type=sink-task-metrics,connector=([-.\w]+),task=([\d]+),topic=([-.\w]+),partition=([\d]+)
sink-record-lag
sink-record-lag-avg
sink-record-lag-max

kafka.connect:type=connect-coordinator-metrics
task-count
connector-count
leader-name
state
rest-request-rate

kafka.connect:type=connect-coordinator-metrics,name=assigned-tasks 
assigned-tasks (existing metric, so can't merge in above without 
breaking compatibility)
kafka.connect:type=connect-coordinator-metrics,name=assigned-connectors 
(existing metric, so can't merge in above without breaking compatibility)
assigned-connectors (existing metric, so can't merge in above without 
breaking compatibility)

kafka.connect:type=connect-worker-rebalance-metrics
rebalance-success-total
rebalance-success-percentage
rebalance-failure-total
rebalance-failure-percentage
rebalance-max-time
rebalance-99p-time
rebalance-95p-time
rebalance-90p-time
rebalance-75p-time
rebalance-50p-time
time-since-last-rebalance
task-failure-rate

This lets you use a single MBean selector to select multiple related attributes 
all at once. You can use JMX's wildcards to target which connectors or tasks or 
topics or partitions you care about.

Also notice that for the topic and partition level metrics, the attributes are 
named identically ("sink-record-lag-avg" instead of 
"sink-record-{topic}-{partition}.records-lag-avg"), so monitoring systems have 
a consistent string they can use, instead of needing to prefix-and-suffix 
matching against the attribute name. And TBH, it integrates better with the 
work I'm doing in https://issues.apache.org/jira/browse/KAFKA-3480

-James

> On Sep 7, 2017, at 4:50 PM, Randall Hauch  wrote:
> 
> Hi everyone.
> 
> I've created a new KIP to add metrics to the Kafka Connect framework:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-196%3A+Add+metrics+to+Kafka+Connect+framework
> 
> The KIP approval deadline is looming, so if you're interested in Kafka
> Connect metrics please review and provide feedback as soon as possible. I'm
> interested not only in whether the metrics are sufficient and appropriate,
> but also in whether the MBean naming conventions are okay.
> 
> Best regards,
> 
> Randall



[GitHub] kafka pull request #3835: MINOR: update operations doc on topic deletion

2017-09-12 Thread omkreddy
GitHub user omkreddy opened a pull request:

https://github.com/apache/kafka/pull/3835

MINOR: update operations doc on topic deletion



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/omkreddy/kafka update-delete-topic-doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3835.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3835


commit 2b899bc0db66bc2251c45856b7cbb233aca4671c
Author: Manikumar Reddy 
Date:   2017-09-12T06:24:21Z

MINOR: update ops doc on topic deletion




---