[jira] [Assigned] (KAFKA-5011) Replica fetchers may need to down-convert messages during a selective message format upgrade

2017-04-04 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin reassigned KAFKA-5011:
---

Assignee: Jiangjie Qin

> Replica fetchers may need to down-convert messages during a selective message 
> format upgrade
> 
>
> Key: KAFKA-5011
> URL: https://issues.apache.org/jira/browse/KAFKA-5011
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>Assignee: Jiangjie Qin
> Fix For: 0.11.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2803: MINOR: improve MinTimestampTrackerTest and fix NPE...

2017-04-04 Thread enothereska
Github user enothereska closed the pull request at:

https://github.com/apache/kafka/pull/2803


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2804: MINOR: don't throw CommitFailedException during su...

2017-04-04 Thread enothereska
Github user enothereska closed the pull request at:

https://github.com/apache/kafka/pull/2804


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (KAFKA-4837) Config validation in Connector plugins need to compare against both canonical and simple class names

2017-04-04 Thread Felix (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956298#comment-15956298
 ] 

Felix edited comment on KAFKA-4837 at 4/5/17 4:49 AM:
--

thanks for fixing this! It was already driving me nuts!

Note: Just remove the "quote" in your fixing example


was (Author: felix41382):
thanks for fixing this! It was already driving me nuts!

> Config validation in Connector plugins need to compare against both canonical 
> and simple class names
> 
>
> Key: KAFKA-4837
> URL: https://issues.apache.org/jira/browse/KAFKA-4837
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
> Fix For: 0.11.0.0, 0.10.2.1
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> A validation check in Connect's REST API that was added to validate that the 
> connector class name in the config matches the connector class name in the 
> request's URL is too strict by not considering both the simple and the 
> canonical name of the connector class. For instance, the following example 
> request: 
> {code}
> PUT /connector-plugins/FileStreamSinkConnector/config/validate/ HTTP/1.1
> Host: connect.example.com
> Accept: application/json
> {
> "connector.class": 
> "org.apache.kafka.connect.file.FileStreamSinkConnector",
> "tasks.max": "1",
> "topics": "test-topic"
> }
> {code}
> returns a "Bad Request" response with error code "400".
> Currently the reasonable workaround is to exactly match the connector class 
> name in both places. The following will work: 
> {code}
> PUT 
> /connector-plugins/"org.apache.kafka.connect.file.FileStreamSinkConnector/config/validate/
>  HTTP/1.1
> Host: connect.example.com
> Accept: application/json
> {
> "connector.class": 
> "org.apache.kafka.connect.file.FileStreamSinkConnector",
> "tasks.max": "1",
> "topics": "test-topic"
> }
> {code}
> However, this is not flexible enough and also breaks several examples in 
> documentation. Validation should take into account both simple and canonical 
> class names. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4837) Config validation in Connector plugins need to compare against both canonical and simple class names

2017-04-04 Thread Felix (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956298#comment-15956298
 ] 

Felix commented on KAFKA-4837:
--

thanks for fixing this! It was already driving me nuts!

> Config validation in Connector plugins need to compare against both canonical 
> and simple class names
> 
>
> Key: KAFKA-4837
> URL: https://issues.apache.org/jira/browse/KAFKA-4837
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
> Fix For: 0.11.0.0, 0.10.2.1
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> A validation check in Connect's REST API that was added to validate that the 
> connector class name in the config matches the connector class name in the 
> request's URL is too strict by not considering both the simple and the 
> canonical name of the connector class. For instance, the following example 
> request: 
> {code}
> PUT /connector-plugins/FileStreamSinkConnector/config/validate/ HTTP/1.1
> Host: connect.example.com
> Accept: application/json
> {
> "connector.class": 
> "org.apache.kafka.connect.file.FileStreamSinkConnector",
> "tasks.max": "1",
> "topics": "test-topic"
> }
> {code}
> returns a "Bad Request" response with error code "400".
> Currently the reasonable workaround is to exactly match the connector class 
> name in both places. The following will work: 
> {code}
> PUT 
> /connector-plugins/"org.apache.kafka.connect.file.FileStreamSinkConnector/config/validate/
>  HTTP/1.1
> Host: connect.example.com
> Accept: application/json
> {
> "connector.class": 
> "org.apache.kafka.connect.file.FileStreamSinkConnector",
> "tasks.max": "1",
> "topics": "test-topic"
> }
> {code}
> However, this is not flexible enough and also breaks several examples in 
> documentation. Validation should take into account both simple and canonical 
> class names. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-138: Change punctuate semantics

2017-04-04 Thread Jay Kreps
Hey guys,

One thing I've always found super important for this kind of design work is
to do a really good job of cataloging the landscape of use cases and how
prevalent each one is. By that I mean not just listing lots of uses, but
also grouping them into categories that functionally need the same thing.
In the absence of this it is very hard to reason about design proposals.
>From the proposals so far I think we have a lot of discussion around
possible apis, but less around what the user needs for different use cases
and how they would implement that using the api.

Here is an example:
You aggregate click and impression data for a reddit like site. Every ten
minutes you want to output a ranked list of the top 10 articles ranked by
clicks/impressions for each geographical area. I want to be able run this
in steady state as well as rerun to regenerate results (or catch up if it
crashes).

There are a couple of tricky things that seem to make this hard with either
of the options proposed:
1. If I emit this data using event time I have the problem described where
a geographical region with no new clicks or impressions will fail to output
results.
2. If I emit this data using system time I have the problem that when
reprocessing data my window may not be ten minutes but 10 hours if my
processing is very fast so it dramatically changes the output.

Maybe a hybrid solution works: I window by event time but trigger results
by system time for windows that have updated? Not really sure the details
of making that work. Does that work? Are there concrete examples where you
actually want the current behavior?

-Jay


On Tue, Apr 4, 2017 at 8:32 PM, Arun Mathew  wrote:

> Hi All,
>
> Thanks for the KIP. We were also in need of a mechanism to trigger
> punctuate in the absence of events.
>
> As I described in [
> https://issues.apache.org/jira/browse/KAFKA-3514?
> focusedCommentId=15926036=com.atlassian.jira.
> plugin.system.issuetabpanels:comment-tabpanel#comment-15926036
> ],
>
>- Our approached involved using the event time by default.
>- The method to check if there is any punctuate ready in the
>PunctuationQueue is triggered via the any event received by the stream
>tread, or at the polling intervals in the absence of any events.
>- When we create Punctuate objects (which contains the next event time
>for punctuation and interval), we also record the creation time (system
>time).
>- While checking for maturity of Punctuate Schedule by mayBePunctuate
>method, we also check if the system clock has elapsed the punctuate
>interval since the schedule creation time.
>- In the absence of any event, or in the absence of any event for one
>topic in the partition group assigned to the stream task, the system
> time
>will elapse the interval and we trigger a punctuate using the expected
>punctuation event time.
>- we then create the next punctuation schedule as punctuation event time
>+ punctuation interval, [again recording the system time of creation of
> the
>schedule].
>
> We call this a Hybrid Punctuate. Of course, this approach has pros and
> cons.
> Pros
>
>- Punctuates will happen in  time duration at max in
>terms of system time.
>- The semantics as a whole continues to revolve around event time.
>- We can use the old data [old timestamps] to rerun any experiments or
>tests.
>
> Cons
>
>- In case the   is not a time duration [say logical
>time/event count], then the approach might not be meaningful.
>- In case there is a case where we have to wait for an actual event from
>a low event rate partition in the partition group, this approach will
> jump
>the gun.
>- in case the event processing cannot catch up with the event rate and
>the expected timestamp events gets queued for long time, this approach
>might jump the gun.
>
> I believe the above approach and discussion goes close to the approach A.
>
> ---
>
> I like the idea of having an even count based punctuate.
>
> ---
>
> I agree with the discussion around approach C, that we should provide the
> user with the option to choose system time or event time based punctuates.
> But I believe that the user predominantly wants to use event time while not
> missing out on regular punctuates due to event delays or event absences.
> Hence a complex punctuate option as Matthias mentioned (quoted below) would
> be most apt.
>
> "- We might want to add "complex" schedules later on (like, punctuate on
> every 10 seconds event-time or 60 seconds system-time whatever comes
> first)."
>
> ---
>
> I think I read somewhere that Kafka Streams started with System Time as the
> punctuation standard, but was later changed to Event Time. I guess there
> would be some good reason behind it. As Kafka Streams want to evolve more
> on the Stream Processing front, I believe the emphasis on event time would
> remain quite 

Re: [VOTE] KIP-120: Cleanup Kafka Streams builder API

2017-04-04 Thread Ewen Cheslack-Postava
+1 (binding)

-Ewen

On Thu, Mar 30, 2017 at 4:03 PM, Guozhang Wang  wrote:

> +1.
>
>
> On Thu, Mar 30, 2017 at 1:18 AM, Damian Guy  wrote:
>
> > Thanks Matthias.
> >
> > +1
> >
> > On Thu, 23 Mar 2017 at 22:40 Matthias J. Sax 
> > wrote:
> >
> > > Hi,
> > >
> > > I would like to start the VOTE on KIP-120:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-120%
> > 3A+Cleanup+Kafka+Streams+builder+API
> > >
> > > If you have further comments, please reply to the DISCUSS thread.
> > >
> > > Thanks a lot!
> > >
> > >
> > > -Matthias
> > >
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: [DISCUSS] KIP-138: Change punctuate semantics

2017-04-04 Thread Arun Mathew
Hi All,

Thanks for the KIP. We were also in need of a mechanism to trigger
punctuate in the absence of events.

As I described in [
https://issues.apache.org/jira/browse/KAFKA-3514?focusedCommentId=15926036=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15926036
],

   - Our approached involved using the event time by default.
   - The method to check if there is any punctuate ready in the
   PunctuationQueue is triggered via the any event received by the stream
   tread, or at the polling intervals in the absence of any events.
   - When we create Punctuate objects (which contains the next event time
   for punctuation and interval), we also record the creation time (system
   time).
   - While checking for maturity of Punctuate Schedule by mayBePunctuate
   method, we also check if the system clock has elapsed the punctuate
   interval since the schedule creation time.
   - In the absence of any event, or in the absence of any event for one
   topic in the partition group assigned to the stream task, the system time
   will elapse the interval and we trigger a punctuate using the expected
   punctuation event time.
   - we then create the next punctuation schedule as punctuation event time
   + punctuation interval, [again recording the system time of creation of the
   schedule].

We call this a Hybrid Punctuate. Of course, this approach has pros and cons.
Pros

   - Punctuates will happen in  time duration at max in
   terms of system time.
   - The semantics as a whole continues to revolve around event time.
   - We can use the old data [old timestamps] to rerun any experiments or
   tests.

Cons

   - In case the   is not a time duration [say logical
   time/event count], then the approach might not be meaningful.
   - In case there is a case where we have to wait for an actual event from
   a low event rate partition in the partition group, this approach will jump
   the gun.
   - in case the event processing cannot catch up with the event rate and
   the expected timestamp events gets queued for long time, this approach
   might jump the gun.

I believe the above approach and discussion goes close to the approach A.

---

I like the idea of having an even count based punctuate.

---

I agree with the discussion around approach C, that we should provide the
user with the option to choose system time or event time based punctuates.
But I believe that the user predominantly wants to use event time while not
missing out on regular punctuates due to event delays or event absences.
Hence a complex punctuate option as Matthias mentioned (quoted below) would
be most apt.

"- We might want to add "complex" schedules later on (like, punctuate on
every 10 seconds event-time or 60 seconds system-time whatever comes
first)."

---

I think I read somewhere that Kafka Streams started with System Time as the
punctuation standard, but was later changed to Event Time. I guess there
would be some good reason behind it. As Kafka Streams want to evolve more
on the Stream Processing front, I believe the emphasis on event time would
remain quite strong.


With Regards,

Arun Mathew
Yahoo! JAPAN Corporation, Tokyo


On Wed, Apr 5, 2017 at 3:53 AM, Thomas Becker  wrote:

> Yeah I like PuncutationType much better; I just threw Time out there
> more as a strawman than an actual suggestion ;) I still think it's
> worth considering what this buys us over an additional callback. I
> foresee a number of punctuate implementations following this pattern:
>
> public void punctuate(PunctuationType type) {
> switch (type) {
> case EVENT_TIME:
> methodA();
> break;
> case SYSTEM_TIME:
> methodB();
> break;
> }
> }
>
> I guess one advantage of this approach is we could add additional
> punctuation types later in a backwards compatible way (like event count
> as you mentioned).
>
> -Tommy
>
>
> On Tue, 2017-04-04 at 11:10 -0700, Matthias J. Sax wrote:
> > That sounds promising.
> >
> > I am just wondering if `Time` is the best name. Maybe we want to add
> > other non-time based punctuations at some point later. I would
> > suggest
> >
> > enum PunctuationType {
> >   EVENT_TIME,
> >   SYSTEM_TIME,
> > }
> >
> > or similar. Just to keep the door open -- it's easier to add new
> > stuff
> > if the name is more generic.
> >
> >
> > -Matthias
> >
> >
> > On 4/4/17 5:30 AM, Thomas Becker wrote:
> > >
> > > I agree that the framework providing and managing the notion of
> > > stream
> > > time is valuable and not something we would want to delegate to the
> > > tasks. I'm not entirely convinced that a separate callback (option
> > > C)
> > > is that messy (it could just be a default method with an empty
> > > implementation), but if we wanted a single API to handle both
> > > cases,
> > > how about something like the following?
> > >
> > > enum Time {
> > >STREAM,
> > >CLOCK
> > > }
> > >
> > > 

Build failed in Jenkins: kafka-trunk-jdk7 #2072

2017-04-04 Thread Apache Jenkins Server
See 


Changes:

[me] KAFKA-4916: test streams with brokers failing

--
[...truncated 341.75 KB...]

kafka.server.DynamicConfigTest > shouldFailLeaderConfigsWithInvalidValues 
STARTED

kafka.server.DynamicConfigTest > shouldFailLeaderConfigsWithInvalidValues PASSED

kafka.server.DynamicConfigTest > shouldFailWhenChangingClientIdUnknownConfig 
STARTED

kafka.server.DynamicConfigTest > shouldFailWhenChangingClientIdUnknownConfig 
PASSED

kafka.server.DynamicConfigTest > shouldFailWhenChangingBrokerUnknownConfig 
STARTED

kafka.server.DynamicConfigTest > shouldFailWhenChangingBrokerUnknownConfig 
PASSED

kafka.server.DynamicConfigChangeTest > testProcessNotification STARTED

kafka.server.DynamicConfigChangeTest > testProcessNotification PASSED

kafka.server.DynamicConfigChangeTest > 
shouldParseWildcardReplicationQuotaProperties STARTED

kafka.server.DynamicConfigChangeTest > 
shouldParseWildcardReplicationQuotaProperties PASSED

kafka.server.DynamicConfigChangeTest > testDefaultClientIdQuotaConfigChange 
STARTED

kafka.server.DynamicConfigChangeTest > testDefaultClientIdQuotaConfigChange 
PASSED

kafka.server.DynamicConfigChangeTest > testQuotaInitialization STARTED

kafka.server.DynamicConfigChangeTest > testQuotaInitialization PASSED

kafka.server.DynamicConfigChangeTest > testUserQuotaConfigChange STARTED

kafka.server.DynamicConfigChangeTest > testUserQuotaConfigChange PASSED

kafka.server.DynamicConfigChangeTest > testClientIdQuotaConfigChange STARTED

kafka.server.DynamicConfigChangeTest > testClientIdQuotaConfigChange PASSED

kafka.server.DynamicConfigChangeTest > testUserClientIdQuotaChange STARTED

kafka.server.DynamicConfigChangeTest > testUserClientIdQuotaChange PASSED

kafka.server.DynamicConfigChangeTest > shouldParseReplicationQuotaProperties 
STARTED

kafka.server.DynamicConfigChangeTest > shouldParseReplicationQuotaProperties 
PASSED

kafka.server.DynamicConfigChangeTest > 
shouldParseRegardlessOfWhitespaceAroundValues STARTED

kafka.server.DynamicConfigChangeTest > 
shouldParseRegardlessOfWhitespaceAroundValues PASSED

kafka.server.DynamicConfigChangeTest > testDefaultUserQuotaConfigChange STARTED

kafka.server.DynamicConfigChangeTest > testDefaultUserQuotaConfigChange PASSED

kafka.server.DynamicConfigChangeTest > shouldParseReplicationQuotaReset STARTED

kafka.server.DynamicConfigChangeTest > shouldParseReplicationQuotaReset PASSED

kafka.server.DynamicConfigChangeTest > testDefaultUserClientIdQuotaConfigChange 
STARTED

kafka.server.DynamicConfigChangeTest > testDefaultUserClientIdQuotaConfigChange 
PASSED

kafka.server.DynamicConfigChangeTest > testConfigChangeOnNonExistingTopic 
STARTED

kafka.server.DynamicConfigChangeTest > testConfigChangeOnNonExistingTopic PASSED

kafka.server.DynamicConfigChangeTest > testConfigChange STARTED

kafka.server.DynamicConfigChangeTest > testConfigChange PASSED

kafka.server.ServerGenerateBrokerIdTest > testGetSequenceIdMethod STARTED

kafka.server.ServerGenerateBrokerIdTest > testGetSequenceIdMethod PASSED

kafka.server.ServerGenerateBrokerIdTest > testBrokerMetadataOnIdCollision 
STARTED

kafka.server.ServerGenerateBrokerIdTest > testBrokerMetadataOnIdCollision PASSED

kafka.server.ServerGenerateBrokerIdTest > testAutoGenerateBrokerId STARTED

kafka.server.ServerGenerateBrokerIdTest > testAutoGenerateBrokerId PASSED

kafka.server.ServerGenerateBrokerIdTest > testMultipleLogDirsMetaProps STARTED

kafka.server.ServerGenerateBrokerIdTest > testMultipleLogDirsMetaProps PASSED

kafka.server.ServerGenerateBrokerIdTest > testDisableGeneratedBrokerId STARTED

kafka.server.ServerGenerateBrokerIdTest > testDisableGeneratedBrokerId PASSED

kafka.server.ServerGenerateBrokerIdTest > testUserConfigAndGeneratedBrokerId 
STARTED

kafka.server.ServerGenerateBrokerIdTest > testUserConfigAndGeneratedBrokerId 
PASSED

kafka.server.ServerGenerateBrokerIdTest > 
testConsistentBrokerIdFromUserConfigAndMetaProps STARTED

kafka.server.ServerGenerateBrokerIdTest > 
testConsistentBrokerIdFromUserConfigAndMetaProps PASSED

kafka.server.AdvertiseBrokerTest > testBrokerAdvertiseHostNameAndPortToZK 
STARTED

kafka.server.AdvertiseBrokerTest > testBrokerAdvertiseHostNameAndPortToZK PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
PASSED

kafka.server.SaslPlaintextReplicaFetchTest > testReplicaFetcherThread STARTED

kafka.server.SaslPlaintextReplicaFetchTest > testReplicaFetcherThread PASSED

kafka.server.SaslApiVersionsRequestTest > 
testApiVersionsRequestWithUnsupportedVersion STARTED

kafka.server.SaslApiVersionsRequestTest > 

[jira] [Commented] (KAFKA-4971) Why is there no difference between kafka benchmark tests on SSD and HDD?

2017-04-04 Thread Dasol Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956205#comment-15956205
 ] 

Dasol Kim commented on KAFKA-4971:
--

It might be a silly question, If kafka uses the page cache of the OS, if the OS 
gets bottleneck, the OS is installed in the SSD on the server having both the 
SSD and the HDD, and the kafka is installed on each of the SSD and HDD 
separately Is not it an accurate experiment? I experimented with 9 servers I 
had before experimenting with VMs, but the results of SSDs and HDDs were 
similar, so I divided VMs into HDDs and SSDs and experimented. 

> Why is there no difference between kafka benchmark tests on SSD and HDD? 
> -
>
> Key: KAFKA-4971
> URL: https://issues.apache.org/jira/browse/KAFKA-4971
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 0.10.0.0
> Environment: Oracle VM VirtualBox
> OS : CentOs 7
> Memory : 1G
> Disk : 8GB
>Reporter: Dasol Kim
>
> I installed OS and kafka in the two SSD and two HDDs  to perform the kafka 
> benchmark test based on the disc difference. As expected, the SSD should show 
> faster results, but according to my experimental results, there is no big 
> difference between SSD and HDD. why? Ohter settings have been set to default.
> *test settings
> zookeeper node  : 1, producer node : 2, broker node : 2(SSD 1, HDD 1)
> test scenario : Two producers send messages to the broker and compare the 
> throughtput per second of kafka installed on SSD and kafka on HDD
> command : ./bin/kafka-producer-perf-test.sh --num-records 100 
> --record-size 2000 --topic test --throughput 10 --producer-props 
> bootstrap.servers=SN02:9092
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4916) Add streams tests with brokers failing

2017-04-04 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4916:
-
Reviewer: Ewen Cheslack-Postava
  Status: Patch Available  (was: Reopened)

> Add streams tests with brokers failing
> --
>
> Key: KAFKA-4916
> URL: https://issues.apache.org/jira/browse/KAFKA-4916
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Eno Thereska
>Assignee: Eno Thereska
>Priority: Blocker
> Fix For: 0.11.0.0, 0.10.2.1
>
>
> We need to add either integration or system tests with streams and have Kafka 
> brokers fail and come back up. A combination of transient and permanent 
> broker failures.
> As part of adding test, fix any critical bugs that arise.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (KAFKA-4916) Add streams tests with brokers failing

2017-04-04 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava reopened KAFKA-4916:
--

Reopening since we've only got the trunk version committed and we want this as 
blocker for 0.10.2.1. Just needs an updated PR for the 0.10.2 branch.

> Add streams tests with brokers failing
> --
>
> Key: KAFKA-4916
> URL: https://issues.apache.org/jira/browse/KAFKA-4916
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Eno Thereska
>Assignee: Eno Thereska
>Priority: Blocker
> Fix For: 0.11.0.0, 0.10.2.1
>
>
> We need to add either integration or system tests with streams and have Kafka 
> brokers fail and come back up. A combination of transient and permanent 
> broker failures.
> As part of adding test, fix any critical bugs that arise.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4916) Add streams tests with brokers failing

2017-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956168#comment-15956168
 ] 

ASF GitHub Bot commented on KAFKA-4916:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2719


> Add streams tests with brokers failing
> --
>
> Key: KAFKA-4916
> URL: https://issues.apache.org/jira/browse/KAFKA-4916
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Eno Thereska
>Assignee: Eno Thereska
>Priority: Blocker
> Fix For: 0.11.0.0, 0.10.2.1
>
>
> We need to add either integration or system tests with streams and have Kafka 
> brokers fail and come back up. A combination of transient and permanent 
> broker failures.
> As part of adding test, fix any critical bugs that arise.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2719: KAFKA-4916: test streams with brokers failing

2017-04-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2719


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-4916) Add streams tests with brokers failing

2017-04-04 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4916.
--
   Resolution: Fixed
Fix Version/s: 0.11.0.0

Issue resolved by pull request 2719
[https://github.com/apache/kafka/pull/2719]

> Add streams tests with brokers failing
> --
>
> Key: KAFKA-4916
> URL: https://issues.apache.org/jira/browse/KAFKA-4916
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Eno Thereska
>Assignee: Eno Thereska
>Priority: Blocker
> Fix For: 0.11.0.0, 0.10.2.1
>
>
> We need to add either integration or system tests with streams and have Kafka 
> brokers fail and come back up. A combination of transient and permanent 
> broker failures.
> As part of adding test, fix any critical bugs that arise.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-0.10.2-jdk7 #120

2017-04-04 Thread Apache Jenkins Server
See 


Changes:

[me] MINOR: don't throw CommitFailedException during suspendTasksAndState

[me] MINOR: improve MinTimestampTrackerTest and fix NPE when null element

--
[...truncated 322.51 KB...]

kafka.server.KafkaConfigTest > testUncleanLeaderElectionDefault STARTED

kafka.server.KafkaConfigTest > testUncleanLeaderElectionDefault PASSED

kafka.server.KafkaConfigTest > testInvalidAdvertisedListenersProtocol STARTED

kafka.server.KafkaConfigTest > testInvalidAdvertisedListenersProtocol PASSED

kafka.server.KafkaConfigTest > testUncleanElectionEnabled STARTED

kafka.server.KafkaConfigTest > testUncleanElectionEnabled PASSED

kafka.server.KafkaConfigTest > testAdvertisePortDefault STARTED

kafka.server.KafkaConfigTest > testAdvertisePortDefault PASSED

kafka.server.KafkaConfigTest > testVersionConfiguration STARTED

kafka.server.KafkaConfigTest > testVersionConfiguration PASSED

kafka.server.KafkaConfigTest > testEqualAdvertisedListenersProtocol STARTED

kafka.server.KafkaConfigTest > testEqualAdvertisedListenersProtocol PASSED

kafka.server.SslReplicaFetchTest > testReplicaFetcherThread STARTED

kafka.server.SslReplicaFetchTest > testReplicaFetcherThread PASSED

kafka.server.IsrExpirationTest > testIsrExpirationForSlowFollowers STARTED

kafka.server.IsrExpirationTest > testIsrExpirationForSlowFollowers PASSED

kafka.server.IsrExpirationTest > testIsrExpirationForStuckFollowers STARTED

kafka.server.IsrExpirationTest > testIsrExpirationForStuckFollowers PASSED

kafka.server.IsrExpirationTest > testIsrExpirationIfNoFetchRequestMade STARTED

kafka.server.IsrExpirationTest > testIsrExpirationIfNoFetchRequestMade PASSED

kafka.server.SaslPlaintextReplicaFetchTest > testReplicaFetcherThread STARTED

kafka.server.SaslPlaintextReplicaFetchTest > testReplicaFetcherThread PASSED

kafka.server.ReplicationQuotasTest > 
shouldBootstrapTwoBrokersWithLeaderThrottle STARTED

kafka.server.ReplicationQuotasTest > 
shouldBootstrapTwoBrokersWithLeaderThrottle PASSED

kafka.server.ReplicationQuotasTest > shouldThrottleOldSegments STARTED

kafka.server.ReplicationQuotasTest > shouldThrottleOldSegments PASSED

kafka.server.ReplicationQuotasTest > 
shouldBootstrapTwoBrokersWithFollowerThrottle STARTED

kafka.server.ReplicationQuotasTest > 
shouldBootstrapTwoBrokersWithFollowerThrottle PASSED

kafka.server.ServerStartupTest > testBrokerStateRunningAfterZK STARTED

kafka.server.ServerStartupTest > testBrokerStateRunningAfterZK PASSED

kafka.server.ServerStartupTest > testBrokerCreatesZKChroot STARTED

kafka.server.ServerStartupTest > testBrokerCreatesZKChroot PASSED

kafka.server.ServerStartupTest > testConflictBrokerStartupWithSamePort STARTED

kafka.server.ServerStartupTest > testConflictBrokerStartupWithSamePort PASSED

kafka.server.ServerStartupTest > testConflictBrokerRegistration STARTED

kafka.server.ServerStartupTest > testConflictBrokerRegistration PASSED

kafka.server.ServerStartupTest > testBrokerSelfAware STARTED

kafka.server.ServerStartupTest > testBrokerSelfAware PASSED

kafka.server.ProduceRequestTest > testSimpleProduceRequest STARTED

kafka.server.ProduceRequestTest > testSimpleProduceRequest PASSED

kafka.server.ProduceRequestTest > testCorruptLz4ProduceRequest STARTED

kafka.server.ProduceRequestTest > testCorruptLz4ProduceRequest PASSED

kafka.server.ReplicaManagerTest > testHighWaterMarkDirectoryMapping STARTED

kafka.server.ReplicaManagerTest > testHighWaterMarkDirectoryMapping PASSED

kafka.server.ReplicaManagerTest > 
testFetchBeyondHighWatermarkReturnEmptyResponse STARTED

kafka.server.ReplicaManagerTest > 
testFetchBeyondHighWatermarkReturnEmptyResponse PASSED

kafka.server.ReplicaManagerTest > testIllegalRequiredAcks STARTED

kafka.server.ReplicaManagerTest > testIllegalRequiredAcks PASSED

kafka.server.ReplicaManagerTest > testClearPurgatoryOnBecomingFollower STARTED

kafka.server.ReplicaManagerTest > testClearPurgatoryOnBecomingFollower PASSED

kafka.server.ReplicaManagerTest > testHighwaterMarkRelativeDirectoryMapping 
STARTED

kafka.server.ReplicaManagerTest > testHighwaterMarkRelativeDirectoryMapping 
PASSED

kafka.server.KafkaMetricReporterClusterIdTest > testClusterIdPresent STARTED

kafka.server.KafkaMetricReporterClusterIdTest > testClusterIdPresent PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
PASSED

kafka.server.OffsetCommitTest > testUpdateOffsets STARTED

kafka.server.OffsetCommitTest > testUpdateOffsets PASSED

kafka.server.OffsetCommitTest > testLargeMetadataPayload STARTED

kafka.server.OffsetCommitTest > testLargeMetadataPayload PASSED


Jenkins build is back to normal : kafka-trunk-jdk8 #1401

2017-04-04 Thread Apache Jenkins Server
See 




[jira] [Commented] (KAFKA-4730) Streams does not have an in-memory windowed store

2017-04-04 Thread Nikki Thean (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956097#comment-15956097
 ] 

Nikki Thean commented on KAFKA-4730:


Hi [~enothereska], I'm interested in this ticket. What kind of data structure 
were you thinking of / would you suggest for the implementation of in-memory 
windowed stores? Composite keys for the Map the general implementation uses, 
separate key-value stores for each time segment, or something else?

> Streams does not have an in-memory windowed store
> -
>
> Key: KAFKA-4730
> URL: https://issues.apache.org/jira/browse/KAFKA-4730
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Eno Thereska
> Fix For: 0.11.0.0
>
>
> Streams has windowed persistent stores (e.g., see PersistentKeyValueFactory 
> interface with "windowed" method), however it does not allow for windowed 
> in-memory stores (e.g., see InMemoryKeyValueFactory interface). 
> In addition to the interface not allowing it, streams does not actually have 
> an implementation of an in-memory windowed store.
> The implications are that operations that require windowing cannot use 
> in-memory stores. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-0.10.2-jdk7 #119

2017-04-04 Thread Apache Jenkins Server
See 


Changes:

[me] KAFKA-4837: Fix class name comparison in connector-plugins REST endpoint

--
[...truncated 843.10 KB...]

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndMapToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndMapToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > stringToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > stringToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > timestampToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > timestampToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
timestampToConnectWithDefaultValue STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
timestampToConnectWithDefaultValue PASSED

org.apache.kafka.connect.json.JsonConverterTest > timeToConnectOptional STARTED

org.apache.kafka.connect.json.JsonConverterTest > timeToConnectOptional PASSED

org.apache.kafka.connect.json.JsonConverterTest > dateToConnectWithDefaultValue 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > dateToConnectWithDefaultValue 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > decimalToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > decimalToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonNonStringKeys STARTED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonNonStringKeys PASSED

org.apache.kafka.connect.json.JsonConverterTest > longToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > longToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > mismatchSchemaJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > mismatchSchemaJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToConnectNonStringKeys 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > mapToConnectNonStringKeys 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation PASSED

org.apache.kafka.connect.json.JsonConverterTest > bytesToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > bytesToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > shortToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > shortToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > intToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > intToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnectOptional 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnectOptional 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > structToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > structToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > stringToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > stringToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndArrayToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndArrayToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > byteToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > byteToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaPrimitiveToConnect 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaPrimitiveToConnect 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > byteToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > byteToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > intToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > intToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > dateToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > dateToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
decimalToConnectOptionalWithDefaultValue STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
decimalToConnectOptionalWithDefaultValue PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonStringKeys STARTED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonStringKeys PASSED


Jenkins build is back to normal : kafka-trunk-jdk7 #2070

2017-04-04 Thread Apache Jenkins Server
See 




[GitHub] kafka pull request #2743: KIP-101: Alter Replication Protocol to use Leader ...

2017-04-04 Thread benstopford
Github user benstopford closed the pull request at:

https://github.com/apache/kafka/pull/2743


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4923) Add Exactly-Once Semantics to Streams

2017-04-04 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-4923:
---
Summary: Add Exactly-Once Semantics to Streams  (was: Add Exactly-Once 
Semantics)

> Add Exactly-Once Semantics to Streams
> -
>
> Key: KAFKA-4923
> URL: https://issues.apache.org/jira/browse/KAFKA-4923
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>  Labels: kip
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-129%3A+Streams+Exactly-Once+Semantics



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4477) Node reduces its ISR to itself, and doesn't recover. Other nodes do not take leadership, cluster remains sick until node is restarted.

2017-04-04 Thread Joseph Aliase (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956038#comment-15956038
 ] 

Joseph Aliase commented on KAFKA-4477:
--

We have started seeing Open File Descriptor  issue in version 0.10.1.1. 

> Node reduces its ISR to itself, and doesn't recover. Other nodes do not take 
> leadership, cluster remains sick until node is restarted.
> --
>
> Key: KAFKA-4477
> URL: https://issues.apache.org/jira/browse/KAFKA-4477
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.0
> Environment: RHEL7
> java version "1.8.0_66"
> Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
> Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
>Reporter: Michael Andre Pearce (IG)
>Assignee: Apurva Mehta
>Priority: Critical
>  Labels: reliability
> Fix For: 0.10.1.1
>
> Attachments: 2016_12_15.zip, issue_node_1001_ext.log, 
> issue_node_1001.log, issue_node_1002_ext.log, issue_node_1002.log, 
> issue_node_1003_ext.log, issue_node_1003.log, kafka.jstack, 
> state_change_controller.tar.gz
>
>
> We have encountered a critical issue that has re-occured in different 
> physical environments. We haven't worked out what is going on. We do though 
> have a nasty work around to keep service alive. 
> We do have not had this issue on clusters still running 0.9.01.
> We have noticed a node randomly shrinking for the partitions it owns the 
> ISR's down to itself, moments later we see other nodes having disconnects, 
> followed by finally app issues, where producing to these partitions is 
> blocked.
> It seems only by restarting the kafka instance java process resolves the 
> issues.
> We have had this occur multiple times and from all network and machine 
> monitoring the machine never left the network, or had any other glitches.
> Below are seen logs from the issue.
> Node 7:
> [2016-12-01 07:01:28,112] INFO Partition 
> [com_ig_trade_v1_position_event--demo--compacted,10] on broker 7: Shrinking 
> ISR for partition [com_ig_trade_v1_position_event--demo--compacted,10] from 
> 1,2,7 to 7 (kafka.cluster.Partition)
> All other nodes:
> [2016-12-01 07:01:38,172] WARN [ReplicaFetcherThread-0-7], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@5aae6d42 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 7 was disconnected before the response was 
> read
> All clients:
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.NetworkException: The server disconnected 
> before a response was received.
> After this occurs, we then suddenly see on the sick machine an increasing 
> amount of close_waits and file descriptors.
> As a work around to keep service we are currently putting in an automated 
> process that tails and regex's for: and where new_partitions hit just itself 
> we restart the node. 
> "\[(?P.+)\] INFO Partition \[.*\] on broker .* Shrinking ISR for 
> partition \[.*\] from (?P.+) to (?P.+) 
> \(kafka.cluster.Partition\)"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5007) Kafka Replica Fetcher Thread- Resource Leak

2017-04-04 Thread Joseph Aliase (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956036#comment-15956036
 ] 

Joseph Aliase commented on KAFKA-5007:
--

[~ijuma] Can you take look into this issue

> Kafka Replica Fetcher Thread- Resource Leak
> ---
>
> Key: KAFKA-5007
> URL: https://issues.apache.org/jira/browse/KAFKA-5007
> Project: Kafka
>  Issue Type: Bug
>  Components: core, network
>Affects Versions: 0.10.1.1
> Environment: Centos 7
> Jave 8
>Reporter: Joseph Aliase
>
> Kafka is running out of open file descriptor when system network interface is 
> done.
> Issue description:
> We have a Kafka Cluster of 5 node running on version 0.10.1.1. The open file 
> descriptor for the account running Kafka is set to 10.
> During an upgrade, network interface went down. Outage continued for 12 hours 
> eventually all the broker crashed with java.io.IOException: Too many open 
> files error.
> We repeated the test in a lower environment and observed that Open Socket 
> count keeps on increasing while the NIC is down.
> We have around 13 topics with max partition size of 120 and number of replica 
> fetcher thread is set to 8.
> Using an internal monitoring tool we observed that Open Socket descriptor   
> for the broker pid continued to increase although NIC was down leading to  
> Open File descriptor error. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2808: KIP-101: Alter Replication Protocol to use Leader ...

2017-04-04 Thread benstopford
GitHub user benstopford opened a pull request:

https://github.com/apache/kafka/pull/2808

KIP-101: Alter Replication Protocol to use Leader Epoch rather than High 
Watermark for Truncation

This PR describes the addition of Partition Level Leader Epochs to messages 
in Kafka as a mechanism for fixing some known issues in the replication 
protocol. Full details can be found here:

[KIP-101 
Reference](https://cwiki.apache.org/confluence/display/KAFKA/KIP-101+-+Alter+Replication+Protocol+to+use+Leader+Epoch+rather+than+High+Watermark+for+Truncation)

*The key elements are*:
- Epochs are stamped on messages as they enter the leader.
- Epochs are tracked in both leader and follower in a new checkpoint file. 
- A new API allows followers to retrieve the leader's latest offset for a 
particular epoch. 
- The logic for truncating the log, when a replica becomes a follower, has 
been moved from Partition into the ReplicaFetcherThread
- When partitions are added to the ReplicaFetcherThread they are added in 
an initialising state. Initialising partitions request leader epochs and then 
truncate their logs appropriately. 

This test provides a good overview of the workflow 
`EpochDrivenReplicationProtocolAcceptanceTest.shouldFollowLeaderEpochBasicWorkflow()`

The corrupted log use case is covered by the test  
`EpochDrivenReplicationProtocolAcceptanceTest.offsetsShouldNotGoBackwards()`

Remaining work: The test 
`EpochDrivenReplicationProtocolAcceptanceTest.shouldSurviveFastLeaderChange()` 
doesn't correctly reproduce the underlying issue. This will be altered later to 
properly support this use case. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/confluentinc/kafka kip-101-v2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2808.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2808


commit a96a8bbee2435bd46cd19746f61b73eeb2f94088
Author: Ben Stopford 
Date:   2017-03-27T16:16:16Z

All work to date squashed (18 committs)
KIP-101: Push after merge.

KIP-101: Fixes for checksytle breaks

KIP-101: Remove TestSuite class

KIP-101: Comments

KIP-101: Comments

KIP-101: Altered logic in ReplicaFetcherThread:
- On NoLeaderForPartition continue to poll for epochs indefinitely
- Add synchronisation around log trucation to ensure we cannot truncate the 
log of a leader (light testing, more to follow, noted in TODO)

KIP-101: Rename Epoch -> PartitionLeaderEpoch

KIP-101:  First commit based on feedback from Jun/Jason

KIP-101:  Second commit based on feedback from Jun/Jason

KIP-101:  Third commit based on feedback from Jun/Jason

KIP-101:  Fourth commit based on feedback from Jun/Jason
- removed retainMatchingOffset parameter from clearOldest as not used

KIP-101:  tidy only

KIP-101:  Return Log End Offset If Undefined Epoch Requested (this covers 
the case of a bootstrapping broker)

KIP-101:  Altered log truncation to always be inclusive, so we always 
delete epochs inclusive of the passed offset, whether clearing earliest or 
latest entries.

KIP-101:  Add optimisation back in for previous commit.

KIP-101:  If epochOffset.endOffset() is UNSUPPORTED_EPOCH_OFFSET, which can 
happen during the transition phase, we should fall back to HW.
Improved fuglyness too.

KIP-101:  Small tidy

KIP-101:  Refactored threading model in Abstract/ReplicaFetcherThread. 
Functionally identical but now the logic sits largely in the abstract class.

KIP-101:  Moved OffsetsForLeaderEpoch.getResponseFor() into ReplicaManager

KIP-101:  (1) Altered ReplicaFetcherThread to poll continuously on errors. 
(2) Only send epoch requests if version >= 11

KIP-101:  As segments are recovered, truncate the epoch cache with the 
appropriate segment

KIP-101:  Fix bug in DummyFetcherThread which was defaulting to requiring 
initialisation. Caused AbstractFetherThread test to hang.

KIP-101:  Fix bug in ReplicaManager imports

KIP-101:  Fix bug in ReplicaManager imports by making all imports explicit. 
Also remove OffsetCheckpointFile which appears to still be in the remote 
repostiory. This was causing a compilation issue.

KIP-101:  Remove override of OffsetsTopicPartitionsProp (to 5) in 
PlaintexConsumerTest as it causes a test in BaseConsumerTest to fail. Will fix 
this issue in separate PR

KIP-101:  Rename only (OffsesForLeaderEpochRequest)

KIP-101:  Fix merge error

KIP-101:  Fix couple more merge errors

KIP-101:  Re-enable test_zk_security_upgrade on Ismael's request


[GitHub] kafka pull request #2798: KAFKA-4837: Fix class name comparison in connector...

2017-04-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2798


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4837) Config validation in Connector plugins need to compare against both canonical and simple class names

2017-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955952#comment-15955952
 ] 

ASF GitHub Bot commented on KAFKA-4837:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2798


> Config validation in Connector plugins need to compare against both canonical 
> and simple class names
> 
>
> Key: KAFKA-4837
> URL: https://issues.apache.org/jira/browse/KAFKA-4837
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
> Fix For: 0.11.0.0, 0.10.2.1
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> A validation check in Connect's REST API that was added to validate that the 
> connector class name in the config matches the connector class name in the 
> request's URL is too strict by not considering both the simple and the 
> canonical name of the connector class. For instance, the following example 
> request: 
> {code}
> PUT /connector-plugins/FileStreamSinkConnector/config/validate/ HTTP/1.1
> Host: connect.example.com
> Accept: application/json
> {
> "connector.class": 
> "org.apache.kafka.connect.file.FileStreamSinkConnector",
> "tasks.max": "1",
> "topics": "test-topic"
> }
> {code}
> returns a "Bad Request" response with error code "400".
> Currently the reasonable workaround is to exactly match the connector class 
> name in both places. The following will work: 
> {code}
> PUT 
> /connector-plugins/"org.apache.kafka.connect.file.FileStreamSinkConnector/config/validate/
>  HTTP/1.1
> Host: connect.example.com
> Accept: application/json
> {
> "connector.class": 
> "org.apache.kafka.connect.file.FileStreamSinkConnector",
> "tasks.max": "1",
> "topics": "test-topic"
> }
> {code}
> However, this is not flexible enough and also breaks several examples in 
> documentation. Validation should take into account both simple and canonical 
> class names. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4837) Config validation in Connector plugins need to compare against both canonical and simple class names

2017-04-04 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4837:
-
   Resolution: Fixed
Fix Version/s: 0.11.0.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 2798
[https://github.com/apache/kafka/pull/2798]

> Config validation in Connector plugins need to compare against both canonical 
> and simple class names
> 
>
> Key: KAFKA-4837
> URL: https://issues.apache.org/jira/browse/KAFKA-4837
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
> Fix For: 0.11.0.0, 0.10.2.1
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> A validation check in Connect's REST API that was added to validate that the 
> connector class name in the config matches the connector class name in the 
> request's URL is too strict by not considering both the simple and the 
> canonical name of the connector class. For instance, the following example 
> request: 
> {code}
> PUT /connector-plugins/FileStreamSinkConnector/config/validate/ HTTP/1.1
> Host: connect.example.com
> Accept: application/json
> {
> "connector.class": 
> "org.apache.kafka.connect.file.FileStreamSinkConnector",
> "tasks.max": "1",
> "topics": "test-topic"
> }
> {code}
> returns a "Bad Request" response with error code "400".
> Currently the reasonable workaround is to exactly match the connector class 
> name in both places. The following will work: 
> {code}
> PUT 
> /connector-plugins/"org.apache.kafka.connect.file.FileStreamSinkConnector/config/validate/
>  HTTP/1.1
> Host: connect.example.com
> Accept: application/json
> {
> "connector.class": 
> "org.apache.kafka.connect.file.FileStreamSinkConnector",
> "tasks.max": "1",
> "topics": "test-topic"
> }
> {code}
> However, this is not flexible enough and also breaks several examples in 
> documentation. Validation should take into account both simple and canonical 
> class names. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4810) SchemaBuilder should be more lax about checking that fields are unset if they are being set to the same value

2017-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955937#comment-15955937
 ] 

ASF GitHub Bot commented on KAFKA-4810:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2806


> SchemaBuilder should be more lax about checking that fields are unset if they 
> are being set to the same value
> -
>
> Key: KAFKA-4810
> URL: https://issues.apache.org/jira/browse/KAFKA-4810
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Vitaly Pushkar
>  Labels: newbie
> Fix For: 0.11.0.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Currently SchemaBuilder is strict when checking that certain fields have not 
> been set yet (e.g. version, name, doc). It just checks that the field is 
> null. This is intended to protect the user from buggy code that overwrites a 
> field with different values, but it's a bit too strict currently. In generic 
> code for converting schemas (e.g. Converters) you will sometimes initialize a 
> builder with these values (e.g. because you get a SchemaBuilder for a logical 
> type, which sets name & version), but then have generic code for setting name 
> & version from the source schema.
> We saw this bug in practice with Confluent's AvroConverter, so it's likely it 
> could trip up others as well. You can work around the issue, but it would be 
> nice if exceptions were only thrown if you try to overwrite an existing value 
> with a different value.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2806: KAFKA-4810: Kafka Connect SchemaBuilder unset fiel...

2017-04-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2806


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-4810) SchemaBuilder should be more lax about checking that fields are unset if they are being set to the same value

2017-04-04 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4810.
--
   Resolution: Fixed
Fix Version/s: 0.11.0.0

Issue resolved by pull request 2806
[https://github.com/apache/kafka/pull/2806]

> SchemaBuilder should be more lax about checking that fields are unset if they 
> are being set to the same value
> -
>
> Key: KAFKA-4810
> URL: https://issues.apache.org/jira/browse/KAFKA-4810
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Vitaly Pushkar
>  Labels: newbie
> Fix For: 0.11.0.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Currently SchemaBuilder is strict when checking that certain fields have not 
> been set yet (e.g. version, name, doc). It just checks that the field is 
> null. This is intended to protect the user from buggy code that overwrites a 
> field with different values, but it's a bit too strict currently. In generic 
> code for converting schemas (e.g. Converters) you will sometimes initialize a 
> builder with these values (e.g. because you get a SchemaBuilder for a logical 
> type, which sets name & version), but then have generic code for setting name 
> & version from the source schema.
> We saw this bug in practice with Confluent's AvroConverter, so it's likely it 
> could trip up others as well. You can work around the issue, but it would be 
> nice if exceptions were only thrown if you try to overwrite an existing value 
> with a different value.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5008) Kafka-Clients not OSGi ready

2017-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955819#comment-15955819
 ] 

ASF GitHub Bot commented on KAFKA-5008:
---

GitHub user lostiniceland opened a pull request:

https://github.com/apache/kafka/pull/2807

KAFKA-5008: Provide OSGi metadata for Kafka-Clients

This change uses the bnd-gradle-plugin for the kafka-clients module in 
order to generate OSGi metadata.
The bnd.bnd file is used by the plugin for instructions. For now, only 
org.apacha.kafka.clients.* is exported.

Signed-off-by: Marc Schlegel 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lostiniceland/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2807.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2807


commit e7d9e95724d669d4a9897b379b36e37a621bc65b
Author: Marc Schlegel 
Date:   2017-04-04T21:07:39Z

KAFKA-5008: Provide OSGi metadata for Kafka-Clients

This change uses the bnd-gradle-plugin for the kafka-clients module in 
order to generate OSGi metadata.
The bnd.bnd file is used by the plugin for instructions. For now, only 
org.apacha.kafka.clients.* is exported.

Signed-off-by: Marc Schlegel 




> Kafka-Clients not OSGi ready
> 
>
> Key: KAFKA-5008
> URL: https://issues.apache.org/jira/browse/KAFKA-5008
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.2.0
>Reporter: Marc
>Priority: Minor
>
> The kafka-clients artifact does not provide OSGi metadata. This adds an 
> additional barrier for OSGi developers to use the artifact since it has to be 
> [wrapped|http://bnd.bndtools.org/chapters/390-wrapping.html].
> The metadata can automatically be created using bnd.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2807: KAFKA-5008: Provide OSGi metadata for Kafka-Client...

2017-04-04 Thread lostiniceland
GitHub user lostiniceland opened a pull request:

https://github.com/apache/kafka/pull/2807

KAFKA-5008: Provide OSGi metadata for Kafka-Clients

This change uses the bnd-gradle-plugin for the kafka-clients module in 
order to generate OSGi metadata.
The bnd.bnd file is used by the plugin for instructions. For now, only 
org.apacha.kafka.clients.* is exported.

Signed-off-by: Marc Schlegel 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lostiniceland/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2807.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2807


commit e7d9e95724d669d4a9897b379b36e37a621bc65b
Author: Marc Schlegel 
Date:   2017-04-04T21:07:39Z

KAFKA-5008: Provide OSGi metadata for Kafka-Clients

This change uses the bnd-gradle-plugin for the kafka-clients module in 
order to generate OSGi metadata.
The bnd.bnd file is used by the plugin for instructions. For now, only 
org.apacha.kafka.clients.* is exported.

Signed-off-by: Marc Schlegel 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4222) Transient failure in QueryableStateIntegrationTest.queryOnRebalance

2017-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955785#comment-15955785
 ] 

ASF GitHub Bot commented on KAFKA-4222:
---

Github user mjsax closed the pull request at:

https://github.com/apache/kafka/pull/2358


> Transient failure in QueryableStateIntegrationTest.queryOnRebalance
> ---
>
> Key: KAFKA-4222
> URL: https://issues.apache.org/jira/browse/KAFKA-4222
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams, unit tests
>Reporter: Jason Gustafson
>Assignee: Matthias J. Sax
> Fix For: 0.10.1.0
>
>
> Seen here: https://builds.apache.org/job/kafka-trunk-jdk8/915/console
> {code}
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
> queryOnRebalance[1] FAILED
> java.lang.AssertionError: Condition not met within timeout 3. waiting 
> for metadata, store and value to be non null
> at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:276)
> at 
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest.verifyAllKVKeys(QueryableStateIntegrationTest.java:263)
> at 
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest.queryOnRebalance(QueryableStateIntegrationTest.java:342)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2358: WIP -- DO NOT MERGE -- KAFKA-4222: Transient failu...

2017-04-04 Thread mjsax
Github user mjsax closed the pull request at:

https://github.com/apache/kafka/pull/2358


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3455) Connect custom processors with the streams DSL

2017-04-04 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955739#comment-15955739
 ] 

Matthias J. Sax commented on KAFKA-3455:


[~nthean] "I have done exactly this" -> you mean you did use 
{{stream.transform(...).to(...)}} ?

> Connect custom processors with the streams DSL
> --
>
> Key: KAFKA-3455
> URL: https://issues.apache.org/jira/browse/KAFKA-3455
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.0.1
>Reporter: Jonathan Bender
>  Labels: user-experience
> Fix For: 0.11.0.0
>
>
> From the kafka users email thread, we discussed the idea of connecting custom 
> processors with topologies defined from the Streams DSL (and being able to 
> sink data from the processor).  Possibly this could involve exposing the 
> underlying processor's name in the streams DSL so it can be connected with 
> the standard processor API.
> {quote}
> Thanks for the feedback. This is definitely something we wanted to support
> in the Streams DSL.
> One tricky thing, though, is that some operations do not translate to a
> single processor, but a sub-graph of processors (think of a stream-stream
> join, which is translated to actually 5 processors for windowing / state
> queries / merging, each with a different internal name). So how to define
> the API to return the processor name needs some more thinking.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4810) SchemaBuilder should be more lax about checking that fields are unset if they are being set to the same value

2017-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955732#comment-15955732
 ] 

ASF GitHub Bot commented on KAFKA-4810:
---

GitHub user vitaly-pushkar opened a pull request:

https://github.com/apache/kafka/pull/2806

KAFKA-4810: Kafka Connect SchemaBuilder unset fields validation fix

https://issues.apache.org/jira/browse/KAFKA-4810
 
> Currently SchemaBuilder is strict when checking that certain fields have 
not been set yet (e.g. version, name, doc). It just checks that the field is 
null. This is intended to protect the user from buggy code that overwrites a 
field with different values, but it's a bit too strict currently. In generic 
code for converting schemas (e.g. Converters) you will sometimes initialize a 
builder with these values (e.g. because you get a SchemaBuilder for a logical 
type, which sets name & version), but then have generic code for setting name & 
version from the source schema.

Changed the validation method to not only check if a field is null but also 
to check if the new value that is being set is the same as the current value of 
the field.
@ewencp


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vitaly-pushkar/kafka 
KAFKA-4810-schema-builder-default-fields-validation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2806.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2806


commit 7ed22f0b54ab09aebe6c2f6e9404974ba3b21ed1
Author: Vitaly Pushkar 
Date:   2017-04-04T20:00:46Z

Change field null validation to accept the same value overwrite




> SchemaBuilder should be more lax about checking that fields are unset if they 
> are being set to the same value
> -
>
> Key: KAFKA-4810
> URL: https://issues.apache.org/jira/browse/KAFKA-4810
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Vitaly Pushkar
>  Labels: newbie
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Currently SchemaBuilder is strict when checking that certain fields have not 
> been set yet (e.g. version, name, doc). It just checks that the field is 
> null. This is intended to protect the user from buggy code that overwrites a 
> field with different values, but it's a bit too strict currently. In generic 
> code for converting schemas (e.g. Converters) you will sometimes initialize a 
> builder with these values (e.g. because you get a SchemaBuilder for a logical 
> type, which sets name & version), but then have generic code for setting name 
> & version from the source schema.
> We saw this bug in practice with Confluent's AvroConverter, so it's likely it 
> could trip up others as well. You can work around the issue, but it would be 
> nice if exceptions were only thrown if you try to overwrite an existing value 
> with a different value.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2806: KAFKA-4810: Kafka Connect SchemaBuilder unset fiel...

2017-04-04 Thread vitaly-pushkar
GitHub user vitaly-pushkar opened a pull request:

https://github.com/apache/kafka/pull/2806

KAFKA-4810: Kafka Connect SchemaBuilder unset fields validation fix

https://issues.apache.org/jira/browse/KAFKA-4810
 
> Currently SchemaBuilder is strict when checking that certain fields have 
not been set yet (e.g. version, name, doc). It just checks that the field is 
null. This is intended to protect the user from buggy code that overwrites a 
field with different values, but it's a bit too strict currently. In generic 
code for converting schemas (e.g. Converters) you will sometimes initialize a 
builder with these values (e.g. because you get a SchemaBuilder for a logical 
type, which sets name & version), but then have generic code for setting name & 
version from the source schema.

Changed the validation method to not only check if a field is null but also 
to check if the new value that is being set is the same as the current value of 
the field.
@ewencp


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vitaly-pushkar/kafka 
KAFKA-4810-schema-builder-default-fields-validation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2806.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2806


commit 7ed22f0b54ab09aebe6c2f6e9404974ba3b21ed1
Author: Vitaly Pushkar 
Date:   2017-04-04T20:00:46Z

Change field null validation to accept the same value overwrite




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4810) SchemaBuilder should be more lax about checking that fields are unset if they are being set to the same value

2017-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955707#comment-15955707
 ] 

ASF GitHub Bot commented on KAFKA-4810:
---

Github user vitaly-pushkar closed the pull request at:

https://github.com/apache/kafka/pull/2787


> SchemaBuilder should be more lax about checking that fields are unset if they 
> are being set to the same value
> -
>
> Key: KAFKA-4810
> URL: https://issues.apache.org/jira/browse/KAFKA-4810
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Vitaly Pushkar
>  Labels: newbie
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Currently SchemaBuilder is strict when checking that certain fields have not 
> been set yet (e.g. version, name, doc). It just checks that the field is 
> null. This is intended to protect the user from buggy code that overwrites a 
> field with different values, but it's a bit too strict currently. In generic 
> code for converting schemas (e.g. Converters) you will sometimes initialize a 
> builder with these values (e.g. because you get a SchemaBuilder for a logical 
> type, which sets name & version), but then have generic code for setting name 
> & version from the source schema.
> We saw this bug in practice with Confluent's AvroConverter, so it's likely it 
> could trip up others as well. You can work around the issue, but it would be 
> nice if exceptions were only thrown if you try to overwrite an existing value 
> with a different value.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2787: KAFKA-4810: Kafka Connect SchemaBuilder unset fiel...

2017-04-04 Thread vitaly-pushkar
Github user vitaly-pushkar closed the pull request at:

https://github.com/apache/kafka/pull/2787


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3455) Connect custom processors with the streams DSL

2017-04-04 Thread Nikki Thean (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955651#comment-15955651
 ] 

Nikki Thean commented on KAFKA-3455:


If it helps, as a user, I have done exactly this and found it pretty intuitive. 
My app was built last December.

> Connect custom processors with the streams DSL
> --
>
> Key: KAFKA-3455
> URL: https://issues.apache.org/jira/browse/KAFKA-3455
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.0.1
>Reporter: Jonathan Bender
>  Labels: user-experience
> Fix For: 0.11.0.0
>
>
> From the kafka users email thread, we discussed the idea of connecting custom 
> processors with topologies defined from the Streams DSL (and being able to 
> sink data from the processor).  Possibly this could involve exposing the 
> underlying processor's name in the streams DSL so it can be connected with 
> the standard processor API.
> {quote}
> Thanks for the feedback. This is definitely something we wanted to support
> in the Streams DSL.
> One tricky thing, though, is that some operations do not translate to a
> single processor, but a sub-graph of processors (think of a stream-stream
> join, which is translated to actually 5 processors for windowing / state
> queries / merging, each with a different internal name). So how to define
> the API to return the processor name needs some more thinking.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-138: Change punctuate semantics

2017-04-04 Thread Thomas Becker
Yeah I like PuncutationType much better; I just threw Time out there
more as a strawman than an actual suggestion ;) I still think it's
worth considering what this buys us over an additional callback. I
foresee a number of punctuate implementations following this pattern:

public void punctuate(PunctuationType type) {
switch (type) {
case EVENT_TIME:
methodA();
break;
case SYSTEM_TIME:
methodB();
break;
}
}

I guess one advantage of this approach is we could add additional
punctuation types later in a backwards compatible way (like event count
as you mentioned).

-Tommy


On Tue, 2017-04-04 at 11:10 -0700, Matthias J. Sax wrote:
> That sounds promising.
>
> I am just wondering if `Time` is the best name. Maybe we want to add
> other non-time based punctuations at some point later. I would
> suggest
>
> enum PunctuationType {
>   EVENT_TIME,
>   SYSTEM_TIME,
> }
>
> or similar. Just to keep the door open -- it's easier to add new
> stuff
> if the name is more generic.
>
>
> -Matthias
>
>
> On 4/4/17 5:30 AM, Thomas Becker wrote:
> >
> > I agree that the framework providing and managing the notion of
> > stream
> > time is valuable and not something we would want to delegate to the
> > tasks. I'm not entirely convinced that a separate callback (option
> > C)
> > is that messy (it could just be a default method with an empty
> > implementation), but if we wanted a single API to handle both
> > cases,
> > how about something like the following?
> >
> > enum Time {
> >STREAM,
> >CLOCK
> > }
> >
> > Then on ProcessorContext:
> > context.schedule(Time time, long interval)  // We could allow this
> > to
> > be called once for each value of time to mix approaches.
> >
> > Then the Processor API becomes:
> > punctuate(Time time) // time here denotes which schedule resulted
> > in
> > this call.
> >
> > Thoughts?
> >
> >
> > On Mon, 2017-04-03 at 22:44 -0700, Matthias J. Sax wrote:
> > >
> > > Thanks a lot for the KIP Michal,
> > >
> > > I was thinking about the four options you proposed in more
> > > details
> > > and
> > > this are my thoughts:
> > >
> > > (A) You argue, that users can still "punctuate" on event-time via
> > > process(), but I am not sure if this is possible. Note, that
> > > users
> > > only
> > > get record timestamps via context.timestamp(). Thus, users would
> > > need
> > > to
> > > track the time progress per partition (based on the partitions
> > > they
> > > obverse via context.partition(). (This alone puts a huge burden
> > > on
> > > the
> > > user by itself.) However, users are not notified at startup what
> > > partitions are assigned, and user are not notified when
> > > partitions
> > > get
> > > revoked. Because this information is not available, it's not
> > > possible
> > > to
> > > "manually advance" stream-time, and thus event-time punctuation
> > > within
> > > process() seems not to be possible -- or do you see a way to get
> > > it
> > > done? And even if, it might still be too clumsy to use.
> > >
> > > (B) This does not allow to mix both approaches, thus limiting
> > > what
> > > users
> > > can do.
> > >
> > > (C) This should give all flexibility we need. However, just
> > > adding
> > > one
> > > more method seems to be a solution that is too simple (cf my
> > > comments
> > > below).
> > >
> > > (D) This might be hard to use. Also, I am not sure how a user
> > > could
> > > enable system-time and event-time punctuation in parallel.
> > >
> > >
> > >
> > > Overall options (C) seems to be the most promising approach to
> > > me.
> > > Because I also favor a clean API, we might keep current
> > > punctuate()
> > > as-is, but deprecate it -- so we can remove it at some later
> > > point
> > > when
> > > people use the "new punctuate API".
> > >
> > >
> > > Couple of follow up questions:
> > >
> > > - I am wondering, if we should have two callback methods or just
> > > one
> > > (ie, a unified for system and event time punctuation or one for
> > > each?).
> > >
> > > - If we have one, how can the user figure out, which condition
> > > did
> > > trigger?
> > >
> > > - How would the API look like, for registering different
> > > punctuate
> > > schedules? The "type" must be somehow defined?
> > >
> > > - We might want to add "complex" schedules later on (like,
> > > punctuate
> > > on
> > > every 10 seconds event-time or 60 seconds system-time whatever
> > > comes
> > > first). I don't say we should add this right away, but we might
> > > want
> > > to
> > > define the API in a way, that it allows extensions like this
> > > later
> > > on,
> > > without redesigning the API (ie, the API should be designed
> > > extensible)
> > >
> > > - Did you ever consider count-based punctuation?
> > >
> > >
> > > I understand, that you would like to solve a simple problem, but
> > > we
> > > learned from the past, that just "adding some API" quickly leads
> > > to a
> > > not very well defined API that needs time 

[jira] [Commented] (KAFKA-5013) Fail the build when findbugs fails

2017-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955607#comment-15955607
 ] 

ASF GitHub Bot commented on KAFKA-5013:
---

GitHub user cmccabe opened a pull request:

https://github.com/apache/kafka/pull/2805

KAFKA-5013. Fail the build when findbugs fails



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cmccabe/kafka KAFKA-5013

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2805.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2805


commit 9ed55e9076c0a75252a76f2edf2c9fe43a145e98
Author: Colin P. Mccabe 
Date:   2017-04-04T18:43:07Z

KAFKA-5013. Fail the build when findbugs fails




> Fail the build when findbugs fails
> --
>
> Key: KAFKA-5013
> URL: https://issues.apache.org/jira/browse/KAFKA-5013
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>
> We should fail the build when findbugs fails, so that new findbugs warnings 
> do not creep in.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2805: KAFKA-5013. Fail the build when findbugs fails

2017-04-04 Thread cmccabe
GitHub user cmccabe opened a pull request:

https://github.com/apache/kafka/pull/2805

KAFKA-5013. Fail the build when findbugs fails



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cmccabe/kafka KAFKA-5013

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2805.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2805


commit 9ed55e9076c0a75252a76f2edf2c9fe43a145e98
Author: Colin P. Mccabe 
Date:   2017-04-04T18:43:07Z

KAFKA-5013. Fail the build when findbugs fails




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5013) Fail the build when findbugs fails

2017-04-04 Thread Colin P. McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955602#comment-15955602
 ] 

Colin P. McCabe commented on KAFKA-5013:


For now, let's exclude tests.  I'll post a patch which can be applied once the 
other findbugs patches have landed.

> Fail the build when findbugs fails
> --
>
> Key: KAFKA-5013
> URL: https://issues.apache.org/jira/browse/KAFKA-5013
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>
> We should fail the build when findbugs fails, so that new findbugs warnings 
> do not creep in.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5013) Fail the build when findbugs fails

2017-04-04 Thread Colin P. McCabe (JIRA)
Colin P. McCabe created KAFKA-5013:
--

 Summary: Fail the build when findbugs fails
 Key: KAFKA-5013
 URL: https://issues.apache.org/jira/browse/KAFKA-5013
 Project: Kafka
  Issue Type: Bug
Reporter: Colin P. McCabe
Assignee: Colin P. McCabe


We should fail the build when findbugs fails, so that new findbugs warnings do 
not creep in.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-5009) Globally Unique internal topic names when using Joins

2017-04-04 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955544#comment-15955544
 ] 

Matthias J. Sax edited comment on KAFKA-5009 at 4/4/17 6:16 PM:


All topics generated by Kafka Streams are prefixed with {{application.id}} -- 
thus, multiple Kafka Streams instances should not conflict on topic names -- 
topic names should be globally unique already. Can you explain in more detail 
what conflicts you did observe? I am also not sure, who this would related to 
Confluent Schema Registry?


was (Author: mjsax):
All topics generated by Kafka Streams are prefixed with {{application.id}} -- 
thus, multiple Kafka Streams instances should not conflict on topic names. Can 
you explain in more detail what conflicts you did observe? I am also not sure, 
who this would related to Confluent Schema Registry?

> Globally Unique internal topic names when using Joins 
> --
>
> Key: KAFKA-5009
> URL: https://issues.apache.org/jira/browse/KAFKA-5009
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Mark Tranter
>
> We are using multiple different Kafka Streams applications on the back of a 
> single kafka cluster. This allows each micro-service in our enterprise to 
> consume & process kafka data to suit its own needs.
> Currently when joining streams, an internal topic is created & named using 
> KStreamsBuilder.newName(prefix);
> ```
> String thisWindowStreamName = topology.newName(WINDOWED_NAME);
> String otherWindowStreamName = topology.newName(WINDOWED_NAME);
> String joinThisName = rightOuter ? 
> topology.newName(OUTERTHIS_NAME) : topology.newName(JOINTHIS_NAME);
> String joinOtherName = leftOuter ? 
> topology.newName(OUTEROTHER_NAME) : topology.newName(JOINOTHER_NAME);
> String joinMergeName = topology.newName(MERGE_NAME);
> ```
> This prefix is a constant, and internally an incrementor is used to generate 
> a  unique (per KStreamBuilder instance) topic name for the topic.
> In situations where multiple KStreamBuilders are in use (for example, 
> multiple different Kafka Streams applications) we are seeing collisions in 
> topic names.
> Perhaps the join() methods should take a topic prefix overload to allow 
> developers to provide unique names for these topics. Similar to the way other 
> stateful processors work when having to provide a store name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5009) Globally Unique internal topic names when using Joins

2017-04-04 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955544#comment-15955544
 ] 

Matthias J. Sax commented on KAFKA-5009:


All topics generated by Kafka Streams are prefixed with {{application.id}} -- 
thus, multiple Kafka Streams instances should not conflict on topic names. Can 
you explain in more detail what conflicts you did observe? I am also not sure, 
who this would related to Confluent Schema Registry?

> Globally Unique internal topic names when using Joins 
> --
>
> Key: KAFKA-5009
> URL: https://issues.apache.org/jira/browse/KAFKA-5009
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Mark Tranter
>
> We are using multiple different Kafka Streams applications on the back of a 
> single kafka cluster. This allows each micro-service in our enterprise to 
> consume & process kafka data to suit its own needs.
> Currently when joining streams, an internal topic is created & named using 
> KStreamsBuilder.newName(prefix);
> ```
> String thisWindowStreamName = topology.newName(WINDOWED_NAME);
> String otherWindowStreamName = topology.newName(WINDOWED_NAME);
> String joinThisName = rightOuter ? 
> topology.newName(OUTERTHIS_NAME) : topology.newName(JOINTHIS_NAME);
> String joinOtherName = leftOuter ? 
> topology.newName(OUTEROTHER_NAME) : topology.newName(JOINOTHER_NAME);
> String joinMergeName = topology.newName(MERGE_NAME);
> ```
> This prefix is a constant, and internally an incrementor is used to generate 
> a  unique (per KStreamBuilder instance) topic name for the topic.
> In situations where multiple KStreamBuilders are in use (for example, 
> multiple different Kafka Streams applications) we are seeing collisions in 
> topic names.
> Perhaps the join() methods should take a topic prefix overload to allow 
> developers to provide unique names for these topics. Similar to the way other 
> stateful processors work when having to provide a store name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-138: Change punctuate semantics

2017-04-04 Thread Matthias J. Sax
That sounds promising.

I am just wondering if `Time` is the best name. Maybe we want to add
other non-time based punctuations at some point later. I would suggest

enum PunctuationType {
  EVENT_TIME,
  SYSTEM_TIME,
}

or similar. Just to keep the door open -- it's easier to add new stuff
if the name is more generic.


-Matthias


On 4/4/17 5:30 AM, Thomas Becker wrote:
> I agree that the framework providing and managing the notion of stream
> time is valuable and not something we would want to delegate to the
> tasks. I'm not entirely convinced that a separate callback (option C)
> is that messy (it could just be a default method with an empty
> implementation), but if we wanted a single API to handle both cases,
> how about something like the following?
> 
> enum Time {
>STREAM,
>CLOCK
> }
> 
> Then on ProcessorContext:
> context.schedule(Time time, long interval)  // We could allow this to
> be called once for each value of time to mix approaches.
> 
> Then the Processor API becomes:
> punctuate(Time time) // time here denotes which schedule resulted in
> this call.
> 
> Thoughts?
> 
> 
> On Mon, 2017-04-03 at 22:44 -0700, Matthias J. Sax wrote:
>> Thanks a lot for the KIP Michal,
>>
>> I was thinking about the four options you proposed in more details
>> and
>> this are my thoughts:
>>
>> (A) You argue, that users can still "punctuate" on event-time via
>> process(), but I am not sure if this is possible. Note, that users
>> only
>> get record timestamps via context.timestamp(). Thus, users would need
>> to
>> track the time progress per partition (based on the partitions they
>> obverse via context.partition(). (This alone puts a huge burden on
>> the
>> user by itself.) However, users are not notified at startup what
>> partitions are assigned, and user are not notified when partitions
>> get
>> revoked. Because this information is not available, it's not possible
>> to
>> "manually advance" stream-time, and thus event-time punctuation
>> within
>> process() seems not to be possible -- or do you see a way to get it
>> done? And even if, it might still be too clumsy to use.
>>
>> (B) This does not allow to mix both approaches, thus limiting what
>> users
>> can do.
>>
>> (C) This should give all flexibility we need. However, just adding
>> one
>> more method seems to be a solution that is too simple (cf my comments
>> below).
>>
>> (D) This might be hard to use. Also, I am not sure how a user could
>> enable system-time and event-time punctuation in parallel.
>>
>>
>>
>> Overall options (C) seems to be the most promising approach to me.
>> Because I also favor a clean API, we might keep current punctuate()
>> as-is, but deprecate it -- so we can remove it at some later point
>> when
>> people use the "new punctuate API".
>>
>>
>> Couple of follow up questions:
>>
>> - I am wondering, if we should have two callback methods or just one
>> (ie, a unified for system and event time punctuation or one for
>> each?).
>>
>> - If we have one, how can the user figure out, which condition did
>> trigger?
>>
>> - How would the API look like, for registering different punctuate
>> schedules? The "type" must be somehow defined?
>>
>> - We might want to add "complex" schedules later on (like, punctuate
>> on
>> every 10 seconds event-time or 60 seconds system-time whatever comes
>> first). I don't say we should add this right away, but we might want
>> to
>> define the API in a way, that it allows extensions like this later
>> on,
>> without redesigning the API (ie, the API should be designed
>> extensible)
>>
>> - Did you ever consider count-based punctuation?
>>
>>
>> I understand, that you would like to solve a simple problem, but we
>> learned from the past, that just "adding some API" quickly leads to a
>> not very well defined API that needs time consuming clean up later on
>> via other KIPs. Thus, I would prefer to get a holistic punctuation
>> KIP
>> with this from the beginning on to avoid later painful redesign.
>>
>>
>>
>> -Matthias
>>
>>
>>
>> On 4/3/17 11:58 AM, Michal Borowiecki wrote:
>>>
>>> Thanks Thomas,
>>>
>>> I'm also wary of changing the existing semantics of punctuate, for
>>> backward compatibility reasons, although I like the conceptual
>>> simplicity of that option.
>>>
>>> Adding a new method to me feels safer but, in a way, uglier. I
>>> added
>>> this to the KIP now as option (C).
>>>
>>> The TimestampExtractor mechanism is actually more flexible, as it
>>> allows
>>> you to return any value, you're not limited to event time or system
>>> time
>>> (although I don't see an actual use case where you might need
>>> anything
>>> else then those two). Hence I also proposed the option to allow
>>> users
>>> to, effectively, decide what "stream time" is for them given the
>>> presence or absence of messages, much like they can decide what msg
>>> time
>>> means for them using the TimestampExtractor. What do you think
>>> about
>>> that? This is probably most flexible but also most 

[jira] [Updated] (KAFKA-5012) Refactor plugin discovery in Connect to offer a global index for plugins and transformations

2017-04-04 Thread Konstantine Karantasis (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-5012:
--
Description: 
Currently {{PluginDiscovery}} returns a list of plugins and a list of 
transformations. Consequently, the code to match and discover plugins 
(connectors) is a bit scattered. 

This refactoring should target to index the plugins based on simple name, 
alias, or fully qualified name. This index should also be able to resolve 
conflicts between plugins with the same simple name or alias as well as tighten 
the checks with respect to URL validation (e.g. check that the requested plugin 
exists). 

Possibly a Map of plugins (and maybe for symmetry a Map of transformations) 
will be returned by the methods in {{PluginDiscovery}} instead of a List. 

This is a generalized fix related to the incremental fix implemented in: 
https://issues.apache.org/jira/browse/KAFKA-4837

  was:

Currently {{PluginDiscovery}} returns a list of plugins and a list of 
transformations. Consequently, the code to match and discover plugins 
(connectors) is a bit scattered. 

This refactor should target to index the plugins based on simple name, alias, 
or fully qualified name. This index should also be able to resolve conflicts 
between plugins with the same simple name or alias. 

Possibly a Map of plugins (and maybe for symmetry transformations) will be 
returned by the methods in {{PluginDiscovery}} 

This is a generalized fix related to the incremental fix implemented in: 
https://issues.apache.org/jira/browse/KAFKA-4837


> Refactor plugin discovery in Connect to offer a global index for plugins and 
> transformations
> 
>
> Key: KAFKA-5012
> URL: https://issues.apache.org/jira/browse/KAFKA-5012
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Minor
>
> Currently {{PluginDiscovery}} returns a list of plugins and a list of 
> transformations. Consequently, the code to match and discover plugins 
> (connectors) is a bit scattered. 
> This refactoring should target to index the plugins based on simple name, 
> alias, or fully qualified name. This index should also be able to resolve 
> conflicts between plugins with the same simple name or alias as well as 
> tighten the checks with respect to URL validation (e.g. check that the 
> requested plugin exists). 
> Possibly a Map of plugins (and maybe for symmetry a Map of transformations) 
> will be returned by the methods in {{PluginDiscovery}} instead of a List. 
> This is a generalized fix related to the incremental fix implemented in: 
> https://issues.apache.org/jira/browse/KAFKA-4837



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5012) Refactor plugin discovery in Connect to offer a global index for plugins and transformations

2017-04-04 Thread Konstantine Karantasis (JIRA)
Konstantine Karantasis created KAFKA-5012:
-

 Summary: Refactor plugin discovery in Connect to offer a global 
index for plugins and transformations
 Key: KAFKA-5012
 URL: https://issues.apache.org/jira/browse/KAFKA-5012
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Affects Versions: 0.10.2.0
Reporter: Konstantine Karantasis
Assignee: Konstantine Karantasis
Priority: Minor



Currently {{PluginDiscovery}} returns a list of plugins and a list of 
transformations. Consequently, the code to match and discover plugins 
(connectors) is a bit scattered. 

This refactor should target to index the plugins based on simple name, alias, 
or fully qualified name. This index should also be able to resolve conflicts 
between plugins with the same simple name or alias. 

Possibly a Map of plugins (and maybe for symmetry transformations) will be 
returned by the methods in {{PluginDiscovery}} 

This is a generalized fix related to the incremental fix implemented in: 
https://issues.apache.org/jira/browse/KAFKA-4837



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5011) Replica fetchers may need to down-convert messages during a selective message format upgrade

2017-04-04 Thread Joel Koshy (JIRA)
Joel Koshy created KAFKA-5011:
-

 Summary: Replica fetchers may need to down-convert messages during 
a selective message format upgrade
 Key: KAFKA-5011
 URL: https://issues.apache.org/jira/browse/KAFKA-5011
 Project: Kafka
  Issue Type: Bug
Reporter: Joel Koshy
 Fix For: 0.11.0.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4971) Why is there no difference between kafka benchmark tests on SSD and HDD?

2017-04-04 Thread Michal Borowiecki (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955260#comment-15955260
 ] 

Michal Borowiecki commented on KAFKA-4971:
--

I'd venture a guess that you are limited by something else than your hdd/ssd 
performance.
Is 1g your total memory in the VM? How much of it is allocated to the kafka jvm 
process?
Some things I can think of:
Is there a lot of activity in the gc.log?
Is the OS not swapping ferociously due to over-allocation of memory by any 
chance?

Hope that helps.

> Why is there no difference between kafka benchmark tests on SSD and HDD? 
> -
>
> Key: KAFKA-4971
> URL: https://issues.apache.org/jira/browse/KAFKA-4971
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 0.10.0.0
> Environment: Oracle VM VirtualBox
> OS : CentOs 7
> Memory : 1G
> Disk : 8GB
>Reporter: Dasol Kim
>
> I installed OS and kafka in the two SSD and two HDDs  to perform the kafka 
> benchmark test based on the disc difference. As expected, the SSD should show 
> faster results, but according to my experimental results, there is no big 
> difference between SSD and HDD. why? Ohter settings have been set to default.
> *test settings
> zookeeper node  : 1, producer node : 2, broker node : 2(SSD 1, HDD 1)
> test scenario : Two producers send messages to the broker and compare the 
> throughtput per second of kafka installed on SSD and kafka on HDD
> command : ./bin/kafka-producer-perf-test.sh --num-records 100 
> --record-size 2000 --topic test --throughput 10 --producer-props 
> bootstrap.servers=SN02:9092
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4971) Why is there no difference between kafka benchmark tests on SSD and HDD?

2017-04-04 Thread Dasol Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955219#comment-15955219
 ] 

Dasol Kim commented on KAFKA-4971:
--

I was experimenting with the kafka producer installed on the SSD by sending 
messages to one broker installed on another ssd server in throughput / sec. In 
the same way, the experiment was conducted using the kafka of the HDD installed 
server. According to the known facts, SSD experiments should show more 
throughput per second.

test settings
OS : CentOs 7, Memory : 1G, Disk : 8G

it's my test result and command
VM - SSD(producer 1, broker 1, partition 1)

./bin/kafka-producer-perf-test.sh --num-records 100 --record-size 2000 
--topic test --throughput 10 --producer-props bootstrap.servers=SN02:9092
9169 records sent, 1833.8 records/sec (3.50 MB/sec), 2393.7 ms avg latency, 
3621.0 max latency.
17840 records sent, 3566.6 records/sec (6.80 MB/sec), 4412.3 ms avg latency, 
5282.0 max latency.
23568 records sent, 4704.2 records/sec (8.97 MB/sec), 4391.3 ms avg latency, 
5132.0 max latency.
32872 records sent, 6574.4 records/sec (12.54 MB/sec), 2612.5 ms avg latency, 
3055.0 max latency.
39352 records sent, 7870.4 records/sec (15.01 MB/sec), 2001.6 ms avg latency, 
2644.0 max latency.
41168 records sent, 8228.7 records/sec (15.69 MB/sec), 2003.0 ms avg latency, 
2585.0 max latency.
40568 records sent, 8105.5 records/sec (15.46 MB/sec), 2113.5 ms avg latency, 
2374.0 max latency.
65528 records sent, 13105.6 records/sec (25.00 MB/sec), 1351.3 ms avg latency, 
1778.0 max latency.
108096 records sent, 21619.2 records/sec (41.24 MB/sec), 780.2 ms avg latency, 
1026.0 max latency.
79992 records sent, 15988.8 records/sec (30.50 MB/sec), 855.3 ms avg latency, 
2238.0 max latency.
31152 records sent, 6230.4 records/sec (11.88 MB/sec), 2651.9 ms avg latency, 
3180.0 max latency.
39520 records sent, 7899.3 records/sec (15.07 MB/sec), 1820.1 ms avg latency, 
2536.0 max latency.
52824 records sent, 10564.8 records/sec (20.15 MB/sec), 1942.5 ms avg latency, 
3243.0 max latency.
68912 records sent, 13760.4 records/sec (26.25 MB/sec), 1122.5 ms avg latency, 
1678.0 max latency.
93024 records sent, 18601.1 records/sec (35.48 MB/sec), 954.2 ms avg latency, 
1624.0 max latency.
80200 records sent, 16040.0 records/sec (30.59 MB/sec), 981.6 ms avg latency, 
1417.0 max latency.
101720 records sent, 20344.0 records/sec (38.80 MB/sec), 829.8 ms avg latency, 
1201.0 max latency.
73272 records sent, 14654.4 records/sec (27.95 MB/sec), 1076.5 ms avg latency, 
1487.0 max latency.
100 records sent, 11094.223238 records/sec (21.16 MB/sec), 1444.94 ms avg 
latency, 5282.00 ms max latency, 1181 ms 50th, 3116 ms 95th, 4794 ms 99th, 5158 
ms 99.9th.

VM - HDD(producer 1, broker 1, partition 1)

./bin/kafka-producer-perf-test.sh --num-records 100 --record-size 2000 
--topic test --throughput 10 --producer-props bootstrap.servers=SN03:9092
11145 records sent, 2228.6 records/sec (4.25 MB/sec), 2209.9 ms avg latency, 
3442.0 max latency.
19592 records sent, 3912.9 records/sec (7.46 MB/sec), 4165.9 ms avg latency, 
4661.0 max latency.
18472 records sent, 3694.4 records/sec (7.05 MB/sec), 4416.4 ms avg latency, 
4598.0 max latency.
33312 records sent, 6662.4 records/sec (12.71 MB/sec), 2862.6 ms avg latency, 
4366.0 max latency.
49000 records sent, 9782.4 records/sec (18.66 MB/sec), 1923.6 ms avg latency, 
2673.0 max latency.
41856 records sent, 8357.8 records/sec (15.94 MB/sec), 1760.9 ms avg latency, 
2241.0 max latency.
48032 records sent, 9602.6 records/sec (18.32 MB/sec), 1863.5 ms avg latency, 
2283.0 max latency.
78032 records sent, 15606.4 records/sec (29.77 MB/sec), 1096.7 ms avg latency, 
1364.0 max latency.
93440 records sent, 18688.0 records/sec (35.64 MB/sec), 833.6 ms avg latency, 
1299.0 max latency.
72184 records sent, 14436.8 records/sec (27.54 MB/sec), 1185.9 ms avg latency, 
1421.0 max latency.
80352 records sent, 16070.4 records/sec (30.65 MB/sec), 955.9 ms avg latency, 
1896.0 max latency.
64200 records sent, 12840.0 records/sec (24.49 MB/sec), 1319.0 ms avg latency, 
1652.0 max latency.
86400 records sent, 17280.0 records/sec (32.96 MB/sec), 972.3 ms avg latency, 
1292.0 max latency.
74472 records sent, 14894.4 records/sec (28.41 MB/sec), 1073.2 ms avg latency, 
1224.0 max latency.
75912 records sent, 15182.4 records/sec (28.96 MB/sec), 961.6 ms avg latency, 
1901.0 max latency.
30088 records sent, 6017.6 records/sec (11.48 MB/sec), 2746.6 ms avg latency, 
3482.0 max latency.
70368 records sent, 14073.6 records/sec (26.84 MB/sec), 1257.5 ms avg latency, 
2300.0 max latency.
100 records sent, 11269.129347 records/sec (21.49 MB/sec), 1423.68 ms avg 
latency, 4661.00 ms max latency, 1179 ms 50th, 3467 ms 95th, 4460 ms 99th, 4597 
ms 99.9th.


Comparing the two experimental results, we can see that the results of the two 
experiments show little difference. I do not know what 

[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2017-04-04 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955113#comment-15955113
 ] 

Ismael Juma commented on KAFKA-3174:


We switched message format V2 to use CRC32C (see KAFKA-1449). It seems that the 
safest thing is not to change what we do for V0 and V1.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
>
> We used org.apache.kafka.common.utils.CRC32 in clients because it has better 
> performance than java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2804: MINOR: don't throw CommitFailedException during su...

2017-04-04 Thread enothereska
GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/2804

MINOR: don't throw CommitFailedException during suspendTasksAndState

Cherrypicked from trunk https://github.com/apache/kafka/pull/2535

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka CommitFailedException-0.10.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2804.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2804


commit eda74ef1ce3796734e4c002724acadf38145647a
Author: Eno Thereska 
Date:   2017-04-04T12:54:48Z

Cherry pick from trunk




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2803: MINOR: improve MinTimestampTrackerTest and fix NPE...

2017-04-04 Thread enothereska
GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/2803

MINOR: improve MinTimestampTrackerTest and fix NPE when null element

Cherry picked from trunk fix https://github.com/apache/kafka/pull/2611

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka fix_NPE_0.10.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2803.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2803


commit ef82cbf580ab0400616ef25800de14fc5fb4dc70
Author: Eno Thereska 
Date:   2017-04-04T12:50:01Z

Cherrypick from trunk




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-5010) Log cleaner crashed with BufferOverflowException when writing to the writeBuffer

2017-04-04 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5010:
---
Labels: reliability  (was: )

> Log cleaner crashed with BufferOverflowException when writing to the 
> writeBuffer
> 
>
> Key: KAFKA-5010
> URL: https://issues.apache.org/jira/browse/KAFKA-5010
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.2.0
>Reporter: Shuai Lin
>  Labels: reliability
> Fix For: 0.11.0.0
>
>
> After upgrading from 0.10.0.1 to 0.10.2.0 the log cleaner thread crashed with 
> BufferOverflowException when writing the filtered records into the 
> writeBuffer:
> {code}
> [2017-03-24 10:41:03,926] INFO [kafka-log-cleaner-thread-0], Starting  
> (kafka.log.LogCleaner)
> [2017-03-24 10:41:04,177] INFO Cleaner 0: Beginning cleaning of log 
> app-topic-20170317-20. (kafka.log.LogCleaner)
> [2017-03-24 10:41:04,177] INFO Cleaner 0: Building offset map for 
> app-topic-20170317-20... (kafka.log.LogCleaner)
> [2017-03-24 10:41:04,387] INFO Cleaner 0: Building offset map for log 
> app-topic-20170317-20 for 1 segments in offset range [9737795, 9887707). 
> (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,101] INFO Cleaner 0: Offset map for log 
> app-topic-20170317-20 complete. (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,106] INFO Cleaner 0: Cleaning log app-topic-20170317-20 
> (cleaning prior to Fri Mar 24 10:36:06 GMT 2017, discarding tombstones prior 
> to Thu Mar 23 10:18:02 GMT 2017)... (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,110] INFO Cleaner 0: Cleaning segment 0 in log 
> app-topic-20170317-20 (largest timestamp Fri Mar 24 09:58:25 GMT 2017) into 
> 0, retaining deletes. (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,372] ERROR [kafka-log-cleaner-thread-0], Error due to  
> (kafka.log.LogCleaner)
> java.nio.BufferOverflowException
> at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:206)
> at org.apache.kafka.common.record.LogEntry.writeTo(LogEntry.java:98)
> at 
> org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:158)
> at 
> org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:111)
> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:468)
> at kafka.log.Cleaner.$anonfun$cleanSegments$1(LogCleaner.scala:405)
> at 
> kafka.log.Cleaner.$anonfun$cleanSegments$1$adapted(LogCleaner.scala:401)
> at scala.collection.immutable.List.foreach(List.scala:378)
> at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401)
> at kafka.log.Cleaner.$anonfun$clean$6(LogCleaner.scala:363)
> at kafka.log.Cleaner.$anonfun$clean$6$adapted(LogCleaner.scala:362)
> at scala.collection.immutable.List.foreach(List.scala:378)
> at kafka.log.Cleaner.clean(LogCleaner.scala:362)
> at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241)
> at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> [2017-03-24 10:41:07,375] INFO [kafka-log-cleaner-thread-0], Stopped  
> (kafka.log.LogCleaner)
> {code}
> I tried different values of log.cleaner.buffer.size, from 512K to 2M to 10M 
> to 128M, all with no luck: The log cleaner thread crashed immediately after 
> the broker got restarted. But setting it to 256MB fixed the problem!
> Here are the settings for the cluster:
> {code}
> - log.message.format.version = 0.9.0.0 (we use 0.9 format because have old 
> consumers)
> - log.cleaner.enable = 'true'
> - log.cleaner.min.cleanable.ratio = '0.1'
> - log.cleaner.threads = '1'
> - log.cleaner.io.buffer.load.factor = '0.98'
> - log.roll.hours = '24'
> - log.cleaner.dedupe.buffer.size = 2GB 
> - log.segment.bytes = 256MB (global is 512MB, but we have been using 256MB 
> for this topic)
> - message.max.bytes = 10MB
> {code}
> Given that the size of readBuffer and writeBuffer are exactly the same (half 
> of log.cleaner.io.buffer.size), why would the cleaner throw a 
> BufferOverflowException when writing the filtered records into the 
> writeBuffer? IIUC that should never happen because the size of the filtered 
> records should be no greater that the size of the readBuffer, thus no greater 
> than the size of the writeBuffer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5010) Log cleaner crashed with BufferOverflowException when writing to the writeBuffer

2017-04-04 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5010:
---
Priority: Critical  (was: Major)

> Log cleaner crashed with BufferOverflowException when writing to the 
> writeBuffer
> 
>
> Key: KAFKA-5010
> URL: https://issues.apache.org/jira/browse/KAFKA-5010
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.2.0
>Reporter: Shuai Lin
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.0
>
>
> After upgrading from 0.10.0.1 to 0.10.2.0 the log cleaner thread crashed with 
> BufferOverflowException when writing the filtered records into the 
> writeBuffer:
> {code}
> [2017-03-24 10:41:03,926] INFO [kafka-log-cleaner-thread-0], Starting  
> (kafka.log.LogCleaner)
> [2017-03-24 10:41:04,177] INFO Cleaner 0: Beginning cleaning of log 
> app-topic-20170317-20. (kafka.log.LogCleaner)
> [2017-03-24 10:41:04,177] INFO Cleaner 0: Building offset map for 
> app-topic-20170317-20... (kafka.log.LogCleaner)
> [2017-03-24 10:41:04,387] INFO Cleaner 0: Building offset map for log 
> app-topic-20170317-20 for 1 segments in offset range [9737795, 9887707). 
> (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,101] INFO Cleaner 0: Offset map for log 
> app-topic-20170317-20 complete. (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,106] INFO Cleaner 0: Cleaning log app-topic-20170317-20 
> (cleaning prior to Fri Mar 24 10:36:06 GMT 2017, discarding tombstones prior 
> to Thu Mar 23 10:18:02 GMT 2017)... (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,110] INFO Cleaner 0: Cleaning segment 0 in log 
> app-topic-20170317-20 (largest timestamp Fri Mar 24 09:58:25 GMT 2017) into 
> 0, retaining deletes. (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,372] ERROR [kafka-log-cleaner-thread-0], Error due to  
> (kafka.log.LogCleaner)
> java.nio.BufferOverflowException
> at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:206)
> at org.apache.kafka.common.record.LogEntry.writeTo(LogEntry.java:98)
> at 
> org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:158)
> at 
> org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:111)
> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:468)
> at kafka.log.Cleaner.$anonfun$cleanSegments$1(LogCleaner.scala:405)
> at 
> kafka.log.Cleaner.$anonfun$cleanSegments$1$adapted(LogCleaner.scala:401)
> at scala.collection.immutable.List.foreach(List.scala:378)
> at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401)
> at kafka.log.Cleaner.$anonfun$clean$6(LogCleaner.scala:363)
> at kafka.log.Cleaner.$anonfun$clean$6$adapted(LogCleaner.scala:362)
> at scala.collection.immutable.List.foreach(List.scala:378)
> at kafka.log.Cleaner.clean(LogCleaner.scala:362)
> at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241)
> at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> [2017-03-24 10:41:07,375] INFO [kafka-log-cleaner-thread-0], Stopped  
> (kafka.log.LogCleaner)
> {code}
> I tried different values of log.cleaner.buffer.size, from 512K to 2M to 10M 
> to 128M, all with no luck: The log cleaner thread crashed immediately after 
> the broker got restarted. But setting it to 256MB fixed the problem!
> Here are the settings for the cluster:
> {code}
> - log.message.format.version = 0.9.0.0 (we use 0.9 format because have old 
> consumers)
> - log.cleaner.enable = 'true'
> - log.cleaner.min.cleanable.ratio = '0.1'
> - log.cleaner.threads = '1'
> - log.cleaner.io.buffer.load.factor = '0.98'
> - log.roll.hours = '24'
> - log.cleaner.dedupe.buffer.size = 2GB 
> - log.segment.bytes = 256MB (global is 512MB, but we have been using 256MB 
> for this topic)
> - message.max.bytes = 10MB
> {code}
> Given that the size of readBuffer and writeBuffer are exactly the same (half 
> of log.cleaner.io.buffer.size), why would the cleaner throw a 
> BufferOverflowException when writing the filtered records into the 
> writeBuffer? IIUC that should never happen because the size of the filtered 
> records should be no greater that the size of the readBuffer, thus no greater 
> than the size of the writeBuffer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5010) Log cleaner crashed with BufferOverflowException when writing to the writeBuffer

2017-04-04 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5010:
---
Fix Version/s: 0.11.0.0

> Log cleaner crashed with BufferOverflowException when writing to the 
> writeBuffer
> 
>
> Key: KAFKA-5010
> URL: https://issues.apache.org/jira/browse/KAFKA-5010
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.2.0
>Reporter: Shuai Lin
> Fix For: 0.11.0.0
>
>
> After upgrading from 0.10.0.1 to 0.10.2.0 the log cleaner thread crashed with 
> BufferOverflowException when writing the filtered records into the 
> writeBuffer:
> {code}
> [2017-03-24 10:41:03,926] INFO [kafka-log-cleaner-thread-0], Starting  
> (kafka.log.LogCleaner)
> [2017-03-24 10:41:04,177] INFO Cleaner 0: Beginning cleaning of log 
> app-topic-20170317-20. (kafka.log.LogCleaner)
> [2017-03-24 10:41:04,177] INFO Cleaner 0: Building offset map for 
> app-topic-20170317-20... (kafka.log.LogCleaner)
> [2017-03-24 10:41:04,387] INFO Cleaner 0: Building offset map for log 
> app-topic-20170317-20 for 1 segments in offset range [9737795, 9887707). 
> (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,101] INFO Cleaner 0: Offset map for log 
> app-topic-20170317-20 complete. (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,106] INFO Cleaner 0: Cleaning log app-topic-20170317-20 
> (cleaning prior to Fri Mar 24 10:36:06 GMT 2017, discarding tombstones prior 
> to Thu Mar 23 10:18:02 GMT 2017)... (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,110] INFO Cleaner 0: Cleaning segment 0 in log 
> app-topic-20170317-20 (largest timestamp Fri Mar 24 09:58:25 GMT 2017) into 
> 0, retaining deletes. (kafka.log.LogCleaner)
> [2017-03-24 10:41:07,372] ERROR [kafka-log-cleaner-thread-0], Error due to  
> (kafka.log.LogCleaner)
> java.nio.BufferOverflowException
> at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:206)
> at org.apache.kafka.common.record.LogEntry.writeTo(LogEntry.java:98)
> at 
> org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:158)
> at 
> org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:111)
> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:468)
> at kafka.log.Cleaner.$anonfun$cleanSegments$1(LogCleaner.scala:405)
> at 
> kafka.log.Cleaner.$anonfun$cleanSegments$1$adapted(LogCleaner.scala:401)
> at scala.collection.immutable.List.foreach(List.scala:378)
> at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401)
> at kafka.log.Cleaner.$anonfun$clean$6(LogCleaner.scala:363)
> at kafka.log.Cleaner.$anonfun$clean$6$adapted(LogCleaner.scala:362)
> at scala.collection.immutable.List.foreach(List.scala:378)
> at kafka.log.Cleaner.clean(LogCleaner.scala:362)
> at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241)
> at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> [2017-03-24 10:41:07,375] INFO [kafka-log-cleaner-thread-0], Stopped  
> (kafka.log.LogCleaner)
> {code}
> I tried different values of log.cleaner.buffer.size, from 512K to 2M to 10M 
> to 128M, all with no luck: The log cleaner thread crashed immediately after 
> the broker got restarted. But setting it to 256MB fixed the problem!
> Here are the settings for the cluster:
> {code}
> - log.message.format.version = 0.9.0.0 (we use 0.9 format because have old 
> consumers)
> - log.cleaner.enable = 'true'
> - log.cleaner.min.cleanable.ratio = '0.1'
> - log.cleaner.threads = '1'
> - log.cleaner.io.buffer.load.factor = '0.98'
> - log.roll.hours = '24'
> - log.cleaner.dedupe.buffer.size = 2GB 
> - log.segment.bytes = 256MB (global is 512MB, but we have been using 256MB 
> for this topic)
> - message.max.bytes = 10MB
> {code}
> Given that the size of readBuffer and writeBuffer are exactly the same (half 
> of log.cleaner.io.buffer.size), why would the cleaner throw a 
> BufferOverflowException when writing the filtered records into the 
> writeBuffer? IIUC that should never happen because the size of the filtered 
> records should be no greater that the size of the readBuffer, thus no greater 
> than the size of the writeBuffer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-138: Change punctuate semantics

2017-04-04 Thread Thomas Becker
I agree that the framework providing and managing the notion of stream
time is valuable and not something we would want to delegate to the
tasks. I'm not entirely convinced that a separate callback (option C)
is that messy (it could just be a default method with an empty
implementation), but if we wanted a single API to handle both cases,
how about something like the following?

enum Time {
   STREAM,
   CLOCK
}

Then on ProcessorContext:
context.schedule(Time time, long interval)  // We could allow this to
be called once for each value of time to mix approaches.

Then the Processor API becomes:
punctuate(Time time) // time here denotes which schedule resulted in
this call.

Thoughts?


On Mon, 2017-04-03 at 22:44 -0700, Matthias J. Sax wrote:
> Thanks a lot for the KIP Michal,
>
> I was thinking about the four options you proposed in more details
> and
> this are my thoughts:
>
> (A) You argue, that users can still "punctuate" on event-time via
> process(), but I am not sure if this is possible. Note, that users
> only
> get record timestamps via context.timestamp(). Thus, users would need
> to
> track the time progress per partition (based on the partitions they
> obverse via context.partition(). (This alone puts a huge burden on
> the
> user by itself.) However, users are not notified at startup what
> partitions are assigned, and user are not notified when partitions
> get
> revoked. Because this information is not available, it's not possible
> to
> "manually advance" stream-time, and thus event-time punctuation
> within
> process() seems not to be possible -- or do you see a way to get it
> done? And even if, it might still be too clumsy to use.
>
> (B) This does not allow to mix both approaches, thus limiting what
> users
> can do.
>
> (C) This should give all flexibility we need. However, just adding
> one
> more method seems to be a solution that is too simple (cf my comments
> below).
>
> (D) This might be hard to use. Also, I am not sure how a user could
> enable system-time and event-time punctuation in parallel.
>
>
>
> Overall options (C) seems to be the most promising approach to me.
> Because I also favor a clean API, we might keep current punctuate()
> as-is, but deprecate it -- so we can remove it at some later point
> when
> people use the "new punctuate API".
>
>
> Couple of follow up questions:
>
> - I am wondering, if we should have two callback methods or just one
> (ie, a unified for system and event time punctuation or one for
> each?).
>
> - If we have one, how can the user figure out, which condition did
> trigger?
>
> - How would the API look like, for registering different punctuate
> schedules? The "type" must be somehow defined?
>
> - We might want to add "complex" schedules later on (like, punctuate
> on
> every 10 seconds event-time or 60 seconds system-time whatever comes
> first). I don't say we should add this right away, but we might want
> to
> define the API in a way, that it allows extensions like this later
> on,
> without redesigning the API (ie, the API should be designed
> extensible)
>
> - Did you ever consider count-based punctuation?
>
>
> I understand, that you would like to solve a simple problem, but we
> learned from the past, that just "adding some API" quickly leads to a
> not very well defined API that needs time consuming clean up later on
> via other KIPs. Thus, I would prefer to get a holistic punctuation
> KIP
> with this from the beginning on to avoid later painful redesign.
>
>
>
> -Matthias
>
>
>
> On 4/3/17 11:58 AM, Michal Borowiecki wrote:
> >
> > Thanks Thomas,
> >
> > I'm also wary of changing the existing semantics of punctuate, for
> > backward compatibility reasons, although I like the conceptual
> > simplicity of that option.
> >
> > Adding a new method to me feels safer but, in a way, uglier. I
> > added
> > this to the KIP now as option (C).
> >
> > The TimestampExtractor mechanism is actually more flexible, as it
> > allows
> > you to return any value, you're not limited to event time or system
> > time
> > (although I don't see an actual use case where you might need
> > anything
> > else then those two). Hence I also proposed the option to allow
> > users
> > to, effectively, decide what "stream time" is for them given the
> > presence or absence of messages, much like they can decide what msg
> > time
> > means for them using the TimestampExtractor. What do you think
> > about
> > that? This is probably most flexible but also most complicated.
> >
> > All comments appreciated.
> >
> > Cheers,
> >
> > Michal
> >
> >
> > On 03/04/17 19:23, Thomas Becker wrote:
> > >
> > > Although I fully agree we need a way to trigger periodic
> > > processing
> > > that is independent from whether and when messages arrive, I'm
> > > not sure
> > > I like the idea of changing the existing semantics across the
> > > board.
> > > What if we added an additional callback to Processor that can be
> > > scheduled similarly to punctuate() but was always 

[jira] [Resolved] (KAFKA-5009) Globally Unique internal topic names when using Joins

2017-04-04 Thread Mark Tranter (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Tranter resolved KAFKA-5009.
-
Resolution: Invalid

This is only an issue when using Confluent Schema Registry related Serdes.

> Globally Unique internal topic names when using Joins 
> --
>
> Key: KAFKA-5009
> URL: https://issues.apache.org/jira/browse/KAFKA-5009
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Mark Tranter
>
> We are using multiple different Kafka Streams applications on the back of a 
> single kafka cluster. This allows each micro-service in our enterprise to 
> consume & process kafka data to suit its own needs.
> Currently when joining streams, an internal topic is created & named using 
> KStreamsBuilder.newName(prefix);
> ```
> String thisWindowStreamName = topology.newName(WINDOWED_NAME);
> String otherWindowStreamName = topology.newName(WINDOWED_NAME);
> String joinThisName = rightOuter ? 
> topology.newName(OUTERTHIS_NAME) : topology.newName(JOINTHIS_NAME);
> String joinOtherName = leftOuter ? 
> topology.newName(OUTEROTHER_NAME) : topology.newName(JOINOTHER_NAME);
> String joinMergeName = topology.newName(MERGE_NAME);
> ```
> This prefix is a constant, and internally an incrementor is used to generate 
> a  unique (per KStreamBuilder instance) topic name for the topic.
> In situations where multiple KStreamBuilders are in use (for example, 
> multiple different Kafka Streams applications) we are seeing collisions in 
> topic names.
> Perhaps the join() methods should take a topic prefix overload to allow 
> developers to provide unique names for these topics. Similar to the way other 
> stateful processors work when having to provide a store name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-3502) Build is killed during kafka streams tests due to `pure virtual method called` error

2017-04-04 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955024#comment-15955024
 ] 

Eno Thereska commented on KAFKA-3502:
-

Seeing this again.

> Build is killed during kafka streams tests due to `pure virtual method 
> called` error
> 
>
> Key: KAFKA-3502
> URL: https://issues.apache.org/jira/browse/KAFKA-3502
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ashish Singh
>Assignee: Guozhang Wang
>  Labels: transient-unit-test-failure
> Fix For: 0.10.2.0
>
>
> Build failed due to failure in streams' test. Not clear which test led to 
> this.
> Jenkins console: 
> https://builds.apache.org/job/kafka-trunk-git-pr-jdk7/3210/console
> {code}
> org.apache.kafka.streams.kstream.internals.KTableFilterTest > testValueGetter 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapTest > testFlatMap 
> PASSED
> org.apache.kafka.streams.kstream.internals.KTableAggregateTest > testAggBasic 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapValuesTest > 
> testFlatMapValues PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testMerge PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testFrom PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testNewName PASSED
> pure virtual method called
> terminate called without an active exception
> :streams:test FAILED
> FAILURE: Build failed with an exception.
> * What went wrong:
> Execution failed for task ':streams:test'.
> > Process 'Gradle Test Executor 4' finished with non-zero exit value 134
> {code}
> Tried reproducing the issue locally, but could not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (KAFKA-3502) Build is killed during kafka streams tests due to `pure virtual method called` error

2017-04-04 Thread Eno Thereska (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eno Thereska reopened KAFKA-3502:
-

> Build is killed during kafka streams tests due to `pure virtual method 
> called` error
> 
>
> Key: KAFKA-3502
> URL: https://issues.apache.org/jira/browse/KAFKA-3502
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ashish Singh
>Assignee: Guozhang Wang
>  Labels: transient-unit-test-failure
> Fix For: 0.10.2.0
>
>
> Build failed due to failure in streams' test. Not clear which test led to 
> this.
> Jenkins console: 
> https://builds.apache.org/job/kafka-trunk-git-pr-jdk7/3210/console
> {code}
> org.apache.kafka.streams.kstream.internals.KTableFilterTest > testValueGetter 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapTest > testFlatMap 
> PASSED
> org.apache.kafka.streams.kstream.internals.KTableAggregateTest > testAggBasic 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapValuesTest > 
> testFlatMapValues PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testMerge PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testFrom PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testNewName PASSED
> pure virtual method called
> terminate called without an active exception
> :streams:test FAILED
> FAILURE: Build failed with an exception.
> * What went wrong:
> Execution failed for task ':streams:test'.
> > Process 'Gradle Test Executor 4' finished with non-zero exit value 134
> {code}
> Tried reproducing the issue locally, but could not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: log cleaner thread crashes with different log.cleaner.io.buffer.size

2017-04-04 Thread Shuai Lin
Filed a JIRA for this. https://issues.apache.org/jira/browse/KAFKA-5010

(also forwarded to the dev list.)

On Tue, Apr 4, 2017 at 7:16 PM, Shuai Lin  wrote:

> After digging a bit into the source code of LogCleaner class, i realized
> maybe i should not try to fix the crash by increasing the
> log.cleaner.io.buffer.size: the cleaner would automatically grow the buffer
> so that it can hold at least one record.
>
> Also for the last error i pasted above, i.e. "largest offset in message
> set can not be safely converted to relative offset", it should be due to
> the fact that i set a too large io buffer size (256MB) which can hold too
> many records, among which the value of (largest offset - base offset of the
> new cleaned segment) is larger than Integer.MAX_VALUE, so it failed the
> assertion
> 
> in LogSegment.append.
>
> From the source code, I see the workflow of cleaning a segment is like
> this:
>
> - read as much records as possible into the readBuffer
> - construct a MemoryRecords object and filter the records using the
> OffsetMap
> - write the filtered records into the writeBuffer
> - repeat, until the whole segment is processed
>
> So the question is clear: Given that the size of readBuffer and
> writeBuffer are exactly the same
> 
> (half of log.cleaner.io.buffer.size), why would the cleaner throw a
> BufferOverflowException when writing the filtered records into the
> writeBuffer?  That should never happen because the size of the filtered
> records should be no greater that the size of the readBuffer, thus no
> greater than the size of the writeBuffer.
>
> [2017-03-24 10:41:07,372] ERROR [kafka-log-cleaner-thread-0], Error due to
>>  (kafka.log.LogCleaner)
>> java.nio.BufferOverflowException
>> at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:206)
>> at org.apache.kafka.common.record.LogEntry.writeTo(LogEntry.jav
>> a:98)
>> at org.apache.kafka.common.record.MemoryRecords.filterTo(Memory
>> Records.java:158)
>> at org.apache.kafka.common.record.MemoryRecords.filterTo(Memory
>> Records.java:111)
>> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:468)
>
>
> Also worth mentioning that we are still using log.message.format.version =
> 0.9.0.0 because there are still some old consumers. Could this be related
> to the problem?
>
>
> On Sun, Apr 2, 2017 at 11:46 PM, Shuai Lin  wrote:
>
>> Hi,
>>
>> Recently we updated from kafka 0.10.0.1 to 0.10.2.1, and found the log
>> cleaner thread crashed with this error:
>>
>> [2017-03-24 10:41:03,926] INFO [kafka-log-cleaner-thread-0], Starting
>>>  (kafka.log.LogCleaner)
>>> [2017-03-24 10:41:04,177] INFO Cleaner 0: Beginning cleaning of log
>>> app-topic-20170317-20. (kafka.log.LogCleaner)
>>> [2017-03-24 10:41:04,177] INFO Cleaner 0: Building offset map for
>>> app-topic-20170317-20... (kafka.log.LogCleaner)
>>> [2017-03-24 10:41:04,387] INFO Cleaner 0: Building offset map for log
>>> app-topic-20170317-20 for 1 segments in offset range [9737795, 9887707).
>>> (kafka.log.LogCleaner)
>>> [2017-03-24 10:41:07,101] INFO Cleaner 0: Offset map for log
>>> app-topic-20170317-20 complete. (kafka.log.LogCleaner)
>>> [2017-03-24 10:41:07,106] INFO Cleaner 0: Cleaning log
>>> app-topic-20170317-20 (cleaning prior to Fri Mar 24 10:36:06 GMT 2017,
>>> discarding tombstones prior to Thu Mar 23 10:18:02 GMT 2017)...
>>> (kafka.log.LogCleaner)
>>> [2017-03-24 10:41:07,110] INFO Cleaner 0: Cleaning segment 0 in log
>>> app-topic-20170317-20 (largest timestamp Fri Mar 24 09:58:25 GMT 2017) into
>>> 0, retaining deletes. (kafka.log.LogCleaner)
>>> [2017-03-24 10:41:07,372] ERROR [kafka-log-cleaner-thread-0], Error due
>>> to  (kafka.log.LogCleaner)
>>> java.nio.BufferOverflowException
>>> at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:206)
>>> at org.apache.kafka.common.record.LogEntry.writeTo(LogEntry.jav
>>> a:98)
>>> at org.apache.kafka.common.record.MemoryRecords.filterTo(Memory
>>> Records.java:158)
>>> at org.apache.kafka.common.record.MemoryRecords.filterTo(Memory
>>> Records.java:111)
>>> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:468)
>>> at kafka.log.Cleaner.$anonfun$cleanSegments$1(LogCleaner.scala:
>>> 405)
>>> at kafka.log.Cleaner.$anonfun$cleanSegments$1$adapted(LogCleane
>>> r.scala:401)
>>> at scala.collection.immutable.List.foreach(List.scala:378)
>>> at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401)
>>> at kafka.log.Cleaner.$anonfun$clean$6(LogCleaner.scala:363)
>>> at kafka.log.Cleaner.$anonfun$clean$6$adapted(LogCleaner.scala:
>>> 362)
>>> at scala.collection.immutable.List.foreach(List.scala:378)
>>> at kafka.log.Cleaner.clean(LogCleaner.scala:362)

[jira] [Created] (KAFKA-5010) Log cleaner crashed with BufferOverflowException when writing to the writeBuffer

2017-04-04 Thread Shuai Lin (JIRA)
Shuai Lin created KAFKA-5010:


 Summary: Log cleaner crashed with BufferOverflowException when 
writing to the writeBuffer
 Key: KAFKA-5010
 URL: https://issues.apache.org/jira/browse/KAFKA-5010
 Project: Kafka
  Issue Type: Bug
  Components: log
Affects Versions: 0.10.2.0
Reporter: Shuai Lin


After upgrading from 0.10.0.1 to 0.10.2.0 the log cleaner thread crashed with 
BufferOverflowException when writing the filtered records into the writeBuffer:

{code}
[2017-03-24 10:41:03,926] INFO [kafka-log-cleaner-thread-0], Starting  
(kafka.log.LogCleaner)
[2017-03-24 10:41:04,177] INFO Cleaner 0: Beginning cleaning of log 
app-topic-20170317-20. (kafka.log.LogCleaner)
[2017-03-24 10:41:04,177] INFO Cleaner 0: Building offset map for 
app-topic-20170317-20... (kafka.log.LogCleaner)
[2017-03-24 10:41:04,387] INFO Cleaner 0: Building offset map for log 
app-topic-20170317-20 for 1 segments in offset range [9737795, 9887707). 
(kafka.log.LogCleaner)
[2017-03-24 10:41:07,101] INFO Cleaner 0: Offset map for log 
app-topic-20170317-20 complete. (kafka.log.LogCleaner)
[2017-03-24 10:41:07,106] INFO Cleaner 0: Cleaning log app-topic-20170317-20 
(cleaning prior to Fri Mar 24 10:36:06 GMT 2017, discarding tombstones prior to 
Thu Mar 23 10:18:02 GMT 2017)... (kafka.log.LogCleaner)
[2017-03-24 10:41:07,110] INFO Cleaner 0: Cleaning segment 0 in log 
app-topic-20170317-20 (largest timestamp Fri Mar 24 09:58:25 GMT 2017) into 0, 
retaining deletes. (kafka.log.LogCleaner)
[2017-03-24 10:41:07,372] ERROR [kafka-log-cleaner-thread-0], Error due to  
(kafka.log.LogCleaner)
java.nio.BufferOverflowException
at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:206)
at org.apache.kafka.common.record.LogEntry.writeTo(LogEntry.java:98)
at 
org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:158)
at 
org.apache.kafka.common.record.MemoryRecords.filterTo(MemoryRecords.java:111)
at kafka.log.Cleaner.cleanInto(LogCleaner.scala:468)
at kafka.log.Cleaner.$anonfun$cleanSegments$1(LogCleaner.scala:405)
at 
kafka.log.Cleaner.$anonfun$cleanSegments$1$adapted(LogCleaner.scala:401)
at scala.collection.immutable.List.foreach(List.scala:378)
at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401)
at kafka.log.Cleaner.$anonfun$clean$6(LogCleaner.scala:363)
at kafka.log.Cleaner.$anonfun$clean$6$adapted(LogCleaner.scala:362)
at scala.collection.immutable.List.foreach(List.scala:378)
at kafka.log.Cleaner.clean(LogCleaner.scala:362)
at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241)
at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
[2017-03-24 10:41:07,375] INFO [kafka-log-cleaner-thread-0], Stopped  
(kafka.log.LogCleaner)
{code}

I tried different values of log.cleaner.buffer.size, from 512K to 2M to 10M to 
128M, all with no luck: The log cleaner thread crashed immediately after the 
broker got restarted. But setting it to 256MB fixed the problem!

Here are the settings for the cluster:
{code}
- log.message.format.version = 0.9.0.0 (we use 0.9 format because have old 
consumers)
- log.cleaner.enable = 'true'
- log.cleaner.min.cleanable.ratio = '0.1'
- log.cleaner.threads = '1'
- log.cleaner.io.buffer.load.factor = '0.98'
- log.roll.hours = '24'
- log.cleaner.dedupe.buffer.size = 2GB 
- log.segment.bytes = 256MB (global is 512MB, but we have been using 256MB for 
this topic)
- message.max.bytes = 10MB
{code}

Given that the size of readBuffer and writeBuffer are exactly the same (half of 
log.cleaner.io.buffer.size), why would the cleaner throw a 
BufferOverflowException when writing the filtered records into the writeBuffer? 
IIUC that should never happen because the size of the filtered records should 
be no greater that the size of the readBuffer, thus no greater than the size of 
the writeBuffer.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Enable Kafka sink connector to insert data from topics to tables as and when sink is up

2017-04-04 Thread Nilkanth Patel
I have developed kafka-sink-connector (using confluent-oss-3.2.0-2.11,
connect framework) for my data-store (Amppol ADS), which stores data from
kafka topic to corresponding table in my store.
Every thing is working as expected as long as kafka servers and ADS servers
are up and running.
Need a help/suggestions about a specific use-case where events are getting
ingested in kafka topics and underneath sink component (ADS) is down.
Expectation here is Whenever a sink servers comes up, records that were
ingested earlier in kafka topics should be inserted into the tables;

Kindly advise how to handle such a case.

Is there any support available in connect framework for this..? or atleast
some references will be a great help.

Nilkanth Patel.


Re: [VOTE] KIP-134: Delay initial consumer group rebalance

2017-04-04 Thread Damian Guy
Hi Onur,

Thanks for the update. I misunderstood what you said before. I believe what
you are suggesting sounds ok, though i don't think it addresses the point
Becket made earlier in the discussion thread. See below.

Thanks,
Damian



1. Better rebalance timing. We will try to rebalance only when all the
consumers in a group have joined. The challenge would be someone has to
define what does ALL consumers mean, it could either be a time or number of
consumers, etc.

2. Avoid frequent rebalance. For example, if there are 100 consumers in a
group, today, in the worst case, we may end up with 100 rebalances even if
all the consumers joined the group in a reasonably small amount of time.
Frequent rebalance is also a bad thing for brokers.

Having a client side configuration may solve problem 1 better because each
consumer group can potentially configure their own timing. However, it does
not really prevent frequent rebalance in general because some of the
consumers can be misconfigured. (This may have something to do with KIP-124
as well. But if quota is applied on the JoinGroup/SyncGroup request it may
cause some unwanted cascading effects.)

Having a broker side configuration may result in less flexibility for each
consumer group, but it can prevent frequent rebalance better. I think with
some reasonable design, the rebalance timing issue can be resolved on the
broker side as well. Matthias had a good point on extending the delay when
a new consumer joins a group (we actually did something similar to batch
ISR change propagation). For example, let's say on the broker side, we will
always delay 2 seconds each time we see a new consumer joining a consumer
group. This would probably work for most of the consumer groups and will
also limit the rebalance frequency to protect the brokers.

I am not sure about the streams use case here, but if something like 2
seconds of delay is acceptable for streams, I would prefer adding the
configuration to the broker so that we can address both problems.

On Mon, 3 Apr 2017 at 21:41 Onur Karaman 
wrote:

Delaying the SyncGroupRequest is not what I had in mind.

What I was thinking was essentially a client-side stabilization window
where the client does nothing other than participate in the group
membership protocol and wait a bit for the group to stabilize.

During this window, several rounds of rebalance can take place, clients
would participate in these rebalances (they'd get notified of the rebalance
from the heartbeats they've been sending during this stabilization window),
but they would effectively not run any
ConsumerRebalanceListener.onPartitionsAssigned or process messages until
the window has closed or rebalance finishes if the window ends during a
rebalance.

So something like:
T0: client A is processing messages
T1: new client B joins
T2: client A gets notified and rejoins the group.
T3: rebalance finishes with the group consisting of A and B. This is where
the stabilization window begins for both A and B. Stabilization window
duration is W.
T4: new client C joins.
T5: clients A and B get notified and they rejoin the group.
T6: rebalance finishes with the group consisting of A, B, and C.
T3+W: clients A, B, and C finally run their
ConsumerRebalanceListener.onPartitionsAssigned and begin processing
messages.

If T3+W is during the middle of a rebalance, then we wait until that
rebalance round finishes. Otherwise, we just run the
ConsumerRebalanceListener.onPartitionsAssigned and begin processing
messages.

On Mon, Apr 3, 2017 at 11:40 AM, Becket Qin  wrote:

> Hey Onur,
>
> Are you suggesting letting the consumers to hold back on sending
> SyncGroupRequest on the first rebalance? I am not sure how exactly that
> works. But it looks that having the group coordinator to control the
> rebalance progress would be clearer and probably safer than letting the
> group members to guess the state of a group. Can you elaborate a little
bit
> on your idea?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Apr 3, 2017 at 8:16 AM, Onur Karaman  >
> wrote:
>
> > Hi Damian.
> >
> > After reading the discussion thread again, it still doesn't seem like
the
> > thread discussed the option I mentioned earlier.
> >
> > From what I had understood from the broker-side vs. client-side config
> > debate was that the client-side config from the discussion would cause a
> > wire format change, while the client-side config change that I had
> > suggested would not.
> >
> > I just want to make sure we don't accidentally skip past it due to a
> > potential misunderstanding.
> >
> > On Mon, Apr 3, 2017 at 8:10 AM, Bill Bejeck  wrote:
> >
> > > +1 (non-binding)
> > >
> > > On Mon, Apr 3, 2017 at 9:53 AM, Mathieu Fenniak <
> > > mathieu.fenn...@replicon.com> wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > This will be very helpful 

[GitHub] kafka pull request #2802: MINOR: Log append validation improvements

2017-04-04 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/2802

MINOR: Log append validation improvements



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka validate-base-offset

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2802.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2802


commit be11ac949a7b3ff4bd261c437b8da61a190d82a0
Author: Ismael Juma 
Date:   2017-04-04T10:15:28Z

Validate baseOffset and minor tweaks to error messages and log entries

commit c19075fe3bc9c354186cd9e848f8820d082cd36d
Author: Ismael Juma 
Date:   2017-04-04T10:16:55Z

Make validation of batch and records consistent in `LogValidator`

commit 03de0188f00eca2c2a12cf67afa29288313f6a8e
Author: Ismael Juma 
Date:   2017-04-04T10:17:43Z

Fix typo




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4913) creating a window store with one segment throws division by zero error

2017-04-04 Thread Eno Thereska (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eno Thereska updated KAFKA-4913:

Priority: Blocker  (was: Major)

> creating a window store with one segment throws division by zero error
> --
>
> Key: KAFKA-4913
> URL: https://issues.apache.org/jira/browse/KAFKA-4913
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Xavier Léauté
>Assignee: Damian Guy
>Priority: Blocker
> Fix For: 0.10.2.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4916) Add streams tests with brokers failing

2017-04-04 Thread Eno Thereska (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eno Thereska updated KAFKA-4916:

Issue Type: Bug  (was: Improvement)

> Add streams tests with brokers failing
> --
>
> Key: KAFKA-4916
> URL: https://issues.apache.org/jira/browse/KAFKA-4916
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Eno Thereska
>Assignee: Eno Thereska
>Priority: Blocker
> Fix For: 0.10.2.1
>
>
> We need to add either integration or system tests with streams and have Kafka 
> brokers fail and come back up. A combination of transient and permanent 
> broker failures.
> As part of adding test, fix any critical bugs that arise.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5009) Globally Unique internal topic names when using Joins

2017-04-04 Thread Mark Tranter (JIRA)
Mark Tranter created KAFKA-5009:
---

 Summary: Globally Unique internal topic names when using Joins 
 Key: KAFKA-5009
 URL: https://issues.apache.org/jira/browse/KAFKA-5009
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Mark Tranter


We are using multiple different Kafka Streams applications on the back of a 
single kafka cluster. This allows each micro-service in our enterprise to 
consume & process kafka data to suit its own needs.

Currently when joining streams, an internal topic is created & named using 
KStreamsBuilder.newName(prefix);

```

String thisWindowStreamName = topology.newName(WINDOWED_NAME);
String otherWindowStreamName = topology.newName(WINDOWED_NAME);
String joinThisName = rightOuter ? topology.newName(OUTERTHIS_NAME) 
: topology.newName(JOINTHIS_NAME);
String joinOtherName = leftOuter ? 
topology.newName(OUTEROTHER_NAME) : topology.newName(JOINOTHER_NAME);
String joinMergeName = topology.newName(MERGE_NAME);
```

This prefix is a constant, and internally an incrementor is used to generate a  
unique (per KStreamBuilder instance) topic name for the topic.

In situations where multiple KStreamBuilders are in use (for example, multiple 
different Kafka Streams applications) we are seeing collisions in topic names.

Perhaps the join() methods should take a topic prefix overload to allow 
developers to provide unique names for these topics. Similar to the way other 
stateful processors work when having to provide a store name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4943) SCRAM secret's should be better protected with Zookeeper ACLs

2017-04-04 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-4943:
--
Status: Patch Available  (was: Open)

> SCRAM secret's should be better protected with Zookeeper ACLs
> -
>
> Key: KAFKA-4943
> URL: https://issues.apache.org/jira/browse/KAFKA-4943
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 0.10.2.0
>Reporter: Johan Ström
>Assignee: Rajini Sivaram
> Fix For: 0.10.2.1
>
>
> With the new SCRAM authenticator the secrets are stored in Zookeeper:
> {code}
> get /kafka/config/users/alice
> {"version":1,"config":{"SCRAM-SHA-512":"salt=ODhnZjNkdWZibTV1cG1zdnV6bmh6djF3Mg==,stored_key=BAbHWHuGEb4m5+U+p0M9oFQmOPhU6M7q5jtZY8deDDoZCvxaqVNLz41yPzdgcp1WpiEBmfwYOuFlo9hMFKM7mA==,server_key=JW3KhpMeyUgh0OAC0kejuFUvUSlXBv/Z68tlfOWcMw5f5jrBwyBnjNQ9VZsSYz1AcI9IYaQ5S6H3yN39SieNiA==,iterations=4096"}}
> {code}
> These are stored without any ACL, and zookeeper-security-migration.sh does 
> not seem to change that either:
> {code}
> getAcl /kafka/config/users/alice
> 'world,'anyone
> : cdrwa
> getAcl /kafka/config/users
> 'world,'anyone
> : cdrwa
> getAcl /kafka
> 'world,'anyone
> : r
> 'sasl,'bob
> : cdrwa
> getAcl /kafka/config/changes
> 'world,'anyone
> : r
> 'sasl,'bob
> : cdrwa
> {code}
> The above output is after running security migrator, for some reason 
> /kafka/config/users is ignored, but others are fixed..
> Even if these where to be stored with secure ZkUtils#DefaultAcls, they would 
> be world readable.
> From my (limited) point of view, they should be readable by Kafka only.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4943) SCRAM secret's should be better protected with Zookeeper ACLs

2017-04-04 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-4943:
--
Affects Version/s: 0.10.2.0
Fix Version/s: 0.10.2.1
  Component/s: security

> SCRAM secret's should be better protected with Zookeeper ACLs
> -
>
> Key: KAFKA-4943
> URL: https://issues.apache.org/jira/browse/KAFKA-4943
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 0.10.2.0
>Reporter: Johan Ström
>Assignee: Rajini Sivaram
> Fix For: 0.10.2.1
>
>
> With the new SCRAM authenticator the secrets are stored in Zookeeper:
> {code}
> get /kafka/config/users/alice
> {"version":1,"config":{"SCRAM-SHA-512":"salt=ODhnZjNkdWZibTV1cG1zdnV6bmh6djF3Mg==,stored_key=BAbHWHuGEb4m5+U+p0M9oFQmOPhU6M7q5jtZY8deDDoZCvxaqVNLz41yPzdgcp1WpiEBmfwYOuFlo9hMFKM7mA==,server_key=JW3KhpMeyUgh0OAC0kejuFUvUSlXBv/Z68tlfOWcMw5f5jrBwyBnjNQ9VZsSYz1AcI9IYaQ5S6H3yN39SieNiA==,iterations=4096"}}
> {code}
> These are stored without any ACL, and zookeeper-security-migration.sh does 
> not seem to change that either:
> {code}
> getAcl /kafka/config/users/alice
> 'world,'anyone
> : cdrwa
> getAcl /kafka/config/users
> 'world,'anyone
> : cdrwa
> getAcl /kafka
> 'world,'anyone
> : r
> 'sasl,'bob
> : cdrwa
> getAcl /kafka/config/changes
> 'world,'anyone
> : r
> 'sasl,'bob
> : cdrwa
> {code}
> The above output is after running security migrator, for some reason 
> /kafka/config/users is ignored, but others are fixed..
> Even if these where to be stored with secure ZkUtils#DefaultAcls, they would 
> be world readable.
> From my (limited) point of view, they should be readable by Kafka only.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2801: MINOR: Fix multiple org.apache.kafka.streams.Kafka...

2017-04-04 Thread original-brownbear
GitHub user original-brownbear opened a pull request:

https://github.com/apache/kafka/pull/2801

MINOR: Fix multiple 
org.apache.kafka.streams.KafkaStreams.StreamStateListener being instantiated

There should only be a single 
`org.apache.kafka.streams.KafkaStreams.StreamStateListener` to ensure 
synchronization of operations on 
`org.apache.kafka.streams.KafkaStreams.StreamStateListener#threadState`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/original-brownbear/kafka 
fix-stream-state-listener

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2801.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2801


commit 595b9641fc0c279eecd87a86a33b95629c699a90
Author: Armin Braun 
Date:   2017-04-04T07:41:53Z

MINOR: Fix multiple 
org.apache.kafka.streams.KafkaStreams.StreamStateListener being instantiated




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk8 #1400

2017-04-04 Thread Apache Jenkins Server
See 


Changes:

[me] KAFKA-4855: Struct SchemaBuilder should not allow duplicate fields

--
[...truncated 342.63 KB...]

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce STARTED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch FAILED
java.lang.AssertionError: Partition [test1,0] metadata not propagated after 
15000 ms
at kafka.utils.TestUtils$.fail(TestUtils.scala:307)
at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:811)
at 
kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:844)
at 
kafka.utils.TestUtils$$anonfun$createTopic$1.apply(TestUtils.scala:250)
at 
kafka.utils.TestUtils$$anonfun$createTopic$1.apply(TestUtils.scala:249)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.Range.foreach(Range.scala:160)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at kafka.utils.TestUtils$.createTopic(TestUtils.scala:249)
at 
kafka.integration.PrimitiveApiTest$$anonfun$produceAndMultiFetch$1.apply(PrimitiveApiTest.scala:121)
at 
kafka.integration.PrimitiveApiTest$$anonfun$produceAndMultiFetch$1.apply(PrimitiveApiTest.scala:120)
at scala.collection.immutable.List.foreach(List.scala:381)
at 
kafka.integration.PrimitiveApiTest.produceAndMultiFetch(PrimitiveApiTest.scala:120)
at 
kafka.integration.PrimitiveApiTest.testProduceAndMultiFetch(PrimitiveApiTest.scala:185)

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest 

[jira] [Assigned] (KAFKA-3385) Need to log "Rejected connection" as WARNING message

2017-04-04 Thread Andrea Cosentino (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrea Cosentino reassigned KAFKA-3385:
---

Assignee: (was: Andrea Cosentino)

> Need to log "Rejected connection" as WARNING message
> 
>
> Key: KAFKA-3385
> URL: https://issues.apache.org/jira/browse/KAFKA-3385
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Xiaomin Zhang
>Priority: Minor
>
> We may found below INFO messages in the log due to inappropriate 
> configuration:
> INFO kafka.network. Acceptor: Rejected connection from /, address already 
> has the configured maximum of 10 connections.
> It will make more sense for Kafka to report above message as "WARN", not just 
> "INFO", as it truly indicates something need to check against. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2800: added interface to allow producers to create a Pro...

2017-04-04 Thread simplesteph
GitHub user simplesteph opened a pull request:

https://github.com/apache/kafka/pull/2800

added interface to allow producers to create a ProducerRecord without…

… specifying a partition, making it more obvious that the parameter 
permission can be null

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/simplesteph/kafka 
add-producer-record-timestamp

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2800.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2800


commit 88b5fac31f0a7b819093f32a0d52bd9c104f4af3
Author: simplesteph 
Date:   2017-04-04T06:46:06Z

added interface to allow producers to create a ProducerRecord without 
specifying a partition, making it more obvious that the parameter permission 
can be null




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---