[jira] [Commented] (KAFKA-7652) Kafka Streams Session store performance degradation from 0.10.2.2 to 0.11.0.0

2019-01-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756771#comment-16756771
 ] 

ASF GitHub Bot commented on KAFKA-7652:
---

guozhangwang commented on pull request #6161: KAFKA-7652: Part II; Add 
single-point query for SessionStore and use for flushing / getter
URL: https://github.com/apache/kafka/pull/6161
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Kafka Streams Session store performance degradation from 0.10.2.2 to 0.11.0.0
> -
>
> Key: KAFKA-7652
> URL: https://issues.apache.org/jira/browse/KAFKA-7652
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.0, 0.11.0.1, 0.11.0.2, 0.11.0.3, 1.1.1, 2.0.0, 
> 2.0.1
>Reporter: Jonathan Gordon
>Priority: Major
> Attachments: kafka_10_2_1_flushes.txt, kafka_11_0_3_flushes.txt
>
>
> I'm creating this issue in response to [~guozhang]'s request on the mailing 
> list:
> [https://lists.apache.org/thread.html/97d620f4fd76be070ca4e2c70e2fda53cafe051e8fc4505dbcca0321@%3Cusers.kafka.apache.org%3E]
> We are attempting to upgrade our Kafka Streams application from 0.10.2.1 but 
> experience a severe performance degradation. The highest amount of CPU time 
> seems spent in retrieving from the local cache. Here's an example thread 
> profile with 0.11.0.0:
> [https://i.imgur.com/l5VEsC2.png]
> When things are running smoothly we're gated by retrieving from the state 
> store with acceptable performance. Here's an example thread profile with 
> 0.10.2.1:
> [https://i.imgur.com/IHxC2cZ.png]
> Some investigation reveals that it appears we're performing about 3 orders 
> magnitude more lookups on the NamedCache over a comparable time period. I've 
> attached logs of the NamedCache flush logs for 0.10.2.1 and 0.11.0.3.
> We're using session windows and have the app configured for 
> commit.interval.ms = 30 * 1000 and cache.max.bytes.buffering = 10485760
> I'm happy to share more details if they would be helpful. Also happy to run 
> tests on our data.
> I also found this issue, which seems like it may be related:
> https://issues.apache.org/jira/browse/KAFKA-4904
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-3522) Consider adding version information into rocksDB storage format

2019-01-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756719#comment-16756719
 ] 

ASF GitHub Bot commented on KAFKA-3522:
---

mjsax commented on pull request #6149: KAFKA-3522: Add RocksDBTimestampedStore
URL: https://github.com/apache/kafka/pull/6149
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Consider adding version information into rocksDB storage format
> ---
>
> Key: KAFKA-3522
> URL: https://issues.apache.org/jira/browse/KAFKA-3522
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Matthias J. Sax
>Priority: Major
>  Labels: architecture
>
> Kafka Streams does not introduce any modifications to the data format in the 
> underlying Kafka protocol, but it does use RocksDB for persistent state 
> storage, and currently its data format is fixed and hard-coded. We want to 
> consider the evolution path in the future we we change the data format, and 
> hence having some version info stored along with the storage file / directory 
> would be useful.
> And this information could be even out of the storage file; for example, we 
> can just use a small "version indicator" file in the rocksdb directory for 
> this purposes. Thoughts? [~enothereska] [~jkreps]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-3522) Consider adding version information into rocksDB storage format

2019-01-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756698#comment-16756698
 ] 

ASF GitHub Bot commented on KAFKA-3522:
---

mjsax commented on pull request #6204: KAFKA-3522: Replace RecordConverter with 
TimestampedBytesStore
URL: https://github.com/apache/kafka/pull/6204
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Consider adding version information into rocksDB storage format
> ---
>
> Key: KAFKA-3522
> URL: https://issues.apache.org/jira/browse/KAFKA-3522
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Matthias J. Sax
>Priority: Major
>  Labels: architecture
>
> Kafka Streams does not introduce any modifications to the data format in the 
> underlying Kafka protocol, but it does use RocksDB for persistent state 
> storage, and currently its data format is fixed and hard-coded. We want to 
> consider the evolution path in the future we we change the data format, and 
> hence having some version info stored along with the storage file / directory 
> would be useful.
> And this information could be even out of the storage file; for example, we 
> can just use a small "version indicator" file in the rocksdb directory for 
> this purposes. Thoughts? [~enothereska] [~jkreps]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6278) Allow multiple concurrent transactions on a single producer

2019-01-30 Thread Ask Solem (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756554#comment-16756554
 ] 

Ask Solem commented on KAFKA-6278:
--

@Jason can you use multiple transactional ids over the same network socket?  I 
tried sending multiple InitProducerIDRequests on the same socket and it doesn't 
complain, but would this be safe to do?

 

I'm trying to implement exactly-once in our kafka streams port, and reading the 
Kafka Streams source code it seems they are actually using one producer per 
TaskId+partition?! That would mean you could end up having hundreds of 
producers, not just one per thread, so now investigating using multiple 
transactional ids in the same producer.

> Allow multiple concurrent transactions on a single producer
> ---
>
> Key: KAFKA-6278
> URL: https://issues.apache.org/jira/browse/KAFKA-6278
> Project: Kafka
>  Issue Type: New Feature
>  Components: producer 
>Reporter: Tim Cuthbertson
>Priority: Major
>
> It's recommended to share a producer between threads, because it's likely 
> faster / cheaper.
> However with the transactional API there's a big caveat. If you're using 
> transactions, every message sent to a given producer instance will be 
> considered part of the "active transaction" regardless of what thread it came 
> from. Furthermore, if two threads want to use transactions on a shared 
> producer instance, it (probably) won't work.
> Possible fix: add an API which exposes the transaction ID to the user, 
> instead of making it internal state of the producer. e.g.:
> {noformat}
> Transaction tx = producer.beginTransaction()
> producer.send(tx, message)
> producer.commitTransaction(tx)
> {noformat}
> That way, it's explicit which transaction a message will be part of, rather 
> than the current state which is "the open transaction, which may have been 
> opened by an unrelated thread".
> See also initial discussion on slack: 
> https://confluentcommunity.slack.com/archives/C488525JT/p151173973412



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-7877) Connect DLQ not used in SinkTask put()

2019-01-30 Thread Andrew Bourgeois (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756534#comment-16756534
 ] 

Andrew Bourgeois edited comment on KAFKA-7877 at 1/30/19 8:26 PM:
--

The following sentence:

{noformat}
For sink connectors, we will write the original record (from the Kafka topic 
the sink connector is consuming from) which caused the failure to a 
configurable Kafka topic.{noformat}
in 
[KIP-298|https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect]
 doesn't state that the DLQ feature only applies to the conversion and 
transformation phases.

 


was (Author: andrew bourgeois):
The following sentence:

{noformat}
For sink connectors, we will write the original record (from the Kafka topic 
the sink connector is consuming from) which caused the failure to a 
configurable Kafka topic.{noformat}
in 
[KIP-298|https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect]
 doesn't state that the DLQ feature only applies to the conversion and 
transformation phases. The delivery phase is excluded.

 

> Connect DLQ not used in SinkTask put()
> --
>
> Key: KAFKA-7877
> URL: https://issues.apache.org/jira/browse/KAFKA-7877
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Andrew Bourgeois
>Priority: Major
>
> The Dead Letter Queue isn't implemented for the put() operation.
> In 
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java]
>  the "retryWithToleranceOperator" gets used during the conversion and 
> transformation phases, but not when delivering the messages to the sink task.
> According to KIP-298, the Dead Letter Queue should be used as long as we 
> throw org.apache.kafka.connect.errors.RetriableException.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7877) Connect DLQ not used in SinkTask put()

2019-01-30 Thread Andrew Bourgeois (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756534#comment-16756534
 ] 

Andrew Bourgeois commented on KAFKA-7877:
-

The following sentence:

{noformat}
For sink connectors, we will write the original record (from the Kafka topic 
the sink connector is consuming from) which caused the failure to a 
configurable Kafka topic.{noformat}
in 
[KIP-298|https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect]
 doesn't state that the DLQ feature only applies to the conversion and 
transformation phases. The delivery phase is excluded.

 

> Connect DLQ not used in SinkTask put()
> --
>
> Key: KAFKA-7877
> URL: https://issues.apache.org/jira/browse/KAFKA-7877
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Andrew Bourgeois
>Priority: Major
>
> The Dead Letter Queue isn't implemented for the put() operation.
> In 
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java]
>  the "retryWithToleranceOperator" gets used during the conversion and 
> transformation phases, but not when delivering the messages to the sink task.
> According to KIP-298, the Dead Letter Queue should be used as long as we 
> throw org.apache.kafka.connect.errors.RetriableException.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7877) Connect DLQ not used in SinkTask put()

2019-01-30 Thread Arjun Satish (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756507#comment-16756507
 ] 

Arjun Satish commented on KAFKA-7877:
-

This is not a bug. We decided to not implement DLQ to capture failures in the 
sink task for the following reasons:
 # When SinkTask#put() is invoked, a collection of SinkRecords are passed into 
it. It is not clear which of these records, if any, are the cause of the 
failure. In the case of a connection error, there are no bad records per se. 
Either the connector needs to be resilient to such connections and retry 
internally or throw a RetriableException.
 # RetriableExceptions thrown from the SinkTask are infinitely retried (with 
the same set of SinkRecords). Records must make it to the DLQ only when the 
framework has decided if it hits a hard error, and just cannot proceed.
 # In some cases, the records provided to SinkTask#put() could be cached by a 
Connector optimized for high throughput. A later call to this method causes an 
error, and it might have been due to the cached records.

Maybe, we should add a new Exception type (say, BadDataException) that can 
return the set of bad records to the framework for more precise error reporting.

Could you please point to the phrasing in the KIP that led to the confusion? 
I'll update it so the behavior is clearer.

> Connect DLQ not used in SinkTask put()
> --
>
> Key: KAFKA-7877
> URL: https://issues.apache.org/jira/browse/KAFKA-7877
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Andrew Bourgeois
>Priority: Major
>
> The Dead Letter Queue isn't implemented for the put() operation.
> In 
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java]
>  the "retryWithToleranceOperator" gets used during the conversion and 
> transformation phases, but not when delivering the messages to the sink task.
> According to KIP-298, the Dead Letter Queue should be used as long as we 
> throw org.apache.kafka.connect.errors.RetriableException.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7885) Streams: TopologyDescription violates equals-hashCode contract.

2019-01-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756494#comment-16756494
 ] 

ASF GitHub Bot commented on KAFKA-7885:
---

jCalamari commented on pull request #6210: KAFKA-7885: TopologyDescription 
violates equals-hashCode contract.
URL: https://github.com/apache/kafka/pull/6210
 
 
   Overwrote hash code implementation in StaticTopicNameExtractor.
   Added a unit test confirming equals-hashCode contract.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Streams: TopologyDescription violates equals-hashCode contract.
> ---
>
> Key: KAFKA-7885
> URL: https://issues.apache.org/jira/browse/KAFKA-7885
> Project: Kafka
>  Issue Type: Bug
>Reporter: Piotr Fras
>Priority: Major
>
> As per JavaSE documentation:
> > If two objects are *equal* according to the *equals*(Object) method, then 
> >calling the *hashCode* method on each of the two objects must produce the 
> >same integer result.
>  
> This is not the case for TopologyDescription.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7885) Streams: TopologyDescription violates equals-hashCode contract.

2019-01-30 Thread Piotr Fras (JIRA)
Piotr Fras created KAFKA-7885:
-

 Summary: Streams: TopologyDescription violates equals-hashCode 
contract.
 Key: KAFKA-7885
 URL: https://issues.apache.org/jira/browse/KAFKA-7885
 Project: Kafka
  Issue Type: Bug
Reporter: Piotr Fras


As per JavaSE documentation:

> If two objects are *equal* according to the *equals*(Object) method, then 
>calling the *hashCode* method on each of the two objects must produce the same 
>integer result.

 

This is not the case for TopologyDescription.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7374) Tiered Storage

2019-01-30 Thread Mark Thomas (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756316#comment-16756316
 ] 

Mark Thomas commented on KAFKA-7374:


Note: Attachment deleted due to high likelihood of being spam

> Tiered Storage
> --
>
> Key: KAFKA-7374
> URL: https://issues.apache.org/jira/browse/KAFKA-7374
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Maciej Bryński
>Priority: Major
>
> Both Pravega and Pulsar gives possibility to use tiered storage.
> We can store old messages on different FS like S3 or HDFS.
> Kafka should give similar possibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7374) Tiered Storage

2019-01-30 Thread Mark Thomas (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Thomas updated KAFKA-7374:
---
Attachment: (was: Google Photos)

> Tiered Storage
> --
>
> Key: KAFKA-7374
> URL: https://issues.apache.org/jira/browse/KAFKA-7374
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Maciej Bryński
>Priority: Major
>
> Both Pravega and Pulsar gives possibility to use tiered storage.
> We can store old messages on different FS like S3 or HDFS.
> Kafka should give similar possibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7719) Improve fairness in SocketServer processors

2019-01-30 Thread Mark Thomas (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756315#comment-16756315
 ] 

Mark Thomas commented on KAFKA-7719:


Note: attachment deleted due to high likelihood of being spam

> Improve fairness in SocketServer processors
> ---
>
> Key: KAFKA-7719
> URL: https://issues.apache.org/jira/browse/KAFKA-7719
> Project: Kafka
>  Issue Type: Improvement
>  Components: network
>Affects Versions: 2.1.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Major
> Fix For: 2.2.0
>
>
> See 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-402%3A+Improve+fairness+in+SocketServer+processors
>  for details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7719) Improve fairness in SocketServer processors

2019-01-30 Thread Mark Thomas (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Thomas updated KAFKA-7719:
---
Attachment: (was: 20181128_005530_609.backup)

> Improve fairness in SocketServer processors
> ---
>
> Key: KAFKA-7719
> URL: https://issues.apache.org/jira/browse/KAFKA-7719
> Project: Kafka
>  Issue Type: Improvement
>  Components: network
>Affects Versions: 2.1.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Major
> Fix For: 2.2.0
>
>
> See 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-402%3A+Improve+fairness+in+SocketServer+processors
>  for details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7697) Possible deadlock in kafka.cluster.Partition

2019-01-30 Thread Rajini Sivaram (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756129#comment-16756129
 ] 

Rajini Sivaram commented on KAFKA-7697:
---

Apache Kafka 2.1.1 containing the fix is currently going through the release 
process and RC1 is available for testing and voting - see 
http://mail-archives.apache.org/mod_mbox/kafka-users/201901.mbox/%3c67fc2ed5-0cc3-4fc6-8e14-ba562f6e4...@www.fastmail.com%3E

> Possible deadlock in kafka.cluster.Partition
> 
>
> Key: KAFKA-7697
> URL: https://issues.apache.org/jira/browse/KAFKA-7697
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Gian Merlino
>Assignee: Rajini Sivaram
>Priority: Blocker
> Fix For: 2.2.0, 2.1.1
>
> Attachments: threaddump.txt
>
>
> After upgrading a fairly busy broker from 0.10.2.0 to 2.1.0, it locked up 
> within a few minutes (by "locked up" I mean that all request handler threads 
> were busy, and other brokers reported that they couldn't communicate with 
> it). I restarted it a few times and it did the same thing each time. After 
> downgrading to 0.10.2.0, the broker was stable. I attached a thread dump from 
> the last attempt on 2.1.0 that shows lots of kafka-request-handler- threads 
> trying to acquire the leaderIsrUpdateLock lock in kafka.cluster.Partition.
> It jumps out that there are two threads that already have some read lock 
> (can't tell which one) and are trying to acquire a second one (on two 
> different read locks: 0x000708184b88 and 0x00070821f188): 
> kafka-request-handler-1 and kafka-request-handler-4. Both are handling a 
> produce request, and in the process of doing so, are calling 
> Partition.fetchOffsetSnapshot while trying to complete a DelayedFetch. At the 
> same time, both of those locks have writers from other threads waiting on 
> them (kafka-request-handler-2 and kafka-scheduler-6). Neither of those locks 
> appear to have writers that hold them (if only because no threads in the dump 
> are deep enough in inWriteLock to indicate that).
> ReentrantReadWriteLock in nonfair mode prioritizes waiting writers over 
> readers. Is it possible that kafka-request-handler-1 and 
> kafka-request-handler-4 are each trying to read-lock the partition that is 
> currently locked by the other one, and they're both parked waiting for 
> kafka-request-handler-2 and kafka-scheduler-6 to get write locks, which they 
> never will, because the former two threads own read locks and aren't giving 
> them up?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-7884) Docs for message.format.version and log.message.format.version show invalid (corrupt?) "valid values"

2019-01-30 Thread Lee Dongjin (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Dongjin reassigned KAFKA-7884:
--

Assignee: Lee Dongjin

> Docs for message.format.version and log.message.format.version show invalid 
> (corrupt?) "valid values"
> -
>
> Key: KAFKA-7884
> URL: https://issues.apache.org/jira/browse/KAFKA-7884
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Reporter: James Cheng
>Assignee: Lee Dongjin
>Priority: Major
>
> In the docs for message.format.version and log.message.format.version, the 
> list of valid values is
>  
> {code:java}
> kafka.api.ApiVersionValidator$@56aac163 
> {code}
>  
> It appears it's simply doing a .toString on the class/instance.
> At a minimum, we should remove this java-y-ness.
> Even better is, it should show all the valid values.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7884) Docs for message.format.version and log.message.format.version show invalid (corrupt?) "valid values"

2019-01-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756007#comment-16756007
 ] 

ASF GitHub Bot commented on KAFKA-7884:
---

dongjinleekr commented on pull request #6209: KAFKA-7884: Docs for 
message.format.version and log.message.format.version show invalid (corrupt?) 
"valid values"
URL: https://github.com/apache/kafka/pull/6209
 
 
   ![apache 
kafka](https://user-images.githubusercontent.com/2375128/51978989-33f47600-24cf-11e9-89ef-c8116ff85947.png)
   The reason for the problem is simple: `ApiVersionValidator#toString` is 
missing. In contrast, all other Validators like `ThrottledReplicaListValidator` 
or `Range`, have its own `toString` method.
   
   This update solves this problem by adding `ApiVersionValidator#toString`. It 
also provides a unit test for it.
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Docs for message.format.version and log.message.format.version show invalid 
> (corrupt?) "valid values"
> -
>
> Key: KAFKA-7884
> URL: https://issues.apache.org/jira/browse/KAFKA-7884
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Reporter: James Cheng
>Priority: Major
>
> In the docs for message.format.version and log.message.format.version, the 
> list of valid values is
>  
> {code:java}
> kafka.api.ApiVersionValidator$@56aac163 
> {code}
>  
> It appears it's simply doing a .toString on the class/instance.
> At a minimum, we should remove this java-y-ness.
> Even better is, it should show all the valid values.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7882) StateStores are frequently closed during the 'transform' method

2019-01-30 Thread Mateusz Owczarek (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mateusz Owczarek updated KAFKA-7882:

Description: 
Hello, I have a problem with the state store being closed frequently while 
transforming upcoming records. To ensure only one record of the same key and 
the window comes to an aggregate I have created a custom Transformer (I know 
something similar is going to be introduced with suppress method on KTable in 
the future, but my implementation is quite simple and imo should work 
correctly) with the following implementation:
{code:java}
override def transform(key: Windowed[K], value: V): (Windowed[K], V) = {

val partition = context.partition() 
if (partition != -1) store.put(key.key(), (value, partition), 
key.window().start()) 
else logger.warn(s"-1 partition")

null //Ensuring no 1:1 forwarding, context.forward and commit logic is in the 
punctuator callback
}
{code}
 

What I do get is the following error:
{code:java}
Store MyStore is currently closed{code}
This problem appears only when the number of streaming threads (or input topic 
partitions) is greater than 1 even if I'm just saving to the store and turn off 
the punctuation.

If punctuation is present, however, I sometimes get -1 as a partition value in 
the transform method. I'm familiar with the basic docs, however, I haven't 
found anything that could help me here.

I build my state store like this:
{code:java}
val stateStore = Stores.windowStoreBuilder(
  Stores.persistentWindowStore(
stateStoreName,
timeWindows.maintainMs() + timeWindows.sizeMs + 
TimeUnit.DAYS.toMillis(1),
timeWindows.segments,
timeWindows.sizeMs,
false
  ),
  serde[K],
  serde[(V, Int)]
)
{code}
and include it in a DSL API like this:
{code:java}
builder.addStateStore(stateStore)
(...).transform(new MyTransformer(...), "MyStore")
{code}
INB4: I don't close any state stores manually, I gave them retention time as 
long as possible for the debugging stage, I tried to hotfix that with the retry 
in the transform method and the state stores reopen at the end and the `put` 
method works, but this approach is questionable and I am concerned if it 
actually works.

Edit:
May this be because of the fact that the 
{code:java}StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG{code} is set to low 
value? If I understand correctly, spilling to disk is done therefore more 
frequently, may it, therefore, cause closing the store?

  was:
Hello, I have a problem with the state store being closed frequently while 
transforming upcoming records. To ensure only one record of the same key and 
the window comes to an aggregate I have created a custom Transformer (I know 
something similar is going to be introduced with suppress method on KTable in 
the future, but my implementation is quite simple and imo should work 
correctly) with the following implementation:
{code:java}
override def transform(key: Windowed[K], value: V): (Windowed[K], V) = {

val partition = context.partition() 
if (partition != -1) store.put(key.key(), (value, partition), 
key.window().start()) 
else logger.warn(s"-1 partition")

null //Ensuring no 1:1 forwarding, context.forward and commit logic is in the 
punctuator callback
}
{code}
 

What I do get is the following error:
{code:java}
Store MyStore is currently closed{code}
This problem appears only when the number of streaming threads (or input topic 
partitions) is greater than 1 even if I'm just saving to the store and turn off 
the punctuation.

If punctuation is present, however, I sometimes get -1 as a partition value in 
the transform method. I'm familiar with the basic docs, however, I haven't 
found anything that could help me here.

I build my state store like this:
{code:java}
val stateStore = Stores.windowStoreBuilder(
  Stores.persistentWindowStore(
stateStoreName,
timeWindows.maintainMs() + timeWindows.sizeMs + 
TimeUnit.DAYS.toMillis(1),
timeWindows.segments,
timeWindows.sizeMs,
false
  ),
  serde[K],
  serde[(V, Int)]
)
{code}
and include it in a DSL API like this:
{code:java}
builder.addStateStore(stateStore)
(...).transform(new MyTransformer(...), "MyStore")
{code}
INB4: I don't close any state stores manually, I gave them retention time as 
long as possible for the debugging stage, I tried to hotfix that with the retry 
in the transform method and the state stores reopen at the end and the `put` 
method works, but this approach is questionable and I am concerned if it 
actually works.


> StateStores are frequently closed during the 'transform' method
> ---
>
> Key: KAFKA-7882
> URL: https://issues.apache.org/jira/browse/KAFKA-7882
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>

[jira] [Commented] (KAFKA-7870) Error sending fetch request (sessionId=1578860481, epoch=INITIAL) to node 2: java.io.IOException: Connection to 2 was disconnected before the response was read.

2019-01-30 Thread Maurits Hartman (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755870#comment-16755870
 ] 

Maurits Hartman commented on KAFKA-7870:


Any progress on this issue? We actually experience the same issue after 
upgrading from v2.0.0 to v2.1.0. 10-30 minutes after restart, all brokers start 
logging errors about being unable to fetch from one of the brokers.

> Error sending fetch request (sessionId=1578860481, epoch=INITIAL) to node 2: 
> java.io.IOException: Connection to 2 was disconnected before the response was 
> read.
> 
>
> Key: KAFKA-7870
> URL: https://issues.apache.org/jira/browse/KAFKA-7870
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.1.0
>Reporter: Chakhsu Lau
>Priority: Blocker
>
> We build a kafka cluster with 5 brokers. But one of brokers suddenly stopped 
> running during the run. And it happened twice in the same broker. Here is the 
> log and is this a bug in kafka ?
> {code:java}
> [2019-01-25 12:57:14,686] INFO [ReplicaFetcher replicaId=3, leaderId=2, 
> fetcherId=0] Error sending fetch request (sessionId=1578860481, 
> epoch=INITIAL) to node 2: java.io.IOException: Connection to 2 was 
> disconnected before the response was read. 
> (org.apache.kafka.clients.FetchSessionHandler)
> [2019-01-25 12:57:14,687] WARN [ReplicaFetcher replicaId=3, leaderId=2, 
> fetcherId=0] Error in response for fetch request (type=FetchRequest, 
> replicaId=3, maxWait=500, minBytes=1, maxBytes=10485760, 
> fetchData={api-result-bi-heatmap-8=(offset=0, logStartOffset=0, 
> maxBytes=1048576, currentLeaderEpoch=Optional[4]), 
> api-result-bi-heatmap-save-12=(offset=0, logStartOffset=0, maxBytes=1048576, 
> currentLeaderEpoch=Optional[4]), api-result-bi-heatmap-task-2=(offset=2, 
> logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[4]), 
> api-result-bi-flow-39=(offset=1883206, logStartOffset=0, maxBytes=1048576, 
> currentLeaderEpoch=Optional[4]), __consumer_offsets-47=(offset=349437, 
> logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[4]), 
> api-result-bi-heatmap-track-6=(offset=1039889, logStartOffset=0, 
> maxBytes=1048576, currentLeaderEpoch=Optional[4]), 
> api-result-bi-heatmap-task-17=(offset=0, logStartOffset=0, maxBytes=1048576, 
> currentLeaderEpoch=Optional[4]), __consumer_offsets-2=(offset=0, 
> logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[4]), 
> api-result-bi-heatmap-aggs-19=(offset=1255056, logStartOffset=0, 
> maxBytes=1048576, currentLeaderEpoch=Optional[4])}, 
> isolationLevel=READ_UNCOMMITTED, toForget=, metadata=(sessionId=1578860481, 
> epoch=INITIAL)) (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 2 was disconnected before the response was 
> read
> at 
> org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:97)
> at 
> kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:97)
> at 
> kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:190)
> at 
> kafka.server.AbstractFetcherThread.kafka$server$AbstractFetcherThread$$processFetchRequest(AbstractFetcherThread.scala:241)
> at 
> kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:130)
> at 
> kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:129)
> at scala.Option.foreach(Option.scala:257)
> at 
> kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:129)
> at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:111)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)