[jira] [Commented] (KAFKA-6441) FetchRequest populates buffer of size MinBytes, even if response is smaller

2018-01-12 Thread Ivan Babrou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324857#comment-16324857
 ] 

Ivan Babrou commented on KAFKA-6441:


Looks like the issue is in Sarama, which only reads one record batch:

* https://github.com/Shopify/sarama/issues/1022

> FetchRequest populates buffer of size MinBytes, even if response is smaller
> ---
>
> Key: KAFKA-6441
> URL: https://issues.apache.org/jira/browse/KAFKA-6441
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.1
>Reporter: Ivan Babrou
>
> We're using Sarama Go client as consumer, but I don't think it's relevant. 
> Producer is syslog-ng with Kafka output, I'm not quite sure which log format 
> Kafka itself is using, but I can assume 0.11.0.0, because that's what is set 
> in topic settings.
> Our FetchRequest has minSize = 16MB, maxSize = 64, maxWait = 500ms. For a 
> silly reason, Kafka decides to reply with at least minSize buffer with just 
> one 1KB log message. When Sarama was using older consumer API, everything was 
> okay. When we upgraded to 0.11.0.0 consumer API, consumer traffic for 
> 125Mbit/s topic spiked to 55000Mbit/s on the wire and consumer wasn't even 
> able to keep up.
> 1KB message in a 16MB buffer is 1,600,000% overhead.
> I don't think there's any valid reason to do this.
> It's also mildly annoying that there is no tag 0.11.0.1 in git, looking at 
> changes is harder than it should be.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6441) FetchRequest populates buffer of size MinBytes, even if response is smaller

2018-01-11 Thread Ivan Babrou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323603#comment-16323603
 ] 

Ivan Babrou commented on KAFKA-6441:


I dumped raw bytes from Kafka responses and it seems like buffers are fully 
populated with messages. Digging deeper to find out what's causing Sarama to 
only read the first message.

> FetchRequest populates buffer of size MinBytes, even if response is smaller
> ---
>
> Key: KAFKA-6441
> URL: https://issues.apache.org/jira/browse/KAFKA-6441
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.1
>Reporter: Ivan Babrou
>
> We're using Sarama Go client as consumer, but I don't think it's relevant. 
> Producer is syslog-ng with Kafka output, I'm not quite sure which log format 
> Kafka itself is using, but I can assume 0.11.0.0, because that's what is set 
> in topic settings.
> Our FetchRequest has minSize = 16MB, maxSize = 64, maxWait = 500ms. For a 
> silly reason, Kafka decides to reply with at least minSize buffer with just 
> one 1KB log message. When Sarama was using older consumer API, everything was 
> okay. When we upgraded to 0.11.0.0 consumer API, consumer traffic for 
> 125Mbit/s topic spiked to 55000Mbit/s on the wire and consumer wasn't even 
> able to keep up.
> 1KB message in a 16MB buffer is 1,600,000% overhead.
> I don't think there's any valid reason to do this.
> It's also mildly annoying that there is no tag 0.11.0.1 in git, looking at 
> changes is harder than it should be.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6441) FetchRequest populates buffer of size MinBytes, even if response is smaller

2018-01-11 Thread Ivan Babrou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323073#comment-16323073
 ] 

Ivan Babrou commented on KAFKA-6441:


I think it's a bit different, buffers for each partition are allocated based on 
maxBytes:

{noformat}
2018/01/11 21:48:58 Request: max wait time = 500, min bytes = 1, max bytes = 
104857600, isolation = 0, num blocks = 1
2018/01/11 21:48:58   fetch request block for partition 0: 
{fetchOffset:7075063209, maxBytes:2097152}
2018/01/11 21:48:58 Request: max wait time = 500, min bytes = 1, max bytes = 
104857600, isolation = 0, num blocks = 1
2018/01/11 21:48:58   fetch request block for partition 0: 
{fetchOffset:7075063209, maxBytes:2097152}
{noformat}

Here fetchRequestBlock translates to roughly to  PartitionData(offset, 
logStartOffset, maxBytes)

if I dump individual segments from the log, I see individual messages:

{noformat}
baseOffset: 15165279076 lastOffset: 15165279076 baseSequence: -1 lastSequence: 
-1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 140 isTransactional: 
false position: 9241092 CreateTime: 1515699408944 isvalid: true size: 910 
magic: 2 compresscodec: NONE crc: 456596511
baseOffset: 15165279077 lastOffset: 15165279077 baseSequence: -1 lastSequence: 
-1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 140 isTransactional: 
false position: 9242002 CreateTime: 1515699408955 isvalid: true size: 910 
magic: 2 compresscodec: NONE crc: 465015653
baseOffset: 15165279078 lastOffset: 15165279078 baseSequence: -1 lastSequence: 
-1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 140 isTransactional: 
false position: 9242912 CreateTime: 1515699408960 isvalid: true size: 908 
magic: 2 compresscodec: NONE crc: 1569816164
baseOffset: 15165279079 lastOffset: 15165279079 baseSequence: -1 lastSequence: 
-1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 140 isTransactional: 
false position: 9243820 CreateTime: 1515699408997 isvalid: true size: 915 
magic: 2 compresscodec: NONE crc: 1894915965
baseOffset: 15165279080 lastOffset: 15165279080 baseSequence: -1 lastSequence: 
-1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 140 isTransactional: 
false position: 9244735 CreateTime: 1515699409010 isvalid: true size: 916 
magic: 2 compresscodec: NONE crc: 2124364233
baseOffset: 15165279081 lastOffset: 15165279081 baseSequence: -1 lastSequence: 
-1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 140 isTransactional: 
false position: 9245651 CreateTime: 1515699409035 isvalid: true size: 918 
magic: 2 compresscodec: NONE crc: 1889246530
baseOffset: 15165279082 lastOffset: 15165279082 baseSequence: -1 lastSequence: 
-1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 140 isTransactional: 
false position: 9246569 CreateTime: 1515699409038 isvalid: true size: 914 
magic: 2 compresscodec: NONE crc: 877751927
baseOffset: 15165279083 lastOffset: 15165279083 baseSequence: -1 lastSequence: 
-1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 140 isTransactional: 
false position: 9247483 CreateTime: 1515699409061 isvalid: true size: 915 
magic: 2 compresscodec: NONE crc: 3313577153
baseOffset: 15165279084 lastOffset: 15165279084 baseSequence: -1 lastSequence: 
-1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 140 isTransactional: 
false position: 9248398 CreateTime: 1515699409132 isvalid: true size: 912 
magic: 2 compresscodec: NONE crc: 1951840175
baseOffset: 15165279085 lastOffset: 15165279085 baseSequence: -1 lastSequence: 
-1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 140 isTransactional: 
false position: 9249310 CreateTime: 1515699409133 isvalid: true size: 915 
magic: 2 compresscodec: NONE crc: 1357735233
baseOffset: 15165279086 lastOffset: 15165279086 baseSequence: -1 lastSequence: 
-1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 140 isTransactional: 
false position: 9250225 CreateTime: 1515699409137 isvalid: true size: 920 
magic: 2 compresscodec: NONE crc: 899719626
baseOffset: 15165279087 lastOffset: 15165279087 baseSequence: -1 lastSequence: 
-1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 140 isTransactional: 
false position: 9251145 CreateTime: 1515699409162 isvalid: true size: 915 
magic: 2 compresscodec: NONE crc: 1993963751
{noformat}

These should be combined when returned to consumer if buffer is large enough, 
but they are not for some reason.

> FetchRequest populates buffer of size MinBytes, even if response is smaller
> ---
>
> Key: KAFKA-6441
> URL: https://issues.apache.org/jira/browse/KAFKA-6441
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.1
>Reporter: Ivan Babrou
>
> We're using Sarama Go client as consumer, but I don't think it's relevant. 
> Producer is syslog-ng with Kafka output, I'm not quite sure which log 

[jira] [Commented] (KAFKA-6441) FetchRequest populates buffer of size MinBytes, even if response is smaller

2018-01-10 Thread Ivan Babrou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321715#comment-16321715
 ] 

Ivan Babrou commented on KAFKA-6441:


With 0.10.2.0 consumer API Sarama is able to get multiple messages in one 
FetchResponse.

It doesn't seem right to get only one with 0.11.0.0 API.

> FetchRequest populates buffer of size MinBytes, even if response is smaller
> ---
>
> Key: KAFKA-6441
> URL: https://issues.apache.org/jira/browse/KAFKA-6441
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.1
>Reporter: Ivan Babrou
>
> We're using Sarama Go client as consumer, but I don't think it's relevant. 
> Producer is syslog-ng with Kafka output, I'm not quite sure which log format 
> Kafka itself is using, but I can assume 0.11.0.0, because that's what is set 
> in topic settings.
> Our FetchRequest has minSize = 16MB, maxSize = 64, maxWait = 500ms. For a 
> silly reason, Kafka decides to reply with at least minSize buffer with just 
> one 1KB log message. When Sarama was using older consumer API, everything was 
> okay. When we upgraded to 0.11.0.0 consumer API, consumer traffic for 
> 125Mbit/s topic spiked to 55000Mbit/s on the wire and consumer wasn't even 
> able to keep up.
> 1KB message in a 16MB buffer is 1,600,000% overhead.
> I don't think there's any valid reason to do this.
> It's also mildly annoying that there is no tag 0.11.0.1 in git, looking at 
> changes is harder than it should be.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6441) FetchRequest populates buffer of size MinBytes, even if response is smaller

2018-01-10 Thread Ivan Babrou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321716#comment-16321716
 ] 

Ivan Babrou commented on KAFKA-6441:


With 0.10.2.0 consumer API Sarama is able to get multiple messages in one 
FetchResponse.

It doesn't seem right to get only one with 0.11.0.0 API.

> FetchRequest populates buffer of size MinBytes, even if response is smaller
> ---
>
> Key: KAFKA-6441
> URL: https://issues.apache.org/jira/browse/KAFKA-6441
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.1
>Reporter: Ivan Babrou
>
> We're using Sarama Go client as consumer, but I don't think it's relevant. 
> Producer is syslog-ng with Kafka output, I'm not quite sure which log format 
> Kafka itself is using, but I can assume 0.11.0.0, because that's what is set 
> in topic settings.
> Our FetchRequest has minSize = 16MB, maxSize = 64, maxWait = 500ms. For a 
> silly reason, Kafka decides to reply with at least minSize buffer with just 
> one 1KB log message. When Sarama was using older consumer API, everything was 
> okay. When we upgraded to 0.11.0.0 consumer API, consumer traffic for 
> 125Mbit/s topic spiked to 55000Mbit/s on the wire and consumer wasn't even 
> able to keep up.
> 1KB message in a 16MB buffer is 1,600,000% overhead.
> I don't think there's any valid reason to do this.
> It's also mildly annoying that there is no tag 0.11.0.1 in git, looking at 
> changes is harder than it should be.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)