[jira] [Created] (KAFKA-5465) FetchResponse v0 does not return any messages when max_bytes smaller than v2 message set

2017-06-17 Thread Dana Powers (JIRA)
Dana Powers created KAFKA-5465:
--

 Summary: FetchResponse v0 does not return any messages when 
max_bytes smaller than v2 message set 
 Key: KAFKA-5465
 URL: https://issues.apache.org/jira/browse/KAFKA-5465
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.11.0.0
Reporter: Dana Powers
Priority: Minor


In prior releases, when consuming uncompressed messages FetchResponse v0 will 
returns a message if it is smaller than the max_bytes sent in the FetchRequest. 
In 0.11.0.0 RC0, when messages are stored as v2 internally, the response will 
be empty unless the full message set is smaller than max_bytes. In some 
configurations, this may cause some old consumers to get stuck on large 
messages where previously they were able to make progress one message at a time.

For example, when I produce 10 5KB messages using ProduceRequest v0 and then 
attempt FetchRequest v0 with partition max bytes = 6KB (larger than a single 
message but smaller than all 10 messages together), I get an empty message set 
from 0.11.0.0. Previous brokers would have returned a single message.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3878) Support exponential backoff for broker reconnect attempts

2016-06-20 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339941#comment-15339941
 ] 

Dana Powers commented on KAFKA-3878:


Proposed implementation at https://github.com/apache/kafka/pull/1523

> Support exponential backoff for broker reconnect attempts
> -
>
> Key: KAFKA-3878
> URL: https://issues.apache.org/jira/browse/KAFKA-3878
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, network
>Reporter: Dana Powers
>
> The client currently uses a constant backoff policy, configured via 
> 'reconnect.backoff.ms' . To reduce network load during longer broker outages, 
> it would be useful to support an optional exponentially increasing backoff 
> policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3878) Support exponential backoff for broker reconnect attempts

2016-06-20 Thread Dana Powers (JIRA)
Dana Powers created KAFKA-3878:
--

 Summary: Support exponential backoff for broker reconnect attempts
 Key: KAFKA-3878
 URL: https://issues.apache.org/jira/browse/KAFKA-3878
 Project: Kafka
  Issue Type: Improvement
  Components: clients, network
Reporter: Dana Powers


The client currently uses a constant backoff policy, configured via 
'reconnect.backoff.ms' . To reduce network load during longer broker outages, 
it would be useful to support an optional exponentially increasing backoff 
policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3615) Exclude test jars in CLASSPATH of kafka-run-class.sh

2016-04-30 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15265561#comment-15265561
 ] 

Dana Powers commented on KAFKA-3615:


This PR has a bug that breaks the classpath setup for bin scripts in the rc2 
release. Should we reopen this and follow up, or open a new issue?

> Exclude test jars in CLASSPATH of kafka-run-class.sh
> 
>
> Key: KAFKA-3615
> URL: https://issues.apache.org/jira/browse/KAFKA-3615
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin, build
>Affects Versions: 0.10.0.0
>Reporter: Liquan Pei
>Assignee: Liquan Pei
>  Labels: newbie
> Fix For: 0.10.1.0, 0.10.0.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3550) Broker does not honor MetadataRequest api version; always returns v0 MetadataResponse

2016-04-12 Thread Dana Powers (JIRA)
Dana Powers created KAFKA-3550:
--

 Summary: Broker does not honor MetadataRequest api version; always 
returns v0 MetadataResponse
 Key: KAFKA-3550
 URL: https://issues.apache.org/jira/browse/KAFKA-3550
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.9.0.1, 0.8.2.2, 0.8.1.1, 0.8.0
Reporter: Dana Powers


To reproduce:

Send a MetadataRequest (api key 3) with incorrect api version (e.g., 1234).
The expected behavior is for the broker to reject the request as unrecognized.
Broker (incorrectly) responds with MetadataResponse v0.

The problem here is that any request for a "new" MetadataRequest (i.e., KIP-4) 
sent to an old broker will generate an incorrect response.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3160) Kafka LZ4 framing code miscalculates header checksum

2016-04-10 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15234221#comment-15234221
 ] 

Dana Powers commented on KAFKA-3160:


Magnus: have you made any progress on this? The more I think about it, the more 
I think this needs to get included w/ KIP-31. If the goal of KIP-31 is to avoid 
recompression, and the goal of this JIRA is to fix the compression format, and 
in all cases we need to maintain compatibility with old clients, then I think 
the only way to solve all conditions is to make the pre-KIP-31 FetchRequest / 
ProduceRequest versions use the broken LZ4 format, and require the fixed format 
in the new FetchRequest / ProduceRequest version:

Old 0.8/0.9 clients (current behavior): produce messages w/ broken checksum; 
consume messages w/ incorrect checksum only
New 0.10 clients (proposed behavior): produce messages in "new KIP-31 format" 
w/ correct checksum; consume messages in "new KIP-31 format" w/ correct 
checksum only

Proposed behavior for 0.10 broker:
 - convert all "old format" messages to "new KIP-31 format" + fix checksum to 
correct value
 - require incoming "new KIP-31 format" messages to have correct checksum, 
otherwise throw error
 - when serving requests for "old format", fixup checksum to be incorrect when 
converting "new KIP-31 format" messages to old format

Thoughts?

> Kafka LZ4 framing code miscalculates header checksum
> 
>
> Key: KAFKA-3160
> URL: https://issues.apache.org/jira/browse/KAFKA-3160
> Project: Kafka
>  Issue Type: Bug
>  Components: compression
>Affects Versions: 0.8.2.0, 0.8.2.1, 0.9.0.0, 0.8.2.2, 0.9.0.1
>Reporter: Dana Powers
>Assignee: Magnus Edenhill
>  Labels: compatibility, compression, lz4
>
> KAFKA-1493 partially implements the LZ4 framing specification, but it 
> incorrectly calculates the header checksum. This causes 
> KafkaLZ4BlockInputStream to raise an error 
> [IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends *correctly* framed 
> LZ4 data. It also causes KafkaLZ4BlockOutputStream to generate incorrectly 
> framed LZ4 data, which means clients decoding LZ4 messages from kafka will 
> always receive incorrectly framed data.
> Specifically, the current implementation includes the 4-byte MagicNumber in 
> the checksum, which is incorrect.
> http://cyan4973.github.io/lz4/lz4_Frame_format.html
> Third-party clients that attempt to use off-the-shelf lz4 framing find that 
> brokers reject messages as having a corrupt checksum. So currently non-java 
> clients must 'fixup' lz4 packets to deal with the broken checksum.
> Magnus first identified this issue in librdkafka; kafka-python has the same 
> problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3160) Kafka LZ4 framing code miscalculates header checksum

2016-04-10 Thread Dana Powers (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dana Powers updated KAFKA-3160:
---
Description: 
KAFKA-1493 partially implements the LZ4 framing specification, but it 
incorrectly calculates the header checksum. This causes 
KafkaLZ4BlockInputStream to raise an error 
[IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends *correctly* framed 
LZ4 data. It also causes KafkaLZ4BlockOutputStream to generate incorrectly 
framed LZ4 data, which means clients decoding LZ4 messages from kafka will 
always receive incorrectly framed data.

Specifically, the current implementation includes the 4-byte MagicNumber in the 
checksum, which is incorrect.
http://cyan4973.github.io/lz4/lz4_Frame_format.html

Third-party clients that attempt to use off-the-shelf lz4 framing find that 
brokers reject messages as having a corrupt checksum. So currently non-java 
clients must 'fixup' lz4 packets to deal with the broken checksum.

Magnus first identified this issue in librdkafka; kafka-python has the same 
problem.

  was:
KAFKA-1493 partially implements the LZ4 framing specification, but it 
incorrectly calculates the header checksum. This causes 
KafkaLZ4BlockInputStream to raise an error 
[IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends *correctly* framed 
LZ4 data. It also causes the kafka broker to always return incorrectly framed 
LZ4 data to clients.

Specifically, the current implementation includes the 4-byte MagicNumber in the 
checksum, which is incorrect.
http://cyan4973.github.io/lz4/lz4_Frame_format.html

Third-party clients that attempt to use off-the-shelf lz4 framing find that 
brokers reject messages as having a corrupt checksum. So currently non-java 
clients must 'fixup' lz4 packets to deal with the broken checksum.

Magnus first identified this issue in librdkafka; kafka-python has the same 
problem.


> Kafka LZ4 framing code miscalculates header checksum
> 
>
> Key: KAFKA-3160
> URL: https://issues.apache.org/jira/browse/KAFKA-3160
> Project: Kafka
>  Issue Type: Bug
>  Components: compression
>Affects Versions: 0.8.2.0, 0.8.2.1, 0.9.0.0, 0.8.2.2, 0.9.0.1
>Reporter: Dana Powers
>Assignee: Magnus Edenhill
>  Labels: compatibility, compression, lz4
>
> KAFKA-1493 partially implements the LZ4 framing specification, but it 
> incorrectly calculates the header checksum. This causes 
> KafkaLZ4BlockInputStream to raise an error 
> [IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends *correctly* framed 
> LZ4 data. It also causes KafkaLZ4BlockOutputStream to generate incorrectly 
> framed LZ4 data, which means clients decoding LZ4 messages from kafka will 
> always receive incorrectly framed data.
> Specifically, the current implementation includes the 4-byte MagicNumber in 
> the checksum, which is incorrect.
> http://cyan4973.github.io/lz4/lz4_Frame_format.html
> Third-party clients that attempt to use off-the-shelf lz4 framing find that 
> brokers reject messages as having a corrupt checksum. So currently non-java 
> clients must 'fixup' lz4 packets to deal with the broken checksum.
> Magnus first identified this issue in librdkafka; kafka-python has the same 
> problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3442) FetchResponse size exceeds max.partition.fetch.bytes

2016-03-23 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208721#comment-15208721
 ] 

Dana Powers commented on KAFKA-3442:


+1. verified test passes on trunk commit `7af67ce` 

> FetchResponse size exceeds max.partition.fetch.bytes
> 
>
> Key: KAFKA-3442
> URL: https://issues.apache.org/jira/browse/KAFKA-3442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Dana Powers
>Assignee: Jiangjie Qin
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> Produce 1 byte message to topic foobar
> Fetch foobar w/ max.partition.fetch.bytes=1024
> Test expects to receive a truncated message (~1024 bytes). 0.8 and 0.9 pass 
> this test, but 0.10 FetchResponse has full message, exceeding the max 
> specified in the FetchRequest.
> I tested with v0 and v1 apis, both fail. Have not tested w/ v2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3442) FetchResponse size exceeds max.partition.fetch.bytes

2016-03-22 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207142#comment-15207142
 ] 

Dana Powers commented on KAFKA-3442:


sounds good to me!

> FetchResponse size exceeds max.partition.fetch.bytes
> 
>
> Key: KAFKA-3442
> URL: https://issues.apache.org/jira/browse/KAFKA-3442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Dana Powers
>Assignee: Jiangjie Qin
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> Produce 1 byte message to topic foobar
> Fetch foobar w/ max.partition.fetch.bytes=1024
> Test expects to receive a truncated message (~1024 bytes). 0.8 and 0.9 pass 
> this test, but 0.10 FetchResponse has full message, exceeding the max 
> specified in the FetchRequest.
> I tested with v0 and v1 apis, both fail. Have not tested w/ v2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3442) FetchResponse size exceeds max.partition.fetch.bytes

2016-03-22 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206876#comment-15206876
 ] 

Dana Powers commented on KAFKA-3442:


[~junrao] Actually I meant adding the error code to v0 and v1 responses. I 
think that would be helpful if the response behavior changes to no longer 
include partial messages. But if behavior doesn't change for v0/v1 then agree 
it is probably better to defer to future release.

> FetchResponse size exceeds max.partition.fetch.bytes
> 
>
> Key: KAFKA-3442
> URL: https://issues.apache.org/jira/browse/KAFKA-3442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Dana Powers
>Assignee: Jiangjie Qin
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> Produce 1 byte message to topic foobar
> Fetch foobar w/ max.partition.fetch.bytes=1024
> Test expects to receive a truncated message (~1024 bytes). 0.8 and 0.9 pass 
> this test, but 0.10 FetchResponse has full message, exceeding the max 
> specified in the FetchRequest.
> I tested with v0 and v1 apis, both fail. Have not tested w/ v2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3442) FetchResponse size exceeds max.partition.fetch.bytes

2016-03-22 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206790#comment-15206790
 ] 

Dana Powers commented on KAFKA-3442:


yes, kafka-python uses the same check for partial messages as the java client 
and will raise a RecordTooLargeException to user if there is only a partial 
message. I think it would be best if the 0.10 broker continued to return 
partial messages, but since the protocol spec refers to this behavior as an 
optimization then I think it would be acceptable to change the behavior and 
return empty payload. Would it be possible to include an error_code with the 
empty payload?

> FetchResponse size exceeds max.partition.fetch.bytes
> 
>
> Key: KAFKA-3442
> URL: https://issues.apache.org/jira/browse/KAFKA-3442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Dana Powers
>Assignee: Jiangjie Qin
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> Produce 1 byte message to topic foobar
> Fetch foobar w/ max.partition.fetch.bytes=1024
> Test expects to receive a truncated message (~1024 bytes). 0.8 and 0.9 pass 
> this test, but 0.10 FetchResponse has full message, exceeding the max 
> specified in the FetchRequest.
> I tested with v0 and v1 apis, both fail. Have not tested w/ v2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3442) FetchResponse size exceeds max.partition.fetch.bytes

2016-03-21 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205655#comment-15205655
 ] 

Dana Powers commented on KAFKA-3442:


In all prior broker releases clients check the response for a "partial" message 
-- i.e. the MessageSetSize is less than the MessageSize. This follows this 
statement in the protocol wiki:

"As an optimization the server is allowed to return a partial message at the 
end of the message set. Clients should handle this case."

So in this case the client is checking for a partial message, not an error code.

> FetchResponse size exceeds max.partition.fetch.bytes
> 
>
> Key: KAFKA-3442
> URL: https://issues.apache.org/jira/browse/KAFKA-3442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Dana Powers
>Assignee: Jiangjie Qin
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> Produce 1 byte message to topic foobar
> Fetch foobar w/ max.partition.fetch.bytes=1024
> Test expects to receive a truncated message (~1024 bytes). 0.8 and 0.9 pass 
> this test, but 0.10 FetchResponse has full message, exceeding the max 
> specified in the FetchRequest.
> I tested with v0 and v1 apis, both fail. Have not tested w/ v2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-3442) FetchResponse size exceeds max.partition.fetch.bytes

2016-03-21 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205411#comment-15205411
 ] 

Dana Powers edited comment on KAFKA-3442 at 3/22/16 12:13 AM:
--

{code}
# clone kafka-python repo
git clone https://github.com/dpkp/kafka-python.git

# install kafka fixture binaries
tar xzvf kafka_2.10-0.10.0.0.tgz -C servers/0.10.0.0/
mv servers/0.10.0.0/kafka_2.10-0.10.0.0 servers/0.10.0.0/kafka-bin

# you can install other versions of kafka to compare test results
KAFKA_VERSION=0.9.0.1 ./build_integration.sh

# install python test harness
pip install tox

# run just the failing test [replace py27 w/ py## as needed: 
py26,py27,py33,py34,py35]
KAFKA_VERSION=0.10.0.0 tox -e py27 
test/test_consumer_integration.py::TestConsumerIntegration::test_huge_messages
{code}


was (Author: dana.powers):
# clone kafka-python repo
git clone https://github.com/dpkp/kafka-python.git

# install kafka fixture binaries
tar xzvf kafka_2.10-0.10.0.0.tgz -C servers/0.10.0.0/
mv servers/0.10.0.0/kafka_2.10-0.10.0.0 servers/0.10.0.0/kafka-bin

# you can install other versions of kafka to compare test results
KAFKA_VERSION=0.9.0.1 ./build_integration.sh

# install python test harness
pip install tox

# run just the failing test [replace py27 w/ py## as needed: 
py26,py27,py33,py34,py35]
KAFKA_VERSION=0.10.0.0 tox -e py27 
test/test_consumer_integration.py::TestConsumerIntegration::test_huge_messages

> FetchResponse size exceeds max.partition.fetch.bytes
> 
>
> Key: KAFKA-3442
> URL: https://issues.apache.org/jira/browse/KAFKA-3442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Dana Powers
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> Produce 1 byte message to topic foobar
> Fetch foobar w/ max.partition.fetch.bytes=1024
> Test expects to receive a truncated message (~1024 bytes). 0.8 and 0.9 pass 
> this test, but 0.10 FetchResponse has full message, exceeding the max 
> specified in the FetchRequest.
> I tested with v0 and v1 apis, both fail. Have not tested w/ v2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3442) FetchResponse size exceeds max.partition.fetch.bytes

2016-03-21 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205411#comment-15205411
 ] 

Dana Powers commented on KAFKA-3442:


# clone kafka-python repo
git clone https://github.com/dpkp/kafka-python.git

# install kafka fixture binaries
tar xzvf kafka_2.10-0.10.0.0.tgz -C servers/0.10.0.0/
mv servers/0.10.0.0/kafka_2.10-0.10.0.0 servers/0.10.0.0/kafka-bin

# you can install other versions of kafka to compare test results
KAFKA_VERSION=0.9.0.1 ./build_integration.sh

# install python test harness
pip install tox

# run just the failing test [replace py27 w/ py## as needed: 
py26,py27,py33,py34,py35]
KAFKA_VERSION=0.10.0.0 tox -e py27 
test/test_consumer_integration.py::TestConsumerIntegration::test_huge_messages

> FetchResponse size exceeds max.partition.fetch.bytes
> 
>
> Key: KAFKA-3442
> URL: https://issues.apache.org/jira/browse/KAFKA-3442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Dana Powers
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> Produce 1 byte message to topic foobar
> Fetch foobar w/ max.partition.fetch.bytes=1024
> Test expects to receive a truncated message (~1024 bytes). 0.8 and 0.9 pass 
> this test, but 0.10 FetchResponse has full message, exceeding the max 
> specified in the FetchRequest.
> I tested with v0 and v1 apis, both fail. Have not tested w/ v2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3442) FetchResponse size exceeds max.partition.fetch.bytes

2016-03-21 Thread Dana Powers (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dana Powers updated KAFKA-3442:
---
Priority: Blocker  (was: Major)

> FetchResponse size exceeds max.partition.fetch.bytes
> 
>
> Key: KAFKA-3442
> URL: https://issues.apache.org/jira/browse/KAFKA-3442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Dana Powers
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> Produce 1 byte message to topic foobar
> Fetch foobar w/ max.partition.fetch.bytes=1024
> Test expects to receive a truncated message (~1024 bytes). 0.8 and 0.9 pass 
> this test, but 0.10 FetchResponse has full message, exceeding the max 
> specified in the FetchRequest.
> I tested with v0 and v1 apis, both fail. Have not tested w/ v2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3442) FetchResponse size exceeds max.partition.fetch.bytes

2016-03-21 Thread Dana Powers (JIRA)
Dana Powers created KAFKA-3442:
--

 Summary: FetchResponse size exceeds max.partition.fetch.bytes
 Key: KAFKA-3442
 URL: https://issues.apache.org/jira/browse/KAFKA-3442
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.10.0.0
Reporter: Dana Powers


Produce 1 byte message to topic foobar
Fetch foobar w/ max.partition.fetch.bytes=1024

Test expects to receive a truncated message (~1024 bytes). 0.8 and 0.9 pass 
this test, but 0.10 FetchResponse has full message, exceeding the max specified 
in the FetchRequest.

I tested with v0 and v1 apis, both fail. Have not tested w/ v2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3200) MessageSet from broker seems invalid

2016-02-03 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131142#comment-15131142
 ] 

Dana Powers commented on KAFKA-3200:


the minimum Message payload size is 26 bytes (8 offset, 4 size, 14 for an 
'empty' message), so generally I would break if there are less than 26 bytes 
left and then also break if the decoded size is larger than the remaining 
buffer.

for reference, the code I wrote to handle message set decoding in kafka-python 
is here: 
https://github.com/dpkp/kafka-python/blob/master/kafka/protocol/message.py#L123-L158

> MessageSet from broker seems invalid
> 
>
> Key: KAFKA-3200
> URL: https://issues.apache.org/jira/browse/KAFKA-3200
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0
> Environment: Linux,  running Oracle JVM 1.8
>Reporter: Rajiv Kurian
>
> I am writing a java consumer client for Kafka and using the protocol guide at 
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
>  to parse buffers. I am currently running into a problem parsing certain 
> fetch responses. Many times it works fine but some other times it does not. 
> It might just be a bug with my implementation in which case I apologize.
> My messages are uncompressed and exactly 23 bytes in length and has null 
> keys. So each Message in my MessageSet is exactly size 4 (crc) + 
> 1(magic_bytes) + 1 (attributes) + 4(key_num_bytes) + 0 (key is null) + 
> 4(num_value_bytes) + 23(value_bytes) = 37 bytes.
> So each element of the MessageSet itself is exactly 37 (size of message) + 8 
> (offset) + 4 (message_size) = 49 bytes.
> In comparison an empty message set element should be of size 8 (offset) + 4 
> (message_size) + 4 (crc) + 1(magic_bytes) + 1 (attributes) + 4(key_num_bytes) 
> + 0 (key is null) + 4(num_value_bytes) + 0(value is null)  = 26 bytes
> I occasionally receive a MessageSet which says size is 1000. A size of 1000 
> is not divisible by my MessageSet element size which is 49 bytes. When I 
> parse such a message set I can actually read 20 of message set elements(49 
> bytes) which is 980 bytes. I have 20 extra bytes to parse now which is 
> actually less than even an empty message (26 bytes). At this moment I don't 
> know how to parse the messages any more.
> I will attach a file for a response that can actually cause me to run into 
> this problem as well as the sample ccde.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3177) Kafka consumer can hang when position() is called on a non-existing partition.

2016-02-01 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127701#comment-15127701
 ] 

Dana Powers commented on KAFKA-3177:


A similar infinite loop happens when the partition exists but has no leader b/c 
it is under-replicated. In that case, Fetcher.listOffset infinitely retries on 
the leaderNotAvailableError returned by sendListOffsetRequest.

> Kafka consumer can hang when position() is called on a non-existing partition.
> --
>
> Key: KAFKA-3177
> URL: https://issues.apache.org/jira/browse/KAFKA-3177
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> This can be easily reproduced as following:
> {code}
> {
> ...
> consumer.assign(SomeNonExsitingTopicParition);
> consumer.position();
> ...
> }
> {code}
> It seems when position is called we will try to do the following:
> 1. Fetch committed offsets.
> 2. If there is no committed offsets, try to reset offset using reset 
> strategy. in sendListOffsetRequest(), if the consumer does not know the 
> TopicPartition, it will refresh its metadata and retry. In this case, because 
> the partition does not exist, we fall in to the infinite loop of refreshing 
> topic metadata.
> Another orthogonal issue is that if the topic in the above code piece does 
> not exist, position() call will actually create the topic due to the fact 
> that currently topic metadata request could automatically create the topic. 
> This is a known separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1493) Use a well-documented LZ4 compression format and remove redundant LZ4HC option

2016-01-27 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15119703#comment-15119703
 ] 

Dana Powers commented on KAFKA-1493:


Hi all - it appears that the header checksum (HC) byte is incorrect. The Kafka 
implementation hashes the magic bytes + header, but the spec is to only hash 
header (don't include magic).

We are having some trouble encoding/decoding from non-java clients because the 
framing must be munged before reading / writing to kafka. Is this known? I 
don't see another JIRA for it. Should I file separately or should this be 
reopened?

> Use a well-documented LZ4 compression format and remove redundant LZ4HC option
> --
>
> Key: KAFKA-1493
> URL: https://issues.apache.org/jira/browse/KAFKA-1493
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.2.0
>Reporter: James Oliver
>Assignee: James Oliver
>Priority: Blocker
> Fix For: 0.8.2.0
>
> Attachments: KAFKA-1493.patch, KAFKA-1493.patch, 
> KAFKA-1493_2014-10-16_13:49:34.patch, KAFKA-1493_2014-10-16_21:25:23.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1493) Use a well-documented LZ4 compression format and remove redundant LZ4HC option

2016-01-27 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120072#comment-15120072
 ] 

Dana Powers commented on KAFKA-1493:


filed KAFKA-3160

> Use a well-documented LZ4 compression format and remove redundant LZ4HC option
> --
>
> Key: KAFKA-1493
> URL: https://issues.apache.org/jira/browse/KAFKA-1493
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.2.0
>Reporter: James Oliver
>Assignee: James Oliver
>Priority: Blocker
> Fix For: 0.8.2.0
>
> Attachments: KAFKA-1493.patch, KAFKA-1493.patch, 
> KAFKA-1493_2014-10-16_13:49:34.patch, KAFKA-1493_2014-10-16_21:25:23.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3160) Kafka LZ4 framing code miscalculates header checksum

2016-01-27 Thread Dana Powers (JIRA)
Dana Powers created KAFKA-3160:
--

 Summary: Kafka LZ4 framing code miscalculates header checksum
 Key: KAFKA-3160
 URL: https://issues.apache.org/jira/browse/KAFKA-3160
 Project: Kafka
  Issue Type: Bug
  Components: compression
Affects Versions: 0.9.0.0, 0.8.2.1, 0.8.2.0, 0.8.2.2
Reporter: Dana Powers


KAFKA-1493 implements the LZ4 framing specification, but it incorrectly 
calculates the header checksum. Specifically, the current implementation 
includes the 4-byte MagicNumber in the checksum, which is incorrect.
http://cyan4973.github.io/lz4/lz4_Frame_format.html

Third-party clients that attempt to use off-the-shelf lz4 framing find that 
brokers reject messages as having a corrupt checksum. So currently non-java 
clients must 'fixup' lz4 packets to deal with the broken checksum.

Magnus first identified this issue in librdkafka; kafka-python has the same 
problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3088) 0.9.0.0 broker crash on receipt of produce request with empty client ID

2016-01-21 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111020#comment-15111020
 ] 

Dana Powers commented on KAFKA-3088:


But client ids are not globally unique, even in the java implementation, right? 
Start 2 consumers in different processes / different servers, and you'll get 
two identical client-ids (consumer-1) as I understand that code. Also note that 
a user-supplied client-id does not get any incremented value added, so all 
metrics get blended in that case. And most third-party clients that set a 
client-id dont attempt to add a unique incremented number at all. So I don't 
think option-2 adds much value.

Another alternative to consider is skipping client metrics if there is no 
client-id.

> 0.9.0.0 broker crash on receipt of produce request with empty client ID
> ---
>
> Key: KAFKA-3088
> URL: https://issues.apache.org/jira/browse/KAFKA-3088
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.9.0.0
>Reporter: Dave Peterson
>Assignee: Jun Rao
>
> Sending a produce request with an empty client ID to a 0.9.0.0 broker causes 
> the broker to crash as shown below.  More details can be found in the 
> following email thread:
> http://mail-archives.apache.org/mod_mbox/kafka-users/201601.mbox/%3c5693ecd9.4050...@dspeterson.com%3e
>[2016-01-10 23:03:44,957] ERROR [KafkaApi-3] error when handling request 
> Name: ProducerRequest; Version: 0; CorrelationId: 1; ClientId: null; 
> RequiredAcks: 1; AckTimeoutMs: 1 ms; TopicAndPartition: [topic_1,3] -> 37 
> (kafka.server.KafkaApis)
>java.lang.NullPointerException
>   at 
> org.apache.kafka.common.metrics.JmxReporter.getMBeanName(JmxReporter.java:127)
>   at 
> org.apache.kafka.common.metrics.JmxReporter.addAttribute(JmxReporter.java:106)
>   at 
> org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:76)
>   at 
> org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:288)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:177)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:162)
>   at 
> kafka.server.ClientQuotaManager.getOrCreateQuotaSensors(ClientQuotaManager.scala:209)
>   at 
> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:111)
>   at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$2(KafkaApis.scala:353)
>   at 
> kafka.server.KafkaApis$$anonfun$handleProducerRequest$1.apply(KafkaApis.scala:371)
>   at 
> kafka.server.KafkaApis$$anonfun$handleProducerRequest$1.apply(KafkaApis.scala:371)
>   at 
> kafka.server.ReplicaManager.appendMessages(ReplicaManager.scala:348)
>   at kafka.server.KafkaApis.handleProducerRequest(KafkaApis.scala:366)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:68)
>   at 
> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1649) Protocol documentation does not indicate that ReplicaNotAvailable can be ignored

2015-01-14 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277690#comment-14277690
 ] 

Dana Powers commented on KAFKA-1649:


I think this could be a problem -- it is not updated in the wiki and there 
appears to be a server-side change between 0.8.1.1 and 0.8.2.0 that could 
expose third-party clients to uncaught errors, assuming that the client 
authors/maintainers were not aware that the ReplicaNotAvailable Error should be 
ignored.

 Protocol documentation does not indicate that ReplicaNotAvailable can be 
 ignored
 

 Key: KAFKA-1649
 URL: https://issues.apache.org/jira/browse/KAFKA-1649
 Project: Kafka
  Issue Type: Improvement
  Components: website
Affects Versions: 0.8.1.1
Reporter: Hernan Rivas Inaka
Priority: Minor
  Labels: protocol-documentation
   Original Estimate: 10m
  Remaining Estimate: 10m

 The protocol documentation here 
 https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-ErrorCodes
  should indicate that error 9 (ReplicaNotAvailable) can be safely ignored on 
 producers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1649) Protocol documentation does not indicate that ReplicaNotAvailable can be ignored

2015-01-14 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277807#comment-14277807
 ] 

Dana Powers commented on KAFKA-1649:


I am only testing from the wire-protocol level.  Running a broker failure test 
with 2 brokers, 1 topic w/ num.partitions=2 and default.replication.factor=2 .  
Send 100 random messages directly to partition 0, kill the leader for partition 
0, attempt to write messages to partition 0, with retries and metadata reloads.

Running the test against 0.8.2.0 returns ReplicaNotAvailable error code in the 
PartitionMetadata, whereas 0.8.1.1 does not.

This is the metadata w/ both brokers up:
Topic metadata: [TopicMetadata(topic='test_switch_leader-qkUBJTZGLA', error=0, 
partitions=[PartitionMetadata(topic='test_switch_leader-qkUBJTZGLA', 
partition=1, leader=1, replicas=(1, 0), isr=(1, 0), error=0), 
PartitionMetadata(topic='test_switch_leader-qkUBJTZGLA', partition=0, leader=0, 
replicas=(0, 1), isr=(0, 1), error=0)])]

And this is the metadata after killing one broker (0.8.2.0):
Topic metadata: [TopicMetadata(topic='test_switch_leader-qkUBJTZGLA', error=0, 
partitions=[PartitionMetadata(topic='test_switch_leader-qkUBJTZGLA', 
partition=1, leader=1, replicas=(1,), isr=(1,), error=9), 
PartitionMetadata(topic='test_switch_leader-qkUBJTZGLA', partition=0, leader=1, 
replicas=(1,), isr=(1,), error=9)])]

The 0.8.1.1 output is slightly different -- and significantly no error in 
PartitionMetadata
Before killing partition 0 leader (broker 1):
Topic metadata: [TopicMetadata(topic='test_switch_leader-eMbCMlVrOC', error=0, 
partitions=[PartitionMetadata(topic='test_switch_leader-eMbCMlVrOC', 
partition=0, leader=1, replicas=(1, 0), isr=(1, 0), error=0), 
PartitionMetadata(topic='test_switch_leader-eMbCMlVrOC', partition=1, leader=0, 
replicas=(0, 1), isr=(0, 1), error=0)])]

After killing partition 0 leader:
Topic metadata: [TopicMetadata(topic='test_switch_leader-eMbCMlVrOC', error=0, 
partitions=[PartitionMetadata(topic='test_switch_leader-eMbCMlVrOC', 
partition=0, leader=0, replicas=(1, 0), isr=(0,), error=0), 
PartitionMetadata(topic='test_switch_leader-eMbCMlVrOC', partition=1, leader=0, 
replicas=(0, 1), isr=(0, 1), error=0)])]

 Protocol documentation does not indicate that ReplicaNotAvailable can be 
 ignored
 

 Key: KAFKA-1649
 URL: https://issues.apache.org/jira/browse/KAFKA-1649
 Project: Kafka
  Issue Type: Improvement
  Components: website
Affects Versions: 0.8.1.1
Reporter: Hernan Rivas Inaka
Priority: Minor
  Labels: protocol-documentation
   Original Estimate: 10m
  Remaining Estimate: 10m

 The protocol documentation here 
 https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-ErrorCodes
  should indicate that error 9 (ReplicaNotAvailable) can be safely ignored on 
 producers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1841) OffsetCommitRequest API - timestamp field is not versioned

2015-01-05 Thread Dana Powers (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dana Powers updated KAFKA-1841:
---
Description: 
Timestamp field was added to the OffsetCommitRequest wire protocol api for 
0.8.2 by KAFKA-1012 .  The 0.8.1.1 server does not support the timestamp field, 
so I think the api version of OffsetCommitRequest should be incremented and 
checked by the 0.8.2 kafka server before attempting to read a timestamp from 
the network buffer in OffsetCommitRequest.readFrom 
(core/src/main/scala/kafka/api/OffsetCommitRequest.scala)

It looks like a subsequent patch (KAFKA-1462) added another api change to 
support a new constructor w/ params generationId and consumerId, calling that 
version 1, and a pending patch (KAFKA-1634) adds retentionMs as another field, 
while possibly removing timestamp altogether, calling this version 2.  So the 
fix here is not straightforward enough for me to submit a patch.

This could possibly be merged into KAFKA-1634, but opening as a separate Issue 
because I believe the lack of versioning in the current trunk should block 
0.8.2 release.

  was:
Timestamp field was added to the OffsetCommitRequest wire protocol api for 
0.8.2 by KAFKA-1012 .  The 0.8.1.1 server does not support the timestamp field, 
so I think the api version of OffsetCommitRequest should be incremented and 
checked by the 0.8.2 kafka server before attempting to read a timestamp from 
the network buffer in OffsetCommitRequest.readFrom 
(core/src/main/scala/kafka/api/OffsetCommitRequest.scala)

It looks like a subsequent patch (kafka-1462) added another api change to 
support a new constructor w/ params generationId and consumerId, calling that 
version 1, and a pending patch (kafka-1634) adds retentionMs as another field, 
while possibly removing timestamp altogether, calling this version 2.  So the 
fix here is not straightforward enough for me to submit a patch.

This could possibly be merged into KAFKA-1634, but opening as a separate Issue 
because I believe the lack of versioning in the current trunk should block 
0.8.2 release.


 OffsetCommitRequest API - timestamp field is not versioned
 --

 Key: KAFKA-1841
 URL: https://issues.apache.org/jira/browse/KAFKA-1841
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.2
 Environment: wire-protocol
Reporter: Dana Powers
Priority: Blocker

 Timestamp field was added to the OffsetCommitRequest wire protocol api for 
 0.8.2 by KAFKA-1012 .  The 0.8.1.1 server does not support the timestamp 
 field, so I think the api version of OffsetCommitRequest should be 
 incremented and checked by the 0.8.2 kafka server before attempting to read a 
 timestamp from the network buffer in OffsetCommitRequest.readFrom 
 (core/src/main/scala/kafka/api/OffsetCommitRequest.scala)
 It looks like a subsequent patch (KAFKA-1462) added another api change to 
 support a new constructor w/ params generationId and consumerId, calling that 
 version 1, and a pending patch (KAFKA-1634) adds retentionMs as another 
 field, while possibly removing timestamp altogether, calling this version 2.  
 So the fix here is not straightforward enough for me to submit a patch.
 This could possibly be merged into KAFKA-1634, but opening as a separate 
 Issue because I believe the lack of versioning in the current trunk should 
 block 0.8.2 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1634) Improve semantics of timestamp in OffsetCommitRequests and update documentation

2015-01-05 Thread Dana Powers (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265042#comment-14265042
 ] 

Dana Powers commented on KAFKA-1634:


possibly related to this JIRA: KAFKA-1841 .  The timestamp field itself was not 
in the released api version 0 and if it is to be included in 0.8.2 (this JIRA 
suggests it is, but to be removed in 0.8.3 ?) then I think it will need to be 
versioned.

 Improve semantics of timestamp in OffsetCommitRequests and update 
 documentation
 ---

 Key: KAFKA-1634
 URL: https://issues.apache.org/jira/browse/KAFKA-1634
 Project: Kafka
  Issue Type: Bug
Reporter: Neha Narkhede
Assignee: Guozhang Wang
Priority: Blocker
 Fix For: 0.8.3

 Attachments: KAFKA-1634.patch, KAFKA-1634_2014-11-06_15:35:46.patch, 
 KAFKA-1634_2014-11-07_16:54:33.patch, KAFKA-1634_2014-11-17_17:42:44.patch, 
 KAFKA-1634_2014-11-21_14:00:34.patch, KAFKA-1634_2014-12-01_11:44:35.patch, 
 KAFKA-1634_2014-12-01_18:03:12.patch


 From the mailing list -
 following up on this -- I think the online API docs for OffsetCommitRequest
 still incorrectly refer to client-side timestamps:
 https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest
 Wasn't that removed and now always handled server-side now?  Would one of
 the devs mind updating the API spec wiki?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)