Kamal,
Thanks very much for your testing. I also tried script 
(kafka-console-producer.sh) provided by kafka and find it does work at this 
situation. 
The original testing I did is with the test program written by ourselves. I'll 
try to find the difference. 
Thanks for your help!

Regards,
Aggie
-----Original Message-----
From: Kamal C [mailto:kamaltar...@gmail.com] 
Sent: Tuesday, September 27, 2016 6:09 PM
To: users@kafka.apache.org
Subject: Re: producer can't push msg sometimes with 1 broker recoved

Aggie,

I'm not able to re-produce your behavior in 0.10.0.1.

> I did more testing and find the rule (Topic is created with
"--replication-factor 2 --partitions 1" in following case):
> node 1               node 2
> down(lead)           down (replica)
> down(replica)         up   (lead)              producer send fail !!!

When node 2 is up, after the metadata update producer able to connect and send 
messages to it.

Logs:

[2016-09-27T15:18:17,907] NetworkClient: handleDisconnections(): Node 1 
disconnected.
[2016-09-27T15:18:18,007] NetworkClient: initiateConnect(): Initiating 
connection to node 1 at localhost:9093.
[2016-09-27T15:18:18,008] Selector: pollSelectionKeys(): Connection with
localhost/127.0.0.1 disconnected
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_45]
    at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
~[?:1.8.0_45]
    at
org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
~[kafka-clients-0.10.0.1.jar:?]
    at
org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
~[kafka-clients-0.10.0.1.jar:?]
    at
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:309)
[kafka-clients-0.10.0.1.jar:?]
    at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
[kafka-clients-0.10.0.1.jar:?]
    at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
[kafka-clients-0.10.0.1.jar:?]
    at
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:229)
[kafka-clients-0.10.0.1.jar:?]
    at
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:134)
[kafka-clients-0.10.0.1.jar:?]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_45] 
[2016-09-27T15:18:18,008] NetworkClient: handleDisconnections(): Node 1 
disconnected.
[2016-09-27T15:18:18,043] NetworkClient: maybeUpdate(): Sending metadata 
request {topics=[hello]} to node 0 [2016-09-27T15:18:18,052] Metadata: 
update(): Updated cluster metadata version 4 to Cluster(nodes = 
[tcltest1.nmsworks.co.in:9092 (id: 0 rack:
null)], partitions = [Partition(topic = hello, partition = 0, leader = none, 
replicas = [0,1,], isr = []]) [2016-09-27T15:18:19,053] NetworkClient: 
maybeUpdate(): Sending metadata request {topics=[hello]} to node 0 
[2016-09-27T15:18:19,056] Metadata: update(): Updated cluster metadata version 
5 to Cluster(nodes = [tcltest1.nmsworks.co.in:9092 (id: 0 rack:
null)], partitions = [Partition(topic = hello, partition = 0, leader = 0, 
replicas = [0,1,], isr = [0,]]) [2016-09-27T15:18:19,081] KafkaProducer: 
main(): Batch : 4 sent [2016-09-27T15:18:19,182] KafkaProducer: main(): Batch : 
5, Sending the record with key : 0

- Kamal

On Mon, Sep 26, 2016 at 8:53 AM, FEI Aggie <aggie....@alcatel-lucent.com>
wrote:

> Kamal,
> Thanks for your response. I tried testing with metadata.max.age.ms 
> reduced to 10s, but the behavior not changed, and producer still can't 
> find the live broker.
>
> I did more testing and find the rule (Topic is created with 
> "--replication-factor 2 --partitions 1" in following case):
> node 1               node 2
> down(lead)           down (replica)
> down(replica)         up   (lead)              producer send fail !!!
>
>

> down(lead)           down (replica)
> up  (lead)           down (replica)             producer send ok !!!
>
> If the only node with original lead partition up, everything is fine.
> If the only node with original replica partition up, producer can't 
> connect to broker alive (always try to connect to the original lead 
> broker, node 1 in my case).
>
> Kafka can't recover for this situation? Anyone has clue for this?
>
> Thanks!
> Aggie
> -----Original Message-----
> From: Kamal C [mailto:kamaltar...@gmail.com]
> Sent: Saturday, September 24, 2016 1:37 PM
> To: users@kafka.apache.org
> Subject: Re: producer can't push msg sometimes with 1 broker recoved
>
> Reduce the metadata refresh interval 'metadata.max.age.ms' from 5 min 
> to your desired time interval.
> This may reduce the time window of non-availability broker.
>
> -- Kamal
>

Reply via email to