[ https://issues.apache.org/jira/browse/KAFKA-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248554#comment-15248554 ]
Greg Zoller commented on KAFKA-3568: ------------------------------------ Ok... I don't believe my test results, but I have repeated them. I found that build 9cfb99 (Mar 22) worked for me -- KafkaProducer didn't hang/time-out. I get meaningful commit metadata on send() callback and null as the exception, and it returned instantly as expected. The next-newer build, 73470b0, KafkaProducer send() calls would hang for 60 seconds then timeout--null for callback metadata and timeout exceptions populated. The only differences between the content of these two builds were documentation changes and a tiny config file change. So I started making the changes manually one at a time until I found the issue.... and I can't believe it. Build 9cfb99 removed two lines in config/server.properties: #advertised.host.name=<hostname routable by clients> #advertised.port=<port accessible by clients> (It also commented out a listeners config variable but turning this on/off made no difference to my issue.) Note: THE TWO LINES ARE COMMENTED OUT! I flipped this a few times in disbelief but the results were the same. My process: 1) ./gradlew clean 2) ./gradlew -PscalaVersion=2.11 releaseTarGz -x signArchives 3) (build a Docker image with the tgz file just like in spotify/kafka) 4) ./gradlew -PscalaVersion=2.11 install_2_11 5) clean, update, then recompile my test code in sbt, built against the release in .m2 local maven repo 6) run This process is repeatable and I applied it consistently through all tests. Not convinced yet? I went back to trunk and pasted in those 2 commented out lines in config/server.properties, leaving everything else as-is and rebuilt using the process above. Worked! Here's the git diff on my 2-line change to trunk: $ git diff diff --git a/config/server.properties b/config/server.properties index aebcb87..d9d17e8 100644 --- a/config/server.properties +++ b/config/server.properties @@ -29,6 +29,9 @@ broker.id=0 # listeners = PLAINTEXT://your.host.name:9092 #listeners=PLAINTEXT://:9092 +#advertised.host.name=<hostname routable by clients> +#advertised.port=<port accessible by clients> + # Hostname and port the broker will advertise to producers and consumers. If not set, # it uses the value for "listeners" if configured. Otherwise, it will use the value # returned from java.net.InetAddress.getCanonicalHostName(). I dunno what to say. I'll keep flipping this and have a friend confirm the results. Is some code or script textually looking at this file (not parsing it) for either of these 2 properties? > KafkaProducer fails with timeout on message send() > -------------------------------------------------- > > Key: KAFKA-3568 > URL: https://issues.apache.org/jira/browse/KAFKA-3568 > Project: Kafka > Issue Type: Bug > Components: clients > Affects Versions: 0.10.1.0 > Environment: MacOS Docker > Reporter: Greg Zoller > Labels: producer > > I had a KafkaProducer working fine in 0.9.0.1. I was having unrelated > problems in that version so thought to try 0.10.1.0. I built it as I did > 0.9.0.1: > Fresh build against Scala 2.11.7. Built the tgz build plus local maven > install. From the tgz I created a Docker image similar to spotify/kafka. I > linked my producer code to the maven jars. This process worked in 0.9. > Code is here: > https://gist.github.com/gzoller/145faef1fefc8acea212e87e06fc86e8 > At the bottom you can see a clip from the output... there's a warning about > metadata (not sure if its important or not) and then its trying to send() > messages and timing out. I clipped the output, but it does fail the same way > for each message sent in 0.10.1.0. Same code compiled against 0.9.0.1 > populates the topic's partitions w/o problem. > Was there a breaking change between 0.9 and 0.10, or is this a bug? -- This message was sent by Atlassian JIRA (v6.3.4#6332)