[jira] [Updated] (KAFKA-1501) transient unit tests failures due to port already in use

2015-04-04 Thread Jay Kreps (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps updated KAFKA-1501:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed. [~ewencp] is my personal hero.

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
Assignee: Ewen Cheslack-Postava
  Labels: newbie
 Attachments: KAFKA-1501-choosePorts.patch, KAFKA-1501.patch, 
 KAFKA-1501.patch, KAFKA-1501.patch, KAFKA-1501.patch, 
 KAFKA-1501_2015-03-09_11:41:07.patch, KAFKA-1501_2015-03-25_00:44:50.patch, 
 test-100.out, test-100.out, test-27.out, test-29.out, test-32.out, 
 test-35.out, test-38.out, test-4.out, test-42.out, test-45.out, test-46.out, 
 test-51.out, test-55.out, test-58.out, test-59.out, test-60.out, test-69.out, 
 test-72.out, test-74.out, test-76.out, test-84.out, test-87.out, test-91.out, 
 test-92.out


 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1501) transient unit tests failures due to port already in use

2015-03-26 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1501:
-
Assignee: Ewen Cheslack-Postava  (was: Guozhang Wang)

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
Assignee: Ewen Cheslack-Postava
  Labels: newbie
 Attachments: KAFKA-1501-choosePorts.patch, KAFKA-1501.patch, 
 KAFKA-1501.patch, KAFKA-1501.patch, KAFKA-1501.patch, 
 KAFKA-1501_2015-03-09_11:41:07.patch, KAFKA-1501_2015-03-25_00:44:50.patch, 
 test-100.out, test-100.out, test-27.out, test-29.out, test-32.out, 
 test-35.out, test-38.out, test-4.out, test-42.out, test-45.out, test-46.out, 
 test-51.out, test-55.out, test-58.out, test-59.out, test-60.out, test-69.out, 
 test-72.out, test-74.out, test-76.out, test-84.out, test-87.out, test-91.out, 
 test-92.out


 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1501) transient unit tests failures due to port already in use

2015-03-25 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1501:
-
Attachment: KAFKA-1501_2015-03-25_00:44:50.patch

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
Assignee: Guozhang Wang
  Labels: newbie
 Attachments: KAFKA-1501-choosePorts.patch, KAFKA-1501.patch, 
 KAFKA-1501.patch, KAFKA-1501.patch, KAFKA-1501.patch, 
 KAFKA-1501_2015-03-09_11:41:07.patch, KAFKA-1501_2015-03-25_00:44:50.patch, 
 test-100.out, test-100.out, test-27.out, test-29.out, test-32.out, 
 test-35.out, test-38.out, test-4.out, test-42.out, test-45.out, test-46.out, 
 test-51.out, test-55.out, test-58.out, test-59.out, test-60.out, test-69.out, 
 test-72.out, test-74.out, test-76.out, test-84.out, test-87.out, test-91.out, 
 test-92.out


 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1501) transient unit tests failures due to port already in use

2015-03-09 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1501:
-
Attachment: KAFKA-1501_2015-03-09_11:41:07.patch

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
Assignee: Guozhang Wang
  Labels: newbie
 Attachments: KAFKA-1501-choosePorts.patch, KAFKA-1501.patch, 
 KAFKA-1501.patch, KAFKA-1501.patch, KAFKA-1501.patch, 
 KAFKA-1501_2015-03-09_11:41:07.patch, test-100.out, test-100.out, 
 test-27.out, test-29.out, test-32.out, test-35.out, test-38.out, test-4.out, 
 test-42.out, test-45.out, test-46.out, test-51.out, test-55.out, test-58.out, 
 test-59.out, test-60.out, test-69.out, test-72.out, test-74.out, test-76.out, 
 test-84.out, test-87.out, test-91.out, test-92.out


 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1501) transient unit tests failures due to port already in use

2015-03-06 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1501:
-
Attachment: KAFKA-1501.patch

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
Assignee: Guozhang Wang
  Labels: newbie
 Attachments: KAFKA-1501-choosePorts.patch, KAFKA-1501.patch, 
 KAFKA-1501.patch, KAFKA-1501.patch, KAFKA-1501.patch, test-100.out, 
 test-100.out, test-27.out, test-29.out, test-32.out, test-35.out, 
 test-38.out, test-4.out, test-42.out, test-45.out, test-46.out, test-51.out, 
 test-55.out, test-58.out, test-59.out, test-60.out, test-69.out, test-72.out, 
 test-74.out, test-76.out, test-84.out, test-87.out, test-91.out, test-92.out


 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1501) transient unit tests failures due to port already in use

2014-12-05 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1501:
-
Attachment: test-4.out
test-29.out
test-32.out
test-35.out
test-38.out
test-42.out
test-45.out
test-46.out
test-51.out
test-55.out
test-58.out
test-59.out
test-60.out
test-69.out
test-72.out
test-74.out
test-76.out
test-84.out
test-87.out
test-91.out
test-92.out
test-100.out
test-27.out
test-100.out

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
Assignee: Guozhang Wang
  Labels: newbie
 Attachments: KAFKA-1501-choosePorts.patch, KAFKA-1501.patch, 
 KAFKA-1501.patch, KAFKA-1501.patch, test-100.out, test-100.out, test-27.out, 
 test-29.out, test-32.out, test-35.out, test-38.out, test-4.out, test-42.out, 
 test-45.out, test-46.out, test-51.out, test-55.out, test-58.out, test-59.out, 
 test-60.out, test-69.out, test-72.out, test-74.out, test-76.out, test-84.out, 
 test-87.out, test-91.out, test-92.out


 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1501) transient unit tests failures due to port already in use

2014-12-03 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1501:
-
Attachment: KAFKA-1501.patch

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
Assignee: Guozhang Wang
  Labels: newbie
 Attachments: KAFKA-1501-choosePorts.patch, KAFKA-1501.patch, 
 KAFKA-1501.patch, KAFKA-1501.patch


 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1501) transient unit tests failures due to port already in use

2014-10-29 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1501:
-
Attachment: KAFKA-1501-choosePorts.patch

Did anyone actually verify that a port is getting into TIME_WAIT or was that 
just a hunch? It actually seems unlikely since the socket was returned by 
choosePorts and there's no threading that would allow the socket to still be 
bound. And if it was a socket previously used for accept(), the only way it 
should end up in TIME_WAIT is if there was an outstanding connection request 
that hadn't been handled when the socket was closed.

I think a much simpler explanation is that a port is being allocated twice 
within each test. I suspect you're seeing these errors on ZooKeeperTestHarness 
tests because it uses a single port that is allocated in the TestZKUtils object 
-- that port is used for *all* tests. This means that there are plenty of times 
when that port is not bound (before a test has started) and choosePort() or 
choosePorts() is called (during test class instantiation), which could then 
return that same port and cause a conflict. Unfortunately, I am not able to 
reproduce this issue so I can't verify that. If someone else wants to try to 
verify, just logging the values returned by choosePort and the value of 
TestZKUtils.zookeeperConnect would make this issue easy to track down in a log.

What we really need is to make sure that tests use a single call to 
choosePorts() to allocate *all* the ports they'll need. The attached patch 
should do this. It's obviously possible to call choosePorts() twice, but I've 
tried to discourage it. The choosePort() variant is removed and a warning is 
added to the choosePorts() documentation. It uses a new base class, 
NetworkTestHarness, for all tests that need to coordinate multiple ports (i.e., 
anything that uses ZookeeperTestHarness since at that point both 
ZookeeperTestHarness and the test class will probably need to call 
choosePorts()). Because of the way KafkaServerTestHarness works, I made them 
all get allocated at initialization (so configs for KafkaServerTestHarness can 
still be generated at test class instantiation). You have to know how many to 
allocate up front, but by default it allocates 5 so that all the current tests 
don't need to override anything.

[~copester] - can you test out this patch since you can reliably reproduce the 
issue? And can you give an idea of the type of hardware you're able to 
reproduce it on since you mentioned it seems common on beefier hardware?

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
Assignee: Guozhang Wang
  Labels: newbie
 Attachments: KAFKA-1501-choosePorts.patch, KAFKA-1501.patch, 
 KAFKA-1501.patch


 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1501) transient unit tests failures due to port already in use

2014-10-26 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1501:
-
Attachment: KAFKA-1501.patch

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
Assignee: Guozhang Wang
  Labels: newbie
 Attachments: KAFKA-1501.patch, KAFKA-1501.patch


 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1501) transient unit tests failures due to port already in use

2014-10-23 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1501:
-
Attachment: KAFKA-1501.patch

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
  Labels: newbie
 Attachments: KAFKA-1501.patch


 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1501) transient unit tests failures due to port already in use

2014-10-23 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1501:
-
Assignee: Guozhang Wang
  Status: Patch Available  (was: Open)

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
Assignee: Guozhang Wang
  Labels: newbie
 Attachments: KAFKA-1501.patch


 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1501) transient unit tests failures due to port already in use

2014-06-19 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-1501:
---

Labels: newbie  (was: )

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
  Labels: newbie

 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.2#6252)