Mickael Maison created KAFKA-3404: ------------------------------------- Summary: Issues running the kafka system test Sasl Key: KAFKA-3404 URL: https://issues.apache.org/jira/browse/KAFKA-3404 Project: Kafka Issue Type: Bug Components: system tests Affects Versions: 0.9.0.1 Environment: Intel x86_64 Ubuntu 14.04
Reporter: Mickael Maison Hi, I'm trying to run the test_console_consumer.py system test and it's failing while testing the SASL protocols. [INFO - 2016-03-15 14:41:58,533 - runner - log - lineno:211]: SerialTestRunner: kafkatest.sanity_checks.test_console_consumer.ConsoleConsumerTest.test_lifecycle.security_protocol=SASL_SSL: Summary: Kafka server didn't finish startup Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/ducktape/tests/runner.py", line 102, in run_all_tests result.data = self.run_single_test() File "/usr/local/lib/python2.7/dist-packages/ducktape/tests/runner.py", line 154, in run_single_test return self.current_test_context.function(self.current_test) File "/usr/local/lib/python2.7/dist-packages/ducktape/mark/_mark.py", line 331, in wrapper return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) File "/home/mickael/ibm/messagehub/kafka/tests/kafkatest/sanity_checks/test_console_consumer.py", line 54, in test_lifecycle self.kafka.start() File "/home/mickael/ibm/messagehub/kafka/tests/kafkatest/services/kafka/kafka.py", line 81, in start Service.start(self) File "/usr/local/lib/python2.7/dist-packages/ducktape/services/service.py", line 140, in start self.start_node(node) File "/home/mickael/ibm/messagehub/kafka/tests/kafkatest/services/kafka/kafka.py", line 124, in start_node monitor.wait_until("Kafka Server.*started", timeout_sec=30, err_msg="Kafka server didn't finish startup") File "/usr/local/lib/python2.7/dist-packages/ducktape/cluster/remoteaccount.py", line 303, in wait_until return wait_until(lambda: self.acct.ssh("tail -c +%d %s | grep '%s'" % (self.offset+1, self.log, pattern), allow_fail=True) == 0, **kwargs) File "/usr/local/lib/python2.7/dist-packages/ducktape/utils/util.py", line 36, in wait_until raise TimeoutError(err_msg) TimeoutError: Kafka server didn't finish startup Looking at the logs from the kafka worker, I can see that Kafka is not able to connect the the kerberos server: [2016-03-15 14:41:28,751] FATAL [Kafka Server 1], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) org.apache.kafka.common.KafkaException: javax.security.auth.login.LoginException: Connection refused at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:74) at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:60) at kafka.network.Processor.<init>(SocketServer.scala:379) at kafka.network.SocketServer$$anonfun$startup$1$$anonfun$apply$1.apply$mcVI$sp(SocketServer.scala:96) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at kafka.network.SocketServer$$anonfun$startup$1.apply(SocketServer.scala:95) at kafka.network.SocketServer$$anonfun$startup$1.apply(SocketServer.scala:91) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.MapLike$DefaultValuesIterable.foreach(MapLike.scala:206) at kafka.network.SocketServer.startup(SocketServer.scala:91) at kafka.server.KafkaServer.startup(KafkaServer.scala:179) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37) at kafka.Kafka$.main(Kafka.scala:67) at kafka.Kafka.main(Kafka.scala) Looking at the kerberos worker, I can see it was started fine: Standalone MiniKdc Running --------------------------------------------------- Realm : EXAMPLE.COM Running at : worker4:worker4 krb5conf : /mnt/minikdc/krb5.conf created keytab : /mnt/minikdc/keytab with principals : [client, kafka/worker2] Do <CTRL-C> or kill <PID> to stop it --------------------------------------------------- Running netstat on the kerberos worker, I can see that it's listening on 47385: vagrant@worker4:~$ netstat -ano Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State Timer tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN off (0.00/0/0) tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN off (0.00/0/0) tcp 0 0 0.0.0.0:44313 0.0.0.0:* LISTEN off (0.00/0/0) tcp 0 0 10.0.2.15:22 10.0.2.2:56153 ESTABLISHED keepalive (7165.86/0/0) tcp 0 0 127.0.0.1:47747 127.0.1.1:47385 TIME_WAIT timewait (30.08/0/0) tcp6 0 0 :::111 :::* LISTEN off (0.00/0/0) tcp6 0 0 :::22 :::* LISTEN off (0.00/0/0) tcp6 0 0 127.0.1.1:47385 :::* LISTEN off (0.00/0/0) udp6 0 0 :::45368 :::* off (0.00/0/0) >From the same worker, I can connect fine to 47385 (the kerberos port): vagrant@worker4:~$ nc -vvv worker4 47385 Connection to worker4 47385 port [tcp/*] succeeded! But this is not working from any of the other workers. It seems strange that kerberos is listening on the local address and not on the public one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)