Hi, This is about the interplay of gRPC and Kubernetes, but I'm pretty sure the problem is with the gRPC name resolution. I have a gRPC application running on a kubernetes cluster exposed using a headless service. For those that don't know, this means (as I understand it) kubernetes sets up a DNS entry pointing to the application container. If I then try and run the client in the same cluster it hangs during name resolution to the kubernetes service. Connecting directly to the IP address of the service works fine. E.g. with the server already running and listening on port 50051
kubectl run -it --rm --image=myimage --env="GRPC_TRACE=all" --env="GRPC_VERBOSITY=DEBUG" testclient --command /bin/sh > ping myservice PING myservice (10.0.0.13): 56 data bytes 64 bytes from 10.0.0.13: seq=0 ttl=64 time=0.121 ms (etc; service name resolves fine) > myapp -c 10.0.0.13:50051 # connect as client to the given server and port using the ip address resolved by ping (loads of debug output, but I get the expected response and everything works fine) > myapp -c myservice:50051 (hangs indefinitely trying to resolve the name, full output below) The fact it hangs indefinitely appears to be related to https://github.com/grpc/grpc/issues/9481 (DNS resolver in C core never gives up resolving a nonexistent hostname), except in my case the hostname is valid, as shown by the ping. The container image is busybox plus the C++ application and required libraries. Is there anything gRPC needs in the OS (i.e. my container image) in order to perform name resolution? Ping manages it fine, why not gRPC? Forgive my ignorance of how the underlying name resolution is provided. Thanks, Mark. Output when trying to resolve to the service: I0315 17:08:46.596502176 15 ev_epoll_linux.c:85] epoll engine will be using signal: 36 D0315 17:08:46.596889365 15 ev_posix.c:106] Using polling engine: epoll I0315 17:08:46.597180247 15 init.c:193] grpc_init(void) I0315 17:08:46.597458188 15 completion_queue.c:139] grpc_completion_queue_create(reserved=(nil)) I0315 17:08:46.597765727 14 init.c:193] grpc_init(void) I0315 17:08:46.598071841 14 channel_create.c:235] grpc_insecure_channel_create(target=0xaf2e48, args=0x7ffd594b2fd0, reserved=(nil)) I0315 17:08:46.598358396 14 init.c:193] grpc_init(void) I0315 17:08:46.598675472 14 init.c:198] grpc_shutdown(void) I0315 17:08:46.598951207 14 channel.c:263] grpc_channel_register_call(channel=0xaeba20, method=/grpcendpoint.Endpoint/endpointVersion, host=(null), reserved=(nil)) I0315 17:08:46.599216112 14 channel.c:263] grpc_channel_register_call(channel=0xaeba20, method=/grpcendpoint.Endpoint/listCollections, host=(null), reserved=(nil)) I0315 17:08:46.599476727 14 channel.c:263] grpc_channel_register_call(channel=0xaeba20, method=/grpcendpoint.Endpoint/getCollection, host=(null), reserved=(nil)) I0315 17:08:46.599896046 14 channel.c:291] grpc_channel_create_registered_call(channel=0xaeba20, parent_call=(nil), propagation_mask=ffff, completion_queue=0x7fa7c8001b40, registered_call_handle=0xaf1910, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=(nil)) I0315 17:08:46.600314315 14 grpc_context.c:41] grpc_census_call_set_context(call=0xaf4ec0, census_context=(nil)) I0315 17:08:46.600804961 14 call.c:1690] grpc_call_start_batch(call=0xaf4ec0, ops=0x7ffd594b2cf0, nops=3, tag=0xaf0318, reserved=(nil)) I0315 17:08:46.601111821 14 call.c:1366] ops[0]: SEND_INITIAL_METADATA I0315 17:08:46.601392365 14 call.c:1366] ops[1]: SEND_MESSAGE ptr=0xaebc90 I0315 17:08:46.601653874 14 call.c:1366] ops[2]: SEND_CLOSE_FROM_CLIENT I0315 17:08:46.602006491 14 client_channel.c:109] OP[client-channel:0xaf54c8]: SEND_INITIAL_METADATA{key=3a 70 61 74 68 ':path' value=2f 67 72 70 63 65 6e 64 70 6f 69 6e 74 2e 45 6e 64 70 6f 69 6e 74 2f 65 6e 64 70 6f 69 6e 74 56 65 72 73 69 6f 6e '/grpcendpoint.Endpoint/endpointVersion', key=3a 61 75 74 68 6f 72 69 74 79 ':authority' value=74 65 6c 65 68 69 73 74 3a 35 30 30 35 31 'myservice:50051', key=67 72 70 63 2d 65 6e 63 6f 64 69 6e 67 'grpc-encoding' value=69 64 65 6e 74 69 74 79 'identity', key=67 72 70 63 2d 61 63 63 65 70 74 2d 65 6e 63 6f 64 69 6e 67 'grpc-accept-encoding' value=69 64 65 6e 74 69 74 79 2c 64 65 66 6c 61 74 65 2c 67 7a 69 70 'identity,deflate,gzip'} SEND_MESSAGE:flags=0x00000000:len=0 SEND_TRAILING_METADATA{} I0315 17:08:46.602499932 14 call.c:1690] grpc_call_start_batch(call=0xaf4ec0, ops=0x7ffd594b2d30, nops=3, tag=0xaf03b8, reserved=(nil)) I0315 17:08:46.602729139 14 call.c:1366] ops[0]: RECV_INITIAL_METADATA ptr=0xaf03e0 I0315 17:08:46.603054019 14 call.c:1366] ops[1]: RECV_MESSAGE ptr=0xaf0408 I0315 17:08:46.603314856 14 call.c:1366] ops[2]: RECV_STATUS_ON_CLIENT metadata=0xaf0428 status=0xaf0440 details=0xaf0448 I0315 17:08:46.603639848 14 client_channel.c:109] OP[client-channel:0xaf54c8]: RECV_INITIAL_METADATA RECV_MESSAGE RECV_TRAILING_METADATA I0315 17:08:46.603924761 15 completion_queue.c:333] grpc_completion_queue_next(cc=0x7fa7c8001b40, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=(nil)) D0315 17:08:46.604521071 16 dns_resolver.c:192] dns resolution failed: {"created":"@1489597726.604500252","description":"OS Error","errno":-2,"file":"src/core/lib/iomgr/resolve_address_posix.c","file_line":115,"os_error":"Name or service not known","syscall":"getaddrinfo","target_address":"myservice:50051"} D0315 17:08:46.604820308 16 dns_resolver.c:201] retrying immediately D0315 17:08:46.605088317 16 connectivity_state.c:156] SET: 0xaebb68 client_channel: IDLE --> TRANSIENT_FAILURE [new_lb+resolver] error=0x7fa7c0000b90 {"created":"@1489597726.605077608","description":"No load balancing policy","file":"src/core/ext/client_config/client_channel.c","file_line":188} D0315 17:08:47.605552537 17 dns_resolver.c:192] dns resolution failed: {"created":"@1489597727.605530416","description":"OS Error","errno":-2,"file":"src/core/lib/iomgr/resolve_address_posix.c","file_line":115,"os_error":"Name or service not known","syscall":"getaddrinfo","target_address":"myservice:50051"} D0315 17:08:47.606272692 17 dns_resolver.c:201] retrying immediately D0315 17:08:47.606719769 17 connectivity_state.c:156] SET: 0xaebb68 client_channel: TRANSIENT_FAILURE --> TRANSIENT_FAILURE [new_lb+resolver] error=0x7fa7c0000bf0 {"created":"@1489597727.606709040","description":"No load balancing policy","file":"src/core/ext/client_config/client_channel.c","file_line":188} D0315 17:08:49.501768132 18 dns_resolver.c:192] dns resolution failed: {"created":"@1489597729.501745552","description":"OS Error","errno":-2,"file":"src/core/lib/iomgr/resolve_address_posix.c","file_line":115,"os_error":"Name or service not known","syscall":"getaddrinfo","target_address":"myservice:50051"} D0315 17:08:49.502278763 18 dns_resolver.c:201] retrying immediately D0315 17:08:49.502683440 18 connectivity_state.c:156] SET: 0xaebb68 client_channel: TRANSIENT_FAILURE --> TRANSIENT_FAILURE [new_lb+resolver] error=0x7fa7c0000b90 {"created":"@1489597729.502672595","description":"No load balancing policy","file":"src/core/ext/client_config/client_channel.c","file_line":188} D0315 17:08:52.737487943 19 dns_resolver.c:192] dns resolution failed: {"created":"@1489597732.737457632","description":"OS Error","errno":-2,"file":"src/core/lib/iomgr/resolve_address_posix.c","file_line":115,"os_error":"Name or service not known","syscall":"getaddrinfo","target_address":"myservice:50051"} D0315 17:08:52.737695112 19 dns_resolver.c:201] retrying immediately D0315 17:08:52.737751585 19 connectivity_state.c:156] SET: 0xaebb68 client_channel: TRANSIENT_FAILURE --> TRANSIENT_FAILURE [new_lb+resolver] error=0x7fa7c0000bf0 {"created":"@1489597732.737742549","description":"No load balancing policy","file":"src/core/ext/client_config/client_channel.c","file_line":188} D0315 17:08:56.890396846 20 dns_resolver.c:192] dns resolution failed: {"created":"@1489597736.890373998","description":"OS Error","errno":-2,"file":"src/core/lib/iomgr/resolve_address_posix.c","file_line":115,"os_error":"Name or service not known","syscall":"getaddrinfo","target_address":"myservice:50051"} D0315 17:08:56.890592615 20 dns_resolver.c:201] retrying immediately D0315 17:08:56.890664375 20 connectivity_state.c:156] SET: 0xaebb68 client_channel: TRANSIENT_FAILURE --> TRANSIENT_FAILURE [new_lb+resolver] error=0x7fa7c0000b90 {"created":"@1489597736.890640201","description":"No load balancing policy","file":"src/core/ext/client_config/client_channel.c","file_line":188} (etc in a seemingly infinite loop) -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/grpc-io. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/48f6213b-9f85-473d-b50c-36771533a0ff%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
