Hi,
This is about the interplay of gRPC and Kubernetes, but I'm pretty sure the 
problem is with the gRPC name resolution.
I have a gRPC application running on a kubernetes cluster exposed using a 
headless service.  For those that don't know, this means (as I understand 
it) kubernetes sets up a DNS entry pointing to the application container. 
 If I then try and run the client in the same cluster it hangs during name 
resolution to the kubernetes service.  Connecting directly to the IP 
address of the service works fine.
E.g. with the server already running and listening on port 50051

kubectl run -it --rm --image=myimage --env="GRPC_TRACE=all" 
--env="GRPC_VERBOSITY=DEBUG" testclient --command /bin/sh

> ping myservice
PING myservice (10.0.0.13): 56 data bytes
64 bytes from 10.0.0.13: seq=0 ttl=64 time=0.121 ms
(etc; service name resolves fine)

> myapp -c 10.0.0.13:50051  # connect as client to the given server and 
port using the ip address resolved by ping
(loads of debug output, but I get the expected response and everything 
works fine)

> myapp -c myservice:50051
(hangs indefinitely trying to resolve the name, full output below)

The fact it hangs indefinitely appears to be related 
to https://github.com/grpc/grpc/issues/9481 (DNS resolver in C core never 
gives up resolving a nonexistent hostname), except in my case the hostname 
is valid, as shown by the ping.  The container image is busybox plus the 
C++ application and required libraries.  Is there anything gRPC needs in 
the OS (i.e. my container image) in order to perform name resolution?  Ping 
manages it fine, why not gRPC?  Forgive my ignorance of how the underlying 
name resolution is provided.

Thanks,

Mark.

Output when trying to resolve to the service:

I0315 17:08:46.596502176      15 ev_epoll_linux.c:85]        epoll engine 
will be using signal: 36
D0315 17:08:46.596889365      15 ev_posix.c:106]             Using polling 
engine: epoll
I0315 17:08:46.597180247      15 init.c:193]                 grpc_init(void)
I0315 17:08:46.597458188      15 completion_queue.c:139]     
grpc_completion_queue_create(reserved=(nil))
I0315 17:08:46.597765727      14 init.c:193]                 grpc_init(void)
I0315 17:08:46.598071841      14 channel_create.c:235]       
grpc_insecure_channel_create(target=0xaf2e48, args=0x7ffd594b2fd0, 
reserved=(nil))
I0315 17:08:46.598358396      14 init.c:193]                 grpc_init(void)
I0315 17:08:46.598675472      14 init.c:198]                 
grpc_shutdown(void)
I0315 17:08:46.598951207      14 channel.c:263]             
 grpc_channel_register_call(channel=0xaeba20, 
method=/grpcendpoint.Endpoint/endpointVersion, host=(null), reserved=(nil))
I0315 17:08:46.599216112      14 channel.c:263]             
 grpc_channel_register_call(channel=0xaeba20, 
method=/grpcendpoint.Endpoint/listCollections, host=(null), reserved=(nil))
I0315 17:08:46.599476727      14 channel.c:263]             
 grpc_channel_register_call(channel=0xaeba20, 
method=/grpcendpoint.Endpoint/getCollection, host=(null), reserved=(nil))
I0315 17:08:46.599896046      14 channel.c:291]             
 grpc_channel_create_registered_call(channel=0xaeba20, parent_call=(nil), 
propagation_mask=ffff, completion_queue=0x7fa7c8001b40, 
registered_call_handle=0xaf1910, deadline=gpr_timespec { tv_sec: 
9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=(nil))
I0315 17:08:46.600314315      14 grpc_context.c:41]         
 grpc_census_call_set_context(call=0xaf4ec0, census_context=(nil))
I0315 17:08:46.600804961      14 call.c:1690]               
 grpc_call_start_batch(call=0xaf4ec0, ops=0x7ffd594b2cf0, nops=3, 
tag=0xaf0318, reserved=(nil))
I0315 17:08:46.601111821      14 call.c:1366]                ops[0]: 
SEND_INITIAL_METADATA
I0315 17:08:46.601392365      14 call.c:1366]                ops[1]: 
SEND_MESSAGE ptr=0xaebc90
I0315 17:08:46.601653874      14 call.c:1366]                ops[2]: 
SEND_CLOSE_FROM_CLIENT
I0315 17:08:46.602006491      14 client_channel.c:109]       
OP[client-channel:0xaf54c8]: SEND_INITIAL_METADATA{key=3a 70 61 74 68 
':path' value=2f 67 72 70 63 65 6e 64 70 6f 69 6e 74 2e 45 6e 64 70 6f 69 
6e 74 2f 65 6e 64 70 6f 69 6e 74 56 65 72 73 69 6f 6e 
'/grpcendpoint.Endpoint/endpointVersion', key=3a 61 75 74 68 6f 72 69 74 79 
':authority' value=74 65 6c 65 68 69 73 74 3a 35 30 30 35 31 
'myservice:50051', key=67 72 70 63 2d 65 6e 63 6f 64 69 6e 67 
'grpc-encoding' value=69 64 65 6e 74 69 74 79 'identity', key=67 72 70 63 
2d 61 63 63 65 70 74 2d 65 6e 63 6f 64 69 6e 67 'grpc-accept-encoding' 
value=69 64 65 6e 74 69 74 79 2c 64 65 66 6c 61 74 65 2c 67 7a 69 70 
'identity,deflate,gzip'} SEND_MESSAGE:flags=0x00000000:len=0 
SEND_TRAILING_METADATA{}
I0315 17:08:46.602499932      14 call.c:1690]               
 grpc_call_start_batch(call=0xaf4ec0, ops=0x7ffd594b2d30, nops=3, 
tag=0xaf03b8, reserved=(nil))
I0315 17:08:46.602729139      14 call.c:1366]                ops[0]: 
RECV_INITIAL_METADATA ptr=0xaf03e0
I0315 17:08:46.603054019      14 call.c:1366]                ops[1]: 
RECV_MESSAGE ptr=0xaf0408
I0315 17:08:46.603314856      14 call.c:1366]                ops[2]: 
RECV_STATUS_ON_CLIENT metadata=0xaf0428 status=0xaf0440 details=0xaf0448
I0315 17:08:46.603639848      14 client_channel.c:109]       
OP[client-channel:0xaf54c8]: RECV_INITIAL_METADATA RECV_MESSAGE 
RECV_TRAILING_METADATA
I0315 17:08:46.603924761      15 completion_queue.c:333]     
grpc_completion_queue_next(cc=0x7fa7c8001b40, deadline=gpr_timespec { 
tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=(nil))
D0315 17:08:46.604521071      16 dns_resolver.c:192]         dns resolution 
failed: {"created":"@1489597726.604500252","description":"OS 
Error","errno":-2,"file":"src/core/lib/iomgr/resolve_address_posix.c","file_line":115,"os_error":"Name
 
or service not 
known","syscall":"getaddrinfo","target_address":"myservice:50051"}
D0315 17:08:46.604820308      16 dns_resolver.c:201]         retrying 
immediately
D0315 17:08:46.605088317      16 connectivity_state.c:156]   SET: 0xaebb68 
client_channel: IDLE --> TRANSIENT_FAILURE [new_lb+resolver] 
error=0x7fa7c0000b90 {"created":"@1489597726.605077608","description":"No 
load balancing 
policy","file":"src/core/ext/client_config/client_channel.c","file_line":188}
D0315 17:08:47.605552537      17 dns_resolver.c:192]         dns resolution 
failed: {"created":"@1489597727.605530416","description":"OS 
Error","errno":-2,"file":"src/core/lib/iomgr/resolve_address_posix.c","file_line":115,"os_error":"Name
 
or service not 
known","syscall":"getaddrinfo","target_address":"myservice:50051"}
D0315 17:08:47.606272692      17 dns_resolver.c:201]         retrying 
immediately
D0315 17:08:47.606719769      17 connectivity_state.c:156]   SET: 0xaebb68 
client_channel: TRANSIENT_FAILURE --> TRANSIENT_FAILURE [new_lb+resolver] 
error=0x7fa7c0000bf0 {"created":"@1489597727.606709040","description":"No 
load balancing 
policy","file":"src/core/ext/client_config/client_channel.c","file_line":188}
D0315 17:08:49.501768132      18 dns_resolver.c:192]         dns resolution 
failed: {"created":"@1489597729.501745552","description":"OS 
Error","errno":-2,"file":"src/core/lib/iomgr/resolve_address_posix.c","file_line":115,"os_error":"Name
 
or service not 
known","syscall":"getaddrinfo","target_address":"myservice:50051"}
D0315 17:08:49.502278763      18 dns_resolver.c:201]         retrying 
immediately
D0315 17:08:49.502683440      18 connectivity_state.c:156]   SET: 0xaebb68 
client_channel: TRANSIENT_FAILURE --> TRANSIENT_FAILURE [new_lb+resolver] 
error=0x7fa7c0000b90 {"created":"@1489597729.502672595","description":"No 
load balancing 
policy","file":"src/core/ext/client_config/client_channel.c","file_line":188}
D0315 17:08:52.737487943      19 dns_resolver.c:192]         dns resolution 
failed: {"created":"@1489597732.737457632","description":"OS 
Error","errno":-2,"file":"src/core/lib/iomgr/resolve_address_posix.c","file_line":115,"os_error":"Name
 
or service not 
known","syscall":"getaddrinfo","target_address":"myservice:50051"}
D0315 17:08:52.737695112      19 dns_resolver.c:201]         retrying 
immediately
D0315 17:08:52.737751585      19 connectivity_state.c:156]   SET: 0xaebb68 
client_channel: TRANSIENT_FAILURE --> TRANSIENT_FAILURE [new_lb+resolver] 
error=0x7fa7c0000bf0 {"created":"@1489597732.737742549","description":"No 
load balancing 
policy","file":"src/core/ext/client_config/client_channel.c","file_line":188}
D0315 17:08:56.890396846      20 dns_resolver.c:192]         dns resolution 
failed: {"created":"@1489597736.890373998","description":"OS 
Error","errno":-2,"file":"src/core/lib/iomgr/resolve_address_posix.c","file_line":115,"os_error":"Name
 
or service not 
known","syscall":"getaddrinfo","target_address":"myservice:50051"}
D0315 17:08:56.890592615      20 dns_resolver.c:201]         retrying 
immediately
D0315 17:08:56.890664375      20 connectivity_state.c:156]   SET: 0xaebb68 
client_channel: TRANSIENT_FAILURE --> TRANSIENT_FAILURE [new_lb+resolver] 
error=0x7fa7c0000b90 {"created":"@1489597736.890640201","description":"No 
load balancing 
policy","file":"src/core/ext/client_config/client_channel.c","file_line":188}
(etc in a seemingly infinite loop)

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/48f6213b-9f85-473d-b50c-36771533a0ff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to