Hi,
On one of our servers we have a few grpc processes that talk to each other.
Some processes are servers only (in C++) and others are both clients and
servers(noodejs). The nodejs process are getting stuck after we reboot the
server (the various processes get started about the same time by the init
system). But if we restart/kill the processes they work fine.

Here is a stack trace and debug log. Please let me know if this looks
familiar, otherwise I will keep digging.

Thanks
Abhishek

Backtrace from stuck nodejs process

#0  0x00007fc359f8626d in __lll_lock_wait () from
/lib/x86_64-linux-gnu/libpthread.so.0

#1  0x00007fc359f7fdbd in pthread_mutex_lock () from
/lib/x86_64-linux-gnu/libpthread.so.0

#2  0x00007fc3445ad319 in gpr_mu_lock (mu=<optimized out>) at
../deps/grpc/src/core/lib/gpr/sync_posix.cc:47

#3  0x00007fc34453747f in connected (arg=0x2758090, error=0x28d70a0) at
../deps/grpc/src/core/ext/transport/chttp2/client/chttp2_connector.cc:172

#4  0x00007fc3445132c1 in exec_ctx_run (error=0x28d70a0, closure=<optimized
out>) at ../deps/grpc/src/core/lib/iomgr/exec_ctx.cc:40

#5  grpc_core::ExecCtx::Flush (this=this@entry=0x7fffe53398e0) at
../deps/grpc/src/core/lib/iomgr/exec_ctx.cc:128

#6  0x00007fc34459fffe in ~ExecCtx (this=0x7fffe53398e0,
__in_chrg=<optimized out>) at ../deps/grpc/src/core/lib/iomgr/exec_ctx.h:108

#7  custom_connect_callback (socket=<optimized out>, error=0x28d70a0) at
../deps/grpc/src/core/lib/iomgr/tcp_client_custom.cc:100

#8  0x00007fc34453716a in chttp2_connector_connect (con=0x2758090,
args=0x7fffe5339a50, result=0x299cde0, notify=0x299cdf0) at
../deps/grpc/src/core/ext/transport/chttp2/client/chttp2_connector.cc:215

#9  0x00007fc3445457dc in continue_connect_locked (c=0x299cdb0) at
../deps/grpc/src/core/ext/filters/client_channel/subchannel.cc:408

#10 maybe_start_connecting_locked (c=0x299cdb0) at
../deps/grpc/src/core/ext/filters/client_channel/subchannel.cc:483

#11 0x00007fc344546415 in grpc_subchannel_notify_on_state_change
(c=0x299cdb0, interested_parties=0xdeafbeef, state=0x2800bbc,
notify=<optimized out>) at
../deps/grpc/src/core/ext/filters/client_channel/subchannel.cc:532

#12 0x00007fc344590e07 in grpc_core::SubchannelData<grpc_core::(anonymous
namespace)::PickFirst::PickFirstSubchannelList, grpc_core::(anonymous
namespace)::PickFirst::PickFirstSubchannelData>::StartConnectivityWatchLocked
(

    this=<optimized out>) at
../deps/grpc/src/core/ext/filters/client_channel/lb_policy/subchannel_list.h:304

#13 0x00007fc344591c70 in StartPickingLocked (this=0x2800770) at
../deps/grpc/src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc:249

#14 grpc_core::(anonymous namespace)::PickFirst::PickLocked
(this=0x2800770, pick=0x28052a0) at
../deps/grpc/src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc:270

#15 0x00007fc344540e2a in StartLocked (elem=<optimized out>) at
../deps/grpc/src/core/ext/filters/client_channel/client_channel.cc:2687

#16 process_service_config_and_start_lb_pick_locked (elem=0x2805170) at
../deps/grpc/src/core/ext/filters/client_channel/client_channel.cc:2815

#17 grpc_core::ResolverResultWaiter::DoneLocked (arg=0x28d26c0, error=0x0)
at ../deps/grpc/src/core/ext/filters/client_channel/client_channel.cc:2920

#18 0x00007fc344559332 in grpc_combiner_continue_exec_ctx () at
../deps/grpc/src/core/lib/iomgr/combiner.cc:260

#19 0x00007fc3445132e4 in grpc_core::ExecCtx::Flush
(this=this@entry=0x7fffe5339c10)
at ../deps/grpc/src/core/lib/iomgr/exec_ctx.cc:131

#20 0x00007fc34455a64a in ~ExecCtx (this=0x7fffe5339c10,
__in_chrg=<optimized out>) at ../deps/grpc/src/core/lib/iomgr/exec_ctx.h:108

#21 grpc_custom_resolve_callback (r=0x27f3b40, result=0x2730ac0,
error=<optimized out>) at
../deps/grpc/src/core/lib/iomgr/resolve_address_custom.cc:85

#22 0x000000000143c395 in ?? ()

#23 0x000000000143e44b in ?? ()

#24 0x000000000144ffa8 in ?? ()

#25 0x000000000143edd6 in uv_run ()

#26 0x000000000122c1bd in node::Start(uv_loop_s*, int, char const* const*,
int, char const* const*) ()

#27 0x0000000001224d03 in node::Start(int, char**) ()

#28 0x00007fc359bcc830 in __libc_start_main () from
/lib/x86_64-linux-gnu/libc.so.6

#29 0x00000000008aee41 in _start ()


verbose grpc log


2018-10-22T16:15:39,402380574-0700 E: D1022 16:15:39.401908859    2103
dns_resolver.cc:339]        Using native dns resolver

2018-10-22T16:15:39,402862016-0700 E: I1022 16:15:39.401936640    2103
timer_manager.cc:85]        Spawn timer thread

2018-10-22T16:15:39,403382056-0700 E: I1022 16:15:39.401955746    2103
init.cc:153]                grpc_init(void)

2018-10-22T16:15:39,403875574-0700 E: I1022 16:15:39.401960577    2198
timer_manager.cc:235]       timers not checked: expect another thread to

2018-10-22T16:15:39,404425511-0700 E: I1022 16:15:39.401969527    2198
timer_manager.cc:183]       sleep until kicked

2018-10-22T16:15:39,404906467-0700 E: I1022 16:15:39.402227144    2103
completion_queue.cc:431]
grpc_completion_queue_create_internal(completion_type=0, polling_type=0)

2018-10-22T16:15:40,582796872-0700 E: I1022 16:15:40.582192594    2103
server.cc:944]              grpc_server_create(0x2e29c70, (nil))

2018-10-22T16:15:40,583379934-0700 E: I1022 16:15:40.582212478    2103
server.cc:930]
grpc_server_register_completion_queue(server=0x2ddd2b0, cq=0x2de06b0,
reserved=(nil))

2018-10-22T16:15:40,583888435-0700 E: Method handler requestAction for
/c3d.AdminServiceInterface/RequestAction expected but not provided

2018-10-22T16:15:40,584442345-0700 E: I1022 16:15:40.582867963    2103
server_chttp2.cc:34]
grpc_server_add_insecure_http2_port(server=0x2ddd2b0, addr=0.0.0.0:1111)

2018-10-22T16:15:40,584957465-0700 E: I1022 16:15:40.582899401    2103
tcp_server_custom.cc:380]   SERVER 0x2dbb660 add_port [::]:1111 error="No
Error"

2018-10-22T16:15:40,590308691-0700 E: I1022 16:15:40.589690840    2103
channel_create.cc:93]
 grpc_insecure_channel_create(target=myteddy:1100, args=0x2eed230,
reserved=(nil))

2018-10-22T16:15:40,591341636-0700 E: I1022 16:15:40.590835146    2103
channel.cc:267]             grpc_channel_get_target(channel=0x2e29600)

2018-10-22T16:15:40,591929944-0700 E: I1022 16:15:40.591224466    2103
metadata_array.cc:29]       grpc_metadata_array_init(array=0x2d3a048)

2018-10-22T16:15:40,592455185-0700 E: I1022 16:15:40.591235015    2103
metadata_array.cc:29]       grpc_metadata_array_init(array=0x2d7fd48)

2018-10-22T16:15:40,592997632-0700 E: I1022 16:15:40.591240219    2103
call.cc:1983]               grpc_call_start_batch(call=0x2ed3770,
ops=0x2ea14d0, nops=2, tag=0x2d388f0, reserved=(nil))

2018-10-22T16:15:40,593465266-0700 E: I1022 16:15:40.591244610    2103
call.cc:1593]               ops[0]: SEND_INITIAL_METADATA(nil)

2018-10-22T16:15:40,593928046-0700 E: I1022 16:15:40.591247538    2103
call.cc:1593]               ops[1]: RECV_INITIAL_METADATA ptr=0x2d7fd48

2018-10-22T16:15:40,594908524-0700 E: I1022 16:15:40.591251816    2103
call_combiner.cc:67]        ==> grpc_call_combiner_start() [0x2ed3780]
closure=0x2ed4690 [executing batch] error="No Error"

2018-10-22T16:15:40,595398831-0700 E: I1022 16:15:40.591254354    2103
call_combiner.cc:76]          size: 0 -> 1

2018-10-22T16:15:40,595937480-0700 E: I1022 16:15:40.591256477    2103
call_combiner.cc:85]          EXECUTING IMMEDIATELY

2018-10-22T16:15:40,596469538-0700 E: I1022 16:15:40.591263238    2103
call.cc:638]                OP[client-channel:0x2ed4110]:
SEND_INITIAL_METADATA{key=3a 70 61 74 68 ':path' value=2f 63 33 64 2e 7a 69
6e 63 2e 5a 69 6e 63 2f 41 63 74 69 6f 6e 53 74 72 65 61 6d
'/c3d.zinc.Zinc/ActionStream'} RECV_INITIAL_METADATA

2018-10-22T16:15:40,597051866-0700 E: I1022 16:15:40.591266381    2103
client_channel.cc:1116]     chand=0x2e296c0 calld=0x2ed4130: adding pending
batch at index 0

2018-10-22T16:15:40,597583780-0700 E: I1022 16:15:40.591268938    2103
client_channel.cc:3064]     chand=0x2e296c0 calld=0x2ed4130: entering
client_channel combiner

2018-10-22T16:15:40,598104257-0700 E: I1022 16:15:40.591274987    2103
client_channel.cc:218]      chand=0x2e296c0: starting name resolution

2018-10-22T16:15:40,599053615-0700 E: D1022 16:15:40.591277545    2103
dns_resolver.cc:280]        Start resolving.

2018-10-22T16:15:40,599580345-0700 E: I1022 16:15:40.591284535    2103
client_channel.cc:2828]     chand=0x2e296c0 calld=0x2ed4130: deferring pick
pending resolver result

2018-10-22T16:15:40,600090104-0700 E: I1022 16:15:40.591287019    2103
call_combiner.cc:165]       call_combiner=0x2ed3780: setting
notify_on_cancel=0x2de2850

2018-10-22T16:15:40,600662796-0700 E: I1022 16:15:40.591299744    2103
metadata_array.cc:29]       grpc_metadata_array_init(array=0x2da7898)

2018-10-22T16:15:40,601247786-0700 E: I1022 16:15:40.591303804    2103
call.cc:1983]               grpc_call_start_batch(call=0x2ed3770,
ops=0x2eed050, nops=1, tag=0x2d23af0, reserved=(nil))

2018-10-22T16:15:40,601765545-0700 E: I1022 16:15:40.591307696    2103
call.cc:1593]               ops[0]: RECV_STATUS_ON_CLIENT
metadata=0x2da7898 status=0x2da78b0 details=0x2da78b8

2018-10-22T16:15:40,602248272-0700 E: I1022 16:15:40.591310486    2103
call_combiner.cc:67]        ==> grpc_call_combiner_start() [0x2ed3780]
closure=0x2ed4790 [executing batch] error="No Error"

2018-10-22T16:15:40,602770292-0700 E: I1022 16:15:40.591312823    2103
call_combiner.cc:76]          size: 1 -> 2

2018-10-22T16:15:40,603249135-0700 E: I1022 16:15:40.591315028    2103
call_combiner.cc:91]          QUEUING

2018-10-22T16:15:40,603754515-0700 E: I1022 16:15:40.596813629    2103
server.cc:1024]             grpc_server_start(server=0x2ddd2b0)

2018-10-22T16:15:40,604252793-0700 E: I1022 16:15:40.596825047    2103
executor.cc:200]            EXECUTOR schedule 0x2ed5650 inline

2018-10-22T16:15:40,604735654-0700 E: I1022 16:15:40.596829255    2103
tcp_server_custom.cc:422]   SERVER_START 0x2dbb660

2018-10-22T16:15:40,605185438-0700 E: I1022 16:15:40.596835319    2103
call_details.cc:31]         grpc_call_details_init(cd=0x2dabdf0)

2018-10-22T16:15:40,605648237-0700 E: I1022 16:15:40.596838124    2103
metadata_array.cc:29]       grpc_metadata_array_init(array=0x2dabe50)

2018-10-22T16:15:40,606157918-0700 E: I1022 16:15:40.596843515    2103
server.cc:1352]             grpc_server_request_call(server=0x2ddd2b0,
call=0x2dabde8, details=0x2dabdf0, initial_metadata=0x2dabe50,
cq_bound_to_call=0x2de06b0, cq_for_notification=0x2de06b0, tag=0x2dbd1d0)

2018-10-22T16:15:40,639462920-0700 E: Mon, 22 Oct 2018 23:15:40 GMT
sequelize deprecated String based operators are now deprecated. Please use
Symbol based operators for better security, read more at
http://docs.sequelizejs.com/manual/tutorial/querying.html#operators at
../../node_modules/sequelize/lib/sequelize.js:242:13

2018-10-22T16:15:40,644245283-0700 E: (node:2103) DeprecationWarning:
grpc.load: Use the @grpc/proto-loader module with
grpc.loadPackageDefinition instead

2018-10-22T16:15:40,644791789-0700 E: (node:2103) DeprecationWarning:
Server#addProtoService: Use Server#addService instead

2018-10-22T16:15:40,645336094-0700 E: I1022 16:15:40.643930321    2103
call.cc:1983]               grpc_call_start_batch(call=0x2ed3770,
ops=0x2ea14d0, nops=1, tag=0x2da4260, reserved=(nil))

2018-10-22T16:15:40,645932732-0700 E: I1022 16:15:40.643939646    2103
call.cc:1593]               ops[0]: RECV_MESSAGE ptr=0x2da4298

2018-10-22T16:15:40,646470991-0700 E: I1022 16:15:40.643952248    2103
call_combiner.cc:67]        ==> grpc_call_combiner_start() [0x2ed3780]
closure=0x2ed4890 [executing batch] error="No Error"

2018-10-22T16:15:40,647038084-0700 E: I1022 16:15:40.643954807    2103
call_combiner.cc:76]          size: 2 -> 3

2018-10-22T16:15:40,647569655-0700 E: I1022 16:15:40.643956732    2103
call_combiner.cc:91]          QUEUING

2018-10-22T16:15:40,648086615-0700 E: I1022 16:15:40.647123516    2103
call.cc:1983]               grpc_call_start_batch(call=0x2ed3770,
ops=0x2ea4370, nops=1, tag=0x2ee2140, reserved=(nil))

2018-10-22T16:15:40,648577680-0700 E: I1022 16:15:40.647131625    2103
call.cc:1593]               ops[0]: SEND_MESSAGE ptr=0x2d6bee0

2018-10-22T16:15:40,649081272-0700 E: I1022 16:15:40.647135137    2103
call_combiner.cc:67]        ==> grpc_call_combiner_start() [0x2ed3780]
closure=0x2ed4990 [executing batch] error="No Error"

2018-10-22T16:15:40,649534561-0700 E: I1022 16:15:40.647136958    2103
call_combiner.cc:76]          size: 3 -> 4

2018-10-22T16:15:40,649982921-0700 E: I1022 16:15:40.647138389    2103
call_combiner.cc:91]          QUEUING

2018-10-22T16:15:40,650453200-0700 E: I1022 16:15:40.647217372    2103
completion_queue.cc:851]    grpc_completion_queue_next(cq=0x2de06b0,
deadline=gpr_timespec { tv_sec: -9223372036854775808, tv_nsec: 0,
clock_type: 0 }, reserved=(nil))

2018-10-22T16:15:40,650934867-0700 E: I1022 16:15:40.647225719    2103
completion_queue.cc:951]    RETURN_EVENT[0x2de06b0]: QUEUE_TIMEOUT

2018-10-22T16:15:40,651414350-0700 E: I1022 16:15:40.647566577    2103
client_channel.cc:485]      chand=0x2e296c0: got resolver result:
resolver_result=0x2d80140 error="No Error"

2018-10-22T16:15:40,651904090-0700 E: I1022 16:15:40.647574780    2103
pick_first.cc:154]          Pick First 0x2db0d10 created.

2018-10-22T16:15:40,652397957-0700 E: I1022 16:15:40.647578030    2103
pick_first.cc:382]          Pick First 0x2db0d10 received update with 1
addresses

2018-10-22T16:15:40,652877632-0700 E: I1022 16:15:40.647580977    2103
subchannel_list.h:458]      [pick_first 0x2db0d10] Creating subchannel list
0x2ea39f0 for 1 subchannels

2018-10-22T16:15:40,653356788-0700 E: I1022 16:15:40.647619797    2103
subchannel_list.h:498]      [pick_first 0x2db0d10] subchannel list
0x2ea39f0 index 0: Created subchannel 0x2d86ed0 for address uri ipv4:
10.12.61.201:1100

2018-10-22T16:15:40,653804302-0700 E: I1022 16:15:40.647624418    2103
client_channel.cc:398]      chand=0x2e296c0: created new LB policy
"pick_first" (0x2db0d10)

2018-10-22T16:15:40,654245777-0700 E: I1022 16:15:40.647627929    2103
connectivity_state.cc:92]   CONWATCH: 0x2db0ef8 pick_first: get IDLE

2018-10-22T16:15:40,654759956-0700 E: I1022 16:15:40.647630687    2103
connectivity_state.cc:116]  CONWATCH: 0x2db0ef8 pick_first: from IDLE
[cur=IDLE] notify=0x2d871a8

2018-10-22T16:15:40,655252548-0700 E: I1022 16:15:40.647634030    2103
client_channel.cc:177]      chand=0x2e296c0: setting connectivity state to
IDLE

2018-10-22T16:15:40,655740066-0700 E: I1022 16:15:40.647636773    2103
connectivity_state.cc:164]  SET: 0x2e29748 client_channel: IDLE --> IDLE
[resolver_result] error=(nil) "No Error"

2018-10-22T16:15:40,656232847-0700 E: I1022 16:15:40.647639358    2103
client_channel.cc:2917]     chand=0x2e296c0 calld=0x2ed4130: resolver
returned, doing LB pick

2018-10-22T16:15:40,656696185-0700 E: I1022 16:15:40.647641353    2103
client_channel.cc:2756]     chand=0x2e296c0 calld=0x2ed4130: applying
service config to call

2018-10-22T16:15:40,657147226-0700 E: I1022 16:15:40.647642836    2103
client_channel.cc:2665]     chand=0x2e296c0 calld=0x2ed4130: starting pick
on lb_policy=0x2db0d10

2018-10-22T16:15:40,657601911-0700 E: I1022 16:15:40.647645073    2103
subchannel_list.h:292]      [pick_first 0x2db0d10] subchannel list
0x2ea39f0 index 0 of 1 (subchannel 0x2d86ed0): starting watch: requesting
connectivity change notification (from IDLE)

2018-10-22T16:15:40,658032165-0700 E: I1022 16:15:40.647646950    2103
connectivity_state.cc:116]  CONWATCH: 0x2d86fa0 subchannel: from IDLE
[cur=IDLE] notify=0x2d38578

2018-10-22T16:15:40,658462296-0700 E: I1022 16:15:40.647649274    2103
connectivity_state.cc:164]  SET: 0x2d86fa0 subchannel: IDLE --> CONNECTING
[state_change] error=(nil) "No Error"

2018-10-22T16:15:40,658876797-0700 E: I1022 16:15:40.647650773    2103
connectivity_state.cc:190]  NOTIFY: 0x2d86fa0 subchannel: 0x2d38578

2018-10-22T16:15:40,659315104-0700 E: I1022 16:15:40.647657057    2103
tcp_client_custom.cc:139]   CLIENT_CONNECT: 0x2da77c0 ipv4:10.12.61.201:1100:
asynchronously connecting

2018-10-22T16:15:40,659745439-0700 E: I1022 16:15:40.647676597    2103
tcp_client_custom.cc:69]    CLIENT_CONNECT: ipv4:10.12.61.201:1100:
on_alarm: error="Cancelled"

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/CAD2hskyh26_bjcvoy9d0tsMMgvKVSk_T7KyOV5GRjw0Dc4_%3Dxw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to