Re: RocksDB segfault inside timer when accessing/clearing state

2017-10-09 Thread Kien Truong
Hi Stephan, I guess this is the case. Our cluster is a bit overloaded network-wise, so sometime a Task Manager got disconnected, which causes the restart of the entire job, leading to multiple segfaults in other task managers, prolonging recovery. We're upgrading the network, hopefully the

Re: RocksDB segfault inside timer when accessing/clearing state

2017-10-08 Thread Stefan Richter
Hi, I would assume that those segfaults are only observed *after* a job is already in the process of canceling? This is a known problem, but currently „accepted“ behaviour after discussions with Stephan and Aljoscha (in CC). From that discussion, the background is that the native RocksDB