Hi Kotresh, If you are asking about overall 'policy' to deal with "abort the glusterd" OR "handle the error gracefully" then below answers may not be really applicable. IMO, it deals with design of the glusterd and I am still learning. :)
All in All, the errors in pthread_mutex_lock depends on how the mutex is initialized. (DEFAULT, RECURSIVE, ERRORCHECK). But if we go by default behavior of locks, pthread_mutex_lock() usually fails with EAGAIN/EINVAL. EINVAL is the critical error and may require attention upto the level of aborting the process. IMO. But EAGAIN can be handled as it has to deal with recursive locks. (depends on type of lock) May be using a pthread_mutex_trylock() + timeout(optional) could be a good option where it can returns proper error codes and having error handling around would help solve intermittent conditions. My 2 cents. Thanks, Chetan Risbud. ----- Original Message ----- From: "Kotresh Hiremath Ravishankar" <khire...@redhat.com> To: "Gluster Devel" <gluster-devel@nongnu.org> Sent: Tuesday, April 22, 2014 6:07:36 PM Subject: [Gluster-devel] Help on pthread library error Handling! Hi All, I am using pthread_mutex_lock in changelog translator which is part of brick process (glusterfsd). I see that pthread errors are not being catched and handled in current gluster code except in qemu where it is aborted. What is the correct way to handle the error when pthread library routines fail in gluster. Just logging the error and continuing would lead to data corruption and deadlocks. Aborting the process will abort whole glusterfsd process. Thanks and Regards, Kotresh H R _______________________________________________ Gluster-devel mailing list Gluster-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/gluster-devel _______________________________________________ Gluster-devel mailing list Gluster-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/gluster-devel