[GitHub] Dead2 commented on issue #1775: Deadlock in shm statistics
Dead2 commented on issue #1775: Deadlock in shm statistics URL: https://github.com/apache/incubator-pagespeed-mod/issues/1775#issuecomment-396602343 Increasing the deadlines to allow blocking rewrites is unfortunately not acceptable in our case. We are testing out the patch posted earlier that removes the locking, it has been running on a single server for about a week now without any new crashes (that pretty much only means it didn't make things worse though). A thought for long-term fix might be to either have a global limit of how many processes can linger and do cleanup/rewrites in the background, or set a timelimit for how long each it can delay an exit before it is forced. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Dead2 commented on issue #1775: Deadlock in shm statistics
Dead2 commented on issue #1775: Deadlock in shm statistics URL: https://github.com/apache/incubator-pagespeed-mod/issues/1775#issuecomment-396490225 Apache supposedly has hooks to prevent shutdown of the process until dependent tasks have finished, does mod_pagespeed register with these at all? Does mod_pagespeed register any atexit() code for example? I'd guess this is probably the most robust solution to this problem and to handle the case of killed processes for whatever reason. Apache connection-process exit code: ```C++ ap_run_process_connection(c); ap_lingering_close(c); exit(0); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Dead2 commented on issue #1775: Deadlock in shm statistics
Dead2 commented on issue #1775: Deadlock in shm statistics URL: https://github.com/apache/incubator-pagespeed-mod/issues/1775#issuecomment-394993817 The other proxy is an Nginx one for load-balancing and caching without touching the application server, and we don't want mod_pagespeed at that level. Thus it would have to be a middle-proxy if anything. LoadFromFile is in use now, and no other user than the owner is allowed access to the files/folders as a security measure, thus the need for the whole apache process to change user. But it seems to me that we don't have any concrete evidence that mpm-itk kills a process prematurely. I had a look into the code, but I don't know the apache code well enough to be sure of anything, but from what I saw, there are very few cases where mpm-itk will kill the process (error handling mostly). But if mod_pagespeed can deadlock due to a killed/crashed apache child process, that in itself is not good, is it? Sounds like it needs some way to detect the deadlock and either recover or somehow notify the rest of the system of the situation instead of resulting in a DOS-situation for the whole server. At the very least, a timeout on the lock, and fail the connection instead of waiting endlessly for something that should take <1ms. Ideally it would know what pid holds the lock, and print a warning to that effect to ease debugging, or check whether that pid still exists and remove the lock if the pid is gone. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Dead2 commented on issue #1775: Deadlock in shm statistics
Dead2 commented on issue #1775: Deadlock in shm statistics URL: https://github.com/apache/incubator-pagespeed-mod/issues/1775#issuecomment-394993817 The other proxy is an Nginx one for load-balancing and caching without touching the application server. LoadFromFile is in use now, and no other user than the owner is allowed access to the files/folders as a security measure, thus the need for the whole apache process to change user. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Dead2 commented on issue #1775: Deadlock in shm statistics
Dead2 commented on issue #1775: Deadlock in shm statistics URL: https://github.com/apache/incubator-pagespeed-mod/issues/1775#issuecomment-394840120 @oschaaf This has been considered a few times as a workaround, but due to the way things are, it would mean having two layers of proxies, so it would further complicate things. Also, it would not be able to load files directly instead of doing http requests. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Dead2 commented on issue #1775: Deadlock in shm statistics
Dead2 commented on issue #1775: Deadlock in shm statistics URL: https://github.com/apache/incubator-pagespeed-mod/issues/1775#issuecomment-394729292 @morlovich No crashes in relation to this unfortunately. @jmarantz That is indeed a concern with mpm-itk. But having it live through several connections is impossible, since it cannot re-acquire root permissions once it has given them up. So it has to be a fresh process for each connection. What would help was if it was possible to tell mpm-itk to wait until pagespeed is done processing data before ending the process. I'm not sure how mpm-itk ends the process, but shouldn't mod_pagespeed ideally get notified and be able to release the lock before going away? An alternative would be to spawn a process(instead of a thread) for handling background rewrites and cache maintenance etc (possibly with the optimization that only one process per vhost exist, and that process gets launched on demand and optionally quits after X minutes idle.). This problem cannot easily be solved by a central daemon/process, since it requires permissions to access the various users files, and permanently running complex code as root is not wanted. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Dead2 commented on issue #1775: Deadlock in shm statistics
Dead2 commented on issue #1775: Deadlock in shm statistics URL: https://github.com/apache/incubator-pagespeed-mod/issues/1775#issuecomment-394729292 @morlovich No crashes in relation to this unfortunately. @jmarantz That is indeed a concern with mpm-itk. But having it live through several connections is impossible, since it cannot re-acquire root permissions once it has given them up. So it has to be a fresh process for each connection. What would help was if it was possible to tell mpm-itk to wait until pagespeed is done processing data before ending the process. I'm not sure how mpm-itk ends the process, but shouldn't mod_pagespeed ideally get notified and be able to release the lock before going away? An alternative would be to spawn off a process(instead of a thread) for handling background rewrites and cache maintenance etc (possibly with the optimization that only one process per vhost exist, and that process gets launched on demand and optionally quits after X minutes idle.). This problem cannot easily be solved by a central daemon/process, since it requires permissions to access the various users files, and permanently running complex code as root is not wanted. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Dead2 commented on issue #1775: Deadlock in shm statistics
Dead2 commented on issue #1775: Deadlock in shm statistics URL: https://github.com/apache/incubator-pagespeed-mod/issues/1775#issuecomment-394678320 We do not have a snapshot of the logs from this deadlock unfortunately. We did consider turning statistics off completely, but that would cause ImageMaxRewritesAtOnce to be ignored and potentially turn into a DOS attack vector. I do wish there was a setting for disabling all "fluff" statistics, only keeping the few things needed for communication between processes. Investigating this also made me wonder why ImageMaxRewritesAtOnce is global while NumRewriteThreads and NumExpensiveRewriteThreads are process-local (so global if using a threaded apache and per-connection if using prefork). So this might be something worth changing long-term too. But I am confused, the documentation says NumRewriteThreads and NumExpensiveRewriteThreads are per-process, so those should not be able to cause a deadlock across processes. And ImageMaxRewritesAtOnce is global, so it could cause this, but two of the processes above are trying to rewrite .css files and should thus not be affected. So, either I am missing something or the documentation is misleading, probably the former. :) We will try out your patch, but proving the problem is fixed can take a while, since it can take weeks to trigger the bug (sometimes twice in a few days, sometimes once in a month, so it seems very random). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services