Hi and thanks for reacting promptly! On 25 November 2015 at 13:52, Robby Findler <ro...@eecs.northwestern.edu> wrote: > I don't know what's going on here, but could it be that two threads > are, in parallel, trying to load the same implementation of an > unloaded checker and then stomping on each other?
Interesting! Sounds consistent with the logs: [2|2015-11-23T14:51:31] (re)loading module from (file /var/handin_config/info1-teaching-material/checkers/06-Datentypen/REDACTED-USER-NAME/checker.rkt) [1|2015-11-23T14:51:31] (re)loading module from (file /var/handin_config/info1-teaching-material/checkers/06-Datentypen/REDACTED-USER-NAME/checker.rkt) [... error from thread 1, then error from thread 2...] > The file handin-server/private/reloadable has some dynamic-requires > without appropriate syncronization around them, at least that I see, > which seems suspicious. I don't know if that's *the* problem, but I've probably built a testcase for it. (I also wonder about the set!, but hopefully they don't modify global variables). And unlike I thought, no compiled files get written on the server (only when testing checkers locally). Is it still plausible that avoiding checker-extras (or precompiling it) would help? BTW, docs don't seem to mention synchronization: http://docs.racket-lang.org/reference/Module_Names_and_Loading.html?q=dynamic-require#%28def._%28%28quote._~23~25kernel%29._dynamic-require%29%29 Should I file an issue on those docs? (I'm afraid I couldn't say much though). Cheers, Paolo > On Wed, Nov 25, 2015 at 6:35 AM, Paolo Giarrusso <p.giarru...@gmail.com> > wrote: >> Hi all, >> it's me, handin server guy again. Sorry to bother. >> >> Our handin server started "crashing" with "bad variable linkage" errors at >> deadline time (presumably under somewhat high load), and since it happened >> twice, I thought I'd report it. Any ideas on what's causing this? >> >> After this "crash", the server keeps running, but rejects all submissions >> because the same checker keeps not loading. >> >> == >> >> [1|2015-11-23T14:51:31] (re)loading module from (file >> /var/handin_config/info1-teaching-material/checkers/06-Datentypen/REDACTED-USER-NAME/../checker.rkt) >> [1|2015-11-23T14:51:33] ERROR: link: bad variable linkage; >> [1|2015-11-23T14:51:33] reference to a variable that is uninitialized >> [1|2015-11-23T14:51:33] reference phase level: 0 >> [1|2015-11-23T14:51:33] variable module: >> "/var/handin_home/handin/handin-server/checker.rkt" >> [1|2015-11-23T14:51:33] variable phase: 0 >> [1|2015-11-23T14:51:33] reference in module: >> "/var/handin_config/info1-teaching-material/checkers/checker-extras.rkt" >> [1|2015-11-23T14:51:33] in: submission-eval >> >> Bigger log fragment available at >> https://gist.github.com/Blaisorblade/7f9c6e7f4f456b588a8a >> >> Other info: >> - Restarting the server does fix the error. Somehow. >> - For those unfamiliar with the handin server: it has code which >> automatically reloads checkers, as witnessed by the log above >> (https://github.com/ps-tuebingen/handin/blob/master/handin-server/private/reloadable.rkt). >> But that code doesn't fix the problem. >> - Googling suggests that stale compiled code might be there. But the source >> code hadn't changed. (Also, I found no description of how this arises). >> - Since the server gets sometimes "stuck", I built a trivial watchdog (a >> cronjob) that restarts the server if the status server becomes too slow. The >> above happened after the server was restarted by the watchdog. >> >> One set of hypothesis: >> is it possible that stopping the server at the wrong moment corrupts >> compiled files? (But then, why does the first restart not fix the problem?) >> Do you take care to make compilation atomic with `rename`? >> >> However, according to docs, the server is designed to survive brutal >> restarts. >> >> One non-standard thing I do is that I have a `checker-extras.rkt` module >> with some utilities shared across checkers*, and that's not deployed as part >> of the server (for various reasons), but together with the checkers, so it's >> loaded with (require "../checker-extras.rkt"), and seems to be compiled, >> probably when starting the server. Could this interfere badly with the >> reloading code or with restarting? >> >> *I'm aware of your checker utilities, but here we have slightly different >> requirements. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Racket Users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to racket-users+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. -- Paolo G. Giarrusso - Ph.D. Student, Tübingen University http://ps.informatik.uni-tuebingen.de/team/giarrusso/ -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.