Thanks: I've pushed the syncronization code to the handin repo.

And yes, you're right that `thread` (which is what I think the handin
server uses) doesn't make use of multiple cores. Places sound like the
right construct to use here, but I can already predict one problem:
2htdp/universe depends on racket/gui so it can't be run except in the
main place.

My best guess at a solution would be to provide an alternative version
of 2htdp/universe that doesn't depend on racket/gui (and so doesn't
actually open any windows or do any GUI stuff but instead either
raises errors or simply simulates the tick handler only) and then set
up a namespace for evaluating student programs in another place that
has that dummy version of 2htdp/universe. (I think that the dummy
version of 2htdp/universe would be useful even without different
places for running tests on student code so the server doesn't have
windows popping up anyway!).

That change is a more significant change, however, and would probably
take the form of a change to the handin server that set up the
separate places. I'd be happy to provide advice if you want to look
into this change. I don't think it will be too hard, but it will
require actual work. Or I could put it on a list of things I want to
work on (that sadly doesn't seem to ever shrink...)

I imagine it would be possible to do this in the checker itself if you
wanted to experiment there. (That wouldn't help with the
2htdp/universe dependency problem, however.)

Providing an alternate version of 2htdp/universe is also in the
category of not-too-hard, but requiring-work. Mostly, I guess,
refactoring the library to move dependencies around and then building
a simple layer on top of the refactored code that avoids the
racket/gui dependency. (Or maybe it's already factored well!)

Robby



On Tue, Dec 1, 2015 at 6:55 AM, Paolo Giarrusso <p.giarru...@gmail.com> wrote:
> Hi!
> After a new deadline, I got good news and bad news.
>
> # Good news
>
> I think *this* bug is fixed. Evidence: instead of crashing at the
> first reboot under load, the server survived to 10-20 automated
> reboots with the students submitting en masse without never showing
> the bug. So not only the patch makes sense, but it seems to be for the
> same bug.
>
> # Bad news
>
> The setup still didn't scale, though this wasn't as bad, and part of
> it was due to my setup. One student compared it to new releases from
> Blizzard. While our beefy server isn't even remotely sweating O_O.
>
> So I'd like to understand Racket threads and the handin server, to
> plan accordingly:
>
> - Does the whole handin server actually run on *one* processor,
> because of Racket multithreading?
> - What's your largest deployment with active checkers?
> - Do you actually use this for HtDP courses, or only for advanced
> classes (as a number of signs suggest)?
>
> 1. Under load, requesting the home page takes more than 20 seconds, so
> my watchdog scripts restarts the server. We have a watchdog script
> because when we didn't, the server just hung sometimes, so for our
> previous lecture (smaller, only ~100 students instead of 500, and no
> checkers) this watchdog script did wonders.
> 2. Here's the watchdog:
> curl --max-time 20 -s
> https://handin-ps.informatik.uni-tuebingen.de:7979/ > /dev/null || {
> docker restart handin-server-production; }
> That's even running every minute :-(
>
> Usually that request takes 20 ms, so (naive me thought) how on Earth
> could this balloon to 20 seconds?
> Now that I know of Racket threads, I understand: that includes both
> the web server and the checkers, together with an unspecified number
> of big-bang instances from students. For extra fun, one students
> called animate with big-bang as step function — essentially, a sweet
> HtDP fork bomb.
>
> I don't expect a patch for this, I'm just trying to understand things
> and contemplating workarounds, beyond a more lenient watchdog (or
> disabling it altogether and acting by hand), which I guess won't be
> enough.
>
> Cheers,
> Paolo
>
> On 29 November 2015 at 16:12, Robby Findler <ro...@eecs.northwestern.edu> 
> wrote:
>>
>>
>> On Sunday, November 29, 2015, Paolo Giarrusso <p.giarru...@gmail.com> wrote:
>>>
>>> On Friday, November 27, 2015 at 3:44:20 AM UTC+1, Robby Findler wrote:
>>> > Yes, I think you're right. I originally wrote that because I was
>>> > thinking that this code might be involved in evaluating the user's
>>> > submission, but I am not pretty sure I was wrong about that.
>>>
>>> "not pretty sure"?
>>
>>
>> Sorry. No "not".
>>
>>
>>>
>>>
>>> AFAICS, `auto-reload-value` is used to extract the `checker` binding from
>>> the various checker.rkt. but the lock will not be held while running
>>> `checker`. (Luckily we're not using hooks, I haven't studied that code).
>>
>>
>> Yes that's also what I noticed and why I sent a second diff. Or did I miss
>> another place?
>
> Was just rechecking because of the above confusion. We agree.
>
> --
> Paolo G. Giarrusso - Ph.D. Student, Tübingen University
> http://ps.informatik.uni-tuebingen.de/team/giarrusso/
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to racket-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to