On 27 September 2017 at 01:17, David E Kim <de...@u.washington.edu> wrote:

> Would a reversion or fix to the BOINC client be the best option to save
> wasted computing and bandwidth?  If indeed there is a bug?
>

I added Rosetta to my Windows host earlier today and run it on self built
development version of BOINC. On this version the app worked ok which
strongly hints that the problem is using dirty slot directories. I
installed 7.8.2 a few minutes ago to make sure I can see the bug with that
version. It will take about one hour to get results.

I did see that the graphics app opens stderrgfx.txt file in slot directory
and keeps it open as long as the graphics are open. And that the graphics
app runs a few seconds longer than the corresponding task. In working
clients the open stderrgfx.txt prevents the client from using the slot
directory until the graphics app is closed.

In 7.8.2 the client's check for empty slot directory is broken and the
client uses slot directories for new tasks even if the directory still has
some files left over from the previous task. In Rosetta's case this means
the client starts a new task in a slot that was just used by previous task.
The app then tries to create a shared memory segment with name
"boinc_minirosetta_<slot#>". But a segment with this name is already open
in the graphics app for the previous task and creating a new one fails.

The bug with client using dirty slot directories is already fixed (the fix
was actually available well before 7.8.2 was released). I don't know when
there will be a bug fix release.

Until everyone has downgraded to 7.6.33 or there has been a bug fix release
and everyone has upgraded to it you could change your app so that it
doesn't crash if boinc_graphics_make_shmem() fails. If you can't easily
disable graphics if boinc_graphics_make_shmem() fails you could malloc() a
same size memory chunk and use that as a dummy shared memory. Some tasks
then might not have graphics available but at least they won't crash.

Do you know if the scheduler has a way to skip client versions per platform
> type?
>

I'm not seeing a way to block a specific client version. There is
<max_core_client_version> and >min_core_client_version> in plan classes but
having to use those to block client versions could get ugly pretty fast.
http://boinc.berkeley.edu/trac/wiki/AppPlanSpec


> So this has been a general finding with other projects?
>

Rosetta is the first one I know to have tasks crash after failing to create
shared memory segment. Other projects have had tasks being aborted by the
client due to task supposedly exceeding disk limit. The limit was exceeded
because the slot directory had files left over from the previous task,
often a VM task with GB sized VM image file.

And there was one report about VM task using VM image from another app.

Sorry for this long thread.
>

No problem.

-Juha
_______________________________________________
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
https://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to