Re: Multiple processing compiling the same file

Jim Newton Wed, 31 Jan 2018 02:45:36 -0800

Hi Faré, 

Thanks for taking the time to understand my comments.  I’ve tried to respond to 
some
of your questions below.  Sorry if my original post wasn’t explicit enough to 
give enough
explanation for what I’m trying to do.



>>>> If I run several sbcl processes on different nodes in my compute cluster, 
>>>> it might happen that
>>>> two different runs notice the same file needs to be recompiled (via asdf),
>>>> and they might try to compile it at the same time.  What is the best way 
>>>> to prevent this?
>>>> 
> You mean that this machines share the same host directory? Interesting.
> 

Yes, the cluster shares some disk, and shares home directory.    And I believe 
two cores
on the same physical host share the /tmp, but I’m not 100% sure about that.


>>>> 
> That's an option. It is expensive, though: it means no sharing of fasl
> files between hosts. If you have cluster of 200 machines, that means
> 200x the disk space.

With regard to the question of efficient reuse of fasl files: this is 
completely irrelevant for my case.   My
code takes hours (10 to 12 hours worst case) to run, but only 20 seconds (or 
less) to compile.  I’m very happy to completely
remove the fasl files and regenerate them before each 10 hour run.  (note to 
self: I need to double check that
I do in fact delete the fasl files every time.)   Besides, my current flow 
allows my simply to git-check-in a change, and
re-lauch the code on the cluster in batch.   I don’t really want to add an 
error-prone manual local-build-and-deploy step
if that can be avoided, unless of course there is some great advantage to that 
approach.

> 
> What about instead building your application as an executable and
> delivering that to the cluster?

One difficulty about your build-then-deliver suggestion is that my local 
machine is running mac-os, and the cluster is
running linux.   I don’t think I can build linux executables on my mac. 


>> 
> You can have different ASDF_OUTPUT_TRANSLATIONS or
> asdf:*output-translations-parameter*
> on each machine, or you can indeed have the user cache depend on
> uiop:hostname and more.
> 

This is what I’ve ended up doing.  And it seems to work.  Here is the code
I have inserted into all my scripts.

(let ((home (directory-namestring (user-homedir-pathname)))
      (uid (sb-posix:getuid))
      (pid  (sb-posix:getpid)))
  (setf asdf::*user-cache* (ensure-directories-exist (format nil "/tmp~A~D/~D/" 
home uid pid))))




> The Right Thing™ is still to build and test then deploy, rather than
> deploy then build.

In response to your suggestion about build then deploy.  This seems very 
dangerous and error prone to me.
For example,what if different hosts want to run the same source code but with 
different optimization settings?  
This is a real possibility, as some of my processes are running with profiling 
(debug 3) and collecting profiling results,
and others are running super optimized (speed 3) code to try to find the 
fastest something-or-other. 

I don’t even know whether it is possible create the .asd files so that changing 
a optimization declaration will trigger
everything depending on it to be recompiled.  And If I think i’ve written my 
.asd files as such, how would I know
whether they are really correct? 

It is not the case currently, but may very well be in the future that I want 
different jobs in the cluster running different
git branches of my code code.  That would be a nightmare to manage if I try to 
share fasl files.

> Using Bazel, you might even be able to build in parallel on your cluster.

Basel sounds interesting, but I don’t really see the advantage of building in 
parallel when it only
takes a few seconds to build, but half a day to execute.

> I still don't understand why your use case uses deploy-then-build
> rather than build-then-deploy.


I hope it is now clear why I can’t.  (1) local machine is mac-os while cluster 
is linux 
(2) different jobs in cluster are using different optimization settings. (3) 
future enhancement
to have different cluster nodes running different branches of the code.

Kind regards
Jim

Re: Multiple processing compiling the same file

Reply via email to