Dear Henrik,

Thank you very much for changing the code for getAverageFile(), I will
try it and let you know.

Thank you also for the explanation of writing to a temporary file, now
I understand your intention.

Regarding race conditions: No, I do not assume that aroma.* takes care
of potential race conditions. Here is what I do:

Assume that I have downloaded from GEO a prostate cancer dataset
consisting of 40 CEL-files. Then I create a directory "Prostate" and
subdirectories "Prostate/annotationData" and "Prostate/rawData"
following your required file structure.

 However, starting with the 2nd CEL-file I create subdirectories
"Prostate/Prostate2",...,"Prostate/Prostate40", each containing a
symbolic link to "../annotationData" and "../rawData" from "Prostate".
Thus when running GLAD each cluster node has its own directory to
write to, e.g. "Prostate/Prostate21/reports" for creating the images.
Only after all nodes have finished their computations, then I move the
relevant files to the main directory, e.g. all images are moved to
"Prostate/reports". Afterwards I delete the subdirectories
"Prostate2",...,"Prostate40" and their contents.

As you can see, using this setup there should not be any race
conditions. The only remaining problem are the temporary files which
you store in ".Rcache" in my home directory.

I know that you store the monocell files in ".Rcache/
aroma.affymetrix", so that the monocell files have to be created only
once. However, for the temporary files please allow me to suggest that
you create a temporary directory in your file structure, e.g.
"Prostate/tmp", where these files are stored. In my case this would
definitely solve my problem since each subdirectory would contain its
own temporary directory, e.g. "Prostate/Prostate21/tmp". I do not know
if this change would break any code or cause any problems, it is only
a naive suggestion. What is your  opinion?

Best regards
Christian


On Jul 21, 6:46 pm, Henrik Bengtsson <henrik.bengts...@gmail.com>
wrote:
> Hi Christian.
>
> On Wed, Jul 21, 2010 at 2:59 PM, cstratowa
>
>
>
> <christian.strat...@vie.boehringer-ingelheim.com> wrote:
> > Dear Henrik,
>
> > Thank you for this extensive explanation and sorry for the late reply
> > but I was pretty busy.
>
> > Yes, it did work before! As I mentioned with versions
> > aroma.affymetrix_1.1.0 and earlier I have never had a  problem doing
> > the analyses on cluster nodes.
>
> > Looking at the source code of different versions of saveObject() I
> > realize that using "saveObject(..,safe=FALSE)" would be the same as
> > using saveObject() from R.utils_0.9.1. Thus in principle this could
> > solve my problem. Is this correct?
>
> > Sadly, method AffymetrixCelSet::getAverageFile() in
> > aroma.affymetrix_1.6.2 does not allow to pass parameter "safe=FALSE"
> > to saveObject(). Is it possible for you to change it?
>
> I have decided to remove that debug code that calls saveObject(),
> because it is not really needed anymore.  The main reason why I remove
> it is because it is obsolete code.  The intention of that code snippet
> in getAverageFile() was never to protect against race conditions (it
> was just an unplanned side effect).
>
> Until next release, you can get a patched version as:
>
> library("aroma.affymetrix");
> downloadPackagePatch("aroma.affymetrix");
>
> Note, as I said in my previous reply, by processing (=here calling
> getAverageFile() on) the same data set on multiple hosts, you are
> potentially running into race conditions resulting in corrupt data.
> You should at least be aware of it and understand why this is the
> case.
>
>
>
> > It is still not clear to me why you create first a temporary file
> > which you then rename (although you mention power failures etc).
> > However, would it be possible to add a random number to the temporary
> > filename, e.g. "*.tmp.1948234", so that the problem with the existing
> > temporary file could be avoided?
>
> The main purpose of writing to a temporary file and then renaming is
> to make sure that the file is complete.  If something happens while
> writing the temporary file, the final file will not exist/be created.
> If one would write to the final file from the beginning, there is no
> way for us to know if the file was correctly created or not.  So,
> writing via a temporary file, we effectively have a way of creating
> files in one atomic action.
>
>
>
> > Probably you only need to change line 59 to:
>
> > pathnameT <- sprintf("%s.tmp.%i", pathname,
> > as.integer(runif(1,1,99999999)))
>
> In order not to corrupt the temporary file, we check if it already
> exist as a protection for being overwritten/added to by another
> process.  Yes, you could randomize the name of the temporary file,
> lowering the risk of two hosts writing to the same temporary file.
> However, when done, both hosts will try to rename their temporary
> files to the same pathname.  If done at the same time, we still may
> have problems.
>
>
>
> > Regarding your suggestion to wrap getAverageFile() in Mutex calls I
> > have no idea if there exists an R-package for this purpose. Neither
> > Rmpi nor snow seem to be suitable for this purpose (at least  not
> > without a complete re-write of my package).
>
> Yes, I neither know of a functional mutex implementation in R.  You
> can achieve some by utilizing the lock mechanisms of data base servers
> (not SqlLite), but nothing ready is available to my knowledge.
>
> Again, you seem to assume that aroma.* takes care of potential race
> conditions for you - it does not.  It only tries to detect them
> without warranty - and indeed, the reason why got the error in the
> first place indicates that you are pushing the system and that race
> conditions may very well happen.  If you run things in parallel and
> you are updating/writing the *same data resource*, you should really
> have protection against race conditions.  This is a generic problem
> unrelated to aroma.*.
>
> /Henrik
>
>
>
> > One other question:
> > Is it allowed to delete the contents of directory .Rcache/
> > aroma.affymetrix/idChecks?
>
> Yes, it should be safe to delete any .Rcache/ as long as no R session
> is in the process of writing to it.  It's a cache containing redundant
> information.
>
>
>
> > Best regards
> > Christian
>
> > On Jul 2, 12:47 am, Henrik Bengtsson <h...@stat.berkeley.edu> wrote:
>
> > > Hi Christian.
>
> > > On Tue, Jun 29, 2010 at 3:39 PM, cstratowa
>
> > > <christian.strat...@vie.boehringer-ingelheim.com> wrote:
> > > > Dear Henrik,
>
> > > > Until now I have used aroma.affymetrix_1.1.0 with R-2.8.1 and could
> > > > run my analysis on our sge-cluster w/o any problems.
>
> > > > Now I have upgraded to R-2.11.1 and to aroma.affymetrix_1.6.2 and are
> > > > curently testing with 8 chips whether my package based on
> > > > aroma.affymetrix still works on the cluster. The normalization step on
> > > > a server did run fine, howeever, distributing the 8 samples on the
> > > > cluster to run GladModel() resulted in the problem that 3 of 8 cluster
> > > > nodes did stop with the following error message:
>
> > > > Loading required package: GLAD
> > > > ...
> > > > Loading required package: RColorBrewer
> > > > Loading required package: Cairo
> > > > Error in list(`computeCN(aroma, model = model, arrays = arrays[i],
> > > > chromosomes = 1:23, ref` = <environment>,  :
>
> > > > [2010-06-29 15:08:49] Exception: Cannot save to file. Temporary file
> > > > already exists: ~/.Rcache/aroma.affymetrix/idChecks/
> > > > a1c33926939ee43fbed83ae69301d215.tmp
> > > >  at throw(Exception(...))
> > > >  at throw.default("Cannot save to file. Temporary file already
> > > > exists: ", pathn
> > > >  at throw("Cannot save to file. Temporary file already exists: ",
> > > > pathnameT)
> > > >  at saveObject.default(list(key = key, keyIds = lapply(key, digest2),
> > > > id = id),
> > > >  at saveObject(list(key = key, keyIds = lapply(key, digest2), id =
> > > > id), idPathn
> > > >  at getAverageFile.AffymetrixCelSet(ces, force = force, verbose =
> > > > less(verbose)
> > > >  at NextMethod(generic = "getAverageFile", object = this, indices =
> > > > indices, ..
> > > >  at getAverageFile.ChipEffectSet(ces, force = force, verbose =
> > > > less(verbose))
> > > >  at NextMethod(generic = "getAverageFile", object = this, ...)
> > > >  at getAverageFile.SnpChipEffectSet(ces, force = force, verbose =
> > > > less(verbose)
> > > >  at NextMethod(generic = "getAverageFile", object = this, ...)
> > > >  at getAverageFile.CnChipEffectS
> > > > Calls: computeCN ... saveObject.default -> throw -> throw.default ->
> > > > throw -> throw.Exception
> > > > Execution halted
>
> > > > Interestingly, on the other 5 nodes GladModel() seems to run fine.
>
> > > > Do you have any idea what the reason for this problem might be?
>
> > > This seems to be due to a race condition, because several processes
> > > calls getAverageFile() on the same data set (set of data files).  It
> > > has nothing to do with the GladModel - that is only calling
> > > getAverageFile() in order to calculate the average signal across all
> > > samples in the data set.
>
> > > More precisely, in this particular case it is saveObject() of R.utils
> > > that detects that there already exist a temporary file (added file
> > > name extension *.tmp) that is currently being created and written to
> > > by another process.  This temporary file is renamed to its final name
> > > when done.  The reason why didn't observe it before is most likely
> > > because this additional feature was added to saveObject() in R.utils
> > > v1.2.4:
>
> > > Version: 1.2.4 [2009-10-30]
> > > o ROBUSTIFICATION: Lowered the risk for saveObject() to leave an
> > >   imcomplete file due to say power failures etc.  This is done by
> > >   first writing to a temporary file, which is then renamed.  If the
> > >   temporary file already exists, an exception is thrown.
>
> > > Ok, that's the details explaining the error message and the traceback
> > > you report.
>
> > > So, did it work before?  Did you get valid estimates?  Probably,
> > > because the way getAverageFile() is written it is unlikely that a
> > > corrupt result file is created. For sure is that the calculations
> > > where done multiple times if there were race conditions.
>
> > > I'd like to put out a little disclaimer that although I try write
> > > methods so that they work even when there are race conditions.
> > > However, as you've noticed, I am also very conservative, that is, I
> > > rather detect the race condition and throw an exception, than silently
> > > ignore it.  Then plan is to loosen this up in the future. I just like
> > > to say this here so that you understand my current design
> > > decisions/plans.
>
> > > I have to think about this particular case, because I could loosen up
> > > getAverageFile() a bit, I think. However, at the moment it is better
> > > if you take care of the race conditions yourself.  Assume you current
> > > code looks something like this:
>
> > > fln <- FragmentLengthNormalization(ces);
> > > cesN <- process(fln);
> > > seg <- GladModel(cesN);
> > > process(seg);
>
> > > Then first you should know that the latter two lines are
> > > computationally identical to [it is only slightly more complicated if
> > > you use chip type pairs]:
>
> > > ceR <- getAverageFile(cesN);
> > > seg <- GladModel(cesN, ceR);
> > > process(seg);
>
> > > So, if you can synchronize the averaging by (conceptually only):
>
> > > mutex <- waitForMutex("foo");
> > > ceR <- getAverageFile(cesN);
> > > releaseMutex(mutex);
>
> > > then it should all be fine.  Replace waitForMutex()/releaseMutex()
> > > with your favorite synchronization mechanism.  FYI, if there would be
> > > a cross-platform bullet proof and generic synchronization mechanism in
> > > R, I would internally add synchronization to lots of methods.
>
> > > Hope this helps(?)
>
> > > Henrik
>
> > > >> sessionInfo()
> > > > R version 2.11.1 (2010-05-31)
> > > > x86_64-unknown-linux-gnu
>
> > > > locale:
> > > > [1] C
>
> > > > attached base packages:
> > > > [1] stats     graphics  grDevices
>
> ...
>
> read more »

-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

Reply via email to