Dear Henrik, Thank you for this extensive explanation and sorry for the late reply but I was pretty busy.
Yes, it did work before! As I mentioned with versions aroma.affymetrix_1.1.0 and earlier I have never had a problem doing the analyses on cluster nodes. Looking at the source code of different versions of saveObject() I realize that using "saveObject(..,safe=FALSE)" would be the same as using saveObject() from R.utils_0.9.1. Thus in principle this could solve my problem. Is this correct? Sadly, method AffymetrixCelSet::getAverageFile() in aroma.affymetrix_1.6.2 does not allow to pass parameter "safe=FALSE" to saveObject(). Is it possible for you to change it? It is still not clear to me why you create first a temporary file which you then rename (although you mention power failures etc). However, would it be possible to add a random number to the temporary filename, e.g. "*.tmp.1948234", so that the problem with the existing temporary file could be avoided? Probably you only need to change line 59 to: pathnameT <- sprintf("%s.tmp.%i", pathname, as.integer(runif(1,1,99999999))) Regarding your suggestion to wrap getAverageFile() in Mutex calls I have no idea if there exists an R-package for this purpose. Neither Rmpi nor snow seem to be suitable for this purpose (at least not without a complete re-write of my package). One other question: Is it allowed to delete the contents of directory .Rcache/ aroma.affymetrix/idChecks? Best regards Christian On Jul 2, 12:47 am, Henrik Bengtsson <h...@stat.berkeley.edu> wrote: > Hi Christian. > > On Tue, Jun 29, 2010 at 3:39 PM, cstratowa > > > > <christian.strat...@vie.boehringer-ingelheim.com> wrote: > > Dear Henrik, > > > Until now I have used aroma.affymetrix_1.1.0 with R-2.8.1 and could > > run my analysis on our sge-cluster w/o any problems. > > > Now I have upgraded to R-2.11.1 and to aroma.affymetrix_1.6.2 and are > > curently testing with 8 chips whether my package based on > > aroma.affymetrix still works on the cluster. The normalization step on > > a server did run fine, howeever, distributing the 8 samples on the > > cluster to run GladModel() resulted in the problem that 3 of 8 cluster > > nodes did stop with the following error message: > > > Loading required package: GLAD > > ... > > Loading required package: RColorBrewer > > Loading required package: Cairo > > Error in list(`computeCN(aroma, model = model, arrays = arrays[i], > > chromosomes = 1:23, ref` = <environment>, : > > > [2010-06-29 15:08:49] Exception: Cannot save to file. Temporary file > > already exists: ~/.Rcache/aroma.affymetrix/idChecks/ > > a1c33926939ee43fbed83ae69301d215.tmp > > at throw(Exception(...)) > > at throw.default("Cannot save to file. Temporary file already > > exists: ", pathn > > at throw("Cannot save to file. Temporary file already exists: ", > > pathnameT) > > at saveObject.default(list(key = key, keyIds = lapply(key, digest2), > > id = id), > > at saveObject(list(key = key, keyIds = lapply(key, digest2), id = > > id), idPathn > > at getAverageFile.AffymetrixCelSet(ces, force = force, verbose = > > less(verbose) > > at NextMethod(generic = "getAverageFile", object = this, indices = > > indices, .. > > at getAverageFile.ChipEffectSet(ces, force = force, verbose = > > less(verbose)) > > at NextMethod(generic = "getAverageFile", object = this, ...) > > at getAverageFile.SnpChipEffectSet(ces, force = force, verbose = > > less(verbose) > > at NextMethod(generic = "getAverageFile", object = this, ...) > > at getAverageFile.CnChipEffectS > > Calls: computeCN ... saveObject.default -> throw -> throw.default -> > > throw -> throw.Exception > > Execution halted > > > Interestingly, on the other 5 nodes GladModel() seems to run fine. > > > Do you have any idea what the reason for this problem might be? > > This seems to be due to a race condition, because several processes > calls getAverageFile() on the same data set (set of data files). It > has nothing to do with the GladModel - that is only calling > getAverageFile() in order to calculate the average signal across all > samples in the data set. > > More precisely, in this particular case it is saveObject() of R.utils > that detects that there already exist a temporary file (added file > name extension *.tmp) that is currently being created and written to > by another process. This temporary file is renamed to its final name > when done. The reason why didn't observe it before is most likely > because this additional feature was added to saveObject() in R.utils > v1.2.4: > > Version: 1.2.4 [2009-10-30] > o ROBUSTIFICATION: Lowered the risk for saveObject() to leave an > imcomplete file due to say power failures etc. This is done by > first writing to a temporary file, which is then renamed. If the > temporary file already exists, an exception is thrown. > > Ok, that's the details explaining the error message and the traceback > you report. > > So, did it work before? Did you get valid estimates? Probably, > because the way getAverageFile() is written it is unlikely that a > corrupt result file is created. For sure is that the calculations > where done multiple times if there were race conditions. > > I'd like to put out a little disclaimer that although I try write > methods so that they work even when there are race conditions. > However, as you've noticed, I am also very conservative, that is, I > rather detect the race condition and throw an exception, than silently > ignore it. Then plan is to loosen this up in the future. I just like > to say this here so that you understand my current design > decisions/plans. > > I have to think about this particular case, because I could loosen up > getAverageFile() a bit, I think. However, at the moment it is better > if you take care of the race conditions yourself. Assume you current > code looks something like this: > > fln <- FragmentLengthNormalization(ces); > cesN <- process(fln); > seg <- GladModel(cesN); > process(seg); > > Then first you should know that the latter two lines are > computationally identical to [it is only slightly more complicated if > you use chip type pairs]: > > ceR <- getAverageFile(cesN); > seg <- GladModel(cesN, ceR); > process(seg); > > So, if you can synchronize the averaging by (conceptually only): > > mutex <- waitForMutex("foo"); > ceR <- getAverageFile(cesN); > releaseMutex(mutex); > > then it should all be fine. Replace waitForMutex()/releaseMutex() > with your favorite synchronization mechanism. FYI, if there would be > a cross-platform bullet proof and generic synchronization mechanism in > R, I would internally add synchronization to lots of methods. > > Hope this helps(?) > > Henrik > > > > >> sessionInfo() > > R version 2.11.1 (2010-05-31) > > x86_64-unknown-linux-gnu > > > locale: > > [1] C > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods > > base > > > other attached packages: > > [1] Cairo_1.4-5 RColorBrewer_1.0-2 > > GLAD_2.10.0 > > [4] biasnp_0.2.54 RODBC_1.3-1 > > aroma.affymetrix_1.6.2 > > [7] aroma.apd_0.1.7 affxparser_1.20.0 > > R.huge_0.2.0 > > [10] aroma.core_1.6.2 aroma.light_1.16.0 > > matrixStats_0.2.1 > > [13] R.rsp_0.3.6 R.cache_0.3.0 > > R.filesets_0.8.2 > > [16] digest_0.4.2 R.utils_1.4.2 > > R.oo_1.7.3 > > [19] R.methodsS3_1.2.0 > > > loaded via a namespace (and not attached): > > [1] tools_2.11.1 > > Warning message: > > 'DESCRIPTION' file has 'Encoding' field and re-encoding is not > > possible > > > Best regards > > Christian > > > P.S.: > > Is Google groups still the place to post questions? > > > -- > > When reporting problems on aroma.affymetrix, make sure 1) to run the latest > > version of the package, 2) to report the output of sessionInfo() and > > traceback(), and 3) to post a complete code example. > > > You received this message because you are subscribed to the Google Groups > > "aroma.affymetrix" group with websitehttp://www.aroma-project.org/. > > To post to this group, send email to aroma-affymetrix@googlegroups.com > > To unsubscribe and other options, go tohttp://www.aroma-project.org/forum/ -- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/