WGCNA maintainer here. When working with a large data set, you have a few options.
1. Without being snarky, the best option is to get (or get access to) a computer with large-enough RAM. Many universities, departments, and other research institutes have computer clusters with nodes with at least 64GB of memory. If you are working on your own computer under Windows, make sure you run the 64-bit version of R and consider buying additional RAM if you can. 2. Reduce the number of the features (usually probes/probesets). This is not always possible in general applications, but for gene expression studies in a single tissue one would not expect more than about 10k genes to be expressed, so using all probes of a modern array is probably an overkill - most of them will not be expressed. If you do the reduction, I recommend filtering out probes whose expression values are low in a suitable fraction of the samples (depending on experiment design). If you can't get a computer big enough to handle a reduced data set, the next options are these: 3. Use blockwiseModules with an appropriately set argument maxBlockSize. See WGCNA tutorial I, section 2c, at http://labs.genetics.ucla.edu/horvath/htdocs/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/index.html , and pay careful attention to the paragraphs discussing the choice of maxBlockSize. The function blockwiseModules can be instructed to save the TOM matrices for each block to disk; you can load them later one-by-one if you need to. 4. If you want to do the analysis on your own, use the function projectiveKMeans to pre-cluster the genes into blocks, then run the analysis in each block separately. Remember to remove all large objects from memory and call garbage collection to free up enough memory before going to the next block. HTH, Peter On Fri, May 9, 2014 at 5:37 AM, KK <kidist.kib...@gmail.com> wrote: > I am also working on co-expression analysis. It seems like there is no way > to use TOMsimilarityFromExpr for large datasets. > The option 'maxBlockSize' exists for module detection but not for > TOMsimilarity? The only solution seems to reduce the dataset > > On Sunday, July 8, 2012 1:02:54 PM UTC+2, deeksha.malhan wrote: >> >> Hi >> I am working on co-expression analysis of rice dataset with the help of >> wgcna and R but now I am at one point which is showing error as shown >> below >> : >> >> >> dissTOM = 1-TOMsimilarityFromExpr(datExpr, power = 8); >> Error: cannot allocate vector of size 2.8 Gb >> In addition: Warning messages: >> 1: In matrix(0, nGenes, nGenes) : >> Reached total allocation of 2550Mb: see help(memory.size) >> 2: In matrix(0, nGenes, nGenes) : >> Reached total allocation of 2550Mb: see help(memory.size) >> 3: In matrix(0, nGenes, nGenes) : >> Reached total allocation of 2550Mb: see help(memory.size) >> 4: In matrix(0, nGenes, nGenes) : >> Reached total allocation of 2550Mb: see help(memory.size) >> >> Help me to resolve this problem >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/Error-during-working-with-wgcna-and-R-tp4635768.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> r-h...@r-project.org <javascript:> mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.