Re: [aroma.affymetrix] Speeding up RmaBackgroundCorrection
Hi Henrik, Lowering memory helped - it's drinks on me when we meet. It has been running for approximately 7 days now (ETA for unit type 'expression': 20140320 23:20:26). With the updated affxparser can I speed it up by increasing the memory burden? The current memory allocation is approximately 3 Gbytes. I don't want to cancel the run if it means loosing the progress though. Best, Damian On Tuesday, March 4, 2014 8:32:17 PM UTC-5, Henrik Bengtsson wrote: Did lowering memory/ram solve your problem? Also, an updated version of affxparser that no longer should overflow by the integer multiplication is available (on Bioconductor). Cheers, Henrik On Thu, Feb 27, 2014 at 12:36 PM, Henrik Bengtsson henrik.b...@aroma-project.org javascript: wrote: Congratulations Damian, I think your the first one to hit a limit of the Aroma Framework (remind me to by you a drink whenever you see me in person). I narrowed it down to the affxparser(*) package and I'll investigate further on how to fix this. It should not occur and I'm confident that it can be avoided internally. In the meanwhile, try to lower your 'memory/ram' setting, e.g. setOption(aromaSettings, memory/ram, 10.0) or less. I'm not 100% sure it'll help, but if it does, that's a good clue (for me) on what's causing it. /Henrik DETAILS: The below illustrates the issue in affxparser::readCelUnits(): .Machine$integer.max [1] 2147483647 nbrOfArrays - 5622L .Machine$integer.max / nbrOfArrays [1] 381978.6 nbrOfCells - 381978L nbrOfCells * nbrOfArrays [1] 2147480316 nbrOfCells - 381979L nbrOfCells * nbrOfArrays [1] NA Warning message: In nbrOfCells * nbrOfArrays : NAs produced by integer overflow By decreasing 'memory/ram' I *hope* that 'nbrOfCells' effectively becomes smaller. On Wed, Feb 26, 2014 at 9:15 PM, Damian Plichta damian@gmail.com javascript: wrote: Hi Henrik, Thank you, that was helpful. I run to another problem though. I am trying to perform ExonRmaPlm(csQN, merge=TRUE) but this produces a following error: 20140226 23:25:33| Identifying CDF cell indices...done Error in vector(double, nbrOfCells * nbrOfArrays) : vector size cannot be NA In addition: Warning message: In nbrOfCells * nbrOfArrays : NAs produced by integer overflow 20140226 23:28:35| Reading probe intensities from 5622 arrays...done 20140226 23:28:35| Fitting chunk #1 of 1 of 'expression' units (code=1) with various dimensions...done 20140226 23:28:35|Unit dimension #3 (various dimensions) of 3...done 20140226 23:28:35| Fitting the model by unit dimensions (at least for the large classes)...done 20140226 23:28:35| Unit type #1 ('expression') of 1...done 20140226 23:28:35| Fitting ExonRmaPlm for each unit type separately...done 20140226 23:28:35|Fitting model of class ExonRmaPlm...done I testes whether it worked anyway, but the expression is zero across all arrays when I access it. Do you know what could be causing the problem? Best, Damian The code I run is below: library(aroma.affymetrix) library(aroma.core) setOption(aromaSettings, memory/ram, 500.0); verbose - Arguments$getVerbose(-8, timestamp=TRUE) chipType - HuEx-1_0-st-v2-core cdf - AffymetrixCdfFile$byChipType(chipType) #print(cdf) cs - AffymetrixCelSet$byName(experiment1, cdf=cdf) bc - RmaBackgroundCorrection(cs) csBC - process(bc,verbose=verbose) qn - QuantileNormalization(csBC, typesToUpdate=pm) target - getTargetDistribution(qn, verbose=verbose) qn - QuantileNormalization(csBC, typesToUpdate=pm, targetDistribution=target) csQN - process(qn, verbose=verbose) csPLM - ExonRmaPlm(csQN, mergeGroups=TRUE) fit(csPLM, verbose=verbose) date() ces - getChipEffectSet(csPLM) gExprs - extractDataFrame(ces, units=1:3, addNames=TRUE) sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=C LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] preprocessCore_1.23.0 aroma.light_1.31.8 matrixStats_0.8.14 [4] aroma.affymetrix_2.11.1 aroma.core_2.11.0 R.devices_2.8.2 [7] R.filesets_2.3.0R.utils_1.29.8 R.oo_1.17.0 [10] affxparser_1.34.0 R.methodsS3_1.6.1 loaded via a namespace (and not attached): [1] aroma.apd_0.4.0 base64enc_0.1-1 digest_0.6.4DNAcopy_1.35.1
Re: [aroma.affymetrix] Speeding up RmaBackgroundCorrection
On Fri, Mar 7, 2014 at 8:05 AM, Damian Plichta damian.plic...@gmail.com wrote: Hi Henrik, Lowering memory helped - it's drinks on me when we meet. It has been running for approximately 7 days now (ETA for unit type 'expression': 20140320 23:20:26). With the updated affxparser can I speed it up by increasing the memory burden? The current memory allocation is approximately 3 Gbytes. I don't want to cancel the run if it means loosing the progress though. PLM fitting is done in chunks of units. When a new chunk starts the estimates of the previous one are guaranteed to have been saved to disk. After that you can interrupts the processing at any point and safely restart. All previously processed chunks will be skipped. The current chunk that was interrupted will have to be redone from scratch. When you increase option memory/ram the chunks will be bigger, that is, more units will be processed per chunk. Given a fix memory/ram setting, the number of units per chunk will go down as the number of arrays increases, e.g. doubling the number of arrays will half the number of units processed per chunk. Increasing memory/ram makes a big difference particularly if there are only a small number of units per chunk. The there is a relatively larger disk I/O overhead of reading probe intensities and storing parameter estimates. This is mainly because the file system can impossibly cache the content of all 1000's arrays, i.e. it reads a few units of one array, then goes to the next array and so on. Also, the more the probes are scattered on the array the more they are also scattered in the CEL files, meaning when reading those units from one file, the file system has to skip through a large portion of the file (skipping is cheap, but it is still more efficient to read things nearby rather than scattered and it is more likely that the file cache will be successful). When you increase memory/ram you read more units and therefore you lower the fraction of skipped bytes versus read ones. This is what I believe brings the most speedup when increasing memory/ram. Eventually I *think* this payoff will be relative small and there is little/no longer a need to increase memory/ram. So, yes, you can interrupt your script, update affxparser and increase memory/ram, restart R and restart your script safely.After each chunk is completed, there are some timing statistics on read, write, and fitting overhead in addition to the ETA estimate. Have a look at those, to see if changing the settings makes a difference. Please report back to share you experience - there are some other user benchmarks related to this on http://aroma-project.org/howtos/ImproveProcessingTime /Henrik Best, Damian On Tuesday, March 4, 2014 8:32:17 PM UTC-5, Henrik Bengtsson wrote: Did lowering memory/ram solve your problem? Also, an updated version of affxparser that no longer should overflow by the integer multiplication is available (on Bioconductor). Cheers, Henrik On Thu, Feb 27, 2014 at 12:36 PM, Henrik Bengtsson henrik.b...@aroma-project.org wrote: Congratulations Damian, I think your the first one to hit a limit of the Aroma Framework (remind me to by you a drink whenever you see me in person). I narrowed it down to the affxparser(*) package and I'll investigate further on how to fix this. It should not occur and I'm confident that it can be avoided internally. In the meanwhile, try to lower your 'memory/ram' setting, e.g. setOption(aromaSettings, memory/ram, 10.0) or less. I'm not 100% sure it'll help, but if it does, that's a good clue (for me) on what's causing it. /Henrik DETAILS: The below illustrates the issue in affxparser::readCelUnits(): .Machine$integer.max [1] 2147483647 nbrOfArrays - 5622L .Machine$integer.max / nbrOfArrays [1] 381978.6 nbrOfCells - 381978L nbrOfCells * nbrOfArrays [1] 2147480316 nbrOfCells - 381979L nbrOfCells * nbrOfArrays [1] NA Warning message: In nbrOfCells * nbrOfArrays : NAs produced by integer overflow By decreasing 'memory/ram' I *hope* that 'nbrOfCells' effectively becomes smaller. On Wed, Feb 26, 2014 at 9:15 PM, Damian Plichta damian@gmail.com wrote: Hi Henrik, Thank you, that was helpful. I run to another problem though. I am trying to perform ExonRmaPlm(csQN, merge=TRUE) but this produces a following error: 20140226 23:25:33| Identifying CDF cell indices...done Error in vector(double, nbrOfCells * nbrOfArrays) : vector size cannot be NA In addition: Warning message: In nbrOfCells * nbrOfArrays : NAs produced by integer overflow 20140226 23:28:35| Reading probe intensities from 5622 arrays...done 20140226 23:28:35| Fitting chunk #1 of 1 of 'expression' units (code=1) with various dimensions...done 20140226 23:28:35|Unit dimension #3 (various dimensions) of 3...done 20140226 23:28:35| Fitting the model by unit dimensions (at
Re: [aroma.affymetrix] Speeding up RmaBackgroundCorrection
Did lowering memory/ram solve your problem? Also, an updated version of affxparser that no longer should overflow by the integer multiplication is available (on Bioconductor). Cheers, Henrik On Thu, Feb 27, 2014 at 12:36 PM, Henrik Bengtsson henrik.bengts...@aroma-project.org wrote: Congratulations Damian, I think your the first one to hit a limit of the Aroma Framework (remind me to by you a drink whenever you see me in person). I narrowed it down to the affxparser(*) package and I'll investigate further on how to fix this. It should not occur and I'm confident that it can be avoided internally. In the meanwhile, try to lower your 'memory/ram' setting, e.g. setOption(aromaSettings, memory/ram, 10.0) or less. I'm not 100% sure it'll help, but if it does, that's a good clue (for me) on what's causing it. /Henrik DETAILS: The below illustrates the issue in affxparser::readCelUnits(): .Machine$integer.max [1] 2147483647 nbrOfArrays - 5622L .Machine$integer.max / nbrOfArrays [1] 381978.6 nbrOfCells - 381978L nbrOfCells * nbrOfArrays [1] 2147480316 nbrOfCells - 381979L nbrOfCells * nbrOfArrays [1] NA Warning message: In nbrOfCells * nbrOfArrays : NAs produced by integer overflow By decreasing 'memory/ram' I *hope* that 'nbrOfCells' effectively becomes smaller. On Wed, Feb 26, 2014 at 9:15 PM, Damian Plichta damian.plic...@gmail.com wrote: Hi Henrik, Thank you, that was helpful. I run to another problem though. I am trying to perform ExonRmaPlm(csQN, merge=TRUE) but this produces a following error: 20140226 23:25:33| Identifying CDF cell indices...done Error in vector(double, nbrOfCells * nbrOfArrays) : vector size cannot be NA In addition: Warning message: In nbrOfCells * nbrOfArrays : NAs produced by integer overflow 20140226 23:28:35| Reading probe intensities from 5622 arrays...done 20140226 23:28:35| Fitting chunk #1 of 1 of 'expression' units (code=1) with various dimensions...done 20140226 23:28:35|Unit dimension #3 (various dimensions) of 3...done 20140226 23:28:35| Fitting the model by unit dimensions (at least for the large classes)...done 20140226 23:28:35| Unit type #1 ('expression') of 1...done 20140226 23:28:35| Fitting ExonRmaPlm for each unit type separately...done 20140226 23:28:35|Fitting model of class ExonRmaPlm...done I testes whether it worked anyway, but the expression is zero across all arrays when I access it. Do you know what could be causing the problem? Best, Damian The code I run is below: library(aroma.affymetrix) library(aroma.core) setOption(aromaSettings, memory/ram, 500.0); verbose - Arguments$getVerbose(-8, timestamp=TRUE) chipType - HuEx-1_0-st-v2-core cdf - AffymetrixCdfFile$byChipType(chipType) #print(cdf) cs - AffymetrixCelSet$byName(experiment1, cdf=cdf) bc - RmaBackgroundCorrection(cs) csBC - process(bc,verbose=verbose) qn - QuantileNormalization(csBC, typesToUpdate=pm) target - getTargetDistribution(qn, verbose=verbose) qn - QuantileNormalization(csBC, typesToUpdate=pm, targetDistribution=target) csQN - process(qn, verbose=verbose) csPLM - ExonRmaPlm(csQN, mergeGroups=TRUE) fit(csPLM, verbose=verbose) date() ces - getChipEffectSet(csPLM) gExprs - extractDataFrame(ces, units=1:3, addNames=TRUE) sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=C LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] preprocessCore_1.23.0 aroma.light_1.31.8 matrixStats_0.8.14 [4] aroma.affymetrix_2.11.1 aroma.core_2.11.0 R.devices_2.8.2 [7] R.filesets_2.3.0R.utils_1.29.8 R.oo_1.17.0 [10] affxparser_1.34.0 R.methodsS3_1.6.1 loaded via a namespace (and not attached): [1] aroma.apd_0.4.0 base64enc_0.1-1 digest_0.6.4DNAcopy_1.35.1 [5] PSCBS_0.40.4R.cache_0.9.2 R.huge_0.6.0R.rsp_0.9.28 [9] tools_3.0.2 On Thursday, February 20, 2014 1:21:25 PM UTC-5, Henrik Bengtsson wrote: On Tue, Feb 18, 2014 at 7:30 PM, Damian Plichta damian@gmail.com wrote: Thanks, that helped a lot. It took me less than 3 hours to perform the background correction. Now I'm wondering if for the next step, quantile normalization, I could do a similar trick. Is there a way to precompute the target empirical distribution based on all arrays and then do the normalization on chunks of data (thus in an independent manner)? I can see the option targetDistribution under QuantileNormalization. # Calculate the target distribution based on *all* arrays [not parallalized] qn -
Re: [aroma.affymetrix] Speeding up RmaBackgroundCorrection
Congratulations Damian, I think your the first one to hit a limit of the Aroma Framework (remind me to by you a drink whenever you see me in person). I narrowed it down to the affxparser(*) package and I'll investigate further on how to fix this. It should not occur and I'm confident that it can be avoided internally. In the meanwhile, try to lower your 'memory/ram' setting, e.g. setOption(aromaSettings, memory/ram, 10.0) or less. I'm not 100% sure it'll help, but if it does, that's a good clue (for me) on what's causing it. /Henrik DETAILS: The below illustrates the issue in affxparser::readCelUnits(): .Machine$integer.max [1] 2147483647 nbrOfArrays - 5622L .Machine$integer.max / nbrOfArrays [1] 381978.6 nbrOfCells - 381978L nbrOfCells * nbrOfArrays [1] 2147480316 nbrOfCells - 381979L nbrOfCells * nbrOfArrays [1] NA Warning message: In nbrOfCells * nbrOfArrays : NAs produced by integer overflow By decreasing 'memory/ram' I *hope* that 'nbrOfCells' effectively becomes smaller. On Wed, Feb 26, 2014 at 9:15 PM, Damian Plichta damian.plic...@gmail.com wrote: Hi Henrik, Thank you, that was helpful. I run to another problem though. I am trying to perform ExonRmaPlm(csQN, merge=TRUE) but this produces a following error: 20140226 23:25:33| Identifying CDF cell indices...done Error in vector(double, nbrOfCells * nbrOfArrays) : vector size cannot be NA In addition: Warning message: In nbrOfCells * nbrOfArrays : NAs produced by integer overflow 20140226 23:28:35| Reading probe intensities from 5622 arrays...done 20140226 23:28:35| Fitting chunk #1 of 1 of 'expression' units (code=1) with various dimensions...done 20140226 23:28:35|Unit dimension #3 (various dimensions) of 3...done 20140226 23:28:35| Fitting the model by unit dimensions (at least for the large classes)...done 20140226 23:28:35| Unit type #1 ('expression') of 1...done 20140226 23:28:35| Fitting ExonRmaPlm for each unit type separately...done 20140226 23:28:35|Fitting model of class ExonRmaPlm...done I testes whether it worked anyway, but the expression is zero across all arrays when I access it. Do you know what could be causing the problem? Best, Damian The code I run is below: library(aroma.affymetrix) library(aroma.core) setOption(aromaSettings, memory/ram, 500.0); verbose - Arguments$getVerbose(-8, timestamp=TRUE) chipType - HuEx-1_0-st-v2-core cdf - AffymetrixCdfFile$byChipType(chipType) #print(cdf) cs - AffymetrixCelSet$byName(experiment1, cdf=cdf) bc - RmaBackgroundCorrection(cs) csBC - process(bc,verbose=verbose) qn - QuantileNormalization(csBC, typesToUpdate=pm) target - getTargetDistribution(qn, verbose=verbose) qn - QuantileNormalization(csBC, typesToUpdate=pm, targetDistribution=target) csQN - process(qn, verbose=verbose) csPLM - ExonRmaPlm(csQN, mergeGroups=TRUE) fit(csPLM, verbose=verbose) date() ces - getChipEffectSet(csPLM) gExprs - extractDataFrame(ces, units=1:3, addNames=TRUE) sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=C LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] preprocessCore_1.23.0 aroma.light_1.31.8 matrixStats_0.8.14 [4] aroma.affymetrix_2.11.1 aroma.core_2.11.0 R.devices_2.8.2 [7] R.filesets_2.3.0R.utils_1.29.8 R.oo_1.17.0 [10] affxparser_1.34.0 R.methodsS3_1.6.1 loaded via a namespace (and not attached): [1] aroma.apd_0.4.0 base64enc_0.1-1 digest_0.6.4DNAcopy_1.35.1 [5] PSCBS_0.40.4R.cache_0.9.2 R.huge_0.6.0R.rsp_0.9.28 [9] tools_3.0.2 On Thursday, February 20, 2014 1:21:25 PM UTC-5, Henrik Bengtsson wrote: On Tue, Feb 18, 2014 at 7:30 PM, Damian Plichta damian@gmail.com wrote: Thanks, that helped a lot. It took me less than 3 hours to perform the background correction. Now I'm wondering if for the next step, quantile normalization, I could do a similar trick. Is there a way to precompute the target empirical distribution based on all arrays and then do the normalization on chunks of data (thus in an independent manner)? I can see the option targetDistribution under QuantileNormalization. # Calculate the target distribution based on *all* arrays [not parallalized] qn - QuantileNormalization(dsC, typesToUpdate=pm) target - getTargetDistribution(qn, verbose=verbose) # Normalize array by array toward the same target distribution [in chucks] dsCs - extract(dsC, 1:100) qn - QuantileNormalization(dsCs, typesToUpdate=pm, targetDistribution=target) csNs - process(qn, verbose=verbose)
Re: [aroma.affymetrix] Speeding up RmaBackgroundCorrection
Hi Henrik, Thank you, that was helpful. I run to another problem though. I am trying to perform ExonRmaPlm(csQN, merge=TRUE) but this produces a following error: 20140226 23:25:33| Identifying CDF cell indices...done Error in vector(double, nbrOfCells * nbrOfArrays) : vector size cannot be NA In addition: Warning message: In nbrOfCells * nbrOfArrays : NAs produced by integer overflow 20140226 23:28:35| Reading probe intensities from 5622 arrays...done 20140226 23:28:35| Fitting chunk #1 of 1 of 'expression' units (code=1) with various dimensions...done 20140226 23:28:35|Unit dimension #3 (various dimensions) of 3...done 20140226 23:28:35| Fitting the model by unit dimensions (at least for the large classes)...done 20140226 23:28:35| Unit type #1 ('expression') of 1...done 20140226 23:28:35| Fitting ExonRmaPlm for each unit type separately...done 20140226 23:28:35|Fitting model of class ExonRmaPlm…done I testes whether it worked anyway, but the expression is zero across all arrays when I access it. Do you know what could be causing the problem? Best, Damian The code I run is below: library(aroma.affymetrix) library(aroma.core) setOption(aromaSettings, memory/ram, 500.0); verbose - Arguments$getVerbose(-8, timestamp=TRUE) chipType - HuEx-1_0-st-v2-core cdf - AffymetrixCdfFile$byChipType(chipType) #print(cdf) cs - AffymetrixCelSet$byName(experiment1, cdf=cdf) bc - RmaBackgroundCorrection(cs) csBC - process(bc,verbose=verbose) qn - QuantileNormalization(csBC, typesToUpdate=pm) target - getTargetDistribution(qn, verbose=verbose) qn - QuantileNormalization(csBC, typesToUpdate=pm, targetDistribution=target) csQN - process(qn, verbose=verbose) csPLM - ExonRmaPlm(csQN, mergeGroups=TRUE) fit(csPLM, verbose=verbose) date() ces - getChipEffectSet(csPLM) gExprs - extractDataFrame(ces, units=1:3, addNames=TRUE) sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=C LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] preprocessCore_1.23.0 aroma.light_1.31.8 matrixStats_0.8.14 [4] aroma.affymetrix_2.11.1 aroma.core_2.11.0 R.devices_2.8.2 [7] R.filesets_2.3.0R.utils_1.29.8 R.oo_1.17.0 [10] affxparser_1.34.0 R.methodsS3_1.6.1 loaded via a namespace (and not attached): [1] aroma.apd_0.4.0 base64enc_0.1-1 digest_0.6.4DNAcopy_1.35.1 [5] PSCBS_0.40.4R.cache_0.9.2 R.huge_0.6.0R.rsp_0.9.28 [9] tools_3.0.2 On Thursday, February 20, 2014 1:21:25 PM UTC-5, Henrik Bengtsson wrote: On Tue, Feb 18, 2014 at 7:30 PM, Damian Plichta damian@gmail.com javascript: wrote: Thanks, that helped a lot. It took me less than 3 hours to perform the background correction. Now I'm wondering if for the next step, quantile normalization, I could do a similar trick. Is there a way to precompute the target empirical distribution based on all arrays and then do the normalization on chunks of data (thus in an independent manner)? I can see the option targetDistribution under QuantileNormalization. # Calculate the target distribution based on *all* arrays [not parallalized] qn - QuantileNormalization(dsC, typesToUpdate=pm) target - getTargetDistribution(qn, verbose=verbose) # Normalize array by array toward the same target distribution [in chucks] dsCs - extract(dsC, 1:100) qn - QuantileNormalization(dsCs, typesToUpdate=pm, targetDistribution=target) csNs - process(qn, verbose=verbose) Hope this helps /Henrik Kind regards, Damian Plichta On Monday, February 17, 2014 4:03:54 PM UTC-5, Henrik Bengtsson wrote: Hi. On Sun, Feb 16, 2014 at 6:53 PM, Damian Plichta damian@gmail.com wrote: Hi, I'm processing around 5500 affymetrix exon arrays. The RmaBackgroundCorrection() is pretty slow, 1-2 minutes/array. I played with setOption(aromaSettings, memory/ram, X) and increased X up to 100 but it didn't have any effect on this stage of analysis. If you don't notice any difference in processing time by changing memory/ram from the default (1.0) to 100, then the memory is not your bottleneck. Any way to speed the process up? If you haven't already, make sure to read How to: Improve processing time: http://aroma-project.org/howtos/ImproveProcessingTime If you have access to multiple machines on the same file system, you can do poor mans parallel processing for the *background correction*, because each array is corrected independently of the others. You can do this by