Hello,

I have a question about what is the most efficient way to perform my use case.

What I have done is gotten a matchMatrix from an overlapping, then split it :

regionSiteMap <- findOverlaps(regions, sites)@matchMatrix
indexList <- split(regionSiteMap[, "subject"], regionSiteMap[, "query"])

Now I'd like to, for each region, use the indices to the sites to get the 
sites' scores from a vector and take the mean, like :

means <- sapply(indicesList, function(indices) mean(scoreVect[indices]))

The problem about this is that I have ~ 8 million 'regions', and ~ 28 million 
'sites'. So the indexList is a list of ~ 8 million elements with a few indices 
in each one, and scoresVect is a numeric vector of scores of length ~ 28 
million.

Can anyone suggest what is the fastest way to go on this task ?

--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to