Dear snpStats users, I'm working with a large SnpMatrix object (roughly 5000 samples x 200K snps) and I've noticed using numerical accessors is extremely slow, e.g, see times below, takes over 1.5 seconds to retrieve a single cell in SnpMatrix format [1,1], versus 0.0 seconds to access the same datapoint in RAW format. It also takes no longer (still 1.5s) to access an entire row or column [1,] or [,1].
Is snpStats::SnpMatrix doing something unnecessary prior to returning the matrix entry? [NB: 'chopsticks' seems to give the same slow result] Is there any way around this delay other than copying the entire SnpMatrix into RAW format? I want to access specific cell ranges many times in an algorithm i'm writing and this would be excessively slow with access times of 1.5s. Code to show this below. Many thanks, N. # generate raw matrix rawd <- as.raw(sample(0:3,(10^9),replace=T)); dim(rawd) <- c(5000,200000) # copy to a SnpMatrix object snpd <- new("SnpMatrix",rawd) # show class details > class(snpd) [1] "SnpMatrix" attr(,"package") [1] "snpStats" # access times in SnpMatrix format > system.time(snpd[1,]) user system elapsed 0.876 0.681 1.554 > system.time(snpd[1,1]) user system elapsed 0.872 0.668 1.538 > system.time(snpd[,1]) user system elapsed 0.896 0.644 1.540 # access times in raw format > system.time(rawd[1,]) user system elapsed 0.012 0.004 0.011 > system.time(rawd[,1]) user system elapsed 0 0 0 > system.time(rawd[1,1]) user system elapsed 0 0 0 -- View this message in context: http://r.789695.n4.nabble.com/SnpMatrix-super-slow-to-access-cells-when-large-tp4668812.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.