Just a note that in this particular case, selfmatch(annotatedsrf) would be a fast way to generate a grouping vector, like plyranges::group_by(annotatedsrf, selfmatch(annotatedsrf)).
Michael On Wed, Oct 16, 2019 at 2:48 AM Bhagwat, Aditya < aditya.bhag...@mpi-bn.mpg.de> wrote: > Hi Stuart, Michael, > > Your plyranges package is really cool - now I am using it for left joining > GRanges (I am facing a minor issue there > <https://support.bioconductor.org/p/125623/>, but that is not the topic > of this email - I have been asked by Lori not to double-post :-)). > > This email is about the plyranges functionality for grouping GRanges. > That is cool, but I found it to be not so performant for large numbers of > ranges. > My R session hangs when I do: > > bedfile <- paste0(' > https://gitlab.gwdg.de/loosolab/software/multicrispr/wikis', > '/uploads/a51e98516c1e6b71441f5b5a5f741fa1/SRF.bed') > srfranges <- rtracklayer::import.bed(bedfile, genome = 'mm10') > txdb <- TxDb.Mmusculus.UCSC.mm10.ensGene::TxDb.Mmusculus.UCSC.mm10.ensGene > generanges <- GenomicFeatures::genes(txdb) > annotatedsrf <- plyranges::join_overlap_left(srfranges, generanges) > plyranges::group_by(annotatedsrf, seqnames, start, end, strand) > > For my purposes, I worked around it by performing a groupby in data.table: > > data.table::as.data.table(annotatedsrf)[ > !is.na(gene_id), > gene_id := paste0(gene_id, collapse = ';'), > by = c('seqnames', 'start', 'end', 'strand')) > > And was wondering, in general, whether it would be useful to have a > data.table-based backend for plyranges::groupby() > And, whether all of this is actually a on-issue due to my improper use of > plyranges::group_by properly. > > Thank you for feebdack :-) > > Aditya > > > -- Michael Lawrence Scientist, Bioinformatics and Computational Biology Genentech, A Member of the Roche Group Office +1 (650) 225-7760 micha...@gene.com Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel