>From your description it very much sounds like creating a new package is the way to go.
On Thu, Oct 24, 2019 at 3:03 PM Pages, Herve <[email protected]> wrote: > Hi Panagiotis, > > Avoiding code repetition is always a good idea. An alternative to the > creation of a 3rd package would be to have one of the 2 packages depend > on the other. If that is not a good option (and there might be some > valid reasons for that) then yes, factorizing out the repeated stuff and > putting it in a 3rd package is a good option. > > Note that your subject line is confusing: You're asking opinions on a > meta-annotation package but IIUC this is about the creation of a > **software** package that would provide tools for building and/or > querying a certain type of annotations right? I think of a > meta-annotation package as a data package that would contain searchable > meta data about existing biological annotations but that is not what we > are talking about here is it? > > Also I wonder how much overlap there would be between this new package > and packages like AnnotationDbi, AnnotationForge, GenomicFeatures, > ensembldb which also provide functionalities for creating and querying > annotations. For example AnnotationForge and AnnotationDbi are used to > create and query the hundreds of "classic" *db packages. > > Best, > H. > > On 10/20/19 19:56, Panagiotis Moulos wrote: > > Dear developers, > > > > I maintain two packages (metaseqR, recoup) and about to submit an > enhanced > > (but different in many points, thus a new package) version of the 1st > > (metaseqR2). During their course of development, maintenance and usage, > > these packages have somehow come to use a common underlying annotation > > system for the genomic regions they operate on, which of course makes use > > of Bioconductor facilities and of course structures (GenomicRanges, > > GenomicAlignments, BSgenome, GenomicFeatures etc.) > > > > This annotation system: > > - Builds a local SQLite database > > - Supports certain "custom" genomic features which are required for the > > modeling made by these packages > > - Is currently embedded to each package > > - Has almost evolved to a package of its own with respect to independent > > functionalities > > > > The reason for this mail/question is that I would like to ask your > opinion > > whether it is worthy to create a new package to host the annotation > > functions and detach from the other two. Some points to support this > idea: > > > > 1. It's used in the same manner by two other packages, thus there is a > lot > > of code repetition > > 2. Users (including myself) often load one of these packages just to use > it > > to fetch genomic region annotations for other purposes outside the scope > of > > each package (metaseqR - RNA-Seq data analysis, recoup - NGS signal > > visualization). > > 3. It automatically constructs the required annotation regions to analyze > > Lexogen Quant-Seq data (a protocol we are using a lot), a function which > > may be useful to many others > > 4. The database created can be expanded with custom user annotations > using > > a GTF file to create it (making use of makeTxDbFromGFF) > > 5. Supports various annotation sources (Ensembl, UCSC, RefSeq, custom) in > > one place > > 6. Has a versioning system, allowing transparency and reproducibility > when > > required > > > > Some (maybe obvious) points against this idea: > > > > 1. Bioconductor has already a robust and rich genomic annotation system > > which can be used and re-used as necessary > > 2. Maybe there is no need for yet another annotation-related package > > 3. There is possibly no wide acceptance for such a package, other than my > > usage in the other two, and maybe a few more users that make use of the > > annotation functionalities > > 4. Does not follow standard Bioconductor guidelines for creating > annotation > > packages (on the other hand it's not an annotation package in the strict > > sense, but more a meta-annotation package). > > > > Do you have any thoughts or opinions on the best way of action? > > > > Best regards, > > > > Panagiotis > > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: [email protected] > Phone: (206) 667-5791 > Fax: (206) 667-1319 > _______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > -- Best, Kasper [[alternative HTML version deleted]] _______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
