Breaker breaker.
1) A package to define a base (virtual) "Interactions" class. This would
basically have a constant "Vector" store with a "Hits" object specifying
the pairwise interactions between elements in the constant store. One
could also distinguish between "SelfInteractions" (constant store) and
the more general "Interactions" (two stores, possibly of different
types, e.g., genomic interval -> protein interactions). A variety of
methods would be available here to do manipulations and such.
https://github.com/LTLA/IndexedRelations (WIP)
2) A package to define an "Interactions" subclass where the store is a
genomic interval, with basic methods to operate on such classes. Methods
such as findOverlaps(), linkOverlaps() and boundingBox() would probably
go here. @Luke, a binning method could also conceivably go here.
https://github.com/ComputationalRegulatoryGenomicsICL/GenomicInteractions/issues/37
All of this is open for discussion, if people are interested and willing
to volunteer. These changes will not make the next release anyway.
What he said.
-A
On 22/03/2019 19:54, Aaron Lun wrote:
Hi Luke,
Do you mean bins or bin pairs?
If you want to just bin the coverage in terms of the linear genome,
there should be ways to do that outside of InteractionSet or
GenomicInteractions. This is just dealing with standard genomic
interval data; extract the anchor coordinates and plug it in elsewhere.
If you want to collate region pairs into bin pairs; I don't know of a
dedicated function to do this from a GInteractions object (diffHic
only does this from raw read data). You'll need to figure out what to
do to regions that cross bin boundaries.
The simplest way to mimic this behaviour right now is to generate
another GInteractions object containing ALL POSSIBLE bin pairs (use
combn with a constant set of bin regions) and plug that into
countOverlaps. This will generate loads of zeroes, though, so is not
the most efficient way to do this. You could get a sparser form with
linkOverlaps but this requires more work to get the counts.
I have some more thoughts about the Bioconductor Hi-C infrastructure,
but my laptop battery's running out and I left my charger in my new
apartment. So that'll have to wait until tomorrow.
-A
On 22/03/2019 09:31, Luke Klein wrote:
I am writing a package that will extend the GenomicInteractions
class. I am a statistician, so I may not know best practices when
it comes to extending existing classes (eg. should I make a new slot
or simply add a column to the `elementMetadata`? Are there existing
functions that already do what I am attempting?).
I am not familiar with Bioc-Devel decorum, so if asking this here is
inappropriate, kindly let me know.
About my project:
In the first step, I am hoping to implement a HiC binning function on
HiC data contained in a GenomicInteractions set. I aim to:
- Reorder the anchor pairs (I will explain in more detail to anyone
that wants to help)
- Collapse the regions to the desires bin width
- Sum the counts within each bin
- Update the anchors to agree with the new/updated regions
This will set the stage for the new class that I hope to create for
HiC domain calling, but I need to achieve the above tasks first.
All the best to everyone!
—*Luke Klein*
PhD Student
Department of Statistics
University of California, Riverside
lklei...@ucr.edu <mailto:lklei...@ucr.edu>
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel