The human genome is arranged in 46 chromosomes. The longest is ~250 Mb (~2^28). While a Hilbert curve layout of a single chromosome tends to be informative, there is no obvious meaning in treating the complete human genome as a single 3 Gb linear sequence.

Wolfgang

On 10/03/2017 21:54, Wolfgang Huber wrote:
Two replies:

1. Downsampling?
In case you want to use the Hilbert curve for visualisation, please note
that you will need a graphics device with resolution 65536 x 65536 to
display it. Many people have smaller screens, so binning the genome
(e.g. into bins of 10x10=100nt) could be a practical solution, and more
efficient than computing some large intermediate thing that your
graphics device will then downsample anyway.

2. Long vector
In case you really need the big curve: I just had a look at the C code
in the "HilbertVis" package, which anyway uses long ints, and it does
not look difficult to modify the R wrapper so that it uses a long
vector. I assume that Simon would welcome the patch.

Wolfgang





9.3.17 08:44, Sohaib Ghani scripsit:
I am trying to simulate hilbertcurve (of Bioconductor package) of
level 16 in R. It takes about 4^16=4 Billion points. I want to
generate the hilbert curve of genome (size about 3 billion).

But I am getting this error

long vectors not supported yet: memory.c:1668

I am using 64 bit version (R 3.3.2) so my guess is I can use vectors
of length > 2^31. Also, my RAM is about 350GB.

The command I am using is

itr=4^16
hc = HilbertCurve(1, itr, 16, mode = "pixel", title = "pixel
mode",start_from = "topleft")

Even when I am reading the whole genome sometimes R is crashing in the
process.

I have read the other similar questions on this topic but could not
find the solution. Please help me what should I use for this problem.


Thanks

    [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

--
Best wishes
Wolfgang

-------
Wolfgang Huber
Principal Investigator, EMBL Senior Scientist
European Molecular Biology Laboratory (EMBL)
Heidelberg, Germany

wolfgang.hu...@embl.de
http://www.huber.embl.de

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to