Hello, These "empty" regions are informative in that they represent regions of genomic with no alignments, gaps, etc. Please see the README file here in the Downloads ftp location for a complete description: http://hgdownload.cse.ucsc.edu/goldenPath/hg18/multiz28way/
We hope this helps to explain the data format, Jennifer Jackson UCSC Genome Bioinformatics Group Harshvardhan Kelkar wrote: > I needed to parse the upstream5000.maf(28 species) file from the hg18 > series. > I was encountering bloacks and bloacks of empty space within the file > ,with the > size dropping to 200 MB from near about 2GB . > Being a computer scientist ,it is more like a enormous text file for me > ..what is the > significance of the empty space within this file. > > _______________________________________________ Genome maillist - [email protected] http://www.soe.ucsc.edu/mailman/listinfo/genome
