Hello Ari, I'd like to point you at some information:
Here is the definition of the AGP format: http://www.ncbi.nlm.nih.gov/projects/genome/assembly/agp/AGP_Specification.shtml And the assembly procedure: http://www.ncbi.nlm.nih.gov/projects/genome/assembly/ The AGP file defines the construction of the genome from the various contigs. Not all of a contig is necessarily used. In this particular case, only the first 91835 bases are used of this contig that is a full 184558 bases. The AGP files are in: ftp://hgdownload.cse.ucsc.edu/goldenPath/mm9/bigZips/chromAgp.tar.gz In the case of AC102110.8 only the bases from 23576 to 213732 are used from that contig which is a total of 213732 bases long, so some bits of it at the beginning are not used. I hope this helps explain what you are seeing. Kayla Smith UCSC Genome Bioinformatics Group ----- "Fungazid" <[email protected]> wrote: > hello, > > here are 2 lines in mm9.gold table list: > > _____________________________________________________ > > #bin,chrom,chromStart,chromEnd,ix,type,frag,fragStart,fragEnd,strand > > 76,chr1,3586316,3776472,5,F,AC102110.8,23576,213732,+ > 76,chr1,3776472,3868307,6,F,AC174931.2,0,91835,- > ____________________________________________________ > > > I took the second line and looked in ncbi site for this accesion: > > http://www.ncbi.nlm.nih.gov/nuccore/85702722 > > what I see is a sequence of 184558 bp, while (chromEnd-chromStart+1) > equals 91835 bp at most. In addition I see that if strand is '-' then > fragStart=0, and if strand is '+' then fragStart is some number >0. > > So what is the meaning of these columns (specifically > chromStart,chromEnd,fragStart,fragEnd) and how do to determine the > start and end position of mm9.gold fragments relative to the data in > the NCBI site ? > > Many thanks, > Avi > > > > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
