Hi Mikhail, Are you by any chance using line numbers to select positions from the fixedStep files? Unfortunately that doesn't work because we don't have data for every base in the chromosome, so we have multiple fixedStep sections in the file.
For example, at line 5206 of chr2L.pp: fixedStep chrom=chr2L start=5243 step=1 So line 5207 is base 5243. To get base 5400, look at line (5207 + (5400 - 5243)) = line 5364: zcat postprobsDownload/chr2L.pp.gz | tail -n +5364 | head 0.960 0.968 0.971 0.968 0.958 0.940 0.911 0.855 0.750 0.693 -- fixedStep format needs to be parsed, as you're already doing for variableStep. Hope that helps, and if you have further questions, please send them to us at [email protected]. Angie On Fri, 26 Jun 2009, Mikhail Spivakov wrote: > Hello, > I've noticed an inexplicably large difference between the > D.melanogaster 15-way phastcons data available for download in the > fixed step format > (in particular, > http://hgdownload.cse.ucsc.edu/goldenPath/dm3/phastCons15way/chr2L.pp.gz) > and the same in the variableStep format accessible from the Tables > view for the same D.melanogaster assembly. > > They do look very similar at the beginning of the chromosome, as > expected, - for example here are the first 20: > > pos_on_chr2L variableStep fixedStep > 1 0.770953 0.771 > 2 0.77874 0.78 > 3 0.770953 0.772 > 4 0.77874 0.785 > 5 0.77874 0.78 > 6 0.755378 0.761 > 7 0.716441 0.722 > 8 0.708654 0.71 > 9 0.708654 0.716 > 10 0.700866 0.703 > 11 0.708654 0.712 > 12 0.700866 0.701 > 13 0.661929 0.667 > 14 0.59963 0.604 > 15 0.475031 0.479 > 16 0.264772 0.27 > 17 0.140173 0.146 > 18 0.132386 0.138 > 19 0.101236 0.109 > 20 0.101236 0.103 > > But by as early as 5400 they already look completely out of sync: > > pos_on_chr2L variableStep fixedStep > 5400 0.958661 0.251 > 5401 0.966331 0.345 > 5402 0.966331 0.383 > 5403 0.966331 0.39 > 5404 0.950992 0.376 > 5405 0.935654 0.229 > 5406 0.904976 0.211 > 5407 0.851291 0.166 > 5408 0.743921 0.152 > 5409 0.690236 0.116 > 5410 0.582866 0.044 > 5411 0.575197 0.017 > 5412 0.552189 0.019 > 5413 0.490835 0.017 > 5414 0.368126 0.01 > 5415 0.368126 0.018 > 5416 0.345118 0.022 > 5417 0.291433 0.039 > 5418 0.184063 0.05 > 5419 0.130378 0.054 > 5420 0.10737 0.054 > > This goes well beyond the listed loss of resolution for this fragment > (Worst case: 0.00760937). > > >From some functional tests it seems that the variableStep data makes > more sense - although I've got not solid proof. > Can you please look into this - maybe I'm just confusing something? > > Many thanks > Mikhail > > -- > Mikhail Spivakov, PhD > Postdoctoral Fellow > EMBL-EBI > Hinxton > Cambridge CB10 1SD > UK > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
