Hi Vanessa,

Thanks for your consideration and open response/confirmation of the issue.

I have found that deleting a single line from the phastCons file is sufficient 
for all 4 occurances of the error in dm3 to go away.

However, I wonder if there is any guidance regarding exactly which line is in 
error.

I assume most likely it is the first or the last score in a block that should 
be removed, but have no menas of confirming.

It might matter to a subsequent analysis I have planned since I will be looking 
at conservation in coding regions as it relates to 'phase' (expecting more 
conservation in the wobble position of regulatory regions than that of 
non-regulatory coding regions).  For this being "off-by-one" in selected 
regions would throw off the analysis.

Sincerely,

Malcolm Cook
Stowers Institute for Medical Research -  Bioinformatics
Kansas City, Missouri  USA
 
 

> -----Original Message-----
> From: Vanessa Kirkup Swing [mailto:[email protected]] 
> Sent: Friday, January 14, 2011 12:18 PM
> To: Cook, Malcolm
> Cc: Cook, Malcolm; [email protected]
> Subject: Re: [Genome] converting phastCons to bigWig
> 
> Dear Malcolm,
> 
> Thank you for your patience. Your assumption to remove the 
> line from chr2R.pp was correct. The reason you were getting 
> the error is because there is a slight overlap. This is a 
> known problem with older phastCon files and has been 
> corrected with our newer phastCon files. We will put this on 
> our long list of things to look into and fix.
> 
> If you have further questions, please contact the mailing list.
> 
> Vanessa Kirkup Swing
> UCSC Genome Bioinformatics Group
> 
> 
> ----- Original Message -----
> From: "Malcolm Cook" <[email protected]>
> To: "Malcolm Cook" <[email protected]>, "[email protected]" 
> <[email protected]>
> Sent: Thursday, January 13, 2011 3:35:59 PM
> Subject: Re: [Genome] converting phastCons to bigWig
> 
> To follow up a little myself, I find that deleting a single 
> line from chr2R.pp
> 
> Namely, deleting the line immediately preceeding
> 
> fixedStep chrom=chr2R start=5000001 step=1
> 
> namely:
> 
> 0.997
> 
> allows wibToBigWig to complete without error.
> 
> I suspect there is an off-by-one error upstream
> 
> Can you confirm?
> 
> Any suggested workaround?
> 
> Thanks.
> 
> 
> 
> 
> _____________________________________________
> From:   Cook, Malcolm
> Sent:   Thursday, January 13, 2011 3:59 PM
> To:     '[email protected]'
> Subject:        converting phastCons to bigWig
> 
> 
> I get the above message while attempting to convert into 
> bigwig format the dm3 phastcons scores (as downloaded from 
> ftp://hgdownload.cse.ucsc.edu/goldenPath/dm3/phastCons15way)
> 
> The error occurs only with chromosomes chr2R, X, 3L, and 3R
> 
> gunzip -c chr2R.pp.gz | wigToBigWig -clip  stdin   
> dm3_chromeSizes.tab dm3_chr2R_phastCons15way.bw
> There's more than one value for chr2R base 5000001 (in 
> coordinates that start with 1).
> 
> gunzip -c chr3R.pp.gz | wigToBigWig -clip  stdin   
> dm3_chromeSizes.tab dm3_chr3R_phastCons15way.bw
> There's more than one value for chr3R base 7000001 (in 
> coordinates that start with 1).
> 
> gunzip -c /chrX.pp.gz | wigToBigWig -clip  stdin   
> dm3_chromeSizes.tab dm3_chrX_phastCons15way.bw
> There's more than one value for chrX base 12000001 (in 
> coordinates that start with 1).
> 
> gunzip -c chr3L.pp.gz | wigToBigWig -clip  stdin   
> dm3_chromeSizes.tab dm3_chr3L_phastCons15way.bw
> There's more than one value for chr3L base 7000001 (in 
> coordinates that start with 1).
> 
> 
> I'd appreciate understanding if
>         I am mis-using the tools
>         there are some errors in upstream processes 
> generating the phastCons wig files
>         there are some issues with wigToBigWig
>         other ???
> 
> My purpose in this converstion is to provide fast random 
> access to the phastCons scores via subsequent calls to 
> bigWigToWig for a analysis I am developing.
> 
> If there are other (better) ways of acheiving this, I would 
> be similarly obliged to learn.
> 
> Thanks,
> 
> Malcolm Cook
> Stowers Institute for Medical Research -  Bioinformatics 
> Kansas City, Missouri  USA
> 
> 
> 
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
> 
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to