Hello, Steve,
What about the case for representing insertions in BED files? Could you please tell me how would you represent an insertion in a BED file? Wouldn't it be as below?: Let's say that we want to represent the following insertion in a BED file which is given on cosmic website: http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=mut_summary&id=681 In Cosmic website, the start and end position for an insertion of CTGTGGGCT on chr17:37881006..37881007 (GRCh37) is represented as: start: 37881006 end: 37881007 Since both the start and end positions are 1-based in cosmic. Next, When we convert this variant above to a BED file, the start position would become 0-based. Thus, shifts the position to left. So in BED file, this would be the representation: start: 37881006 end: 37881006 So the bed file for this insertion would look like: CHR STARTEND UNIQUID TYPE chr17 3788100637881006 id2 REF=;OBS=CTGTGGGCT According to this example I have given, would you agree that it is possible for the "start" position to be "equal" end position in a given BED file? You are right, start position can NOT be numeracally larger than the end position. However, they can be equal sometimes for the case of insertions I have given above. Thank you, Laura ________________________________ From: Steve Heitner <[email protected]> To: 'Laura Smith' <[email protected]>; [email protected] Sent: Tuesday, March 27, 2012 11:46 AM Subject: RE: [Genome] Can "end position" can ever be larger than "start position" in a BED format file? Hello, Laura. The simple answer to all three of your questions is yes. Whether they are on the + or - strand, both chromStart and thickStart must be numerically smaller than chromEnd and thickEnd, respectively. If chromStart or thickStart are not smaller than chromEnd or thickEnd, you will receive an error such as "Error line 3 of custom track: chromStart after chromEnd (6000 > 2000)" when trying to load your custom track. Please contact us again at [email protected] if you have any further questions. --- Steve Heitner UCSC Genome Bioinformatics Group -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Laura Smith Sent: Monday, March 26, 2012 6:10 PM To: [email protected] Subject: [Genome] Can "end position" can ever be larger than "start position" in a BED format file? Hi, I read the information on BED format on your website, however something is not very clear to me. 1. In BED format, isstart position always less than or equal to end position? 2. In other words, can "end position" can ever be larger than "start position" in a BED format file? The reason I am asking this is because, I want to know the following: 3. For a region on negative strand, if someone wants to represent it on BED file, is the start position always less than end position? Looking at the examples on negative strand provided in your website, it is easy to assume that end position is always larger than or equal to start position regardless of the strand. Is this correct? If you could please answer the 3 questions above, I would appreciate it. thanks, Laura track name=pairedReads description="Clone Paired Reads" useScore=1 chr22 1000 5000 cloneA 960 + 1000 5000 0 2 567,488, 0,3512 chr22 2000 6000 cloneB 900 - 2000 6000 0 2 433,399, 0,3601 BED format provides a flexible way to define the data lines that are displayed in an annotation track. BED lines have three required fields and nine additional optional fields. The number of fields per line must be consistent throughout any single set of data in an annotation track. The first three required BED fields are: chrom - The name of the chromosome (e.g. chr3, chrY, chr2_random) or contig (e.g. ctgY1). chromStart - The starting position of the feature in the chromosome or contig. The first base in a chromosome is numbered 0. chromEnd - The ending position of the feature in the chromosome or contig. The chromEnd base is not included in the display of the feature. For example, the first 100 bases of a chromosome are defined as chromStart=0, chromEnd=100, and span the bases numbered 0-99. The 9 additional optional BED fields are: name - Defines the name of the BED line. This label is displayed to the left of the BED line in the Genome Browser window when the track is open to full display mode or directly to the left of the item in pack mode. score - A score between 0 and 1000. If the track line useScore attribute is set to 1 for this annotation data set, the score value will determine the level of gray in which this feature is displayed (higher numbers = darker gray). strand - Defines the strand - either '+' or '-'. thickStart - The starting position at which the feature is drawn thickly (for example, the start codon in gene displays). thickEnd - The ending position at which the feature is drawn thickly (for example, the stop codon in gene displays). reserved - This should always be set to zero. blockCount - The number of blocks (exons) in the BED line. blockSizes - A comma-separated list of the block sizes. The number of items in this list should correspond to blockCount. blockStarts - A comma-separated list of block starts. All of the blockStart positions should be calculated relative to chromStart. The number of items in this list should correspond to blockCount. Example: Here's an example of an annotation track that uses a complete BED definition: track name=pairedReads description="Clone Paired Reads" useScore=1 chr22 1000 5000 cloneA 960 + 1000 5000 0 2 567,488, 0,3512 chr22 2000 6000 cloneB 900 - 2000 6000 0 2 433,399, 0,3601 _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
