Dear all, my name is Carsten Raabe and I am working at the institute of experimental Pathology in Münster, Germany.
I do have a hard time understanding what the columns in the wgEncodeCshlShortRnaSeqGm12878CytosolShortTransfrags.shortFrags file stand for. I tried to go via the table browser in order to see the table option offered and I assume these would the same fields as in the file specified above. I do have a bunch question on what the specific columns stand for. Please find my doubts in detail below... bin chromReference sequence chromosome or scaffold chromStartStart position in chromosome chromEndEnd position in chromosome nameName of item scoreScore from 0-1000 >>>>> what does the score indicate strand + or - length >>>>> Difference between end and start position ?? numUnique >>>> unique reads in the contig ? numReads >>>> all reads forming the contig ? minSeqCount >>>> what is the difference between seq and read ? maxSeqCount aveSeqCount firstSeqCount >>>> Besides the question above I am not clear on what does "first" indicate here. medSeqCount thirdSeqCount >>>> The same question as with "first" in the above row. minReadCount >>>> Again what would be the difference between read and seq. maxReadCount aveReadCount firstReadCount medReadCount thirdReadCount what would the meaning of third here. numRegions >>>> I assume it refers to significant region within the contig, how would these (>>> regions) be defined?? regStart regLength seqCount regCount sumCount I checked the mailing list and didn't find a lot of information on these questions above. I furthermore saw the description of tranfrags on the browser page "Small RNA reads were assembled into "Transfrags" by merging reads with one or more overlapping nucleotides. In order to minimize ambiguity from reads that have the potential to map to multiple genomic loci, only the uniquely mapping reads were used to generate Transfrags. The BED6+ format of the transfrag files are created from "intervals-to-contigs" Galaxy tool written by Assaf Gordon in the Hannon lab at CSHL. A complete description of the columns in this format can be found here. The Transfrags view includes all transfrags before filtering." However, the BED6 link doesn't work. I am also confused with regards to the term unique if only unique reads (>>> unique in position, as stated in the citation) were utilized to form the corresponding frags, why is the number of numUnique not always identical to numReads. I am sorry for troubling you a lot with all these questions, however I wouldn't know whom else to ask. Keep up the beautiful work. Cheers, Carsten _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
