Hi Mary,

Thanks for your reply. I tried running the soap2sam.pl script however I  
think because the SOAP output that I was given has been modified from the  
original SOAP output (i.e. my files). Therefore, I was wondering, based on  
the file format I outlined earlier, whether it was possible to simply host  
it as a BED file to view it on the browser. I hoping to get some advice in  
preparing my current file to fit a BED/bigBED type format? I have read the  
FAQ Data File Formats, but I am not sure how to handle my files with the  
multiple coordinates to fit into a UCSC compatible format.

Thanks again.

Cheers,
Rathi


On Thu, 19 May 2011 04:38:46 +1000, Mary Goldman <[email protected]> wrote:

> Hi Rathi,
>
> One of our engineers recommended converting your output to BAM. In
> particular, please see the soap2sam.pl script at
> http://soap.genomics.org.cn/soapaligner.html
>
> You will also need samtools to convert the SAM to BAM, and to sort and
> build an index for the BAM.
> http://samtools.sourceforge.net/
>
> Our notes about using BAM are here:
> http://genome.ucsc.edu/goldenPath/help/bam.html
>
> I hope this information is helpful.  Please feel free to contact the
> mail list again if you require further assistance.
>
> Best,
> Mary
> ------------------
> Mary Goldman
> UCSC Bioinformatics Group
>
>
> On 5/17/11 7:35 AM, Rathi Thiagarajan wrote:
>> Hi there,
>>
>> I was given the following RNASeq paired-end data and looking for ways to
>> visualize it on the genome browser. The file is a processed SOAP  
>> aligned,
>> paired-end, Illumina,  mapped to hg19.
>>
>> The file contains the following columns (tab delimited):
>> ID seqOne seqTwo chromosome oneStarts oneStops twoStarts twoStops
>>
>> Looking at this example of a paired-end junction:
>>
>> HWUSI-EAS474_21_30E9BAAXX:2:1:766:164 is the ID
>> GCACAGCAGAAGTGTTTTTCTTTTTTTAATGAACAA is the left end
>> GTCCCATGTTGACAATTTGTATGGTTTACTTTTTCA is the right end
>> chr12 is the chromosome
>> 14954338,14956285, are the starts for the left end, which aligns to a
>> junction 14954362,14956297, stops for the left end
>> 14954311, right end start
>> 14954346, right end stop
>>
>> Here is a snapshot of a few lines from the actual file:
>>
>> GA2:1:1:32:1827#0 AAATTAGACAACTGATGTCATGCTGTCTTGGTCTCC
>> GTGGAAACAAGTAATGGAACCAACGCCCTGTGTGTA chr11 16779120, 16779155, 16779542,
>> 16779577,
>> GA2:1:1:34:1274#0 TGGTGACCTTCAAGGAATCTTTGAGGGCCTGGAGCT
>> TCCAGGAGCAGCTCCAGGCCCTCAAAGAGTCCTTGA chr11 71726406, 71726441, 71726415,
>> 71726450,
>>
>> (and a junction read) :
>> GA2:1:1:105:706#0 TGGCAGTGCAAATATCCAAGAAGAGGAAGTTTGTCG
>> CCTGGTTGGTGTAACTCGCACCTCAACTCCAGAGTA chr11 75110593,75111737,
>> 75110621,75111745, 75111807, 75111842,
>>
>>
>> Would really appreciate your advice on how to prepare this file to
>> visualize it the UCSC GB especially with the multiple coordinates for  
>> the
>> junction tags. I was advised that BED file might work here however not
>> sure how to set-up the files based on the instructions that was provided
>> in FAQ Data File Formats
>>
>> Thanking you in advance.
>>
>> Cheers,
>> Rathi

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to