Hi Hiram,
but i didnot see any explanation of twoBitToFa supporting Bed format..where is it? Thanks:) Best. On Tue, Aug 9, 2011 at 9:08 PM, Hiram Clawson <[email protected]> wrote: > > You can always use twoBitToFa with its options of processing > via a list file or via a bed file. > > --Hiram > > ----- Original Message ----- > From: "Daofeng Li" <[email protected]> > To: "Ivan Adzhubey" <[email protected]> > Cc: [email protected] > Sent: Tuesday, August 9, 2011 2:11:55 PM > Subject: Re: [Genome] batch extracting sequence by coordinates > > Thanks Ivan. > actually i use the fastaFromBed utility finally, it runs very fast, i > recommend for this tool:) > > Best. > > On Tue, Aug 9, 2011 at 2:39 PM, Ivan Adzhubey < > [email protected]> wrote: > > > Hi Daofeng, > > > > I suggest using nibFrag for this purpose. I found it generally faster > > compared > > to twoBitToFa since for each extraction operation it will only read (a > much > > smaller size) per chromosome nib file instead of a huge 2bit whole genome > > one. > > Also nibFrag would reverse-complement extracted sequence automatically > when > > strand=m while twoBitToFa does not have such option. > > > > The only downside is that you will need to convert downloaded chromosome > > .fa.gz files to nib format (UCSC does not provide chromosomes in nib > format > > for > > download). But you only have to do this once. > > > > Best, > > Ivan > > > > On Tuesday, August 09, 2011 03:27:09 PM Daofeng Li wrote: > > > Hi list members, > > > > > > Is there an effective way for extracting sequence from human genome > hg19 > > by > > > coordinates? > > > i have millions of start-end positions, might this huge amount of data > > not > > > suite for Table browser. > > > I was think use the .2bit genome, any suggestions? > > > i am also thing using following steps: > > > > > > ** > > > > > > * * > > > > > > *twoBitToFa* > > > * > > > > > > twoBitToFa - Convert all or part of .2bit file to fasta > > > > > > usage: > > > > > > twoBitToFa input.2bit output.fa > > > > > > options: > > > > > > -seq=name - restrict this to just one sequence > > > > > > -start=X - start at given position in sequence (zero-based) > > > > > > -end=X - end at given position in sequence (non-inclusive) > > > > > > > > > > > > faToNib > > > > > > faToNib - Convert from .fa to .nib format > > > > > > usage: > > > > > > faToNib in.fa out.nib > > > > > > > > > > > > nibFrag > > > > > > nibFrag - Extract part of a nib file as .fa > > > > > > usage: > > > > > > nibFrag file.nib start end strand out.fa > > > > > > Is this would be the fast way? > > > > > > Thanks in advance. > > > > > > Best. > -- Daofeng Li Postdoc Research Associate Department of Genetics Washington University in St.Louis School of Medicine 314-556-2832 _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
