Re: [Genome] twoBit.h to allow 2bit access in standalone applications

Galt Barber Thu, 04 Aug 2011 11:53:52 -0700

These utilities can probably already do what you want:

Available for 64-bit download here:
  http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/


  (you can also download the source and compile, see the FAQs)

[hgwdev:lib> twoBitToFa
twoBitToFa - Convert all or part of .2bit file to fasta
usage:
    twoBitToFa input.2bit output.fa
options:
    -seq=name - restrict this to just one sequence
    -start=X  - start at given position in sequence (zero-based)
    -end=X - end at given position in sequence (non-inclusive)
    -seqList=file - file containing list of the desired sequence names
                     in the format seqSpec[:start-end], e.g. chr1 or 
chr1:0-189
                     where coordinates are half-open zero-based, i.e. 
[start,end)
    -noMask - convert sequence to all upper case
    -bpt=index.bpt - use bpt index instead of built in one
    -bed=input.bed - grab sequences specified by input.bed. Will exclude 
introns

Sequence and range may also be specified as part of the input
file name using the syntax:
       /path/input.2bit:name
    or
       /path/input.2bit:name
    or
       /path/input.2bit:name:start-end

[hgwdev:lib> twoBitInfo
twoBitInfo - get information about sequences in a .2bit file
usage:
    twoBitInfo input.2bit output.tab
options:
    -nBed   instead of seq sizes, output BED records that define
            areas with N's in sequence
    -noNs   outputs the length of each sequence, but does not count Ns
Output file has the columns::
    seqName size

The 2bit file may be specified in the form path:seq or 
path:seq1,seq2,seqN...
so that information is returned only on the requested sequence(s).
If the form path:seq:start-end is used, start-end is ignored.

-Galt

8/4/2011 11:44 AM, Jens Lichtenberg:
> Hello,
>
> I am would like to use the logic outlined in the genome browser source code
> for twoBit.h in another tool I am working on. Is it at all possible to use
> the components of the genome browser in a simple/meaningful way?
>
> In detail I would like to do something like this (extract the characters
> between position 2000 and 3000 of chromosome 1 in a 2bit file):
>
> #include<iostream>
> #include<twoBit.h>
>
> using namespace std;
> using namespace genome_browser; //I am not sure if this at all exists or
> would even make sense in this case
>
> int main (int argc, char *argv[])
> {
> string bit_file = argv[1];
>
> cout<<  twoBitReadSeqFrag(bit_file, 1, 2000, 3000);<<  endl;
> }
>
> Thank you so much for your help.
>
> Jens
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Re: [Genome] twoBit.h to allow 2bit access in standalone applications

Reply via email to