Hello Andreas,
We have put together some tools for you. These instruction should help 
your bioinformatics team get things set up for you.

General idea: Get program, compile, run using a ".gcg" file which is an 
expanded version of your regular expression. You can create similar .gcg 
files as needed and run against any genome of your choosing.

Location of the .gcg file for McrBC:
http://hgwdev.cse.ucsc.edu/~aamp/

Sequence (genomic) files (inside of each genomes sub-folders):
http://hgdownload.cse.ucsc.edu/downloads.html

Ftp:
http://genome.ucsc.edu/FAQ/FAQdownloads#download1
http://genome.ucsc.edu/FAQ/FAQdownloads#download32
 
Downloading our source:
http://genome.ucsc.edu/FAQ/FAQdownloads#download27

Creating/loading a custom track:
http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#CustomTracks
http://genome.ucsc.edu/goldenPath/customTracks/custTracks.html

Program to do the search:

findCutters - Find REBASE restriction enzymes using their GCG file
usage:
   findCutters rebase.gcg sequence output.bed
where "sequence" is a .fa, .nib, or .2bit file
options:
   -justThis=enzyme    Only search for this enzyme.
   -justThese=file     File of enzymes (one per line) to restrict search.
   -countsOnly         Only output the # of times each enzyme is found
                       in the sequence in a simple 2 column file.
   -consolidateCounts  This option is used in the situation that a bunch
                       of output files have been created and cat'ed
                       together (Like after a cluster run).  The program
                       usage then changes to:
   findCutters -consolidateCounts input.counts output.counts

NOTE: a proper GCG file is the one available from NEB, using a command like:
  curl -A "Mozilla/4.0" http://rebase.neb.com/rebase/link_gcgenz > rebase.gcg

To compile, use the kent libraries.

cd kent/src
make libs

To then compile findCutters

cd hg/utils/findCutters
make

Note, the program will expect your unix environment to have in your home 
directory a ./bin/x86_64 directory and a path to this location in your shell 
file.


Any problems, please let us know,
Thanks,
Jennifer Jackson
UCSC Genome Bioinformatics Group

Weinhäusel Andreas wrote:
> Dear Colleagues, 
>
> is there an possibility to visualise potentially McrBC enzyme recognition 
> sites and density within UCSC-GB?
>
>  
>
> McrBC has the preferred recognition seq. "RC(N*(55-103))RC"  - in case if I 
> would search a vertebrate genome for CpG methylation the preferred 
> recognition seq. would be "RCG(N*(55-103))RCG"  .
>
>  
>
> Would be nice to get this visualized....
>
>  
>
> Greetings ANDREAS
>
>  
>
> DIDr Andreas Weinhäusel 
>
> _____________________________________
>
>  
>
> Austrian Research Centers GmbH - ARC
>
> Life Sciences
>
> Research Center: 2444 Seibersdorf, Austria
>
> T +43 (0) 50 550-3402, F +43 (0) 50 550-3653
>
> [email protected] <mailto:[email protected]> 
>
> http://www.arcs.ac.at <http://www.arcs.ac.at/> 
>
> _____________________________________
>
>  
>
> http://www.lifesciences.at <http://www.lifesciences.at/> 
>
> _____________________________________
>
>  
>
> FBN: 115980i HG Wien, UID: ATU14703506
>
>  
>
> _______________________________________________
> Genome maillist  -  [email protected]
> http://www.soe.ucsc.edu/mailman/listinfo/genome
>   
_______________________________________________
Genome maillist  -  [email protected]
http://www.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to