Angie,
Thanks for the reply. I now have some questions regarding tool behavior.

1. Following Max's guide, I started by masking both fasta files to be
compared with trfBig using Tandem Repeats Finder v4.04. However, upon
running I get messages from stdout such as the following:

Freeing Memory...
Resolving output...
Done.maybeSystem: system(cd .; trf ./chr.tf 2 7 7 80 10 50 2000 -m ) failed
(exit status 1): Unknown error: 0

and

Freeing Memory...
Resolving output...
Done.maybeSystem: system(cd .; trf ./chr2.tf 2 7 7 80 10 50 2000 -m ) failed
(exit status 1): Cannot allocate memory

I am running OS X v10.6.2 on a quad core with 6 GB of RAM and using the
newest trfBig as far as I know.

2. chainPreNet generates the following error:

Got 1 chroms in directory/chr1.sizes, 1 in directory/chr2.sizes
hashMustFindVal: 'chr2' not found
Finishing nets
writing stdout
writing /dev/null
Couldn't open /proc/self/stat , No such file or directory
hashMustFindVal: 'chr2' not found
Got 1 chroms in directory/chr1.sizes, 1 in directory/chr2.sizes
Finishing nets
writing stdout
writing /dev/null
Couldn't open /proc/self/stat , No such file or directory

when running line:

chainMergeSort *.chain > all.chain
chainPreNet all.chain chr1.sizes chr2.sizes stdout \
| chainNet stdin -minSpace=1 chr1.sizes chr2.sizes stdout /dev/null \
| netSyntenic stdin out.net

All the tools I am using were compiled from the newest version of jksrc. Any
ideas what the problems are?

Bremen

On Mon, Dec 7, 2009 at 11:57 PM, Angie Hinrichs <[email protected]> wrote:

> Hi Bremen,
>
> You have the basic flow correct.  Some details added below:
>
> > 2. Align both chromosomes with blastz
>
> We now use lastz, an improved version of blastz
> (http://www.bx.psu.edu/miller_lab/, search for lastz) but blastz is
> sufficient.  The output format of blastz, LAV, must be converted to
> either PSL or AXT so that axtChain can read the alignments (we have
> lavToAxt and lavToPsl programs; PSL is more compact).  lastz can
> produce axt directly with the --output=axt option.
>
>
> > 3. Chain using axtChain (QUESTION: I see this program takes two
> directories
> > as arguments. If I wish only to compare two chromosomes, would these
> > directories have only one file each?)
>
> Yes, that would work.  However, you can also use the -faQ and -faT
> options, and give fasta files instead of directories.  To see a
> description of all axtChain options, run "axtChain" with no arguments.
> Here is an example usage of -faQ and -faT:
>
> axtChain -faQ -faT in.axt oneChrom.fa otherChrom.fa chroms.chain
>
> Another alternative is to convert your fasta files into the compact
> format 2bit like this:
>
> faToTwoBit oneChrom.fa oneChrom.2bit
>
> -- then you can give axtChain [and lastz but not blastz] the .2bit file
> instead of the directory, and don't have to pass -faQ / -faT.
>
>
> > 4. Get size of each chromosome for chain filtering using faSize
>
> Yes.  Note that the sequence name and size must appear in a
> tab-separated file, so use the -detailed flag like this:
>
> faSize -detailed oneChrom.fa > oneChrom.sizes
>
>
> > 5. Sort and filter chains
> >    a. use chainMergeSort using chain from step 3
> >    b. prenet with chainPreNet using chain from 5.a., size of target
> >       chromosome gotten from step 4, and size of query chromosome from
> >       step 4 as arguments
> > 6. Netting?
>
> Yes.  We pipe the output of chainPreNet to chainNet (and pipe that to
> netSyntenic) like this:
>
> chainPreNet chroms.chain oneChrom.sizes otherChrom.sizes stdout \
> | chainNet stdin -minSpace=1 oneChrom.sizes otherChrom.sizes stdout
> /dev/null \
> | netSyntenic stdin chroms.net
>
>
> > 7. Convert to .maf and use phyloFit
>
> Yes.  For historical reasons we use netToAxt | axtToMaf, we don't have
> a netToMaf.  You can see an example of how phyloFit was run for the
> D. melanogaster Conservation track in our source tree
> (kent/src/hg/makeDb/doc/dm2.txt, search for "PHASTCONS 15WAY" and then
> search for phyloFit).
>
> If you have only two species, there is not much phylogenetic
> information for phyloFit, but I suppose it could still make a
> substitution rate model.  If you have more than two species, you will
> also need multiz from http://www.bx.psu.edu/miller_lab/ .
>
>
> > 8. Run phastCons to get .wig output which can be uploaded as a
> conservation
> > track
>
> Yes.  Adam Siepel also has a new method of scoring conservation,
> phyloP, which we offer in addition to phastCons scores on newer
> Conservation tracks.  I have no idea whether one would be more
> appropriate than the other when working with only two species.
>
>
> > Since I am only comparing two chromosomes at a time, I think only a
> single
> > chain is generated which doesn't have to be netted. Am I missing
> anything?
>
> Netting is still necessary, in order to get single-coverage alignments
> on which conservation scores are computed.  axtChain usually produces
> many chains that cover the same position on the reference
> genome/chromosome.  The netting process selects the highest-scoring
> chain as the top level, and then fills in gaps (unaligned areas) using
> the next highest-scoring chain, and then fills in that chain's gaps
> using the next highest-scoring chain and so on.  It is kind of like
> extracting a global alignment from a sea of chained local alignments.
>
>
> Max Haeussler contributed a very useful genomewiki page about
> reconstructing our alignment & conservation pipeline (perhaps you have
> seen it already?):
>
> http://genomewiki.ucsc.edu/index.php/Whole_genome_alignment_howto
>
> Hope that helps, and please send more questions to [email protected]
> as you have them,
>
> Angie
>
> ----- "Bremen Braun" <[email protected]> wrote:
>
> > From: "Bremen Braun" <[email protected]>
> > To: [email protected]
> > Sent: Thursday, December 3, 2009 9:31:48 AM GMT -08:00 US/Canada Pacific
> > Subject: [Genome] Generate conservation info for interspecies genome
> comparisons
> >
> > Hello,
> >
> > I have sequences of similar chromosomes for different species that I
> > want to
> > compare. I would like to be able to generate a conservation track such
> > as
> > the one seen here:
> > http://tinyurl.com/yz6o6fu
> >
> > I looked at an example of steps to be taken. Could you please
> > verify/clarify
> > for me? Let's assume I want to compare 2 chromosomes.
> > 1. Mask both chromosomes
> > 2. Align both chromosomes with blastz
> > 3. Chain using axtChain (QUESTION: I see this program takes two
> > directories
> > as arguments. If I wish only to compare two chromosomes, would these
> > directories have only one file each?)
> > 4. Get size of each chromosome for chain filtering using faSize
> > 5. Sort and filter chains
> >    a. use chainMergeSort using chain from step 3
> >    b. prenet with chainPreNet using chain from 5.a., size of target
> > chromosome gotten from step 4, and size of query chromosome from step
> > 4 as
> > arguments
> > 6. Netting?
> > 7. Convert to .maf and use phyloFit
> > 8. Run phastCons to get .wig output which can be uploaded as a
> > conservation
> > track
> >
> > Since I am only comparing two chromosomes at a time, I think only a
> > single
> > chain is generated which doesn't have to be netted. Am I missing
> > anything?
> >
> > Thanks,
> > Bremen
> > _______________________________________________
> > Genome maillist  -  [email protected]
> > https://lists.soe.ucsc.edu/mailman/listinfo/genome
>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to