On 08/11/13 11:41 AM, JC Grenier wrote:
> Hi there,
>

Hi,

> I got a couple of question for you today. If I want to get the information 
> about the Biological Abundance and the GeneOntology Terms in the same 
> analysis, how should I proceed? If I'm putting two search terms in the 
> command line, would it be alright and detect automatically where to look for 
> each analysis step?
>
> Ex :
> mpiexec -n 48 Ray -k 31 -p Sample.filtered.R1.fq.gz Sample.filtered.R2.fq.gz 
> -s Sample.filtered.single.fq.gz -o RAY_k31_c100 -search 
> ~/NCBI-taxonomy/NCBI-Finished-Bacterial-Genomes -with-taxonomy 
> ~/NCBI-taxonomy/Genome-to-Taxon.tsv ~/NCBI-taxonomy/TreeOfLife-Edges.tsv 
> ~/NCBI-taxonomy/Taxon-Names.tsv -search ~/GeneOntology/EMBL_CDS_Sequences 
> -gene-ontology ~/GeneOntology/OntologyTerms.txt ~/GeneOntology/Annotations.txt
>

I believe this will work quite well.

Basically, the -search commands will add colors to the kmers of the graph.
If your headers for your NCBI genomes have "gi" numbers, then this will create 
a separate namespace and
-with-taxonomy will pick them up to perform a LCA (Last Common Ancestor) in the 
taxonomic tree for each kmer.

see https://github.com/sebhtml/ray/blob/master/Documentation/Taxonomy.txt

> Another question, I tried to redo the same files as on your 2012 Paper (as 
> found on GitHub) using all your scripts you provide there:
>
> https://github.com/sebhtml/Paper-Replication-2012/tree/master/Build-Input-Files-for-Gene-Ontology
>
> And I'm not able to get this completed. The script Rebuild-Fasta.py 
> apparently does nothing at the end of the day. It's creating 
> EMBL_CDS_Sequences folder, but the only file in it is empty and the rest of 
> the procedure is not producing the right things.
>

I tested that and it worked:


git clone https://github.com/sebhtml/Paper-Replication-2012.git
cd Paper-Replication-2012
cd Build-Input-Files-for-Gene-Ontology
./Main.sh


And I got this:

Generating parts.
Thank you for your kind patience !

To use these builds with Ray Ontologies (via the Ray binary), add these options:

-search \
     
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/EMBL_CDS_Sequences
 \
-gene-ontology \
     
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/OntologyTerms.txt
 \
     
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/Annotations.txt
 \


2.8G    
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/Annotations.txt
28M     
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/OntologyTerms.txt


[boiseb01@ls30 ~]$ du -sh 
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/EMBL_CDS_Sequences/*|head
97M     
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/EMBL_CDS_Sequences/000-Sequences.Part.0.fasta
87M     
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/EMBL_CDS_Sequences/000-Sequences.Part.100.fasta
81M     
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/EMBL_CDS_Sequences/000-Sequences.Part.101.fasta
82M     
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/EMBL_CDS_Sequences/000-Sequences.Part.102.fasta
123M    
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/EMBL_CDS_Sequences/000-Sequences.Part.103.fasta
110M    
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/EMBL_CDS_Sequences/000-Sequences.Part.104.fasta
108M    
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/EMBL_CDS_Sequences/000-Sequences.Part.105.fasta
89M     
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/EMBL_CDS_Sequences/000-Sequences.Part.106.fasta
88M     
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/EMBL_CDS_Sequences/000-Sequences.Part.107.fasta
82M     
/home/boiseb01/test-paper-replication/Paper-Replication-2012/Build-Input-Files-for-Gene-Ontology/EMBL_CDS_Sequences/000-Sequences.Part.108.fasta



Do you have wget and gzip ?

> Can you help me with this or indicate a way to get the proper files to launch 
> the gene-ontology search?
>
> Thanks a lot.
> --
> Jean-Christophe Grenier, M.Sc.
>
> -----------------------------------------
> /Bio-informaticien/
> /Laboratoire de Philip Awadalla/
> /Laboratoire de Luis Barreiro/
> /CHU Sainte-Justine/
> //3175, Côte Sainte-Catherine, local B-607
> ///Tél : 514-345-4931 poste 5199/
> -----------------------------------------


------------------------------------------------------------------------------
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to