Hi Robert,

The fasta file that you created the indexes from should be located in the same directory hierarchy as the indexes themselves. For some tools (Bowtie is one them), a symbolic link to the fasta file in the directory with the indexes is also required.

General instructions to set up indexes:
http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup

Instructions for setting up the builds.txt and other precursor steps:
http://wiki.g2.bx.psu.edu/Admin/Data%20Integration


For your example, the data could be organized like:

/share/shared/data/genomes/hg18/
/share/shared/data/genomes/hg18/seq
/share/shared/data/genomes/hg18/seq/Homo_sapiens_assembly18.fasta
/share/shared/data/genomes/hg18/bowtie/
/share/shared/data/genomes/hg18/bowtie/<bowtie_index_files>
/share/shared/data/genomes/hg18/bowtie/<symbolic_link_to_fasta>

-- where <symbolic_link_to_fasta> is named exactly like the original fasta file, in your case, "Homo_sapiens_assembly18.fasta" -- and where all of the <bowtie_index_files> have the full original fasta file name with the index name appended (your example has this correct)

Then, in the bowtie_indices.loc file, the line will be only one row for the fasta genome, not one row for each individual index file.

So, one row, tab deliminated, 4 fields:
<unique_build_id>   <dbkey>   <display_name>   <file_base_path>

Where each field could be, for example:

Homo_sapiens_assembly18
Hs18
Human (Homo sapiens)
/share/shared/data/genomes/hg18/bowtie/Homo_sapiens_assembly18

 -- note that there is no ".fasta" in the <file_base_path> field
-- put all of four of these fields in one single row in the actual file, I only put them on individual lines to make the contents of each field clear
 -- be sure to use only tabs, no extra spaces, to deliminate the fields
-- use the actual file system path in the <file_base_path> (Avoid following symbolic links, as these have been problematic in the past for some users)

The sample .loc files have this information plus more examples:
http://bitbucket.org/galaxy/galaxy-central/src/a10bb73f5793/tool-data/bowtie_indices.loc.sample

We just started up an rsync server to host the same genomes as those available on Galaxy Main. Or, you can obtain genomes from any source - making the data available in fasta format is the only requirement. Full wiki documentation for the rsync server linked in with the other NGS setup wikis & a broader announcement will be coming out later this week, but this prior post covers the basics:
http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-July/010607.html

Hopefully this helps you to get set up,

Jen
Galaxy team


On 8/8/12 2:32 PM, Robert Chase wrote:
Hello,

I am trying to get the reference genomes to appear in our NGS tools. In
my bowtie.loc file for instance I have the following line:

hg18
/share/shared/data/genomes/hg18/bowtie/Homo_sapiens_assembly18.fasta.1.eb
wt

[galaxy@ tool-data]$ cd /share/shared/data/genomes/hg18/bowtie/
[galaxy]$ ls
Homo_sapiens_assembly18.fasta.1.ebwt  Homo_sapiens_assembly18.fasta.4.ebwt
Homo_sapiens_assembly18.fasta.2.ebwt
Homo_sapiens_assembly18.fasta.rev.1.ebwt
Homo_sapiens_assembly18.fasta.3.ebwt
Homo_sapiens_assembly18.fasta.rev.2.ebwt

Do the files provided in the .loc file have to be fasta files? Where can
these fasta files be obtained?

-Rob


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/


--
Jennifer Jackson
http://galaxyproject.org
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/

Reply via email to