Hi Greg,

Even though you are not copying the data into Galaxy's default data store, 
Galaxy determines and stores certain metadata for each of the data files to 
which you are linking. One of the types of metadata defined for the Bam 
datatypes is it's index, which is created by a call to samtools.  

Unfortunately there is really no way around this because Galaxy requires the 
index file to be in a correct state, and I believe the test to determine 
correctness is at least as intensive as generating the index in the first 
place.  It's been a while since I was involved in this (specifically setting 
metadata for bam files using samtools), so perhaps samtools has been recently 
improved in this regard.  if so, I'll look to others to let me know I'm now 
"outdated" in my understanding of this.  If we need to update samtools used by 
the Galaxy code to take advantage of newer features, we can certainly do so.

Greg Von Kuster

On Apr 23, 2012, at 2:51 PM, Gregory Miles wrote:

> Thank you very much for your help with this - we got that settled. One other 
> question...we are importing sorted, indexed bam files into a galaxy data 
> library and we are not having galaxy copy over the files (they are large) but 
> rather just setting up galaxy such that it points to the relevant directory. 
> We noticed that the file (160 GB in size) is taking a long time to import 
> considering all it should be doing is creating a link. When we examined 
> processes that are running, we noticed that samtools is running. From 
> searching around a bit, it seems that Galaxy does this in order to groom the 
> bam file (sort/index) and ensure that it is in the format necessary for 
> galaxy to be able to interpret it. Is there any way around this? We did the 
> sorting and indexing prior to import and it's taking quite a while to perform 
> an unnecessary function. Thanks.
> 
> Greg
> 
> Dr. Gregory Miles
> Bioinformatics Specialist
> Cancer Institute of New Jersey @ UMDNJ
> Office: (732) 235 8817
> 
> -----------------------------------------------------------------------------------------------------------------------------
> CONFIDENTIALITY NOTICE: This email communication may contain private,
> confidential, or legally privileged information intended for the sole
> use of the designated and/or duly authorized recipient(s). If you are
> not the intended recipient or have received this email in error, please
> notify the sender immediately by email and permanently delete all copies
> of this email including all attachments without reading them. If you are
> the intended recipient, secure the contents in a manner that conforms to
> all applicable state and/or federal requirements related to privacy and
> confidentiality of such information.
> 
> 
> On Mon, Apr 23, 2012 at 12:55 PM, Greg Von Kuster <g...@bx.psu.edu> wrote:
> Hi Greg,
> 
> Upload your files to a Galaxy data library using a combination of "Upload 
> files from filesystem paths" without copying data into Galaxy's default data 
> store.
> 
> See the following wiki for all the details:
> 
> http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries/Uploading%20Library%20Files
> 
> For all of the details about data libraries, see:
> 
> http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries
> 
> Greg Von Kuster
> 
> 
> On Apr 23, 2012, at 11:26 AM, Gregory Miles wrote:
> 
> > We have large files that cannot be uploaded using the "file upload" command 
> > and instead would need to be uploaded using a URL. Unfortunately, we are 
> > using a "local" install on a non-local machine, so setting up an FTP server 
> > on this machine is a security issue. The files are located on this computer 
> > already anyhow, and Galaxy would simply be copying from one folder to 
> > another in order to perform the "get data" step. Is there a simple way to 
> > have a "pointer" of some sort such that galaxy knows where this file is and:
> >
> > 1) Would not have to copy it and could simply refer to the file location.
> > 2) Could perform data analysis steps on this file and push the output to 
> > the usual location (not the location of the data files).
> >
> > Any help would be greatly appreciated. Thanks.
> >
> > Dr. Gregory Miles
> > Bioinformatics Specialist
> > Cancer Institute of New Jersey @ UMDNJ
> > Office: (732) 235 8817
> >
> > -----------------------------------------------------------------------------------------------------------------------------
> > CONFIDENTIALITY NOTICE: This email communication may contain private,
> > confidential, or legally privileged information intended for the sole
> > use of the designated and/or duly authorized recipient(s). If you are
> > not the intended recipient or have received this email in error, please
> > notify the sender immediately by email and permanently delete all copies
> > of this email including all attachments without reading them. If you are
> > the intended recipient, secure the contents in a manner that conforms to
> > all applicable state and/or federal requirements related to privacy and
> > confidentiality of such information.
> > ___________________________________________________________
> > The Galaxy User list should be used for the discussion of
> > Galaxy analysis and other features on the public server
> > at usegalaxy.org.  Please keep all replies on the list by
> > using "reply all" in your mail client.  For discussion of
> > local Galaxy instances and the Galaxy source code, please
> > use the Galaxy Development list:
> >
> >  http://lists.bx.psu.edu/listinfo/galaxy-dev
> >
> > To manage your subscriptions to this and other Galaxy lists,
> > please use the interface at:
> >
> >  http://lists.bx.psu.edu/
> 
> 

___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to