Re: [galaxy-user] Data upload issues on Galaxy
Hi Jennifer, I had some uploads that were probably stuck and going on for 5-7 days. I deleted them and uploaded 2 new files for upload. can you please check and see if the upload is happening normally for these files? Thanks a lot for your help and patience thanks, fatih On Wed, Jun 20, 2012 at 10:43 PM, Jennifer Jackson j...@bx.psu.edu wrote: Hello Fatih, I found five jobs under your account in the NGS cluster queue. Job processing is normal - the Galaxy main instance is just very busy today. Your jobs will process in the order that they were queued (the size of the jobs does not impact how long it takes for them to begin to run, only when they where started with respect to other jobs and how long those earlier jobs take to complete). There are substantial resources dedicated to the public instance, so I would expect your jobs to begin to process within the next 24 hrs (at the latest). If your work is urgent, a cloud instance is the recommended alternative: http://getgalaxy.org/cloud Best, Jen Galaxy team On 6/20/12 6:44 PM, Fatih Ozsolak wrote: Hi, I submitted Bowtie and BWA alignment jobs on two relatively small fastq files, and the jobs still appear as gray after multiple hours. can you please check and see if the system if functioning properly? thanks, fatih ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jacksonhttp://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Data upload...
Hi Greg, Upload your files to a Galaxy data library using a combination of Upload files from filesystem paths without copying data into Galaxy's default data store. See the following wiki for all the details: http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries/Uploading%20Library%20Files For all of the details about data libraries, see: http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries Greg Von Kuster On Apr 23, 2012, at 11:26 AM, Gregory Miles wrote: We have large files that cannot be uploaded using the file upload command and instead would need to be uploaded using a URL. Unfortunately, we are using a local install on a non-local machine, so setting up an FTP server on this machine is a security issue. The files are located on this computer already anyhow, and Galaxy would simply be copying from one folder to another in order to perform the get data step. Is there a simple way to have a pointer of some sort such that galaxy knows where this file is and: 1) Would not have to copy it and could simply refer to the file location. 2) Could perform data analysis steps on this file and push the output to the usual location (not the location of the data files). Any help would be greatly appreciated. Thanks. Dr. Gregory Miles Bioinformatics Specialist Cancer Institute of New Jersey @ UMDNJ Office: (732) 235 8817 - CONFIDENTIALITY NOTICE: This email communication may contain private, confidential, or legally privileged information intended for the sole use of the designated and/or duly authorized recipient(s). If you are not the intended recipient or have received this email in error, please notify the sender immediately by email and permanently delete all copies of this email including all attachments without reading them. If you are the intended recipient, secure the contents in a manner that conforms to all applicable state and/or federal requirements related to privacy and confidentiality of such information. ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Data upload...
Thank you very much for your help with this - we got that settled. One other question...we are importing sorted, indexed bam files into a galaxy data library and we are not having galaxy copy over the files (they are large) but rather just setting up galaxy such that it points to the relevant directory. We noticed that the file (160 GB in size) is taking a long time to import considering all it should be doing is creating a link. When we examined processes that are running, we noticed that samtools is running. From searching around a bit, it seems that Galaxy does this in order to groom the bam file (sort/index) and ensure that it is in the format necessary for galaxy to be able to interpret it. Is there any way around this? We did the sorting and indexing prior to import and it's taking quite a while to perform an unnecessary function. Thanks. Greg Dr. Gregory Miles Bioinformatics Specialist Cancer Institute of New Jersey @ UMDNJ Office: (732) 235 8817 - CONFIDENTIALITY NOTICE: This email communication may contain private, confidential, or legally privileged information intended for the sole use of the designated and/or duly authorized recipient(s). If you are not the intended recipient or have received this email in error, please notify the sender immediately by email and permanently delete all copies of this email including all attachments without reading them. If you are the intended recipient, secure the contents in a manner that conforms to all applicable state and/or federal requirements related to privacy and confidentiality of such information. On Mon, Apr 23, 2012 at 12:55 PM, Greg Von Kuster g...@bx.psu.edu wrote: Hi Greg, Upload your files to a Galaxy data library using a combination of Upload files from filesystem paths without copying data into Galaxy's default data store. See the following wiki for all the details: http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries/Uploading%20Library%20Files For all of the details about data libraries, see: http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries Greg Von Kuster On Apr 23, 2012, at 11:26 AM, Gregory Miles wrote: We have large files that cannot be uploaded using the file upload command and instead would need to be uploaded using a URL. Unfortunately, we are using a local install on a non-local machine, so setting up an FTP server on this machine is a security issue. The files are located on this computer already anyhow, and Galaxy would simply be copying from one folder to another in order to perform the get data step. Is there a simple way to have a pointer of some sort such that galaxy knows where this file is and: 1) Would not have to copy it and could simply refer to the file location. 2) Could perform data analysis steps on this file and push the output to the usual location (not the location of the data files). Any help would be greatly appreciated. Thanks. Dr. Gregory Miles Bioinformatics Specialist Cancer Institute of New Jersey @ UMDNJ Office: (732) 235 8817 - CONFIDENTIALITY NOTICE: This email communication may contain private, confidential, or legally privileged information intended for the sole use of the designated and/or duly authorized recipient(s). If you are not the intended recipient or have received this email in error, please notify the sender immediately by email and permanently delete all copies of this email including all attachments without reading them. If you are the intended recipient, secure the contents in a manner that conforms to all applicable state and/or federal requirements related to privacy and confidentiality of such information. ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface
Re: [galaxy-user] Data upload...
Hi Greg, Even though you are not copying the data into Galaxy's default data store, Galaxy determines and stores certain metadata for each of the data files to which you are linking. One of the types of metadata defined for the Bam datatypes is it's index, which is created by a call to samtools. Unfortunately there is really no way around this because Galaxy requires the index file to be in a correct state, and I believe the test to determine correctness is at least as intensive as generating the index in the first place. It's been a while since I was involved in this (specifically setting metadata for bam files using samtools), so perhaps samtools has been recently improved in this regard. if so, I'll look to others to let me know I'm now outdated in my understanding of this. If we need to update samtools used by the Galaxy code to take advantage of newer features, we can certainly do so. Greg Von Kuster On Apr 23, 2012, at 2:51 PM, Gregory Miles wrote: Thank you very much for your help with this - we got that settled. One other question...we are importing sorted, indexed bam files into a galaxy data library and we are not having galaxy copy over the files (they are large) but rather just setting up galaxy such that it points to the relevant directory. We noticed that the file (160 GB in size) is taking a long time to import considering all it should be doing is creating a link. When we examined processes that are running, we noticed that samtools is running. From searching around a bit, it seems that Galaxy does this in order to groom the bam file (sort/index) and ensure that it is in the format necessary for galaxy to be able to interpret it. Is there any way around this? We did the sorting and indexing prior to import and it's taking quite a while to perform an unnecessary function. Thanks. Greg Dr. Gregory Miles Bioinformatics Specialist Cancer Institute of New Jersey @ UMDNJ Office: (732) 235 8817 - CONFIDENTIALITY NOTICE: This email communication may contain private, confidential, or legally privileged information intended for the sole use of the designated and/or duly authorized recipient(s). If you are not the intended recipient or have received this email in error, please notify the sender immediately by email and permanently delete all copies of this email including all attachments without reading them. If you are the intended recipient, secure the contents in a manner that conforms to all applicable state and/or federal requirements related to privacy and confidentiality of such information. On Mon, Apr 23, 2012 at 12:55 PM, Greg Von Kuster g...@bx.psu.edu wrote: Hi Greg, Upload your files to a Galaxy data library using a combination of Upload files from filesystem paths without copying data into Galaxy's default data store. See the following wiki for all the details: http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries/Uploading%20Library%20Files For all of the details about data libraries, see: http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries Greg Von Kuster On Apr 23, 2012, at 11:26 AM, Gregory Miles wrote: We have large files that cannot be uploaded using the file upload command and instead would need to be uploaded using a URL. Unfortunately, we are using a local install on a non-local machine, so setting up an FTP server on this machine is a security issue. The files are located on this computer already anyhow, and Galaxy would simply be copying from one folder to another in order to perform the get data step. Is there a simple way to have a pointer of some sort such that galaxy knows where this file is and: 1) Would not have to copy it and could simply refer to the file location. 2) Could perform data analysis steps on this file and push the output to the usual location (not the location of the data files). Any help would be greatly appreciated. Thanks. Dr. Gregory Miles Bioinformatics Specialist Cancer Institute of New Jersey @ UMDNJ Office: (732) 235 8817 - CONFIDENTIALITY NOTICE: This email communication may contain private, confidential, or legally privileged information intended for the sole use of the designated and/or duly authorized recipient(s). If you are not the intended recipient or have received this email in error, please notify the sender immediately by email and permanently delete all copies of this email including all attachments without reading them. If you are the intended recipient, secure the contents in a manner that conforms to all applicable state and/or federal requirements related to privacy and confidentiality of such information.
Re: [galaxy-user] Data upload...
Thanks again for the feedback...one final (hopefully) thingas I mentioned in first e-mail, we are trying to add a large (~170 GB) BAM file to a library with just a link to the file (no copying). After at least an hour of working, I get the error message Unable to finish job, tool error. Any thoughts as to how I can fix this? Thanks. Greg On 4/23/12, Greg Von Kuster g...@bx.psu.edu wrote: Hi Greg, Even though you are not copying the data into Galaxy's default data store, Galaxy determines and stores certain metadata for each of the data files to which you are linking. One of the types of metadata defined for the Bam datatypes is it's index, which is created by a call to samtools. Unfortunately there is really no way around this because Galaxy requires the index file to be in a correct state, and I believe the test to determine correctness is at least as intensive as generating the index in the first place. It's been a while since I was involved in this (specifically setting metadata for bam files using samtools), so perhaps samtools has been recently improved in this regard. if so, I'll look to others to let me know I'm now outdated in my understanding of this. If we need to update samtools used by the Galaxy code to take advantage of newer features, we can certainly do so. Greg Von Kuster On Apr 23, 2012, at 2:51 PM, Gregory Miles wrote: Thank you very much for your help with this - we got that settled. One other question...we are importing sorted, indexed bam files into a galaxy data library and we are not having galaxy copy over the files (they are large) but rather just setting up galaxy such that it points to the relevant directory. We noticed that the file (160 GB in size) is taking a long time to import considering all it should be doing is creating a link. When we examined processes that are running, we noticed that samtools is running. From searching around a bit, it seems that Galaxy does this in order to groom the bam file (sort/index) and ensure that it is in the format necessary for galaxy to be able to interpret it. Is there any way around this? We did the sorting and indexing prior to import and it's taking quite a while to perform an unnecessary function. Thanks. Greg Dr. Gregory Miles Bioinformatics Specialist Cancer Institute of New Jersey @ UMDNJ Office: (732) 235 8817 - CONFIDENTIALITY NOTICE: This email communication may contain private, confidential, or legally privileged information intended for the sole use of the designated and/or duly authorized recipient(s). If you are not the intended recipient or have received this email in error, please notify the sender immediately by email and permanently delete all copies of this email including all attachments without reading them. If you are the intended recipient, secure the contents in a manner that conforms to all applicable state and/or federal requirements related to privacy and confidentiality of such information. On Mon, Apr 23, 2012 at 12:55 PM, Greg Von Kuster g...@bx.psu.edu wrote: Hi Greg, Upload your files to a Galaxy data library using a combination of Upload files from filesystem paths without copying data into Galaxy's default data store. See the following wiki for all the details: http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries/Uploading%20Library%20Files For all of the details about data libraries, see: http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries Greg Von Kuster On Apr 23, 2012, at 11:26 AM, Gregory Miles wrote: We have large files that cannot be uploaded using the file upload command and instead would need to be uploaded using a URL. Unfortunately, we are using a local install on a non-local machine, so setting up an FTP server on this machine is a security issue. The files are located on this computer already anyhow, and Galaxy would simply be copying from one folder to another in order to perform the get data step. Is there a simple way to have a pointer of some sort such that galaxy knows where this file is and: 1) Would not have to copy it and could simply refer to the file location. 2) Could perform data analysis steps on this file and push the output to the usual location (not the location of the data files). Any help would be greatly appreciated. Thanks. Dr. Gregory Miles Bioinformatics Specialist Cancer Institute of New Jersey @ UMDNJ Office: (732) 235 8817 - CONFIDENTIALITY NOTICE: This email communication may contain private, confidential, or legally privileged information intended for the sole use of the designated and/or duly authorized recipient(s). If you are not the intended recipient or have received this email in error,
Re: [galaxy-user] Data upload...
Is there something helpful in your paster log about the cause? On Apr 23, 2012, at 4:34 PM, Gregory Miles wrote: Thanks again for the feedback...one final (hopefully) thingas I mentioned in first e-mail, we are trying to add a large (~170 GB) BAM file to a library with just a link to the file (no copying). After at least an hour of working, I get the error message Unable to finish job, tool error. Any thoughts as to how I can fix this? Thanks. Greg On 4/23/12, Greg Von Kuster g...@bx.psu.edu wrote: Hi Greg, Even though you are not copying the data into Galaxy's default data store, Galaxy determines and stores certain metadata for each of the data files to which you are linking. One of the types of metadata defined for the Bam datatypes is it's index, which is created by a call to samtools. Unfortunately there is really no way around this because Galaxy requires the index file to be in a correct state, and I believe the test to determine correctness is at least as intensive as generating the index in the first place. It's been a while since I was involved in this (specifically setting metadata for bam files using samtools), so perhaps samtools has been recently improved in this regard. if so, I'll look to others to let me know I'm now outdated in my understanding of this. If we need to update samtools used by the Galaxy code to take advantage of newer features, we can certainly do so. Greg Von Kuster On Apr 23, 2012, at 2:51 PM, Gregory Miles wrote: Thank you very much for your help with this - we got that settled. One other question...we are importing sorted, indexed bam files into a galaxy data library and we are not having galaxy copy over the files (they are large) but rather just setting up galaxy such that it points to the relevant directory. We noticed that the file (160 GB in size) is taking a long time to import considering all it should be doing is creating a link. When we examined processes that are running, we noticed that samtools is running. From searching around a bit, it seems that Galaxy does this in order to groom the bam file (sort/index) and ensure that it is in the format necessary for galaxy to be able to interpret it. Is there any way around this? We did the sorting and indexing prior to import and it's taking quite a while to perform an unnecessary function. Thanks. Greg Dr. Gregory Miles Bioinformatics Specialist Cancer Institute of New Jersey @ UMDNJ Office: (732) 235 8817 - CONFIDENTIALITY NOTICE: This email communication may contain private, confidential, or legally privileged information intended for the sole use of the designated and/or duly authorized recipient(s). If you are not the intended recipient or have received this email in error, please notify the sender immediately by email and permanently delete all copies of this email including all attachments without reading them. If you are the intended recipient, secure the contents in a manner that conforms to all applicable state and/or federal requirements related to privacy and confidentiality of such information. On Mon, Apr 23, 2012 at 12:55 PM, Greg Von Kuster g...@bx.psu.edu wrote: Hi Greg, Upload your files to a Galaxy data library using a combination of Upload files from filesystem paths without copying data into Galaxy's default data store. See the following wiki for all the details: http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries/Uploading%20Library%20Files For all of the details about data libraries, see: http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries Greg Von Kuster On Apr 23, 2012, at 11:26 AM, Gregory Miles wrote: We have large files that cannot be uploaded using the file upload command and instead would need to be uploaded using a URL. Unfortunately, we are using a local install on a non-local machine, so setting up an FTP server on this machine is a security issue. The files are located on this computer already anyhow, and Galaxy would simply be copying from one folder to another in order to perform the get data step. Is there a simple way to have a pointer of some sort such that galaxy knows where this file is and: 1) Would not have to copy it and could simply refer to the file location. 2) Could perform data analysis steps on this file and push the output to the usual location (not the location of the data files). Any help would be greatly appreciated. Thanks. Dr. Gregory Miles Bioinformatics Specialist Cancer Institute of New Jersey @ UMDNJ Office: (732) 235 8817 - CONFIDENTIALITY NOTICE: This email communication may contain private, confidential, or legally privileged information intended for the sole use of the
Re: [galaxy-user] Data upload
Ateeq, The preferred method for uploading large files (that aren't already hosted somewhere) is FTP. See the instructions here: http://wiki.g2.bx.psu.edu/Learn/Upload%20via%20FTP We don't generally provide specific analysis pipelines, rather the tools for composing them, though you're welcome to look through the shared workflows, histories, and pages (See Shared Data in Galaxy) for examples and perhaps someone else will chime in with their experiences analyzing bacterial transcriptomes. Lastly, try not to piggyback on unrelated threads (as in your other email). It makes tracking email replies more difficult. -Dannon On Jan 9, 2012, at 10:30 AM, Ateequr Rehman wrote: Dear galaxy Users I am very very new to galaxy, will be highly obliged if some one could help me to find the way to analyse bacterial tramnscriptome. in the first step itself , i am having trouble to upload files... Does any one knows how to generate URL to upload data, my fastq files are about 3 gb each, Is there any specific pipeline to analyse bacterial transcriptome Thanking all of you Ateeq ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/