[galaxy-user] Problem Loading History Pane
To whom it may concern: The History panel on the right side of the Galaxy page is taking a very very long time to load. Also, when it does load, I have tired to save my .bam files and the transmissions gets truncated to ~7000kb - 8000kb of data. All of my .bam files are several GB. Some times, when I retry tor download the data, it succeeds and other times it is again truncated. The size of the truncation may be different for the same file on the retry attempt. Is there a problem with Galaxy? Thanks, Mike___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Problem with BWA?
To whom it may concern: I have noticed that my workflow has been stuck on align with BWA for illumina for ~24hrs. Is there a problem? Thanks, Mike___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Not able to get data from UCSC table browser
Hi Galaxy team, I also am having a similar problem. I uploaded data by ftp but when I try to load into a work history, it is stuck as Job waiting to run. Mike --- On Thu, 12/22/11, Peng Yu pengyu...@gmail.com wrote: From: Peng Yu pengyu...@gmail.com Subject: [galaxy-user] Not able to get data from UCSC table browser To: galaxy-user@lists.bx.psu.edu Date: Thursday, December 22, 2011, 12:06 AM Hi, I'm trying to get data from UCSC table browser. However, the browser always show Waiting for main.g2.bx.psu.edu Is there any problem with the galaxy server? -- Regards, Peng ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] public interface issues
Same here; I can't initiate Generate Pileup from Sam tools. When I select Execute the brower just stalls and does not add to my workflow. Also, I can't download bam or bai files when use right-click save targe as. I have tried Internet Explorer and Firefox. --- On Mon, 12/5/11, Richard Mark White whit...@yahoo.com wrote: From: Richard Mark White whit...@yahoo.com Subject: Re: [galaxy-user] public interface issues To: Cittaro Davide cittaro.dav...@hsr.it, galaxy-u...@bx.psu.edu galaxy-u...@bx.psu.edu Date: Monday, December 5, 2011, 9:17 AM Likewise. I can get to my Saved Histories, but when i click on one, very few items (if any) show up in the rightside panel. ive also tried multiple browsers, etc. rich From: Cittaro Davide cittaro.dav...@hsr.it To: galaxy-u...@bx.psu.edu galaxy-u...@bx.psu.edu Sent: Monday, December 5, 2011 7:39 AM Subject: [galaxy-user] public interface issues Hi all, I can't use the public interface in an effective way: items in history (i.e. the green boxes) cannot be expanded. More important, names of the items remind me something that has to deal with the tool/library path (e.g. 2010_03/pilot2/README_pilot2_snps). This happens on OS X and MS Windows systems, Firefox and Safari. Thanks d /* Davide Cittaro, PhD Head of Bioinformatics Core Center for Translational Genomics and Bioinformatics San Raffaele Scientific Institute Via Olgettina 58 20132 Milano Italy Office: +39 02 26439140 Mail: cittaro.dav...@hsr.it Skype: daweonline */ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -Inline Attachment Follows- ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Galaxy test won't run a bwa job
Hi Carlos, You can always run your BWA on the Main Galaxy and then import the data directly into the Test Galaxy using the http: address; no need to download to your local machine. The http: address can be found by by selecting properties on the disk icon. Instead of downloading, you can just copy the http:. I hope this helps, Mike --- On Tue, 11/22/11, Carlos Borroto carlos.borr...@gmail.com wrote: From: Carlos Borroto carlos.borr...@gmail.com Subject: Re: [galaxy-user] Galaxy test won't run a bwa job To: Jennifer Jackson j...@bx.psu.edu Cc: galaxy-user@lists.bx.psu.edu Date: Tuesday, November 22, 2011, 3:12 PM Hi Jen, I did as you recommended a rerun my job, but after 2hrs is still waiting. This is my history if that helps: http://test.g2.bx.psu.edu/u/cjav/h/gatk---hg19---example Regards, Carlos On Tue, Nov 22, 2011 at 12:30 PM, Jennifer Jackson j...@bx.psu.edu wrote: Hello Carlos, The NGS cluster was down yesterday for maintenance. Restarting the BWA job should initiate the run. Hopefully this helps, Jen Galaxy team On 11/22/11 7:56 AM, Carlos Borroto wrote: Hi, I'm trying to test a workflow using tools only available on the test server, for this I have uploaded a limited subset of my data that should run fairly quickly. The first step is a BWA mapping, but the job has being in the queue since yesterday. Is it fine to run this kind of test there? Thanks, Carlos ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] GATK wrapper in Galaxy
Hello all, I am trying to run through the GATK pipeline on the Galaxy Test page. I have a question related to the UnifiedGenotyper wrapper page. At the bottom of the page, I choose the Advanced tab for Basic or Advanced Analysis options. After the automatically refreshed, I choose the Annotation Types that I was interested in. Then below that are options for Annotation Interfaces/Groups: What are these options used for? I have looked all over the web, on the GATK wiki and at the GATK QA page (located at getsatisaction.com) without any luck. Are these options important? The description located at the bottom of the page lists: Group - One or more classes/groups of annotations to apply to variant calls. What type of group or class is Standard, Experiments or WorkInProgress? Does class refer to a Java class in a similar way that annotations are arguments? Any clarification would be very helpful. Thanks, Mike ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Problem with bam and/or bai files
Hi all, I appreciate all of the discussion related to this issue. I still don't understand why I should only see this issue when I choose the hg_g1k_v37 format but not when I choose the Hg_19 format? I realize that I would need to ensure that the Bam files are sorted correctly before I enter the GATK pipline, but all of this is before that process. When my read files are processed through to .bam files using the hg_19 format, I can view them in IGV without a problem. It is only when I use the hg_g1k_v37 format that I receive an error from IGV. It seems to me that the process that I am using in Galaxy should be identical except for the reference genome format (i.e. hg_19 or hg_g1k_v37). I am at a loss of how to proceed. Does anyone have ideas? Thanks, Mike --- On Thu, 10/27/11, Jim Robinson jrobi...@broadinstitute.org wrote: From: Jim Robinson jrobi...@broadinstitute.org Subject: Re: [galaxy-user] Problem with bam and/or bai files To: Peter Cock p.j.a.c...@googlemail.com Cc: Galaxy Dev galaxy-...@bx.psu.edu, Mike Dufault dufau...@yahoo.com, galaxy-user galaxy-user@lists.bx.psu.edu Date: Thursday, October 27, 2011, 9:58 AM Its possible the sorting problem was a specific version and now gives an error. The incorrect index caused by bad sequence lengths is a recurrent problem, but I do not know what tool produces such headers. Perhaps someone who has experienced this can chime in. I'm not a samtools expert just sharing my experience on what has caused this error int the past. It does seem that, as a general rule, that these index problems result in errors from Picard (which the GATK uses), while samtools can fail silently and sometimes and give you an unrelated query region. Jim Sending to galaxy-dev ... On Thu, Oct 27, 2011 at 5:51 AM, Jim Robinson jrobi...@broadinstitute.org wrote: Hi Mike, Someone from the Galaxy team can perhaps give some insight on what went wrong, I can comment on the error message from IGV. That error is thrown from Picard, in every case I've investigated so far it was traced to a problem with the index. Useful background re: Error reading bam file. This usually indicates a problem with the index (bai) file. ArrayIndexOutofBoundsException: 4682 (4682). The most common causes are (1) a problem with the sequence dictionary in the BAM header itself, specifically incorrect sequence lengths, Any idea what tools produce that kind of thing? and (2) indexing an un-sorted BAM. Apparently samtools will make invalid indexes from such files without any complaints in both cases. You can even use samtools tview on such files, it happily will show you some random region when you query. That is news to me - I recall samtools index being recommended as a way to determine if a BAM files was sorted or not (error on unsorted, you get an index if it was sorted) and again from memory this is what Galaxy uses internally as part of preparing BAM files on upload. Might this be tied to a specific version of samtools? e.g. a possible regression? I don't see a Sort step in your workflow, maybe that's the problem? Please CC me on any reply, I might miss it in the list. Jim Thanks, Peter ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Problem with bam and/or bai files
Hello Galaxy Team, I have been using Galaxy for SNP detection for with great success. Basically, I followed the screen-cast from Anton without any problems. The only change was to use the BWA instead of Bowtie. Until now, I have always assigned my raw read files to the hg19 format. Now I want to try the GATK pipeline to analyze my samples but I am running into a problem with the bam/bai files. Here is what I did. I imported my Illumina paired end reads into Galaxy and assigned them to the hg_g1k_v37 format instead of the Hg19 format. From there, I again followed the exact same process: FastQ Groomer, Summary Statistics, Boxplots, Align with BWA, filter on SAM, SAM-to-Bam, generate bai file. I made sure that hg_g1k_37 was chosen for the format for all of these steps that required that information. Everything seemed to run successfully as all of the boxed turned green. When I tried to view the bam file in IGV (as a QC step before the GATK pipeline), I received the following error: Error reading bam file. This usually indicates a problem with the index (bai) file. ArrayIndexOutofBoundsException: 4682 (4682). I did the exact same analysis using the Hg19 format and my bam/bai files worked perfectly fine in the IGV viewer. Can anyone tell me what the problem is and how to fix it? Thanks, Mike Dufault ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Indel Extraction - Deletion Information
Hi All, Does anyone know if there is a way to get the deleted base information that corresponds to the Extract indels from SAM output? The output from this tool includes the bases for insertions or - for deletions. I want to get the actual bases instead of the -. I assume there is a way to use the SAM file to extract the information, but if someone already has a nifty way to do it, that would be great. Thanks, Mike___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] How to load/import saved workflow .ga files into Galaxy?
Hi Dannon, That was too easy! I wish all issues were that easy to fix. Thanks, Mike --- On Tue, 5/17/11, Dannon Baker dannonba...@me.com wrote: From: Dannon Baker dannonba...@me.com Subject: Re: [galaxy-user] How to load/import saved workflow .ga files into Galaxy? To: Mike Dufault dufau...@yahoo.com Cc: galaxy-user@lists.bx.psu.edu Date: Tuesday, May 17, 2011, 1:26 PM On the workflows landing page you should find an option to Upload or import workflow in the top right. Clicking that will allow you to supply a URL to a shared galaxy workflow, or in your case paste in the contents of the .ga file in the Encoded workflow box and it should work. Direct file upload of the .ga export will be available at some point. -Dannon On May 17, 2011, at 1:18 PM, Mike Dufault wrote: Dear Galaxy Team, I have created a workflow that I would like to use frequently. I saved the work flow to my desktop as a .ga file and I would now like to import it into a new Cloudman Instance. Using a new/blank history, I selected options and then selected import from file. The only option that I get is to Import History from an Archive and it wants a URL. How can Import my workflow.ga file so that I can use it future instances of Galaxy? Thanks, Mike ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Help!!!!!! with Galaxy Cloud!!!!!
Hello again, So I am able to see all of the .dat files in /mnt/galaxyData. What commands can I use to download a file to my HD? Also, what program should I use to open the .dat file? Thanks again, Mike --- On Wed, 4/13/11, Enis Afgan eaf...@emory.edu wrote: From: Enis Afgan eaf...@emory.edu Subject: Re: [galaxy-user] Help!! with Galaxy Cloud! To: Mike Dufault dufau...@yahoo.com Cc: galaxy-u...@bx.psu.edu Date: Wednesday, April 13, 2011, 11:15 PM Hi Mike, Once the given EBS volume is attached and mounted, all of the data should be in /mnt/galaxyData/files/000/This assumes the file system is mounted to /mnt/galaxyData, which is where it would get mounted to automatically by cloudman on cluster instantiation. Enis On Wed, Apr 13, 2011 at 9:00 PM, Mike Dufault dufau...@yahoo.com wrote: Hi Enis, I started to use the terminal to check to see if the job was running, but it stopped successfully at the same time. Thanks again for helping me to complete the run. Now I have an additional issue. I wanted to save my BAM file, but I kept getting an error. I think the error was because it was too large to send (4.1GB). So I saved what I could to my local HD and terminated the cluster. My EBS volume is 200GB and persisted after the cluster was terminated. I assume that my BAM file resides somewhere in the EBS volume. I started a new Unix cluster and attached the EBS to that cluster. I also established an ssh to the Unis cluster but I do not know where to find the BAM file. Do you know how I can access the BAM file so that I can transfer it to my local HD? Thanks, Mike --- On Wed, 4/13/11, Enis Afgan eaf...@emory.edu wrote: From: Enis Afgan eaf...@emory.edu Subject: Re: [galaxy-user] Help!! with Galaxy Cloud! To: vasu punj pu...@yahoo.com Cc: galaxy-u...@bx.psu.edu Date: Wednesday, April 13, 2011, 10:01 AM Hi Vasu, I am not sure I understand your question but the general instructions on how to get started and use Galaxy on the cloud (i.e., Cloudman) are available at usegalaxy.org/cloud Let us know if you that page does not answer your questions, Enis On Wed, Apr 13, 2011 at 9:40 AM, vasu punj pu...@yahoo.com wrote: I was wondering if there are instructions how can I run the Galaxy on CloudConsole, Indeed first I want to know how Galaxy is established on console? Can someone direct me to the instructions please. Thanks. --- On Tue, 4/12/11, Enis Afgan eaf...@emory.edu wrote: From: Enis Afgan eaf...@emory.edu Subject: Re: [galaxy-user] Help!! with Galaxy Cloud! To: Mike Dufault dufau...@yahoo.com Cc: galaxy-u...@bx.psu.edu Date: Tuesday, April 12, 2011, 9:31 PM Galaxy has the functionality to recover any jobs that were running after it's restarted so it is quite possible to for the job to still be running. In addition, from the cloudman console, it appears that at least one instance is pretty heavily loaded so that can also mean that the job is still running. However, without actually accessing the instance through the command line and checking the status of the job queue, it is not possible to tell if the job is - actually running. Do you know how to do that? It's just a few commands in the terminal: - access the instance [local]$ ssh -i path to the private key you downloaded from AWS when you created a key pair ubuntu@instance public DNS - become galaxy user [ec2]$ sudo su galaxy - list any running jobs [ec2]$ qstat If that command returns a list of jobs and the jobs are in stare 'r' (running), the job is still running; otherwise, no. Let me know how it goes, Enis On Tue, Apr 12, 2011 at 9:49 PM, Mike Dufault dufau...@yahoo.com wrote: Hi Enis, THANK YOU!!! I see that my filter pileup on data step is running. Is this the same analysis that was running before or did it relauch when you restarted Galaxy? I just don't know if the analysis would be compromised. Thanks again to you and the whole Galaxy team. Best, Mike --- On Tue, 4/12/11, Enis Afgan eaf...@emory.edu wrote: From: Enis Afgan eaf...@emory.edu Subject: Re: [galaxy-user] Help!! with Galaxy Cloud! To: Mike Dufault dufau...@yahoo.com Cc: Anton Nekrutenko an...@bx.psu.edu, galaxy-u...@bx.psu.edu Date: Tuesday, April 12, 2011, 9:16 PM Ahh, for some reason cloudman is thinking Galaxy is not 'running' but still 'starting' and has thus not enabled the given button. To access the analysis, in your browser, just delete the '/cloud' part of the URL and that should load Galaxy. Sorry about the confusion, Enis On Tue, Apr 12, 2011 at 9:12 PM, Mike Dufault dufau...@yahoo.com wrote: Hi Enis, Thanks for looking into this. From the Galaxy Cloudman Console, I can see that it was restarted from the log (thanks), but the Access Galaxy choice is still grayed out and I don't know how to access the Analysis window. Is there a way back into my analysis? Thanks, Mike --- On Tue, 4/12/11, Enis Afgan eaf
Re: [galaxy-user] Help!!!!!! with Galaxy Cloud!!!!!
Hello Galaxy Staff, My data has been running on the Amazon EC2 for just over 24hrs. I have not closed any windows and my Exome analysis made it all the way through to filter on Pile up. I have two tabs for this instance. One is the Galaxy Cloudman Console and the other is the tab where I perform the analysis, load data, history etc. Anyway, I went to add a step to the work flow and the screen Welcome Galaxy to the Cloud screen along with the information There is no Galaxy instance running on this host, or the Galaxy instance is not responding. To manage Galaxy on this host, please use the Cloud Console. What happened??? When I go back to the Galaxy Cloudman Console, it shows that my instance is still running along with the four cores, the Cluster log is below. AWS also shows that my instance is running. Will the work flow finish? Can I get my data? How? I tried to re-access the analysis page by selecting Access Galaxy from the Galaxy Cloudman Console but it sends me to the same Welcome page. Is there a way to get back into the analysis page? Please help!!! Thanks, Mike The cluster log shows: 13:05:24 - Master starting13:05:25 - Completed initial cluster configuration.13:05:33 - Starting service 'SGE'13:05:48 - Configuring SGE...13:05:56 - Successfully setup SGE; configuring SGE13:05:57 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:05:57 - Saved file 'cm_boot.py' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:05:57 - Saved file 'cm.tar.gz' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:05:57 - Problem connecting to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3', attempt 1/513:05:59 - Saved file 'Fam122261.clusterName' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:06:24 - Initializing a 'Galaxy' cluster.13:06:24 - Retrieved file 'snaps.yaml' from bucket 'cloudman' to 'cm_snaps.yaml'.13:06:41 - Adding 3 instance(s)...13:07:02 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:07:38 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:07:38 - Saved file 'universe_wsgi.ini.cloud' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:07:38 - Saved file 'tool_conf.xml.cloud' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:07:48 - Error mounting file system '/mnt/galaxyData' from '/dev/sdg3', running command '/bin/mount /dev/sdg3 /mnt/galaxyData' returned code '32' and following stderr: 'mount: you must specify the filesystem type '13:07:52 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:07:52 - Starting service 'Postgres'13:07:52 - PostgreSQL data directory '/mnt/galaxyData/pgsql/data' does not exist (yet?)13:07:52 - Configuring PostgreSQL with a database for Galaxy...13:08:05 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:08:05 - Starting service 'Galaxy'13:08:05 - Galaxy daemon not running.13:08:05 - Galaxy service state changed from 'Starting' to 'Error'13:08:05 - Setting up Galaxy application13:08:05 - Retrieved file 'universe_wsgi.ini.cloud' from bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3' to '/mnt/galaxyTools/galaxy-central/universe_wsgi.ini'.13:08:05 - Retrieved file 'tool_conf.xml.cloud' from bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3' to '/mnt/galaxyTools/galaxy-central/tool_conf.xml'.13:08:05 - Retrieved file 'tool_data_table_conf.xml.cloud' from bucket 'cloudman' to '/mnt/galaxyTools/galaxy-central/tool_data_table_conf.xml.cloud'.13:08:05 - Starting Galaxy...13:08:09 - Galaxy service state changed from 'Error' to 'Starting'13:08:09 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:08:09 - Saved file 'tool_data_table_conf.xml.cloud' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:08:28 - Instance 'i-e46f0a8b' reported alive13:08:28 - Successfully generated root user's public key.13:08:28 - Sent master public key to worker instance 'i-e46f0a8b'.13:08:28 - Instance 'i-e26f0a8d' reported alive13:08:28 - Sent master public key to worker instance 'i-e26f0a8d'.13:08:33 - Instance 'i-e06f0a8f' reported alive13:08:33 - Sent master public key to worker instance 'i-e06f0a8f'.13:08:33 - Adding instance i-e46f0a8b to SGE Execution Host list13:08:44 - Successfully added instance 'i-e46f0a8b' to SGE13:08:44 - Waiting on worker instance 'i-e46f0a8b' to configure itself...13:08:44 - Instance 'i-e26f0a8d' already in SGE's @allhosts13:08:44 - Waiting on worker instance 'i-e26f0a8d' to configure itself...13:08:45 - Instance 'i-e06f0a8f' already in SGE's @allhosts13:08:45 - Waiting on worker instance 'i-e06f0a8f' to configure itself...13:08:50 - Instance 'i-e46f0a8b' ready13:09:27 - Galaxy service state changed from 'Starting' to 'Running'22:38:18 - Found '3' idle instances; trying to remove '2'22:38:18 - Specific termination of instance 'i-e26f0a8d' requested.22:38:18 - Removing instance 'i-e26f0a8d'
Re: [galaxy-user] Help!!!!!! with Galaxy Cloud!!!!!
Hi Enis, Thanks for looking into this. From the Galaxy Cloudman Console, I can see that it was restarted from the log (thanks), but the Access Galaxy choice is still grayed out and I don't know how to access the Analysis window. Is there a way back into my analysis? Thanks, Mike --- On Tue, 4/12/11, Enis Afgan eaf...@emory.edu wrote: From: Enis Afgan eaf...@emory.edu Subject: Re: [galaxy-user] Help!! with Galaxy Cloud! To: Mike Dufault dufau...@yahoo.com Cc: Anton Nekrutenko an...@bx.psu.edu, galaxy-u...@bx.psu.edu Date: Tuesday, April 12, 2011, 8:55 PM Hi Mike, Try accessing your Galaxy instance now. It should be ok. The link in your email contained the IP for your instance so I took the liberty of restarting Galaxy and that brought it back up. There seems to have been an issue with Galaxy accessing its database and that resulted in Galaxy crashing. We'll look into why that happened in the first place but should be ok now. Let me know if you have any more trouble,Enis On Tue, Apr 12, 2011 at 2:49 PM, Mike Dufault dufau...@yahoo.com wrote: Hello Galaxy Staff, My data has been running on the Amazon EC2 for just over 24hrs. I have not closed any windows and my Exome analysis made it all the way through to filter on Pile up. I have two tabs for this instance. One is the Galaxy Cloudman Console and the other is the tab where I perform the analysis, load data, history etc. Anyway, I went to add a step to the work flow and the screen Welcome Galaxy to the Cloud screen along with the information There is no Galaxy instance running on this host, or the Galaxy instance is not responding. To manage Galaxy on this host, please use the Cloud Console. What happened??? When I go back to the Galaxy Cloudman Console, it shows that my instance is still running along with the four cores, the Cluster log is below. AWS also shows that my instance is running. Will the work flow finish? Can I get my data? How? I tried to re-access the analysis page by selecting Access Galaxy from the Galaxy Cloudman Console but it sends me to the same Welcome page. Is there a way to get back into the analysis page? Please help!!! Thanks, Mike The cluster log shows: 13:05:24 - Master starting13:05:25 - Completed initial cluster configuration.13:05:33 - Starting service 'SGE'13:05:48 - Configuring SGE...13:05:56 - Successfully setup SGE; configuring SGE 13:05:57 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:05:57 - Saved file 'cm_boot.py' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3' 13:05:57 - Saved file 'cm.tar.gz' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:05:57 - Problem connecting to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3', attempt 1/513:05:59 - Saved file 'Fam122261.clusterName' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3' 13:06:24 - Initializing a 'Galaxy' cluster.13:06:24 - Retrieved file 'snaps.yaml' from bucket 'cloudman' to 'cm_snaps.yaml'.13:06:41 - Adding 3 instance(s)... 13:07:02 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:07:38 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:07:38 - Saved file 'universe_wsgi.ini.cloud' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3' 13:07:38 - Saved file 'tool_conf.xml.cloud' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:07:48 - Error mounting file system '/mnt/galaxyData' from '/dev/sdg3', running command '/bin/mount /dev/sdg3 /mnt/galaxyData' returned code '32' and following stderr: 'mount: you must specify the filesystem type '13:07:52 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:07:52 - Starting service 'Postgres'13:07:52 - PostgreSQL data directory '/mnt/galaxyData/pgsql/data' does not exist (yet?) 13:07:52 - Configuring PostgreSQL with a database for Galaxy...13:08:05 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:08:05 - Starting service 'Galaxy' 13:08:05 - Galaxy daemon not running.13:08:05 - Galaxy service state changed from 'Starting' to 'Error'13:08:05 - Setting up Galaxy application13:08:05 - Retrieved file 'universe_wsgi.ini.cloud' from bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3' to '/mnt/galaxyTools/galaxy-central/universe_wsgi.ini'.13:08:05 - Retrieved file 'tool_conf.xml.cloud' from bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3' to '/mnt/galaxyTools/galaxy-central/tool_conf.xml'.13:08:05 - Retrieved file 'tool_data_table_conf.xml.cloud' from bucket 'cloudman' to '/mnt/galaxyTools/galaxy-central/tool_data_table_conf.xml.cloud'.13:08:05 - Starting Galaxy...13:08:09 - Galaxy service state changed from 'Error' to 'Starting'13:08:09 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3' 13:08:09 - Saved file 'tool_data_table_conf.xml.cloud' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:08:28 - Instance 'i-e46f0a8b
Re: [galaxy-user] Help!!!!!! with Galaxy Cloud!!!!!
Hi Enis, THANK YOU!!! I see that my filter pileup on data step is running. Is this the same analysis that was running before or did it relauch when you restarted Galaxy? I just don't know if the analysis would be compromised. Thanks again to you and the whole Galaxy team. Best, Mike --- On Tue, 4/12/11, Enis Afgan eaf...@emory.edu wrote: From: Enis Afgan eaf...@emory.edu Subject: Re: [galaxy-user] Help!! with Galaxy Cloud! To: Mike Dufault dufau...@yahoo.com Cc: Anton Nekrutenko an...@bx.psu.edu, galaxy-u...@bx.psu.edu Date: Tuesday, April 12, 2011, 9:16 PM Ahh, for some reason cloudman is thinking Galaxy is not 'running' but still 'starting' and has thus not enabled the given button. To access the analysis, in your browser, just delete the '/cloud' part of the URL and that should load Galaxy. Sorry about the confusion,Enis On Tue, Apr 12, 2011 at 9:12 PM, Mike Dufault dufau...@yahoo.com wrote: Hi Enis, Thanks for looking into this. From the Galaxy Cloudman Console, I can see that it was restarted from the log (thanks), but the Access Galaxy choice is still grayed out and I don't know how to access the Analysis window. Is there a way back into my analysis? Thanks, Mike --- On Tue, 4/12/11, Enis Afgan eaf...@emory.edu wrote: From: Enis Afgan eaf...@emory.edu Subject: Re: [galaxy-user] Help!! with Galaxy Cloud! To: Mike Dufault dufau...@yahoo.com Cc: Anton Nekrutenko an...@bx.psu.edu, galaxy-u...@bx.psu.edu Date: Tuesday, April 12, 2011, 8:55 PM Hi Mike, Try accessing your Galaxy instance now. It should be ok. The link in your email contained the IP for your instance so I took the liberty of restarting Galaxy and that brought it back up. There seems to have been an issue with Galaxy accessing its database and that resulted in Galaxy crashing. We'll look into why that happened in the first place but should be ok now. Let me know if you have any more trouble,Enis On Tue, Apr 12, 2011 at 2:49 PM, Mike Dufault dufau...@yahoo.com wrote: Hello Galaxy Staff, My data has been running on the Amazon EC2 for just over 24hrs. I have not closed any windows and my Exome analysis made it all the way through to filter on Pile up. I have two tabs for this instance. One is the Galaxy Cloudman Console and the other is the tab where I perform the analysis, load data, history etc. Anyway, I went to add a step to the work flow and the screen Welcome Galaxy to the Cloud screen along with the information There is no Galaxy instance running on this host, or the Galaxy instance is not responding. To manage Galaxy on this host, please use the Cloud Console. What happened??? When I go back to the Galaxy Cloudman Console, it shows that my instance is still running along with the four cores, the Cluster log is below. AWS also shows that my instance is running. Will the work flow finish? Can I get my data? How? I tried to re-access the analysis page by selecting Access Galaxy from the Galaxy Cloudman Console but it sends me to the same Welcome page. Is there a way to get back into the analysis page? Please help!!! Thanks, Mike The cluster log shows: 13:05:24 - Master starting13:05:25 - Completed initial cluster configuration.13:05:33 - Starting service 'SGE'13:05:48 - Configuring SGE...13:05:56 - Successfully setup SGE; configuring SGE 13:05:57 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:05:57 - Saved file 'cm_boot.py' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3' 13:05:57 - Saved file 'cm.tar.gz' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:05:57 - Problem connecting to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3', attempt 1/513:05:59 - Saved file 'Fam122261.clusterName' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3' 13:06:24 - Initializing a 'Galaxy' cluster.13:06:24 - Retrieved file 'snaps.yaml' from bucket 'cloudman' to 'cm_snaps.yaml'.13:06:41 - Adding 3 instance(s)... 13:07:02 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:07:38 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:07:38 - Saved file 'universe_wsgi.ini.cloud' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3' 13:07:38 - Saved file 'tool_conf.xml.cloud' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:07:48 - Error mounting file system '/mnt/galaxyData' from '/dev/sdg3', running command '/bin/mount /dev/sdg3 /mnt/galaxyData' returned code '32' and following stderr: 'mount: you must specify the filesystem type '13:07:52 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:07:52 - Starting service 'Postgres'13:07:52 - PostgreSQL data directory '/mnt/galaxyData/pgsql/data' does not exist (yet?) 13:07:52 - Configuring PostgreSQL with a database for Galaxy...13:08:05 - Saved file 'persistent_data.yaml' to bucket 'cm-a42f040c55e7519eb63bbaf269fa78d3'13:08:05 - Starting
[galaxy-user] Indel work flow questions
Hi All, I have a question about the NGS: Indel analysis and SNP Calling. Assuming I have loaded my paired end reads, groomed, and got all the way through to alignment with BWA my question the becomes does the analysis for indel analysis and SNP analysis split in the work flow? For SNP analysis, It seems that I need to filter on SAM, convert SAM-to-BAM, etc... For Indel, It seem that I should use the BWA output that is in SAM format for indel analysis. Are these two above statments correct? I also have a question regarding the input for indel analysis. Should I use the BWA output directly (which is in SAM format) or should I first filter on SAM and use that output (which is also in SAM format). I have tried the indel analysis using both filtered and unfilterd and I get very similar results. It seems to me that should use the filtered on SAM output where I can indicate that the reads are paired=Yes, proper pairs=yes, unmapped=NO. Any thought, insight, etc. Thanks if advance, Mike ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Analyzing Targeted Resequencing data with Galaxy
Sean, Anton and Jen, Thanks for all of the suggestions (in separate replies) on how to better analyze my SelectSure captured Exome data. My original work-flow is below in the e-mail string. Based on the suggestions, I plan to change my work-flow by increasing my quality filter from 20 to 25-30 and increasing my minimum coverage from 3x to ~20x. I will use the Join function to compare the SNPs that are in common with the samples from two family members to filter (narrow down) what they have in common, since I am looking for a hereditary disease. Then i will use the Join function again with the SNPs from build (131) to characterize the SNPs. Sean suggested realignment around indels and potentially quality score recalibration. Is that even possible with Galaxy at the moment? Where in the flow can I perform Indel analysis? Will I need to process my data separately for SNPs and Indel analysis, or can they be done sequentially in the same linear work-flow? I am still a little unsure of the best way to hand this. Please let me know if you have any more suggestions or comments before I re-launch the analysis later this evening. Once I get a flow that works, I hope to be able to publish it for everyone to benefit from. Thanks to the Galaxy team for an outstanding platform and support! Mike --- On Tue, 4/5/11, Sean Davis sdav...@mail.nih.gov wrote: From: Sean Davis sdav...@mail.nih.gov Subject: Re: [galaxy-user] Analyzing Targeted Resequencing data with Galaxy To: Mike Dufault dufau...@yahoo.com Cc: galaxy-user galaxy-user@lists.bx.psu.edu Date: Tuesday, April 5, 2011, 4:39 PM Hi, Mike. See my couple of comments below Sean On Tue, Apr 5, 2011 at 2:22 PM, Mike Dufault dufau...@yahoo.com wrote: Hi all, Like many people on this e-mail chain, I have been looking for advice on how to process Exome data. Below, I have described in detail what I have done with the hope of getting some clarification. Hopefully it will be helpful to many of us! I have SureSelect Exome captured data. The data was delivered to me as two separate files (/1) (/2). Each file has ~33 million reads; 7.2 GB each. I am looking for SNPs from a family with cancer. Eventually I plan to compare the date from multiple members of the same family to find a related disease SNP. Below is the workflow that I used to process my data. I adapted it from the Screencast titles: Mapping Illumina Reads: Paired Ends Example. I used all of the same default parameters as in the screencast. At the end of step 13, I had ~4,700,000 SNPs. This seemed like a lot so in step 14, I filtered on column 7 (c7) which I believe is the Quality SNP value. I set the filter as C7=1 to remove all of the 0 (zero) values for Quality SNP. I figured that if they have a value of zero, they must not be real SNPs. This left me with ~180,000 SNPs. 1: Get Data: Illumina 1.3+ file (/1) 2: Get Data: Illumina 1.3+ file (/2) 3: FASTQ Groomer on data 1 4: FASTQ Groomer on data 2 5: FASTQ Summary Statistics on data 3 6: FASTQ Summary Statistics on data 4 7: Box plot on data 5 8: Box plot on data 6 9: Map with Bowtie for Illumina on data 4 and data 3: mapped reads This might not be the best choice, as bowtie does not allow gapped alignment. See here for a discussion of indels and SNV calling: http://bioinformatics.oxfordjournals.org/content/26/6/722.long You will probably also want to consider local realignment around indels and potentially quality score recalibration. 10: Filter Sam on data 9 11: SAM-to-BAM on data 10: converted to BAM 12: Generate pileup on data 11: converted pileup 13: Filter pileup on data 12 14: Filter data on 13 (c7=1) 15: Sort on data 15 (C7; descending order) First, if anyone has ideas on how to improve the workflow, I would be open to suggestions; especially from people experienced with Galaxy. Second, I am concerned that many/most of the SNPs are known. Should I filter my data against the known SNPdb? If so, how can I do this in Galaxy (in Bowtie?) Keep in mind that, depending on the version of dbSNP, there are many cancer-associated SNPs contaminating the database. Third, as suggested in the screencast, I did not trim or filter my FASTQ Groomed data because I was interested in SNPs and I could filter on Quality later in the workflow. Would implementing a filtering step on phred quality (~20) at this step save me the step of filtering later on. Currently it takes multiple hours (~16) to process the data from start to finish, would filtering at this step reduce the amount of time that it takes to process my data? Presumably, there would be less data to process. I do this on the AWS Cloud and time is money! Adding a gapped alignment algorithm, indel realignment, and quality recalibration can easily increase this time to a couple of days per sample. Fifth, when using Galaxy on the AWS cloud, does adding additional cores