Re: [galaxy-dev] Trackster and gff file with multiple chromosome annotations
I will modify the gff file as you mentioned and update galaxy. Thanks a lot. Yec'han Yec'han LAIZET Ingenieur Plateforme Genome Transcriptome Tel: 05 57 12 27 75 _ INRA-UMR BIOGECO 1202 Equipe Genetique 69 route d'Arcachon 33612 CESTAS Le 29/10/2012 15:59, Jeremy Goecks a écrit : Whatever the file type I set for the gff file (gff3, gff or gtf), I get the transcript_id error: Traceback (most recent call last): File /home/pgtgal/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py, line 91, in main() File /home/pgtgal/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py, line 30, in main for feature in read_unordered_gtf( open( in_fname, 'r' ) ): File /home/pgtgal/galaxy-dist/lib/galaxy/datatypes/util/gff_util.py, line 375, in read_unordered_gtf transcript_id = line_attrs[ 'transcript_id' ] KeyError: 'transcript_id' This was due to an incomplete feature. Turns out that GFF support hadn't been included in feature search; I've added it in -central changeset fa045aad74e9: https://bitbucket.org/galaxy/galaxy-central/changeset/fa045aad74e90f16995e0cbb670a59e6b9becbed Is the gff file not correct? I believe there is an issue with your GFF: it is using non-standard identifiers in the attributes (last) column. To the best of my knowledge, 'name' is not a valid field for connecting features in GFF3 (which is my best guess for the file version), but your GFF uses this field anyways. To fix this issue, I replaced 'name' with 'ID' (which is compliant GFF3) from the command line: -- % sed s/name/ID/ ~/Downloads/test.gff ~/Downloads/test_with_ids.gff -- and this fixed the issue. Finally, there is a sed wrapper in the toolshed should you want to do this conversion in Galaxy: http://toolshed.g2.bx.psu.edu/repository/browse_categories?sort=nameoperation=view_or_manage_repositoryf-deleted=Falsef-free-text-search=sedid=9652a50c5a932f3e Best, J. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Galaxy local install
Hi Vladimir I contacted with this question vendor tech support (Dell), but they could not answer (or did not want to) and directed me to Galaxy developers. I am using RHEL58 and SciLinux55 and want to install a local instance of Galaxy. Both my systems are based on Python 2.4. Question – can I install Python 2.6/2.7 locally without messing up the system? I was advised earlier not to make system install, but being unhealthy curious I did and ended up with reinstalling SciLinux 55 from scratch. How to make sure 2.6/2.7 will not mess up the system’s Python? just install Python 2.6 somewhere in on your box (ie in the parallel to the galaxy directory) and follow the steps described under: Check your Python version on this wiki page: http://wiki.g2.bx.psu.edu/Admin/Get%20Galaxy I recently just did it for one of our development boxes (which has Python 2.5) to allow galaxy to run with Python 2.6 Regards, Hans-Rudolf Thanks, Vladimir ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Parallelism tag and job splitter
On Wednesday, October 31, 2012, Edward Hills wrote: Thanks Peter. My next question is, I have found that VCF files don't get split properly as the header is not included in the second file as is usually required by tools (such as vcf-subset). I have read the code and am happy to implement this functionality but am not to sure where this would best be done. I see a class Text ( data ) which looks like every datatype is sent to. Would it be best to implement a VCF class which is called when the datatype is VCF? Cheers, Ed VCF is I assume defined as a subclass of Text, so inherits the naive simple splitting implemented for text files (which doesn't know about headers). Have a look at the SAM splitting code (under lib/galaxy/datatypes/*.py) as an example where header aware splitting was done. You'll probably need to implement something similar. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Resend: Unnamed histories proliferating, can't get to my data
Hi. Resending because I got no response. Can anybody suggest anything that might explain this, or tell me how I can troubleshoot? Where to look in the Python code? Whether anybody has seen anything like this? Our beta tester can't actually test anything. This occurs whether he does the FTP-style upload or uploads through the browser. Thanks, -Amir Karger On 10/23/12 2:42 PM, Karger, Amir amir_kar...@hms.harvard.edu wrote: I'm using Galaxy from June, 2012. (Sorry if there's already a fix.) We've got it working in production. We've gotten whole pipelines to run. However, we occasionally get situations where we upload file (using the FTP mechanism), which seems to be fine, but then I can't get to the data. I went to Saved Histories, and selected Switch, and it outlined the line in blue and wrote current history next to it. But the right pane still shows Unnamed history with no data in it. Then if I go back to Saved Histories, I get one or two new Unnamed histories, created within the last few minutes. I just tried to View the history, which worked (in the middle pane) and clicked import and start using history. This seemed to work, but I got three panes inside the middle pane! When I go back (again) to saved histories, there are 3 histories - one the imported one with 2 steps, two unnamed histories, all created 1 minute ago. We just asked a beta tester to play with things, and he uploaded two fastqs, but had what sounds like a similar problem. Any thoughts on what's happening? Thanks, -Amir Karger Research Computing Harvard Medical School ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Incorrect chain order for SSL certificates on Galaxy main
Hi all; I ran into SSL certification errors when using Java to connect to Galaxy main via the API. My knowledge of this stuff is minimal, but I did some searching and discovered that the certificate chain on Galaxy main is a problem: https://www.ssllabs.com/ssltest/analyze.html?d=main.g2.bx.psu.edu Looking at the chain with openssl shows a swap of the AddTrust and Internet2 certificates: $ openssl s_client -connect main.g2.bx.psu.edu:443 CONNECTED(0003) depth=2 C = SE, O = AddTrust AB, OU = AddTrust External TTP Network, CN = AddTrust External CA Root verify error:num=19:self signed certificate in certificate chain verify return:0 --- Certificate chain 0 s:/C=US/postalCode=16802/ST=PA/L=University Park/O=The Pennsylvania State University/OU=Center for Comparative Genomics and Bioinformatics/CN=bigsky.bx.psu.edu i:/C=US/O=Internet2/OU=InCommon/CN=InCommon Server CA 1 s:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root i:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root 2 s:/C=US/O=Internet2/OU=InCommon/CN=InCommon Server CA i:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root --- As a result, more picky verification mechanisms fail because of the self signed certificate in the middle of the chain instead of as the root. It appears you can fix this by adjusting the order of certificates in nginx: http://webmasters.stackexchange.com/questions/27842/how-to-prevent-ssl-certificate-chain-not-sorted/28074#28074 http://nginx.org/en/docs/http/configuring_https_servers.html#chains Hope this helps, Brad ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Incorrect chain order for SSL certificates on Galaxy main
On Oct 31, 2012, at 8:55 AM, Brad Chapman wrote: Hi all; I ran into SSL certification errors when using Java to connect to Galaxy main via the API. My knowledge of this stuff is minimal, but I did some searching and discovered that the certificate chain on Galaxy main is a problem: https://www.ssllabs.com/ssltest/analyze.html?d=main.g2.bx.psu.edu Looking at the chain with openssl shows a swap of the AddTrust and Internet2 certificates: $ openssl s_client -connect main.g2.bx.psu.edu:443 CONNECTED(0003) depth=2 C = SE, O = AddTrust AB, OU = AddTrust External TTP Network, CN = AddTrust External CA Root verify error:num=19:self signed certificate in certificate chain verify return:0 --- Certificate chain 0 s:/C=US/postalCode=16802/ST=PA/L=University Park/O=The Pennsylvania State University/OU=Center for Comparative Genomics and Bioinformatics/CN=bigsky.bx.psu.edu i:/C=US/O=Internet2/OU=InCommon/CN=InCommon Server CA 1 s:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root i:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root 2 s:/C=US/O=Internet2/OU=InCommon/CN=InCommon Server CA i:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root --- As a result, more picky verification mechanisms fail because of the self signed certificate in the middle of the chain instead of as the root. It appears you can fix this by adjusting the order of certificates in nginx: http://webmasters.stackexchange.com/questions/27842/how-to-prevent-ssl-certificate-chain-not-sorted/28074#28074 http://nginx.org/en/docs/http/configuring_https_servers.html#chains Hope this helps, Brad Hi Brad, Thanks for catching this. It's been fixed. --nate ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Amazon
Started up a cluster on Amazon using the Launch a Galaxy Cloud Instance and got the following message. Since I don't have any control over where the instances are run not sure how I can control this. The last 4 or 5 times I have started up an existing instance has worked with no problem. Messages (CRITICAL messages cannot be dismissed.) 1. [CRITICAL] Volume 'vol-f882ca85' is located in the wrong availability zone for this instance. You MUST terminate this instance and start a new one in zone 'us-east-1a'. (2012-10-31 14:25:20) ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Amazon
For this instance, you'll need to restart using the old method for launching via the console, specifying the zone 1b. Detection of the zone volumes are in for existing clusters and specifying those for launch is on the short list of things coming up for cloud launch. On Oct 31, 2012, at 10:50 AM, Scooter Willis hwil...@scripps.edu wrote: Tried it again and same error message. The volume was originally created in us-east-1b and newly created instances are being started in us-east-1a. Shouldn't the availability zone be set to us-east-1b when the instance is requested or that info stored in the properties file in the S3 bucket? Any suggestions? From: Scooter Willis hwil...@scripps.edu Date: Wednesday, October 31, 2012 10:32 AM To: galaxy-dev@lists.bx.psu.edu galaxy-dev@lists.bx.psu.edu Subject: Amazon Started up a cluster on Amazon using the Launch a Galaxy Cloud Instance and got the following message. Since I don't have any control over where the instances are run not sure how I can control this. The last 4 or 5 times I have started up an existing instance has worked with no problem. Messages (CRITICAL messages cannot be dismissed.) [CRITICAL] Volume 'vol-f882ca85' is located in the wrong availability zone for this instance. You MUST terminate this instance and start a new one in zone 'us-east-1a'. (2012-10-31 14:25:20) ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] which .loc file for SAM to BAM?
Hi, I'm still setting up a local galaxy. Currently I'm testing the setup of NGS tools. If I try SAM to BAM for a BAM file that has hg18 set as build I get a message that Sequences are not currently available for the specified build. I guess that I have either to manipulate one of the .loc files (but which?) or have to download additional data from rsync server. (I already have the tool-data/shared/hg18 completely) regards, Andreas Btw. Do you have any plans to ease the pain on adding additional builds? Something simpler than having to add one line for each build*tool combo? These lines seem very redundant to me. -- Andreas Kuntzagk SystemAdministrator Berlin Institute for Medical Systems Biology at the Max-Delbrueck-Center for Molecular Medicine Robert-Roessle-Str. 10, 13125 Berlin, Germany http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] which .loc file for SAM to BAM?
On Wed, Oct 31, 2012 at 11:30 AM, Andreas Kuntzagk andreas.kuntz...@mdc-berlin.de wrote: Hi, I'm still setting up a local galaxy. Currently I'm testing the setup of NGS tools. If I try SAM to BAM for a BAM file that has hg18 set as build I get a message that Sequences are not currently available for the specified build. I guess that I have either to manipulate one of the .loc files (but which?) or have to download additional data from rsync server. (I already have the tool-data/shared/hg18 completely) The .loc file you want to modify is 'tool-data/sam_fa_indices.loc'. You can find information about this subject in the wiki[1]. Although the table there is not complete, so you could always find the right xml under 'tools' and poke inside to find a line like this one: validator type=dataset_metadata_in_file filename=sam_fa_indices.loc metadata_name=dbkey metadata_column=1 message=Sequences are not currently available for the specified build. line_startswith=index / [1]http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup And I agree, dealing with .loc files is quite cumbersome. Hope it helps, Carlos ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Error trying to run functional tests on a single tool
Hi, I'm trying to test out the functional testing mechanism by running it on an existing Galaxy tool. First I ran ./run_functional_tests.sh -list which produced a list of tools I can test. I chose 'vcf_annotate' and tested it as follows: ./run_functional_tests.sh -id vcf_annotate This produced a lot of output which included an exception trace. The output was not conclusive as to whether the test ran or was successful. The output is too long for this mailing list but you can find it here: https://gist.github.com/3988398 I am reluctant to try and excerpt the relevant bits because it's hard for me to know what is relevant and what is not. I am running the latest Galaxy (just did hg pull/hg update and migrated). This is on a Mac OS X 10.7.4 machine with python 2.7. When I run the same command on a linux machine, it works (though it took me a while to find the test output; it was buried in a lot of output that also contained (apparently irrelevant) stack traces). So perhaps there is something wrong with my configuration. Hope someone can help me out. Also had a couple of newbie questions about the functional test framework. 1) Why does it use tool_conf.xml.sample instead of tool_conf.xml? Can I change it to use tool_conf.xml? This way I do not need to add tools to two places in order to test them. (Plus the name of tool_conf.xml.sample indicates that it is just a demo file). 2) run_functional_tests.sh -list lists tools (such as 'upload1') that do not have functional tests, so cannot (if my understanding is correct) be tested with this script. Perhaps it would make more sense not to list these tools? Thanks, Dan ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Accessing Galaxy API from Java
Scooter; (cc'ing the dev list and updating the subject line in case others are interested) I have been looking for Java related API's to run workflows externally and haven't found anything searching message forums etc. Would like to automate data coming off up hiseq uploaded to Amazon S3 and then programmatically from external process import the fastq files and kick off a workflow to process. If you know of any docs or Java API for doing this kind of external control can you point me to it. John Chilton has a Java library to access the API through Java: https://github.com/jmchilton/blend4j which should cover lots of this. If you're interested in other JVM languages, I built a small Clojure wrapper around this to simplify some tasks: https://github.com/chapmanb/clj-blend We'd definitely love to have more people involved, so if any functionality you need is missing please feel free to submit pull requests. Brad ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] output name of downloaded datasets
Downloading data is handled in lib/galaxywebapps/galaxy/controllers/dataset.py, method display(), which in turn calls this line: -- return data.datatype.display_data(trans, data, preview, filename, to_ext, chunk, **kwd) -- Which, in most cases, calls display_data in lib/galaxy/datatypes/data.py In this method, you can see how the download name is created: -- valid_chars = '.,^_-()[]0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' fname = ''.join(c in valid_chars and c or '_' for c in data.name)[0:150] -- Best, J. On Oct 31, 2012, at 12:37 PM, julie dubois wrote: Hello, My goal is to introduce, in the xml file of one tool like MACS for example, a supplementary command to redirect the output in another directory (+ creating link between this and the directory of galaxy outputs). But I want to rename my output with the same name that the downloading tools create in this way : GALAXY-NumOfDatasetInHistory[NameOfInput].bed And I don't find where this downloading tool is and so I don't find how create this name. Thanks. julie ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Galaxy processing
Where do I find info if the installed applications make use of multiple nodes via MPI(etc) which would indicate the benefit of starting up X number of nodes for faster processing? You'll need to look at the individual tool documentation. In general, many tools uses multiple cores, few use MPI for multi-node computing. If a workflow has multiple initial inputs for say processing NGS exome data from tumor and blood(gets compared later in the workflow) will each step get sent to a different node(without a dependency) or will the entire workflow run on one node? If you've set up Galaxy to use a job scheduler (e.g. SGE/PBS), multiple nodes can be used. Multiple nodes will be used on the cloud: http://wiki.g2.bx.psu.edu/CloudMan If I have NGS data for 20 patients sitting in a S3 bucket and want a specific workflow run against each patient data input(s) does this require manual selection of files by a user or can the workflow be automated? Automation via the API is possible; unfortunately, most API documentation is in the Py/Sphinx docs for now, so you'll have to dig and/or use the sample scripts in galaxy_dir/scripts/api Can I programmatically start a workflow remotely(via REST) where I have automated the process of uploading NGS data to S3 and know the input file(s) per workflow? Yes. Is it possible to present credentials in a workflow for downloading a file via S3 where I require authentication before a file can be downloaded? You can restrict dataset access using role-based security. Does a roadmap exist for what is planned in the future? Roadmap at a very high level is in this presentation: http://wiki.g2.bx.psu.edu/Documents/Presentations/GCC2012?action=AttachFiledo=gettarget=State.pdf For example any additional tools NGS tools like Abyss going to make into the build? The framework is being separated from tools. The best place to look for tools is in the toolshed, where there is an abyss wrapper: http://toolshed.g2.bx.psu.edu/ Interested in NGS software that handles the dynamics of cancer for gene fusion events, CNVs(etc) when dealing with NGS data. There is active work on cancer tools for Galaxy. Keeping an eye on the toolshed is a good idea here. Best, J. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] user management problem
Hello, I am trying to configure my galaxy instance and I have two problem. The first one is that I cannot delete users, I created some users for testing, I enabled the option on the universe_wsg.ini, and the button appears, but the users set only marked as deleted but they didn't disappear from the users list. Is that normal? The second problem is that I am trying to set an email confirmation for ensure that the users email exists, there is any way to do that? I have introduced the email information on the ini file, but I cannot see any other option for enabling that. Thanks to everyone for your help Jordi ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Join version 1.0.0 error
Using large amazon instance Trying to do an interval join of SNPs as output from pileup 120,000 regions(5.5Mb) with with snp135Common 12,000,000(425Mb) and get the following errors. The goal is to pickup rs id's for known SNPs in the list of SNPs. Is this a memory issue? I was able to do the operation against chr1 as a test. Thought about chaining the outputs and doing against a file for each chromosome to make smaller files but then I have a mess where rs id's are in different columns. 71: Join on data 38 and data 36 0 bytes An error occurred running this job: /opt/sge/default/spool/execd/ip-10-191-53-90/job_scripts/14: line 13: 5517 Killed python /mnt/galaxyTools/galaxy-central/tools/new_operations/gops_join.py /mnt/galaxyData/files/000/dataset_75.dat /mnt/galaxyData/files/000/dataset_77.dat ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Join version 1.0.0 error
Did a subtract first to get a known list of rs SNPs that will be found in the tumor SNPs. That ran without error. Doing a join of the subtracted list of rs SNPs and the tumor SNPs. So something different in the join code then in the subtract code. From: Scooter Willis hwil...@scripps.edumailto:hwil...@scripps.edu Date: Wednesday, October 31, 2012 5:42 PM To: galaxy-dev@lists.bx.psu.edumailto:galaxy-dev@lists.bx.psu.edu galaxy-dev@lists.bx.psu.edumailto:galaxy-dev@lists.bx.psu.edu Subject: Join version 1.0.0 error Using large amazon instance Trying to do an interval join of SNPs as output from pileup 120,000 regions(5.5Mb) with with snp135Common 12,000,000(425Mb) and get the following errors. The goal is to pickup rs id's for known SNPs in the list of SNPs. Is this a memory issue? I was able to do the operation against chr1 as a test. Thought about chaining the outputs and doing against a file for each chromosome to make smaller files but then I have a mess where rs id's are in different columns. 71: Join on data 38 and data 36 0 bytes An error occurred running this job: /opt/sge/default/spool/execd/ip-10-191-53-90/job_scripts/14: line 13: 5517 Killed python /mnt/galaxyTools/galaxy-central/tools/new_operations/gops_join.py /mnt/galaxyData/files/000/dataset_75.dat /mnt/galaxyData/files/000/dataset_77.dat ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Empty TopHat output
We are still getting empty TopHat output files on our Galaxy instance on the cloud. We see that TopHat is generating data while the tool is running (by monitoring our disk usage on the Amazon cloud), but the output is empty files. Is anyone else having this issue? Does anyone have any suggestions? Many thanks in advance! Cheers, Mo Heydarian On Mon, Oct 15, 2012 at 4:53 AM, Joachim Jacob joachim.ja...@vib.be wrote: The same here. Cheers, Joachim -- Joachim Jacob, PhD Rijvisschestraat 120, 9052 Zwijnaarde Tel: +32 9 244.66.34 Bioinformatics Training and Services (BITS) http://www.bits.vib.be @bitsatvib __**_ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Empty TopHat output
Given that this doesn't seem to be happening on our public server or on local instances, my best guess is that the issue is old code. Are you running the most recent dist? J. On Oct 31, 2012, at 7:37 PM, Mohammad Heydarian wrote: We are still getting empty TopHat output files on our Galaxy instance on the cloud. We see that TopHat is generating data while the tool is running (by monitoring our disk usage on the Amazon cloud), but the output is empty files. Is anyone else having this issue? Does anyone have any suggestions? Many thanks in advance! Cheers, Mo Heydarian On Mon, Oct 15, 2012 at 4:53 AM, Joachim Jacob joachim.ja...@vib.be wrote: The same here. Cheers, Joachim -- Joachim Jacob, PhD Rijvisschestraat 120, 9052 Zwijnaarde Tel: +32 9 244.66.34 Bioinformatics Training and Services (BITS) http://www.bits.vib.be @bitsatvib ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Empty TopHat output
In this case, it's useful to differentiate between (i) the AMI that Galaxy Cloud uses and (ii) the Galaxy code running on the cloud. I suspect that (ii) is out of data for you; this is not (yet) automatically updated, even when starting a new instance. Try using the admin console to update to the most recent Galaxy dist using this URL: https://bitbucket.org/galaxy/galaxy-dist/ (not galaxy-central, as is the default) Best, J. On Oct 31, 2012, at 8:36 PM, Mohammad Heydarian wrote: We are running galaxy-cloudman-2011-03-22 (ami-da58aab3). Our latest instance was loaded up just last week. Thanks! Cheers, Mo Heydarian PhD candidate The Johns Hopkins School of Medicine Department of Biological Chemistry 725 Wolfe Street 414 Hunterian Baltimore, MD 21205 On Wed, Oct 31, 2012 at 8:30 PM, Jeremy Goecks jeremy.goe...@emory.edu wrote: Given that this doesn't seem to be happening on our public server or on local instances, my best guess is that the issue is old code. Are you running the most recent dist? J. On Oct 31, 2012, at 7:37 PM, Mohammad Heydarian wrote: We are still getting empty TopHat output files on our Galaxy instance on the cloud. We see that TopHat is generating data while the tool is running (by monitoring our disk usage on the Amazon cloud), but the output is empty files. Is anyone else having this issue? Does anyone have any suggestions? Many thanks in advance! Cheers, Mo Heydarian On Mon, Oct 15, 2012 at 4:53 AM, Joachim Jacob joachim.ja...@vib.be wrote: The same here. Cheers, Joachim -- Joachim Jacob, PhD Rijvisschestraat 120, 9052 Zwijnaarde Tel: +32 9 244.66.34 Bioinformatics Training and Services (BITS) http://www.bits.vib.be @bitsatvib ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Parallelism tag and job splitter
Hi Peter, thanks again. Turns out that it has been implemented by the looks of it in lib/galaxy/datatypes/tabular.py under class Vcf. However, despite this, it is always the Text class in data.py that is loaded and not the proper Vcf one. Can you point me in the direction of where the type is chosen? Cheers, Ed On Wed, Oct 31, 2012 at 9:46 PM, Peter Cock p.j.a.c...@googlemail.comwrote: On Wednesday, October 31, 2012, Edward Hills wrote: Thanks Peter. My next question is, I have found that VCF files don't get split properly as the header is not included in the second file as is usually required by tools (such as vcf-subset). I have read the code and am happy to implement this functionality but am not to sure where this would best be done. I see a class Text ( data ) which looks like every datatype is sent to. Would it be best to implement a VCF class which is called when the datatype is VCF? Cheers, Ed VCF is I assume defined as a subclass of Text, so inherits the naive simple splitting implemented for text files (which doesn't know about headers). Have a look at the SAM splitting code (under lib/galaxy/datatypes/*.py) as an example where header aware splitting was done. You'll probably need to implement something similar. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] (no subject)
Hello Everyone, I am about to write a syncing tool for Galaxy like Dropbox using Python with the progress bar. How do I integrate it with galaxy? It would be easy for client to upload files using the syncing tool. Are there any syncing tools available for Galaxy? Thanks ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Dataset upload fail
Local install of Galaxy on SciLinux55. Fails to upload 5.2 GB fastq file from local HD, while normally loading smaller fastq and fasta datasets (less than 1 GB). Chunks of 1.2 GB size remain in */database/tmp, which all represent beginning of the file that fails to upload. Several attempts to upload made and several chunks of the same size are present. Can I just copy the file dataset to database directory instead of uploading through web interface? This shows up when clicking the button Run this job again ⇝ Exception: Failed to get job information for dataset hid 5 clear this clear this URL: http://127.0.0.1:8080/tool_runner/rerun?id=8 Module weberror.evalexception.middleware:364 in respond view try: __traceback_supplement__ = errormiddleware.Supplement, self, environ app_iter = self.application(environ, detect_start_response) try: return_iter = list(app_iter) app_iter = self.application(environ, detect_start_response) Module paste.debug.prints:98 in __call__ view try: status, headers, body = wsgilib.intercept_output( environ, self.app) if status is None: # Some error occurred environ, self.app) Module paste.wsgilib:539 in intercept_output view data.append(headers) return output.write app_iter = application(environ, replacement_start_response) if data[0] is None: return (None, None, app_iter) app_iter = application(environ, replacement_start_response) Module paste.recursive:80 in __call__ view environ['paste.recursive.script_name'] = my_script_name try: return self.application(environ, start_response) except ForwardRequestException, e: middleware = CheckForRecursionMiddleware( return self.application(environ, start_response) Module paste.httpexceptions:632 in __call__ view []).append(HTTPException) try: return self.application(environ, start_response) except HTTPException, exc: return exc(environ, start_response) return self.application(environ, start_response) Module galaxy.web.framework.base:160 in __call__ view kwargs.pop( '_', None ) try: body = method( trans, **kwargs ) except Exception, e: body = self.handle_controller_exception( e, trans, **kwargs ) body = method( trans, **kwargs ) Module galaxy.webapps.galaxy.controllers.tool_runner:129 in rerun view job = data.creating_job if not job: raise Exception(Failed to get job information for dataset hid %d % data.hid) # Get the tool object tool_id = job.tool_id raise Exception(Failed to get job information for dataset hid %d % data.hid) Exception: Failed to get job information for dataset hid 5 URL: http://127.0.0.1:8080/tool_runner/rerun?id=8 File '/home/yaximik/galaxy-dist/eggs/WebError-0.8a-py2.7.egg/weberror/evalexception/middleware.py', line 364 in respond app_iter = self.application(environ, detect_start_response) File '/home/yaximik/galaxy-dist/eggs/Paste-1.6-py2.7.egg/paste/debug/prints.py', line 98 in __call__ environ, self.app) File '/home/yaximik/galaxy-dist/eggs/Paste-1.6-py2.7.egg/paste/wsgilib.py', line 539 in intercept_output app_iter = application(environ, replacement_start_response) File '/home/yaximik/galaxy-dist/eggs/Paste-1.6-py2.7.egg/paste/recursive.py', line 80 in __call__ return self.application(environ, start_response) File '/home/yaximik/galaxy-dist/eggs/Paste-1.6-py2.7.egg/paste/httpexceptions.py', line 632 in __call__ return self.application(environ, start_response) File '/home/yaximik/galaxy-dist/lib/galaxy/web/framework/base.py', line 160 in __call__ body = method( trans, **kwargs ) File '/home/yaximik/galaxy-dist/lib/galaxy/webapps/galaxy/controllers/tool_runner.py', line 129 in rerun raise Exception(Failed to get job information for dataset hid %d % data.hid) Exception: Failed to get job information for dataset hid 5 ?xml version=1.0 ? traceback sysinfo language version=2.7 Python /language /sysinfo stack frame module weberror.evalexception.middleware /module filename /home/yaximik/galaxy-dist/eggs/WebError-0.8a-py2.7.egg/weberror/evalexception/middleware.py /filename line 364 /line function respond /function operation app_iter = self.application(environ, detect_start_response) /operation operation_context try: __traceback_supplement__ = errormiddleware.Supplement, self, environ app_iter = self.application(environ, detect_start_response) try: return_iter = list(app_iter) /operation_context /frame frame module paste.debug.prints /module filename /home/yaximik/galaxy-dist/eggs/Paste-1.6-py2.7.egg/paste/debug/prints.py /filename line 98 /line function __call__