[galaxy-dev] Copying galaxy folder from linux to windows
Hi there, Since manually deleting the files from the galaxy database is not advised, I thought of just copying the galaxy-dist folder from the linux server to my windows. I copied everything except the 000 folder, where the datasets are located. I also have Cygwin installed in windows. I was wondering if galaxy would still run even if I just copied the folder. About my dataset cleanup problem: I tried running the cleanup scripts, but since the disk is full I cannot make any changes to universe_wsgi.ini (I get this KeyError problem). Any suggestions on how I could free some space/delete datasets? Is manually deleting tmp files also not advised? I am not allowed to remove/alter any other file aside from those within galaxy-dist. Any help would be greatly appreciated. Cheers, DM ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Problem with cleaning up galaxy datasets
Dear all, I have done what was suggested in this thread (remove the # sign), but somehow I still got the same error. Any suggestions on how to get past this? Cheers, Diana On Mon, Apr 16, 2012 at 12:51 AM, Jennifer Jackson j...@bx.psu.edu wrote: repost bounced msg, see why below - Date: Sat, 14 Apr 2012 22:33:22 +0200 From: Klaus Metzeler m...@klaus-metzeler.de To: Michael Moore michaelglennmo...@gmail.com Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Problem with cleaning up galaxy datasets Message-ID: 4F89DF12.6010606@klaus-**metzeler.de4f89df12.6010...@klaus-metzeler.de Content-Type: text/plain; charset=ISO-8859-1; format=flowed OK, this pointed me to the right direction. The universe_wsgi.ini file has a key that reads: database_connection = sqlite:///./database/universe.**sqlite?isolation_level=**IMMEDIATE This entry was commented out in my version of the file (ie, had a # in front of it). While Galaxy itself uses this database connection by default, but the cleanup script looks for this config file entry to locate the database. So, if anyone encounters the same problem, just un-comment this entry and all runs fine. Thanks for your help, Michael. -- Why did this bounce? Digest content not stripped (#2). If you reply to a thread in a digest, please 1. Change the subject of your response from Galaxy-dev Digest Vol ... to the original subject for the thread. 2. Strip out everything else in the digest that is not part of the thread you are responding to. Thanks Klaus for sending in the solution! - Galaxy team -- Jennifer Jackson http://galaxyproject.org __**_ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Toolshed initial upload errors
[For those who came in late - I've installed a local toolshed, which allows me to create repositories, but every time I attempt to upload files, it errors out with TypeError: array item must be char. For those who come after me, here's what I worked out thus far.] Greg asked: Since you've tried uploading various files with no success, the problem is likely to be caused by something specific to your environment - possibly the version of the mercurial package you're using. What version of Python are you running, and what version of the mercurial package do you have installed with it? Also, what version of Galaxy do you have, and what database / version are you using? We're CentOS, an older flavour (4), but my Mercurial is up to data (2.1.2). Python 2.6.4, Galaxy is 6799:40f1816d6857 (grabbed it fresh last week for testing), running it with sqlite. However, the Mercurial is actually installed local to the account I'm using, so I wonder if the toolshed is getting confused with another version, although hg doesn't seem to be installed on the system. Further investigations reveal that the files appear to be in the repo (database/community_files). The error manifest in the middle of Mercurial, in manifest.py where it attempts to coerce a Unicode string into a character array. (As there are some reported issues of Windows file names with Unicode under Mercurial, and I'm uploading from a Windows machine, I used a Mac to create a repo and add a file. Nope, same behaviour.) The Cistrome galaxy fork (https://bitbucket.org/cistrome/cistrome-harvard/src/e7e2fdd74496/lib/galaxy/webapps/community/controllers/upload.py) mentions occasional similar errors. I check the Mercurial installation: % hg --version Mercurial Distributed SCM (version 2.1.2+10-4d875bb546dc) ... % hg debuginstall Checking encoding (UTF-8)... Checking installed modules (/home/f0/paul/Installed/lib/python2.6/site-packages/mercurial)... Checking templates (/home/f0/paul/Installed/lib/python2.6/site-packages/mercurial/templates)... Checking commit editor... Checking username... No problems detected (Actually, I was missing a username and a user ~/.hgrc file. But making that, it passes. Error still persists.) Work continues. Paul Agapow (paul-michael.aga...@hpa.org.uk) Bioinformatics, Health Protection Agency - ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of the HPA, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses, but please re-sweep any attachments before opening or saving. HTTP://www.HPA.org.uk ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY
On Thu, Apr 19, 2012 at 12:40 AM, JIE CHEN jiechenable1...@gmail.com wrote: The version I installed is : mira_3.4.0_prod_linux-gnu_x86_64_static OK, good. The other key question I asked was did you get anything in the MIRA log file (it should be in your history with text data even though it will be red as a failed job)? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] delete data library via API
Dear all, Haven't heard anything back yet. So give it another try. Anyone knows if deleting dataset is supported by API? Or it will be a to-be-added feature? Thanks for any hint! Cheers, Leon On Thu, Apr 12, 2012 at 10:32 PM, Leon Mei hailiang@nbic.nl wrote: Hi guys, Is that possible to delete datasets in a shared data library via API? Of course when the modify permission is granted. We see delete.py in the API script folder but can't figure out how to get it work. Thanks! Leon -- Hailiang (Leon) Mei Netherlands Bioinformatics Center BioAssist NGS Taskforce - http://ngs.nbic.nl Skype: leon_mei Mobile: +31 6 41709231 -- Hailiang (Leon) Mei Netherlands Bioinformatics Center BioAssist NGS Taskforce - http://ngs.nbic.nl Skype: leon_mei Mobile: +31 6 41709231 ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] BAM to BigWig (and tool ID clashes)
Hi Brad Lance, I've been using Brad's bam_to_bigwig tool in Galaxy but realized today (with a new dataset using a splice-aware mapper) that it doesn't seem to be ignoring CIGAR N operators where a read is split over an intron. Looking over Brad's Python script which calculates the coverage to write an intermediate wiggle file, this is done with the samtools via pysam. It is not obvious to me if this can be easily modified to ignore introns. Is this possible Brad? I wasn't aware of Lance's rival bam_to_bigwig tool in the ToolShed till now, and that does talk about this issue. It has a boolean option to ignore gaps when computing coverage, recommended for RNA-Seq where reads are mapped across long splice junctions. Lance, from your tool's help it sounds like it needs a genome database build filled in. I don't understand this requirement - Brad's tool works just fine for standalone BAM files (for example reads mapped to an in house assembly). Is that not supported in your tool? Galaxy team - why does the ToolShed allow duplicate repository names (here bam_to_bigwig) AND duplicate tool IDs (again, here bam_to_bigwig)? Won't this cause chaos when sharing workflows? I would suggest checking this when a tool is uploaded and rejecting repository name or tool ID clashes. Regards, Peter P.S. Brad, your tool is missing an explicit requirements tag listing the UCSC binary wigToBigWig, and the Python library pysam. Lance, your tool doesn't seem to include any author information like your name or email address. I'm inferring it is yours from the Galaxy tool shed user id, lparsons. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)
On Thu, Apr 19, 2012 at 1:55 PM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi Brad Lance, I've been using Brad's bam_to_bigwig tool in Galaxy but realized today (with a new dataset using a splice-aware mapper) that it doesn't seem to be ignoring CIGAR N operators where a read is split over an intron. Looking over Brad's Python script which calculates the coverage to write an intermediate wiggle file, this is done with the samtools via pysam. It is not obvious to me if this can be easily modified to ignore introns. Is this possible Brad? Looking into this a bit more, perhaps 'samtools depth' might be useful (bam2depth.c), maybe we can use this code to update your python+pysam code? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)
The tool shed forces unique repository names per user account, allowing for uniqueness with that combination. All tools uploaded into a tool shed repository are assigned a unique id called a guid, which is unique for all tools across all possible tool sheds. These guids follow a named spacing convention that ensures that any tool installed into any Galaxy instance will be uniquely identified regardless of old tool ids or tool versions. For example, the guid for version 0.0.2 of Brad's tool is toolshed.g2.bx.psu.edu/repos/brad-chapman/bam_to_bigwig/bam_to_bigwig/0.0.2 while the guid for version 0.1 of Lance's tool is toolshed.g2.bx.psu.edu/repos/lparsons/bam_to_bigwig/bam_to_bigwig/0.1 This information can be seen when viewing the tool's metadata in the tool shed. When these tools are installed into a local Galaxy instance, this guid is the tool's id in Galaxy rather than the old id (e.g., tool id=bam_to_bigwig). The old id is still important and must be included in the tool config as usual, but is not used to identify a tool that is installed in a repository from the tool shed. All of these details are explained in the tool shed wiki in the following section. http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_tools_into_a_local_Galaxy_instance This section is also relevant to this discussion. http://wiki.g2.bx.psu.edu/Tool%20Shed#Galaxy_Tool_Versions On Apr 19, 2012, at 8:55 AM, Peter Cock wrote: Galaxy team - why does the ToolShed allow duplicate repository names (here bam_to_bigwig) AND duplicate tool IDs (again, here bam_to_bigwig)? Won't this cause chaos when sharing workflows? I would suggest checking this when a tool is uploaded and rejecting repository name or tool ID clashes. Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] JobManager object has no attribute dispatcher
On Apr 17, 2012, at 9:04 AM, Peter Cock wrote: Hi all, Does anyone know what might have introduced this problem running galaxy-dist when using the task splitting functionality? I'm using the latest code from the default branch, changeset: 7027:f6e790d94282 Hi Peter, This was resolved in changeset 5c93ac32ace1. Thanks for reporting it. --nate galaxy.jobs.manager DEBUG 2012-04-17 13:55:03,610 (4) Job assigned to handler 'main' 127.0.0.1 - - [17/Apr/2012:13:55:06 +0100] POST /root/history_item_updates HTTP/1.1 200 - http://127.0.0.1:8081/history; Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110622 CentOS/3.6-1.el5.centos Firefox/3.6.18 galaxy.jobs DEBUG 2012-04-17 13:55:08,710 (4) Working directory for job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/4 galaxy.jobs.handler DEBUG 2012-04-17 13:55:08,711 dispatching job 4 to tasks runner galaxy.jobs.handler INFO 2012-04-17 13:55:08,845 (4) Job dispatched galaxy.datatypes.sequence DEBUG 2012-04-17 13:55:09,470 Split /mnt/galaxy/galaxy-central/database/files/000/dataset_2.dat into 4 parts... galaxy.datatypes.sequence DEBUG 2012-04-17 13:55:09,470 Attemping to split FASTA file /mnt/galaxy/galaxy-central/database/files/000/dataset_2.dat into chunks of 1 sequences galaxy.datatypes.sequence DEBUG 2012-04-17 13:55:09,471 Writing /mnt/galaxy/galaxy-central/database/files/000/dataset_2.dat part to /mnt/galaxy/galaxy-central/database/job_working_directory/000/4/task_0/dataset_2.dat galaxy.datatypes.sequence DEBUG 2012-04-17 13:55:09,472 Writing /mnt/galaxy/galaxy-central/database/files/000/dataset_2.dat part to /mnt/galaxy/galaxy-central/database/job_working_directory/000/4/task_1/dataset_2.dat galaxy.jobs.splitters.multi DEBUG 2012-04-17 13:55:09,472 do_split created 2 parts galaxy.jobs DEBUG 2012-04-17 13:55:09,506 (4) Working directory for job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/4 galaxy.jobs.runners.tasks ERROR 2012-04-17 13:55:09,784 failure running job 4 Traceback (most recent call last): File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/tasks.py, line 86, in run_job self.app.job_manager.dispatcher.put(tw) AttributeError: 'JobManager' object has no attribute 'dispatcher' Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)
On Thu, Apr 19, 2012 at 2:32 PM, Greg Von Kuster g...@bx.psu.edu wrote: The tool shed forces unique repository names per user account, allowing for uniqueness with that combination. All tools uploaded into a tool shed repository are assigned a unique id called a guid, which is unique for all tools across all possible tool sheds. These guids follow a named spacing convention that ensures that any tool installed into any Galaxy instance will be uniquely identified regardless of old tool ids or tool versions. ... The old id is still important and must be included in the tool config as usual, but is not used to identify a tool that is installed in a repository from the tool shed. Ah - so the old tool ID clashes are only going to be a problem with Galaxy servers where the tools were installed 'the old fashioned way' (like ours). So there is still scope for clashes with shared workflows - but this will be less and less of a problem as local Galaxy installs switch to installing tools via the Tool Shed? What happens if (for example) Brad gives Lance commit rights to his repository (or the other way round)? Then you'd have a clash. All of these details are explained in the tool shed wiki in the following section. http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_tools_into_a_local_Galaxy_instance This section is also relevant to this discussion. http://wiki.g2.bx.psu.edu/Tool%20Shed#Galaxy_Tool_Versions Thanks for the background. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)
On Apr 19, 2012, at 10:04 AM, Peter Cock wrote: On Thu, Apr 19, 2012 at 2:32 PM, Greg Von Kuster g...@bx.psu.edu wrote: The tool shed forces unique repository names per user account, allowing for uniqueness with that combination. All tools uploaded into a tool shed repository are assigned a unique id called a guid, which is unique for all tools across all possible tool sheds. These guids follow a named spacing convention that ensures that any tool installed into any Galaxy instance will be uniquely identified regardless of old tool ids or tool versions. ... The old id is still important and must be included in the tool config as usual, but is not used to identify a tool that is installed in a repository from the tool shed. Ah - so the old tool ID clashes are only going to be a problem with Galaxy servers where the tools were installed 'the old fashioned way' (like ours). Yes, it is highly recommended to install tool shed repositories using the installation process that has been implemented rather than downloading the repository contents as an archive and manually manipulating it to be incorporated into your Galaxy instance. Using the installation process includes many benefits in addition to eliminating the potential tool id clashes. Examples of benefits include not having to stop / restart your Galaxy server in order to use freshly installed tools, being able to deactivate / uninstall tools on-the-fly when finished with them, being able to run multiple versions of the same tool simultaneously in the same Galaxy instance, etc. So there is still scope for clashes with shared workflows - but this will be less and less of a problem as local Galaxy installs switch to installing tools via the Tool Shed? Correct - if you manually download the contents of a repository and install it into your local Galaxy instance, there is no way to eliminate the potential for tool id / version clashes. In fact, it may be beneficial to eliminate the feature enabling users to manually download repository contents, but we'll leave it there as long as the community wants it. What happens if (for example) Brad gives Lance commit rights to his repository (or the other way round)? Then you'd have a clash. Assuming automatic installation using the tool shed install process, no clashes will occur in this scenario, because no matter who pushes changes to the repository, it is still name spaced by the original owner, which can never change. The only part of the guid that could potentially change is the tool version component ( e.g., toolshed.g2.bx.psu.edu/repos/brad-chapman/bam_to_bigwig/bam_to_bigwig/0.0.2 becomes toolshed.g2.bx.psu.edu/repos/brad-chapman/bam_to_bigwig/bam_to_bigwig/0.0.3 if Brad gives Lance the ability to push to his repository and Lance change's the tool version ). All of these details are explained in the tool shed wiki in the following section. http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_tools_into_a_local_Galaxy_instance This section is also relevant to this discussion. http://wiki.g2.bx.psu.edu/Tool%20Shed#Galaxy_Tool_Versions Thanks for the background. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)
Hi Peter, Thanks for the thoughtful comments. I believe the requirement for the genome was imposed by the use of an underlying BedTools utility. I also think that in a newer version of that tool, the requirement was removed, since you correctly point out it is not really necessary. I will see if I can update the tool to remove that requirement and also see about changing the tool id. Sorry for the conflict, that was an oversight on my part, though it would be nice if the Tool Shed could check and warn when someone tries to create a new tool. I would suggest flagging the new repo as invalid until the id is updated instead of outright rejection. As for the author info, you're right, I should really add that as well. That tool was put together very quickly to meet the need of a customer and I didn't properly clean things up before I uploaded. I'll let you know once I get an update out. Of course, any patches etc. are welcome. ;-) Lance Peter Cock wrote: Hi Brad Lance, I've been using Brad's bam_to_bigwig tool in Galaxy but realized today (with a new dataset using a splice-aware mapper) that it doesn't seem to be ignoring CIGAR N operators where a read is split over an intron. Looking over Brad's Python script which calculates the coverage to write an intermediate wiggle file, this is done with the samtools via pysam. It is not obvious to me if this can be easily modified to ignore introns. Is this possible Brad? I wasn't aware of Lance's rival bam_to_bigwig tool in the ToolShed till now, and that does talk about this issue. It has a boolean option to ignore gaps when computing coverage, recommended for RNA-Seq where reads are mapped across long splice junctions. Lance, from your tool's help it sounds like it needs a genome database build filled in. I don't understand this requirement - Brad's tool works just fine for standalone BAM files (for example reads mapped to an in house assembly). Is that not supported in your tool? Galaxy team - why does the ToolShed allow duplicate repository names (here bam_to_bigwig) AND duplicate tool IDs (again, here bam_to_bigwig)? Won't this cause chaos when sharing workflows? I would suggest checking this when a tool is uploaded and rejecting repository name or tool ID clashes. Regards, Peter P.S. Brad, your tool is missing an explicitrequirements tag listing the UCSC binary wigToBigWig, and the Python library pysam. Lance, your tool doesn't seem to include any author information like your name or email address. I'm inferring it is yours from the Galaxy tool shed user id, lparsons. -- Lance Parsons - Scientific Programmer 134 Carl C. Icahn Laboratory Lewis-Sigler Institute for Integrative Genomics Princeton University ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)
On Apr 19, 2012, at 10:37 AM, Lance Parsons wrote: and also see about changing the tool id. I would recommend NOT doing this - see the separate thread for this message that describes ow this works in the tool shed. Sorry for the conflict, that was an oversight on my part, though it would be nice if the Tool Shed could check and warn when someone tries to create a new tool. I would suggest flagging the new repo as invalid until the id is updated instead of outright rejection. Again, see the separate thread for this message - the tool shed does correctly handle this when the automatic installation process is used. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] JobManager object has no attribute dispatcher
On Thu, Apr 19, 2012 at 2:53 PM, Nate Coraor n...@bx.psu.edu wrote: On Apr 17, 2012, at 9:04 AM, Peter Cock wrote: Hi all, Does anyone know what might have introduced this problem running galaxy-dist when using the task splitting functionality? I'm using the latest code from the default branch, changeset: 7027:f6e790d94282 Hi Peter, This was resolved in changeset 5c93ac32ace1. Thanks for reporting it. --nate Are you sure? I've just updated to the tip and the same problem persists. Also looking at that commit it isn't obvious how is might be linked to this issue: https://bitbucket.org/galaxy/galaxy-central/changeset/5c93ac32ace1 Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Error running set_dataset_sizes.py
Hello, I'm seeing some discrepancies in total user usage versus what my histories actually total so I wanted to run set_dataset_sizes.py and set_user_disk_usage.py I am getting the following error. ./set_dataset_sizes.py Loading Galaxy model... Processing 77915 datasets... Completed 0% Traceback (most recent call last): File ./set_dataset_sizes.py, line 43, in module dataset.set_total_size() File lib/galaxy/model/__init__.py, line 703, in set_total_size if self.object_store.exists(self, extra_dir=self._extra_files_path or dataset_%d_files % self.id, dir_only=True): AttributeError: 'NoneType' object has no attribute 'exists' Any help would be much appreciated. Thanks, Liisa ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] JobManager object has no attribute dispatcher
On Apr 19, 2012, at 10:44 AM, Peter Cock wrote: On Thu, Apr 19, 2012 at 2:53 PM, Nate Coraor n...@bx.psu.edu wrote: On Apr 17, 2012, at 9:04 AM, Peter Cock wrote: Hi all, Does anyone know what might have introduced this problem running galaxy-dist when using the task splitting functionality? I'm using the latest code from the default branch, changeset: 7027:f6e790d94282 Hi Peter, This was resolved in changeset 5c93ac32ace1. Thanks for reporting it. --nate Are you sure? I've just updated to the tip and the same problem persists. Also looking at that commit it isn't obvious how is might be linked to this issue: https://bitbucket.org/galaxy/galaxy-central/changeset/5c93ac32ace1 You're right, that's what I get for reading hastily. Fix coming shortly... Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] run job as real user error: environment variables issue?
Hi everyone, I'm currently trying to set up our local Galaxy so it can run jobs as the real user. I followed the documentation and set the galaxy user as a sudoer. However, I get an error message whenever I'm trying to run a job: galaxy.jobs.runners.drmaa ERROR 2012-04-19 14:57:48,376 Uncaught exception queueing job Traceback (most recent call last): File /g/funcgen/galaxy-dev/lib/galaxy/jobs/runners/drmaa.py, line 133, in run_next self.queue_job( obj ) File /g/funcgen/galaxy-dev/lib/galaxy/jobs/runners/drmaa.py, line 219, in queue_job job_id = self.external_runjob(filename, job_wrapper.user_system_pwent[2]).strip() File /g/funcgen/galaxy-dev/lib/galaxy/jobs/runners/drmaa.py, line 427, in external_runjob raise RuntimeError(External_runjob failed (exit code %s)\nCalled from %s:%d\nChild process reported error:\n%s % (str(exitcode), __filename__(), __lineno__(), stderrdata)) RuntimeError: External_runjob failed (exit code 127) Called from /g/funcgen/galaxy-dev/lib/galaxy/jobs/runners/drmaa.py:427 Child process reported error: python: error while loading shared libraries: libpython2.6.so.1.0: cannot open shared object file: No such file or directory Looking closely, it's the non-root user it tries to switch to that doesn't have the LD_LIBRARY_PATH properly set, so there should be an environment inheritance issue. However, I tried to print stuff from the scripts/drmaa_external_runner.py script in EVERY WAY I could think of, to no avail. As if it doesn't even run. Which is surprising since root can run python properly, so it really looks like it's really changing users. I really fail to see where the problem could come from, so if you have leads to suggest, I'll be forever grateful. Best, L-A ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Default annotation track
I needed to add a genome to the tracks in our local instance. However, the only available genome is a multi-fasta file of about 1800 supercontigs. I preserve the sanity of my clients I concatenated the fasta file and provided both versions. The unfortunate part is that the contig annotation data is lost in that conversion. I wonder if there is a way to extract the contig data as annotation and provide it as a default track in the concatenated genome or something like that. Any ideas? Thanks, Alex ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Error copying files from job_working_directory
On Apr 16, 2012, at 9:03 PM, Jose Navas wrote: Hi everybody, I was searching through the Galaxy code and I find the solution: the function responsible for copying files is using shutil.copy function, which only allows copy files. I've modified this function to use shutil.copytree in case of file_name is a directory. I can send you the code if you email me. Also, I don't know if there is a good reason against this solution. If it is, I will be very grateful if somebody can explain it. Thanks, Jose Hi Jose, Are you using the output dataset path as a directory rather than a filename? Or is this with the output dataset's files_path/extra_files_path attribute? --nate From: josenavasmol...@hotmail.com To: galaxy-dev@lists.bx.psu.edu Date: Mon, 16 Apr 2012 17:34:29 + Subject: [galaxy-dev] Error copying files from job_working_directory Hello, I've integrated a tool into my Galaxy instance, but when I run the tool I get this error: galaxy.objectstore CRITICAL 2012-04-16 11:25:56,697 Error copying /home/galaxy/galaxy-dist/database/job_working_directory/000/307/dataset_431_files/unweighted_unifrac_2d_continuous to /home/galaxy/galaxy-dist/database/files/000/dataset_431_files/unweighted_unifrac_2d_continuous: [Errno 21] Is a directory: '/home/galaxy/galaxy-dist/database/job_working_directory/000/307/dataset_431_files/unweighted_unifrac_2d_continuous' As I can see, it fails because it is trying to copy a directory. Is this feature supported in Galaxy? If it is supported, what I have to do to enable copy the directories from the job_working_directory? Thank you, Jose ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Error copying files from job_working_directory
Hi Nate, I'm using the output dataset path to generate an html file and in the output dataset's files_path/extra_files_path attribute I'm generating a set of files and directories. So the problem was in the output dataset's files_path/extra_files_path attribute. Thanks,Jose Subject: Re: [galaxy-dev] Error copying files from job_working_directory From: n...@bx.psu.edu Date: Thu, 19 Apr 2012 12:16:05 -0400 CC: galaxy-dev@lists.bx.psu.edu To: josenavasmol...@hotmail.com On Apr 16, 2012, at 9:03 PM, Jose Navas wrote: Hi everybody, I was searching through the Galaxy code and I find the solution: the function responsible for copying files is using shutil.copy function, which only allows copy files. I've modified this function to use shutil.copytree in case of file_name is a directory. I can send you the code if you email me. Also, I don't know if there is a good reason against this solution. If it is, I will be very grateful if somebody can explain it. Thanks, Jose Hi Jose, Are you using the output dataset path as a directory rather than a filename? Or is this with the output dataset's files_path/extra_files_path attribute? --nate From: josenavasmol...@hotmail.com To: galaxy-dev@lists.bx.psu.edu Date: Mon, 16 Apr 2012 17:34:29 + Subject: [galaxy-dev] Error copying files from job_working_directory Hello, I've integrated a tool into my Galaxy instance, but when I run the tool I get this error: galaxy.objectstore CRITICAL 2012-04-16 11:25:56,697 Error copying /home/galaxy/galaxy-dist/database/job_working_directory/000/307/dataset_431_files/unweighted_unifrac_2d_continuous to /home/galaxy/galaxy-dist/database/files/000/dataset_431_files/unweighted_unifrac_2d_continuous: [Errno 21] Is a directory: '/home/galaxy/galaxy-dist/database/job_working_directory/000/307/dataset_431_files/unweighted_unifrac_2d_continuous' As I can see, it fails because it is trying to copy a directory. Is this feature supported in Galaxy? If it is supported, what I have to do to enable copy the directories from the job_working_directory? Thank you, Jose ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] How to specify RUM output dir?
Thanks Jeremy, that's very helpful, and it's great to hear from a developer! I have been pursuing option (a), and I feel like I'm very close. The RUM tool runs, and the filesizes show up in the history, but the datasets show up as erroneous. When I click on the apparently-problematic dataset's bug icon, the error message shows the following two lines repeated over 100 times: yes: standard output: Broken pip yes: write error I know what this means, generally, but not in the context of Galaxy. Is this a telltale symptom, or is it too generic to say? Under the additional output, it shows exactly the STDOUT the tool shows when it executes and terminates properly from the command line. So I know I'm close, I feel like I'm missing something small. When I click on the view dataset button, I see the data, and it's legit. When I click Edit Attributes, I see a message at the bottom of the Edit Attributes pane that says Required metadata values are missing. Some of these values may not be editable by the user. Selecting Auto-detect will attempt to fix these values. When I attempt to run the Auto-detect, this notification goes away. It seems like the only issue right now is getting rid of that broken pipe error message. Once that's gone, perhaps the datasets won't be flagged as erroneous and I can use them in downstream processes. If I can get this tool working perfectly, I'll definitely upload it to the Galaxy toolshed. Any tips you could provide would be greatly appreciated! Thanks, Dan You have two options: (a) you can set up the tool to report only a subset of outputs from the tool; or (b) you can use a composite datatype to store the complete directory: http://wiki.g2.bx.psu.edu/Admin/Datatypes/Composite%20Datatypes Best, J. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] resubmit a job if the node fails
Hi, Can Galaxy resubmit a job if the node where the job is running fails? I know sge can do that by using qsub -r. It should be very useful if Galaxy can do that. Thank you, Cai ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY
Hi Peter, Thank you for your patience. I checked the error message in the history. They all give exactly the same error-- the one i gave in the first thread. I re thinks the procedure many times and it shouldn't have any problem. Don't know why. What i have done: 1. Installed the mira wrappers using the Galaxy web UI and checked that the mira.py and mira.xml is under one of the directoriesof the shed_tools directory. 2. installed the mira 3.4.0 binaries in my host Thanks a lot. Tyler On Thu, Apr 19, 2012 at 2:11 AM, Peter Cock p.j.a.c...@googlemail.comwrote: On Thu, Apr 19, 2012 at 12:40 AM, JIE CHEN jiechenable1...@gmail.com wrote: The version I installed is : mira_3.4.0_prod_linux-gnu_x86_64_static OK, good. The other key question I asked was did you get anything in the MIRA log file (it should be in your history with text data even though it will be red as a failed job)? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY
On Thu, Apr 19, 2012 at 6:35 PM, JIE CHEN jiechenable1...@gmail.com wrote: Hi Peter, Thank you for your patience. I checked the error message in the history. They all give exactly the same error-- the one i gave in the first thread. Are you saying this is the entire contents of the MIRA log entry in the history? Return error code 1 from command: mira --job=denovo,genome,accurate SANGER_SETTINGS -LR:lsd=1:mxti=0:ft=fastq -FN:fqi=/media/partition2_/galaxydb_data/000/dataset_290.dat SOLEXA_SETTINGS -LR:lsd=1:ft=fastq -FN:fqi=/media/partition2_/galaxydb_data/000/dataset_290.dat COMMON_SETTINGS -OUT:orf=1:orc=1:ora=1:orw=1:orm=0:org=0:ors=0 -OUT:rrot=1:rtd=1 I'm pretty sure you are just telling me the error message. I would have expected more than that, e.g. a line MIRA took XXX minutes before that error message. To try to be even clearer: 1. Start your web browser and goto your Galaxy 2. Upload/import the files 3. Select the MIRA tool from left hand pane 4. Select input files and set parameters 5. Click Execute 6. Notice that six new history entries appear: MIRA contigs (FASTA), MIRA contigs (QUAL),MIRA contigs (CAF), MIRA contigs (ACE), MIRA coverage (Wiggle), MIRA log 7. Wait for MIRA to fail and the six new history entries to go red. 8. Click on the eye icon for the red history item MIRA log 9. Copy and paste the MIRA log contents to an email. Also, and perhaps equally useful, can you access this server at the command line and try the exact same failing command (from a temp directory - it may create lots of files and folders)? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY
Hi Peter, Here is the full log: This is MIRA V3.4.0 (production version). Please cite: Chevreux, B., Wetter, T. and Suhai, S. (1999), Genome Sequence Assembly Using Trace Signals and Additional Sequence Information. Computer Science and Biology: Proceedings of the German Conference on Bioinformatics (GCB) 99, pp. 45-56. To (un-)subscribe the MIRA mailing lists, see: http://www.chevreux.org/mira_mailinglists.html After subscribing, mail general questions to the MIRA talk mailing list: mira_t...@freelists.org To report bugs or ask for features, please use the new ticketing system at: http://sourceforge.net/apps/trac/mira-assembler/ This ensures that requests don't get lost. Compiled by: bach Sun Aug 21 17:50:30 CEST 2011 On: Linux arcadia 2.6.38-11-generic #48-Ubuntu SMP Fri Jul 29 19:02:55 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux Compiled in boundtracking mode. Compiled in bugtracking mode. Compiled with ENABLE64 activated. Runtime settings (sorry, for debug): Size of size_t : 8 Size of uint32 : 4 Size of uint32_t: 4 Size of uint64 : 8 Size of uint64_t: 8 Current system: Linux whsiao-ubuntu 2.6.32-40-generic #87-Ubuntu SMP Tue Mar 6 00:56:56 UTC 2012 x86_64 GNU/Linux Parsing parameters: --job=denovo,genome,accurate SANGER_SETTINGS -LR:lsd=1:mxti=0:ft=fastq -FN:fqi=/media/partition2_/galaxydb_data/000/dataset_290.dat COMMON_SETTINGS -OUT:orf=1:orc=1:ora=1:orw=1:orm=0:org=0:ors=0 -OUT:rrot=1:rtd=1 Parameters parsed without error, perfect. -CL:pec and -CO:emeas1clpec are set, setting -CO:emea values to 1. -- Parameter settings seen for: Sanger data (also common parameters) Used parameter settings: General (-GE): Project name in (proin) : mira Project name out (proout) : mira Number of threads (not) : 2 Automatic memory management (amm) : yes Keep percent memory free (kpmf) : 15 Max. process size (mps) : 0 EST SNP pipeline step (esps): 0 Use template information (uti) : yes Template insert size minimum (tismin) : -1 Template insert size maximum (tismax) : -1 Template partner build direction (tpbd) : -1 Colour reads by hash frequency (crhf) : yes Load reads options (-LR): Load sequence data (lsd): yes File type (ft) : fastq External quality (eq) : from SCF (scf) Ext. qual. override (eqo) : no Discard reads on e.q. error (droeqe): no Solexa scores in qual file (ssiqf) : no FASTQ qual offset (fqqo): 0 Wants quality file (wqf): yes Read naming scheme (rns): [san] Sanger Institute (sanger) Merge with XML trace info (mxti): no Filecheck only (fo) : no Assembly options (-AS): Number of passes (nop) : 4 Skim each pass (sep): yes Maximum number of RMB break loops (rbl) : 2 Maximum contigs per pass (mcpp) : 0 Minimum read length (mrl) : 80 Minimum reads per contig (mrpc) : 2 Base default quality (bdq) : 10 Enforce presence of qualities (epoq): yes Automatic repeat detection (ard): yes Coverage threshold (ardct) : 2 Minimum length (ardml) : 400 Grace length (ardgl): 40 Use uniform read distribution (urd) : no Start in pass (urdsip): 3 Cutoff multiplier (urdcm) : 1.5 Keep long repeats separated (klrs) : no Spoiler detection (sd) : yes Last pass only (sdlpo) : yes Use genomic pathfinder (ugpf) : yes Use emergency search stop (uess): yes ESS partner depth (esspd) : 500 Use emergency blacklist (uebl) : yes Use max. contig build time (umcbt) : no Build time in seconds (bts) : 1 Strain and backbone options (-SB): Load straindata (lsd) : no Assign default strain (ads) : no Default strain name (dsn) : StrainX Load backbone (lb) : no Start backbone usage in pass (sbuip): 3 Backbone file type (bft): fasta
Re: [galaxy-dev] Default annotation track
Alex, It's not clear what the problem is. Trackster will handle an arbitrarily large number of contigs (albeit somewhat clumsily). What contig annotation data are you trying to preserve and/or provide access to? J. On Apr 19, 2012, at 11:30 AM, Oleksandr Moskalenko wrote: I needed to add a genome to the tracks in our local instance. However, the only available genome is a multi-fasta file of about 1800 supercontigs. I preserve the sanity of my clients I concatenated the fasta file and provided both versions. The unfortunate part is that the contig annotation data is lost in that conversion. I wonder if there is a way to extract the contig data as annotation and provide it as a default track in the concatenated genome or something like that. Any ideas? Thanks, Alex ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] How to specify RUM output dir?
yes: standard output: Broken pip yes: write error I know what this means, generally, but not in the context of Galaxy. Is this a telltale symptom, or is it too generic to say? Hard to say, but it's coming from your script and is indeed causing your job to fail. You might find it easier to debug if you run your tool/wrapper script from the command line. Best, J. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] How to specify RUM output dir?
Fantastic! Knowing that it's not coming from Galaxy is very helpful. Thanks again, Dan -Original Message- From: Jeremy Goecks [mailto:jeremy.goe...@emory.edu] Sent: Thursday, April 19, 2012 1:56 PM To: Dorset, Daniel C Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: How to specify RUM output dir? yes: standard output: Broken pip yes: write error I know what this means, generally, but not in the context of Galaxy. Is this a telltale symptom, or is it too generic to say? Hard to say, but it's coming from your script and is indeed causing your job to fail. You might find it easier to debug if you run your tool/wrapper script from the command line. Best, J. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] linking Galaxy and Integrated Genome Browser
Hiral, I've cc'd the Galaxy development mailing list, which includes folks with experience in all areas of Galaxy. Can you be clear about what you're trying to do and what approach you're taking? Once it's clear what the issue is, someone can chime in with suggestions. Best, J. On Apr 19, 2012, at 12:44 PM, Hiral Vora wrote: Hi Jeremy, I have a question regarding committing my changes. I need to make changes to datatypes_conf.xml. But I see that repository does not have that file. So should I commit my changes to datatypes_conf.xml.sample instead? Thanks, Hiral ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY
On Thu, Apr 19, 2012 at 7:13 PM, JIE CHEN jiechenable1...@gmail.com wrote: Hi Peter, Here is the full log: Excellent :) The good news is MIRA seems to be installed and running fine - it just didn't like your test data, and I understand why: ... Sanger will load 1 reads. Longest Sanger: 36 Longest 454: 0 Longest IonTor: 0 Longest PacBio: 0 Longest Solexa: 0 Longest Solid: 0 Longest overall: 36 Total reads to load: 1 ... Sangertotal bases:36 used bases in used reads: 0 454 total bases:0 used bases in used reads: 0 IonTortotal bases:0 used bases in used reads: 0 PacBiototal bases:0 used bases in used reads: 0 Solexatotal bases:0 used bases in used reads: 0 Solid total bases:0 used bases in used reads: 0 .. Fatal error (may be due to problems of the input data or parameters): No read can be used for assembly. ... Then finally some information my wrapper script adds: MIRA took 0.00 minutes Return error code 1 from command: mira --job=denovo,genome,accurate SANGER_SETTINGS -LR:lsd=1:mxti=0:ft=fastq -FN:fqi=/media/partition2_/galaxydb_data/000/dataset_290.dat COMMON_SETTINGS -OUT:orf=1:orc=1:ora=1:orw=1:orm=0:org=0:ors=0 -OUT:rrot=1:rtd=1 It appears you are trying to run MIRA with a single 36bp read, telling MIRA this is a Sanger read. That is very odd (not least because a 36bp read sounds more likely to be an early Solexa/Illumina read from the length). Has something gone wrong with loading the data into Galaxy? Or did you just want to try a trivial test case? If so, it was too simple and MIRA has stopped because it thinks it is bad input. The MIRA output log file (which is actually written to the stout if you run MIRA yourself at the command line) is quite verbose, but it is incredibly useful for diagnosing problems. That is why I collect it as one of the output files in Galaxy. You should be able to try some larger realistic examples, e.g. a virus or a bacterial genome depending on your server's capabilities. And if they fail, have a look through the log file for why MIRA said it failed. Also keep in mind that the Galaxy wrapper is deliberately a simplified front end - MIRA has dozens of command line options which are not available via my wrapper for simplicity. Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Toolshed initial upload errors
Paul, Sorry to see you're still experiencing problems. Based on the issues you've encountered (as well as one or two others recently) I've spent some time re-working things to eliminate the need for installing mercurial to use a local tool shed. We will build eggs for the mercurial package for the various versions of Python supported by Galaxy and include them in the distribution. I'm pretty close to having this finished, so it is likely that this will be available early next week. I'm not sure if this will fix the problems you're seeing, but at least it will eliminate one of the variables. Greg Von Kuster On Apr 19, 2012, at 5:02 AM, Paul-Michael Agapow wrote: [For those who came in late - I've installed a local toolshed, which allows me to create repositories, but every time I attempt to upload files, it errors out with TypeError: array item must be char. For those who come after me, here's what I worked out thus far.] Greg asked: Since you've tried uploading various files with no success, the problem is likely to be caused by something specific to your environment - possibly the version of the mercurial package you're using. What version of Python are you running, and what version of the mercurial package do you have installed with it? Also, what version of Galaxy do you have, and what database / version are you using? We're CentOS, an older flavour (4), but my Mercurial is up to data (2.1.2). Python 2.6.4, Galaxy is 6799:40f1816d6857 (grabbed it fresh last week for testing), running it with sqlite. However, the Mercurial is actually installed local to the account I'm using, so I wonder if the toolshed is getting confused with another version, although hg doesn't seem to be installed on the system. Further investigations reveal that the files appear to be in the repo (database/community_files). The error manifest in the middle of Mercurial, in manifest.py where it attempts to coerce a Unicode string into a character array. (As there are some reported issues of Windows file names with Unicode under Mercurial, and I'm uploading from a Windows machine, I used a Mac to create a repo and add a file. Nope, same behaviour.) The Cistrome galaxy fork (https://bitbucket.org/cistrome/cistrome-harvard/src/e7e2fdd74496/lib/galaxy/webapps/community/controllers/upload.py) mentions occasional similar errors. I check the Mercurial installation: % hg --version Mercurial Distributed SCM (version 2.1.2+10-4d875bb546dc) ... % hg debuginstall Checking encoding (UTF-8)... Checking installed modules (/home/f0/paul/Installed/lib/python2.6/site-packages/mercurial)... Checking templates (/home/f0/paul/Installed/lib/python2.6/site-packages/mercurial/templates)... Checking commit editor... Checking username... No problems detected (Actually, I was missing a username and a user ~/.hgrc file. But making that, it passes. Error still persists.) Work continues. Paul Agapow (paul-michael.aga...@hpa.org.uk) Bioinformatics, Health Protection Agency - ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of the HPA, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses, but please re-sweep any attachments before opening or saving. HTTP://www.HPA.org.uk ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)
Lance and Peter; Peter, thanks for noticing the problem and duplicate tools. Lance, I'm happy to merge these so there are not two different versions out there. I prefer your use for genomeCoverageBed over my custom hacks. That's a nice approach I totally missed. I avoid the need for the sam indexes by creating the file directly from the information in the BAM header. I don't think there is any way around creating it since it's required by the UCSC tools as well, but everything you need is in the BAM header. There might be a sneaky way to do this with samtools -H and awk but I'm not nearly skilled enough to pull that out. Let me know what you think. I can also update my python wrapper script to use the genomeCoverageBed approach instead if you think that's easier. Brad Hi Peter, Thanks for the thoughtful comments. I believe the requirement for the genome was imposed by the use of an underlying BedTools utility. I also think that in a newer version of that tool, the requirement was removed, since you correctly point out it is not really necessary. I will see if I can update the tool to remove that requirement and also see about changing the tool id. Sorry for the conflict, that was an oversight on my part, though it would be nice if the Tool Shed could check and warn when someone tries to create a new tool. I would suggest flagging the new repo as invalid until the id is updated instead of outright rejection. As for the author info, you're right, I should really add that as well. That tool was put together very quickly to meet the need of a customer and I didn't properly clean things up before I uploaded. I'll let you know once I get an update out. Of course, any patches etc. are welcome. ;-) Lance Peter Cock wrote: Hi Brad Lance, I've been using Brad's bam_to_bigwig tool in Galaxy but realized today (with a new dataset using a splice-aware mapper) that it doesn't seem to be ignoring CIGAR N operators where a read is split over an intron. Looking over Brad's Python script which calculates the coverage to write an intermediate wiggle file, this is done with the samtools via pysam. It is not obvious to me if this can be easily modified to ignore introns. Is this possible Brad? I wasn't aware of Lance's rival bam_to_bigwig tool in the ToolShed till now, and that does talk about this issue. It has a boolean option to ignore gaps when computing coverage, recommended for RNA-Seq where reads are mapped across long splice junctions. Lance, from your tool's help it sounds like it needs a genome database build filled in. I don't understand this requirement - Brad's tool works just fine for standalone BAM files (for example reads mapped to an in house assembly). Is that not supported in your tool? Galaxy team - why does the ToolShed allow duplicate repository names (here bam_to_bigwig) AND duplicate tool IDs (again, here bam_to_bigwig)? Won't this cause chaos when sharing workflows? I would suggest checking this when a tool is uploaded and rejecting repository name or tool ID clashes. Regards, Peter P.S. Brad, your tool is missing an explicitrequirements tag listing the UCSC binary wigToBigWig, and the Python library pysam. Lance, your tool doesn't seem to include any author information like your name or email address. I'm inferring it is yours from the Galaxy tool shed user id, lparsons. -- Lance Parsons - Scientific Programmer 134 Carl C. Icahn Laboratory Lewis-Sigler Institute for Integrative Genomics Princeton University ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Manually removing datasets from database
Dear all, I was able to run the cleanup scripts in Galaxy, using the -r option with them. But why weren't the datasets removed from the disk? Can I now manually remove them? Cheers, Diana M. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Manually deleting datasets from server
Dear all, I was able to execute the cleanup scripts in galaxy, using the -r option with them. According to what I've read, that option removes the datasets from the disk. But why didn't that happen? Also, since I have 'deleted and purged' the files, can I now remove them manually from the server? Cheers, CLedero ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Manually deleting datasets from server
Just an additional problem. I retried running the scripts, and I got this error when I was doing the purging: Removing disk, file Error attempting to purge data file: Traceback (most recent call last): File scripts/cleanup_datasets/cleanup_datasets.py, line 518, in module if __name__ == __main__: main() File scripts/cleanup_datasets/cleanup_datasets.py, line 116, in main purge_datasets( app, cutoff_time, options.remove_from_disk, info_only = options.info_only, force_retry = options.force_retry ) File scripts/cleanup_datasets/cleanup_datasets.py, line 353, in purge_datasets _purge_dataset( app, dataset, remove_from_disk, info_only = info_only ) File scripts/cleanup_datasets/cleanup_datasets.py, line 478, in _purge_dataset print Error attempting to purge data file: , dataset.file_name, error: , str( exc ) File /home/applications/galaxy-dist/lib/galaxy/model/__init__.py, line 651, in get_file_name assert self.object_store is not None, Object Store has not been initialized for dataset %s % self.id AssertionError: Object Store has not been initialized for dataset 1 Can you please enlighten me on this error? I am just new to Galaxy and Python, so I'm quite at a loss here. Thanks in advance for any help! On Fri, Apr 20, 2012 at 10:24 AM, Ciara Ledero lede...@gmail.com wrote: Dear all, I was able to execute the cleanup scripts in galaxy, using the -r option with them. According to what I've read, that option removes the datasets from the disk. But why didn't that happen? Also, since I have 'deleted and purged' the files, can I now remove them manually from the server? Cheers, CLedero ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Potential database corruption with local galaxy instance
To update our problems from the other day, it appears like there were multiple issues, at least one of which involved the postgres sequences. There were already id values in several tables that matched those that were being assigned by the sequence, creating errors involving duplicate keys. Offsetting the next assigned value of the sequences +1 seems to have fixed at least some of these problems. Cheers, dave -- David O'Connor http://labs.pathology.wisc.edu/oconnor ph: 608-301-5710 On Wednesday, April 18, 2012 at 11:42 AM, Jeremy Goecks wrote: % sh manage_db.sh (http://manage_db.sh/) downgrade 92 % sh manage_db.sh (http://manage_db.sh/) upgrade The downgrade to 92 and upgrade created job.params. This is progress. You should be able to run jobs again, yes? Unfortunately, we are still getting errors about duplicate key values. The debug output when I try to export a history to a file is shown below my signature. Is there anything that was updated recently that would change primary keys? Primary keys are handled by SQLAlchemy, not Galaxy, so that's not the problem. I would guess the issue arose due to the missing 'params' column in job. This can likely be fixed by deleting all the rows in the job_history_export_table: DELETE FROM job_export_history_archive; The only downside to this operation is that existing history archives won't be found and will have to be recreated. Best, J. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] linking Galaxy and Integrated Genome Browser
Hi, Thank you for getting back to me. I am developer at Loraine Lab and I am working on IGB (i.e Integrated Genome Browser). We would like to integrate IGB with Galaxy. So as per instruction of James, I have cloned copy of galaxy-central to make changes. I will then commit those changes in. I require to change a file name datatypes_conf.xml but that file is not present there. So should I make my changes in datatypes_conf.xml.sample? I am attaching copy of James email for reference. -- Hiral Hi Ann, we'd be happy to integrate support for IGB. The normal way we do this is through a pull request on bitbucket. You basically make a clone of our development repository (galaxy-central) commit all the necessary changes, and we can pull it in. This retains commit history (including attributing the author of the code correctly in the history). I've copied Jeremy, who would probably be the person to handle the pull. Let us know if this all makes sense and seems like a way forward. Thanks! -- jt James Taylor, Assistant Professor, Biology / Computer Science, Emory University On Thursday, April 19, 2012 at 3:28 PM, Jeremy Goecks wrote: Hiral, I've cc'd the Galaxy development mailing list, which includes folks with experience in all areas of Galaxy. Can you be clear about what you're trying to do and what approach you're taking? Once it's clear what the issue is, someone can chime in with suggestions. Best, J. On Apr 19, 2012, at 12:44 PM, Hiral Vora wrote: Hi Jeremy, I have a question regarding committing my changes. I need to make changes to datatypes_conf.xml. But I see that repository does not have that file. So should I commit my changes to datatypes_conf.xml.sample instead? Thanks, Hiral ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] cleanup_datasets.py inquiry
Hi all, I have tried Nate's fix on this script, but I got an indentation error in line 31. Any ideas on how to fix this? I'm not familiar with python, so I'm quite at a loss here. Thanks! ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Spaces in uploaded dataset filenames
Hi, I have a problem with a tool wrapper I wrote handling input files with spaces in their names. If a user uploads a dataset that has spaces in the file name, e.g. foo bar.xml file then even though my tool, which uses the last tmp directory method of handling multiple output files is provided by galaxy with a normal input filename like /galaxy/production/database/files/006/dataset_6667.dat the galaxy renames the output files to names based on the metadata like foo bar.log and foo bar.nxs output files, which don't get copied into the history. I wonder if anyone ran into this issue before and how it was handled. Thanks, Alex ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/