[galaxy-dev] Copying galaxy folder from linux to windows

2012-04-19 Thread diana michelle magbanua
Hi there,

Since manually deleting the files from the galaxy database is not advised,
I thought of just copying the galaxy-dist folder from the linux server to
my windows. I copied everything except the 000 folder, where the datasets
are located. I also have Cygwin installed in windows. I was wondering if
galaxy would still run even if I just copied the folder.

About my dataset cleanup problem: I tried running the cleanup scripts, but
since the disk is full I cannot make any changes to universe_wsgi.ini (I
get this KeyError problem). Any suggestions on how I could free some
space/delete datasets? Is manually deleting tmp files also not advised? I
am not allowed to remove/alter any other file aside from those within
galaxy-dist.

Any help would be greatly appreciated.

Cheers,

DM
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Problem with cleaning up galaxy datasets

2012-04-19 Thread diana michelle magbanua
Dear all,

I have done what was suggested in this thread (remove the # sign), but
somehow I still got the same error. Any suggestions on how to get past this?

Cheers,

Diana

On Mon, Apr 16, 2012 at 12:51 AM, Jennifer Jackson j...@bx.psu.edu wrote:

 repost bounced msg, see why below

 -
 Date: Sat, 14 Apr 2012 22:33:22 +0200
 From: Klaus Metzeler m...@klaus-metzeler.de
 To: Michael Moore michaelglennmo...@gmail.com
 Cc: galaxy-dev@lists.bx.psu.edu
 Subject: Re: [galaxy-dev] Problem with cleaning up galaxy datasets
 Message-ID: 
 4F89DF12.6010606@klaus-**metzeler.de4f89df12.6010...@klaus-metzeler.de
 
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed


 OK, this pointed me to the right direction.
 The universe_wsgi.ini file has a key that reads:

 database_connection =
 sqlite:///./database/universe.**sqlite?isolation_level=**IMMEDIATE

 This entry was commented out in my version of the file (ie, had a # in
 front of it). While Galaxy itself uses this database connection by
 default, but the cleanup script looks for this config file entry to
 locate the database. So, if anyone encounters the same problem, just
 un-comment this entry and all runs fine.

 Thanks for your help, Michael.
 --

 Why did this bounce? Digest content not stripped (#2).
 If you reply to a thread in a digest, please
 1. Change the subject of your response from Galaxy-dev Digest Vol ... to
 the original subject for the thread.
 2. Strip out everything else in the digest that is not part of the thread
 you are responding to.

 Thanks Klaus for sending in the solution!

 - Galaxy team

 --

 Jennifer Jackson
 http://galaxyproject.org

 __**_
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Toolshed initial upload errors

2012-04-19 Thread Paul-Michael Agapow
[For those who came in late - I've installed a local toolshed, which allows me 
to create repositories, but every time  I attempt to upload files, it errors 
out with TypeError: array item must be char. For those who come after me, 
here's what I worked out thus far.]

Greg asked:

 Since you've tried uploading various files with no success, the problem is 
 likely
 to be caused by something specific to your environment - possibly the version 
 of
 the mercurial package you're using.  What version of Python are you running, 
 and
 what version of the mercurial package do you have installed with it?  Also, 
 what
 version of Galaxy do you have, and what database / version are you using?

We're CentOS, an older flavour (4), but my Mercurial is up to data (2.1.2). 
Python 2.6.4, Galaxy is 6799:40f1816d6857 (grabbed it fresh last week for 
testing), running it with sqlite. However, the Mercurial is actually installed 
local to the account I'm using, so I wonder if the toolshed is getting confused 
with another version, although hg doesn't seem to be installed on the system.

Further investigations reveal that the files appear to be in the repo 
(database/community_files). The error manifest in the middle of Mercurial, in 
manifest.py where it attempts to coerce a Unicode string into a character 
array. (As there are some reported issues of Windows file names with Unicode 
under Mercurial, and I'm uploading from a Windows machine, I used a Mac to 
create a repo and add a file. Nope, same behaviour.) The Cistrome galaxy fork 
(https://bitbucket.org/cistrome/cistrome-harvard/src/e7e2fdd74496/lib/galaxy/webapps/community/controllers/upload.py)
 mentions occasional similar errors.

I check the Mercurial installation:

% hg --version
Mercurial Distributed SCM (version 2.1.2+10-4d875bb546dc)
...
% hg debuginstall
Checking encoding (UTF-8)...
Checking installed modules 
(/home/f0/paul/Installed/lib/python2.6/site-packages/mercurial)...
Checking templates 
(/home/f0/paul/Installed/lib/python2.6/site-packages/mercurial/templates)...
Checking commit editor...
Checking username...
No problems detected

(Actually, I was missing a username and a user ~/.hgrc file. But making that, 
it passes. Error still persists.)

Work continues.


Paul Agapow (paul-michael.aga...@hpa.org.uk)
Bioinformatics, Health Protection Agency

-
**
The information contained in the EMail and any attachments is
confidential and intended solely and for the attention and use of
the named addressee(s). It may not be disclosed to any other person
without the express authority of the HPA, or the intended
recipient, or both. If you are not the intended recipient, you must
not disclose, copy, distribute or retain this message or any part
of it. This footnote also confirms that this EMail has been swept
for computer viruses, but please re-sweep any attachments before
opening or saving. HTTP://www.HPA.org.uk
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY

2012-04-19 Thread Peter Cock
On Thu, Apr 19, 2012 at 12:40 AM, JIE CHEN jiechenable1...@gmail.com wrote:
 The version I installed is : mira_3.4.0_prod_linux-gnu_x86_64_static


OK, good.

The other key question I asked was did you get anything in the MIRA
log file (it should be in your history with text data even though it will be
red as a failed job)?

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] delete data library via API

2012-04-19 Thread Leon Mei
Dear all,

Haven't heard anything back yet. So give it another try.

Anyone knows if deleting dataset is supported by API? Or it will be a
to-be-added feature?

Thanks for any hint!

Cheers,
Leon

On Thu, Apr 12, 2012 at 10:32 PM, Leon Mei hailiang@nbic.nl wrote:
 Hi guys,

 Is that possible to delete datasets in a shared data library via API? Of
 course when the modify permission is granted.

 We see delete.py in the API script folder but can't figure out how to get it
 work.

 Thanks!

 Leon

 --
 Hailiang (Leon) Mei
 Netherlands Bioinformatics Center
 BioAssist NGS Taskforce
  - http://ngs.nbic.nl
 Skype: leon_mei    Mobile: +31 6 41709231



-- 
Hailiang (Leon) Mei
Netherlands Bioinformatics Center
BioAssist NGS Taskforce
 - http://ngs.nbic.nl
Skype: leon_mei    Mobile: +31 6 41709231

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] BAM to BigWig (and tool ID clashes)

2012-04-19 Thread Peter Cock
Hi Brad  Lance,

I've been using Brad's bam_to_bigwig tool in Galaxy but realized
today (with a new dataset using a splice-aware mapper) that it
doesn't seem to be ignoring CIGAR N operators where a read is
split over an intron. Looking over Brad's Python script which
calculates the coverage to write an intermediate wiggle file, this is
done with the samtools via pysam. It is not obvious to me if this
can be easily modified to ignore introns. Is this possible Brad?

I wasn't aware of Lance's rival bam_to_bigwig tool in the ToolShed
till now, and that does talk about this issue. It has a boolean option
to ignore gaps when computing coverage, recommended for
RNA-Seq where reads are mapped across long splice junctions.

Lance, from your tool's help it sounds like it needs a genome
database build filled in. I don't understand this requirement - Brad's
tool works just fine for standalone BAM files (for example reads
mapped to an in house assembly). Is that not supported in your
tool?

Galaxy team - why does the ToolShed allow duplicate repository
names (here bam_to_bigwig) AND duplicate tool IDs (again, here
bam_to_bigwig)? Won't this cause chaos when sharing workflows?
I would suggest checking this when a tool is uploaded and rejecting
repository name or tool ID clashes.

Regards,

Peter

P.S.

Brad, your tool is missing an explicit requirements tag
listing the UCSC binary wigToBigWig, and the Python library
pysam.

Lance, your tool doesn't seem to include any author information
like your name or email address. I'm inferring it is yours from the
Galaxy tool shed user id, lparsons.
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)

2012-04-19 Thread Peter Cock
On Thu, Apr 19, 2012 at 1:55 PM, Peter Cock p.j.a.c...@googlemail.com wrote:
 Hi Brad  Lance,

 I've been using Brad's bam_to_bigwig tool in Galaxy but realized
 today (with a new dataset using a splice-aware mapper) that it
 doesn't seem to be ignoring CIGAR N operators where a read is
 split over an intron. Looking over Brad's Python script which
 calculates the coverage to write an intermediate wiggle file, this is
 done with the samtools via pysam. It is not obvious to me if this
 can be easily modified to ignore introns. Is this possible Brad?

Looking into this a bit more, perhaps 'samtools depth' might
be useful (bam2depth.c), maybe we can use this code to update
your python+pysam code?

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)

2012-04-19 Thread Greg Von Kuster
The tool shed forces unique repository names per user account, allowing for 
uniqueness with that combination.  All tools uploaded into a tool shed 
repository are assigned a unique id called a guid, which is unique for all 
tools across all possible tool sheds.  These guids follow a named spacing 
convention that ensures that any tool installed into any Galaxy instance will 
be uniquely identified regardless of old tool ids or tool versions.

For example,  the guid for version 0.0.2 of Brad's tool is 

toolshed.g2.bx.psu.edu/repos/brad-chapman/bam_to_bigwig/bam_to_bigwig/0.0.2

while the guid for version 0.1 of Lance's tool is 

toolshed.g2.bx.psu.edu/repos/lparsons/bam_to_bigwig/bam_to_bigwig/0.1

This information can be seen when viewing the tool's metadata in the tool shed. 
 When these tools are installed into a local Galaxy instance, this guid is the 
tool's id in Galaxy rather than the old id (e.g., tool id=bam_to_bigwig).  
The old  id is still important and must be included in the tool config as 
usual, but is not used to identify a tool that is installed in a repository 
from the tool shed.

All of these details are explained in the tool shed wiki in the following 
section.

http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_tools_into_a_local_Galaxy_instance

This section is also relevant to this discussion.

http://wiki.g2.bx.psu.edu/Tool%20Shed#Galaxy_Tool_Versions


On Apr 19, 2012, at 8:55 AM, Peter Cock wrote:

 
 Galaxy team - why does the ToolShed allow duplicate repository
 names (here bam_to_bigwig) AND duplicate tool IDs (again, here
 bam_to_bigwig)? Won't this cause chaos when sharing workflows?
 I would suggest checking this when a tool is uploaded and rejecting
 repository name or tool ID clashes.
 
 Regards,
 
 Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] JobManager object has no attribute dispatcher

2012-04-19 Thread Nate Coraor
On Apr 17, 2012, at 9:04 AM, Peter Cock wrote:

 Hi all,
 
 Does anyone know what might have introduced this problem running
 galaxy-dist when using the task splitting functionality? I'm using the
 latest code from the default branch, changeset:   7027:f6e790d94282

Hi Peter,

This was resolved in changeset 5c93ac32ace1.  Thanks for reporting it.

--nate

 
 galaxy.jobs.manager DEBUG 2012-04-17 13:55:03,610 (4) Job assigned to
 handler 'main'
 127.0.0.1 - - [17/Apr/2012:13:55:06 +0100] POST
 /root/history_item_updates HTTP/1.1 200 -
 http://127.0.0.1:8081/history; Mozilla/5.0 (X11; U; Linux x86_64;
 en-US; rv:1.9.2.18) Gecko/20110622 CentOS/3.6-1.el5.centos
 Firefox/3.6.18
 galaxy.jobs DEBUG 2012-04-17 13:55:08,710 (4) Working directory for
 job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/4
 galaxy.jobs.handler DEBUG 2012-04-17 13:55:08,711 dispatching job 4 to
 tasks runner
 galaxy.jobs.handler INFO 2012-04-17 13:55:08,845 (4) Job dispatched
 galaxy.datatypes.sequence DEBUG 2012-04-17 13:55:09,470 Split
 /mnt/galaxy/galaxy-central/database/files/000/dataset_2.dat into 4
 parts...
 galaxy.datatypes.sequence DEBUG 2012-04-17 13:55:09,470 Attemping to
 split FASTA file
 /mnt/galaxy/galaxy-central/database/files/000/dataset_2.dat into
 chunks of 1 sequences
 galaxy.datatypes.sequence DEBUG 2012-04-17 13:55:09,471 Writing
 /mnt/galaxy/galaxy-central/database/files/000/dataset_2.dat part to
 /mnt/galaxy/galaxy-central/database/job_working_directory/000/4/task_0/dataset_2.dat
 galaxy.datatypes.sequence DEBUG 2012-04-17 13:55:09,472 Writing
 /mnt/galaxy/galaxy-central/database/files/000/dataset_2.dat part to
 /mnt/galaxy/galaxy-central/database/job_working_directory/000/4/task_1/dataset_2.dat
 galaxy.jobs.splitters.multi DEBUG 2012-04-17 13:55:09,472 do_split
 created 2 parts
 galaxy.jobs DEBUG 2012-04-17 13:55:09,506 (4) Working directory for
 job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/4
 galaxy.jobs.runners.tasks ERROR 2012-04-17 13:55:09,784 failure running job 4
 Traceback (most recent call last):
  File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/tasks.py,
 line 86, in run_job
self.app.job_manager.dispatcher.put(tw)
 AttributeError: 'JobManager' object has no attribute 'dispatcher'
 
 Thanks,
 
 Peter
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)

2012-04-19 Thread Peter Cock
On Thu, Apr 19, 2012 at 2:32 PM, Greg Von Kuster g...@bx.psu.edu wrote:
 The tool shed forces unique repository names per user account, allowing for
 uniqueness with that combination.  All tools uploaded into a tool shed
 repository are assigned a unique id called a guid, which is unique for all
 tools across all possible tool sheds.  These guids follow a named spacing
 convention that ensures that any tool installed into any Galaxy instance
 will be uniquely identified regardless of old tool ids or tool versions.

 ... The old  id is still important and must
 be included in the tool config as usual, but is not used to identify a tool
 that is installed in a repository from the tool shed.

Ah - so the old tool ID clashes are only going to be a problem with
Galaxy servers where the tools were installed 'the old fashioned way'
(like ours). So there is still scope for clashes with shared workflows -
but this will be less and less of a problem as local Galaxy installs
switch to installing tools via the Tool Shed?

What happens if (for example) Brad gives Lance commit rights to
his repository (or the other way round)? Then you'd have a clash.

 All of these details are explained in the tool shed wiki in the following
 section.

 http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_tools_into_a_local_Galaxy_instance

 This section is also relevant to this discussion.

 http://wiki.g2.bx.psu.edu/Tool%20Shed#Galaxy_Tool_Versions

Thanks for the background.

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)

2012-04-19 Thread Greg Von Kuster

On Apr 19, 2012, at 10:04 AM, Peter Cock wrote:

 On Thu, Apr 19, 2012 at 2:32 PM, Greg Von Kuster g...@bx.psu.edu wrote:
 The tool shed forces unique repository names per user account, allowing for
 uniqueness with that combination.  All tools uploaded into a tool shed
 repository are assigned a unique id called a guid, which is unique for all
 tools across all possible tool sheds.  These guids follow a named spacing
 convention that ensures that any tool installed into any Galaxy instance
 will be uniquely identified regardless of old tool ids or tool versions.
 
 ... The old  id is still important and must
 be included in the tool config as usual, but is not used to identify a tool
 that is installed in a repository from the tool shed.
 
 Ah - so the old tool ID clashes are only going to be a problem with
 Galaxy servers where the tools were installed 'the old fashioned way'
 (like ours).

Yes, it is highly recommended to install tool shed repositories using the 
installation process that has been implemented rather than downloading the 
repository contents as an archive and manually manipulating it to be 
incorporated into your Galaxy instance.  Using the installation process 
includes many benefits in addition to eliminating the potential tool id 
clashes.  Examples of benefits include not having to stop / restart your Galaxy 
server in order to use freshly installed tools, being able to deactivate / 
uninstall tools on-the-fly when finished with them, being able to run multiple 
versions of the same tool simultaneously in the same Galaxy instance, etc.


 So there is still scope for clashes with shared workflows -
 but this will be less and less of a problem as local Galaxy installs
 switch to installing tools via the Tool Shed?

Correct - if you manually download the contents of a repository and install it 
into your local Galaxy instance, there is no way to eliminate the potential for 
tool id / version clashes.  In fact, it may be beneficial to eliminate the 
feature enabling users to manually download repository contents, but we'll 
leave it there as long as the community wants it.

 
 What happens if (for example) Brad gives Lance commit rights to
 his repository (or the other way round)? Then you'd have a clash.

Assuming automatic installation using the tool shed install process, no clashes 
will occur in this scenario, because no matter who pushes changes to the 
repository, it is still name spaced by the original owner, which can never 
change.  The only part of the guid that could potentially change is the tool 
version component ( e.g., 
toolshed.g2.bx.psu.edu/repos/brad-chapman/bam_to_bigwig/bam_to_bigwig/0.0.2 
becomes 
toolshed.g2.bx.psu.edu/repos/brad-chapman/bam_to_bigwig/bam_to_bigwig/0.0.3 if 
Brad gives Lance the ability to push to his repository and Lance change's the 
tool version ).

 
 All of these details are explained in the tool shed wiki in the following
 section.
 
 http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_installation_of_Galaxy_tool_shed_repository_tools_into_a_local_Galaxy_instance
 
 This section is also relevant to this discussion.
 
 http://wiki.g2.bx.psu.edu/Tool%20Shed#Galaxy_Tool_Versions
 
 Thanks for the background.
 
 Peter
 


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)

2012-04-19 Thread Lance Parsons

Hi Peter,

Thanks for the thoughtful comments.  I believe the requirement for the 
genome was imposed by the use of an underlying BedTools utility. I also 
think that in a newer version of that tool, the requirement was removed, 
since you correctly point out it is not really necessary.


I will see if I can update the tool to remove that requirement and also 
see about changing the tool id.  Sorry for the conflict, that was an 
oversight on my part, though it would be nice if the Tool Shed could 
check and warn when someone tries to create a new tool. I would suggest 
flagging the new repo as invalid until the id is updated instead of 
outright rejection.


As for the author info, you're right, I should really add that as well.  
That tool was put together very quickly to meet the need of a customer 
and I didn't properly clean things up before I uploaded.  I'll let you 
know once I get an update out.  Of course, any patches etc. are 
welcome.  ;-)


Lance

Peter Cock wrote:

Hi Brad  Lance,

I've been using Brad's bam_to_bigwig tool in Galaxy but realized
today (with a new dataset using a splice-aware mapper) that it
doesn't seem to be ignoring CIGAR N operators where a read is
split over an intron. Looking over Brad's Python script which
calculates the coverage to write an intermediate wiggle file, this is
done with the samtools via pysam. It is not obvious to me if this
can be easily modified to ignore introns. Is this possible Brad?

I wasn't aware of Lance's rival bam_to_bigwig tool in the ToolShed
till now, and that does talk about this issue. It has a boolean option
to ignore gaps when computing coverage, recommended for
RNA-Seq where reads are mapped across long splice junctions.

Lance, from your tool's help it sounds like it needs a genome
database build filled in. I don't understand this requirement - Brad's
tool works just fine for standalone BAM files (for example reads
mapped to an in house assembly). Is that not supported in your
tool?

Galaxy team - why does the ToolShed allow duplicate repository
names (here bam_to_bigwig) AND duplicate tool IDs (again, here
bam_to_bigwig)? Won't this cause chaos when sharing workflows?
I would suggest checking this when a tool is uploaded and rejecting
repository name or tool ID clashes.

Regards,

Peter

P.S.

Brad, your tool is missing an explicitrequirements  tag
listing the UCSC binary wigToBigWig, and the Python library
pysam.

Lance, your tool doesn't seem to include any author information
like your name or email address. I'm inferring it is yours from the
Galaxy tool shed user id, lparsons.


--
Lance Parsons - Scientific Programmer
134 Carl C. Icahn Laboratory
Lewis-Sigler Institute for Integrative Genomics
Princeton University

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)

2012-04-19 Thread Greg Von Kuster

On Apr 19, 2012, at 10:37 AM, Lance Parsons wrote:

  and also see about changing the tool id.  

I would recommend NOT doing this  - see the separate thread for this message 
that describes ow this works in the tool shed.

 Sorry for the conflict, that was an oversight on my part, though it would be 
 nice if the Tool Shed could check and warn when someone tries to create a new 
 tool. I would suggest flagging the new repo as invalid until the id is 
 updated instead of outright rejection.

Again, see the separate thread for this message - the tool shed does correctly 
handle this when the automatic installation process is used.

 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
 http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] JobManager object has no attribute dispatcher

2012-04-19 Thread Peter Cock
On Thu, Apr 19, 2012 at 2:53 PM, Nate Coraor n...@bx.psu.edu wrote:
 On Apr 17, 2012, at 9:04 AM, Peter Cock wrote:

 Hi all,

 Does anyone know what might have introduced this problem running
 galaxy-dist when using the task splitting functionality? I'm using the
 latest code from the default branch, changeset:   7027:f6e790d94282

 Hi Peter,

 This was resolved in changeset 5c93ac32ace1.  Thanks for reporting it.

 --nate

Are you sure? I've just updated to the tip and the same problem
persists. Also looking at that commit it isn't obvious how is might
be linked to this issue:
https://bitbucket.org/galaxy/galaxy-central/changeset/5c93ac32ace1

Thanks,

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Error running set_dataset_sizes.py

2012-04-19 Thread Liisa Koski
Hello,
I'm seeing some discrepancies in total user usage versus what my histories 
actually total so I wanted to run set_dataset_sizes.py  and 
set_user_disk_usage.py 

I am getting the following error.

 ./set_dataset_sizes.py
Loading Galaxy model...
Processing 77915 datasets...
Completed 0%
Traceback (most recent call last):
  File ./set_dataset_sizes.py, line 43, in module
dataset.set_total_size()
  File lib/galaxy/model/__init__.py, line 703, in set_total_size
if self.object_store.exists(self, extra_dir=self._extra_files_path or 
dataset_%d_files % self.id, dir_only=True):
AttributeError: 'NoneType' object has no attribute 'exists'


Any help would be much appreciated.

Thanks,
Liisa

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] JobManager object has no attribute dispatcher

2012-04-19 Thread Nate Coraor
On Apr 19, 2012, at 10:44 AM, Peter Cock wrote:

 On Thu, Apr 19, 2012 at 2:53 PM, Nate Coraor n...@bx.psu.edu wrote:
 On Apr 17, 2012, at 9:04 AM, Peter Cock wrote:
 
 Hi all,
 
 Does anyone know what might have introduced this problem running
 galaxy-dist when using the task splitting functionality? I'm using the
 latest code from the default branch, changeset:   7027:f6e790d94282
 
 Hi Peter,
 
 This was resolved in changeset 5c93ac32ace1.  Thanks for reporting it.
 
 --nate
 
 Are you sure? I've just updated to the tip and the same problem
 persists. Also looking at that commit it isn't obvious how is might
 be linked to this issue:
 https://bitbucket.org/galaxy/galaxy-central/changeset/5c93ac32ace1

You're right, that's what I get for reading hastily.  Fix coming shortly...

 
 Thanks,
 
 Peter
 

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] run job as real user error: environment variables issue?

2012-04-19 Thread Louise-Amélie Schmitt

Hi everyone,

I'm currently trying to set up our local Galaxy so it can run jobs as 
the real user. I followed the documentation and set the galaxy user as a 
sudoer. However, I get an error message whenever I'm trying to run a job:


galaxy.jobs.runners.drmaa ERROR 2012-04-19 14:57:48,376 Uncaught 
exception queueing job

Traceback (most recent call last):
  File /g/funcgen/galaxy-dev/lib/galaxy/jobs/runners/drmaa.py, line 
133, in run_next

self.queue_job( obj )
  File /g/funcgen/galaxy-dev/lib/galaxy/jobs/runners/drmaa.py, line 
219, in queue_job
job_id = self.external_runjob(filename, 
job_wrapper.user_system_pwent[2]).strip()
  File /g/funcgen/galaxy-dev/lib/galaxy/jobs/runners/drmaa.py, line 
427, in external_runjob
raise RuntimeError(External_runjob failed (exit code %s)\nCalled 
from %s:%d\nChild process reported error:\n%s % (str(exitcode), 
__filename__(), __lineno__(), stderrdata))

RuntimeError: External_runjob failed (exit code 127)
Called from /g/funcgen/galaxy-dev/lib/galaxy/jobs/runners/drmaa.py:427
Child process reported error:
python: error while loading shared libraries: libpython2.6.so.1.0: 
cannot open shared object file: No such file or directory



Looking closely, it's the non-root user it tries to switch to that 
doesn't have the LD_LIBRARY_PATH properly set, so there should be an 
environment inheritance issue. However, I tried to print stuff from the 
scripts/drmaa_external_runner.py script in EVERY WAY I could think of, 
to no avail. As if it doesn't even run. Which is surprising since root 
can run python properly, so it really looks like it's really changing users.


I really fail to see where the problem could come from, so if you have 
leads to suggest, I'll be forever grateful.


Best,
L-A
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


[galaxy-dev] Default annotation track

2012-04-19 Thread Oleksandr Moskalenko
I needed to add a genome to the tracks in our local instance. However, the only 
available genome is a multi-fasta file of about 1800 supercontigs. I preserve 
the sanity of my clients I concatenated the fasta file and provided both 
versions. The unfortunate part is that the contig annotation data is lost in 
that conversion. I wonder if there is a way to extract the contig data as 
annotation and provide it as a default track in the concatenated genome or 
something like that. Any ideas?

Thanks,

Alex
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Error copying files from job_working_directory

2012-04-19 Thread Nate Coraor
On Apr 16, 2012, at 9:03 PM, Jose Navas wrote:

 Hi everybody,
 
 I was searching through the Galaxy code and I find the solution: the function 
 responsible for copying files is using shutil.copy function, which only 
 allows copy files. I've modified this function to use shutil.copytree in case 
 of file_name is a directory. I can send you the code if you email me.
 
 Also,  I don't know if there is a good reason against this solution. If it 
 is, I will be very grateful if somebody can explain it.
 
 Thanks,
 
 Jose

Hi Jose,

Are you using the output dataset path as a directory rather than a filename?  
Or is this with the output dataset's files_path/extra_files_path attribute?

--nate

 
 From: josenavasmol...@hotmail.com
 To: galaxy-dev@lists.bx.psu.edu
 Date: Mon, 16 Apr 2012 17:34:29 +
 Subject: [galaxy-dev] Error copying files from job_working_directory
 
 Hello,
 
 I've integrated a tool into my Galaxy instance, but when I run the tool I get 
 this error:
 
 galaxy.objectstore CRITICAL 2012-04-16 11:25:56,697 Error copying 
 /home/galaxy/galaxy-dist/database/job_working_directory/000/307/dataset_431_files/unweighted_unifrac_2d_continuous
  to 
 /home/galaxy/galaxy-dist/database/files/000/dataset_431_files/unweighted_unifrac_2d_continuous:
  [Errno 21] Is a directory: 
 '/home/galaxy/galaxy-dist/database/job_working_directory/000/307/dataset_431_files/unweighted_unifrac_2d_continuous'
 
 As I can see, it fails because it is trying to copy a directory. Is this 
 feature supported in Galaxy? If it is supported, what I have to do to enable 
 copy the directories from the job_working_directory?
 
 Thank you,
 
 Jose
 
 ___ Please keep all 
 replies on the list by using reply all in your mail client. To manage your 
 subscriptions to this and other Galaxy lists, please use the interface 
 at:http://lists.bx.psu.edu/
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Error copying files from job_working_directory

2012-04-19 Thread Jose Navas

Hi Nate,
I'm using the output dataset path to generate an html file and in the output 
dataset's files_path/extra_files_path attribute I'm generating a set of files 
and directories. So the problem was in the output dataset's 
files_path/extra_files_path attribute.
Thanks,Jose

 Subject: Re: [galaxy-dev] Error copying files from job_working_directory
 From: n...@bx.psu.edu
 Date: Thu, 19 Apr 2012 12:16:05 -0400
 CC: galaxy-dev@lists.bx.psu.edu
 To: josenavasmol...@hotmail.com
 
 On Apr 16, 2012, at 9:03 PM, Jose Navas wrote:
 
  Hi everybody,
  
  I was searching through the Galaxy code and I find the solution: the 
  function responsible for copying files is using shutil.copy function, which 
  only allows copy files. I've modified this function to use shutil.copytree 
  in case of file_name is a directory. I can send you the code if you email 
  me.
  
  Also,  I don't know if there is a good reason against this solution. If it 
  is, I will be very grateful if somebody can explain it.
  
  Thanks,
  
  Jose
 
 Hi Jose,
 
 Are you using the output dataset path as a directory rather than a filename?  
 Or is this with the output dataset's files_path/extra_files_path attribute?
 
 --nate
 
  
  From: josenavasmol...@hotmail.com
  To: galaxy-dev@lists.bx.psu.edu
  Date: Mon, 16 Apr 2012 17:34:29 +
  Subject: [galaxy-dev] Error copying files from job_working_directory
  
  Hello,
  
  I've integrated a tool into my Galaxy instance, but when I run the tool I 
  get this error:
  
  galaxy.objectstore CRITICAL 2012-04-16 11:25:56,697 Error copying 
  /home/galaxy/galaxy-dist/database/job_working_directory/000/307/dataset_431_files/unweighted_unifrac_2d_continuous
   to 
  /home/galaxy/galaxy-dist/database/files/000/dataset_431_files/unweighted_unifrac_2d_continuous:
   [Errno 21] Is a directory: 
  '/home/galaxy/galaxy-dist/database/job_working_directory/000/307/dataset_431_files/unweighted_unifrac_2d_continuous'
  
  As I can see, it fails because it is trying to copy a directory. Is this 
  feature supported in Galaxy? If it is supported, what I have to do to 
  enable copy the directories from the job_working_directory?
  
  Thank you,
  
  Jose
  
  ___ Please keep all 
  replies on the list by using reply all in your mail client. To manage 
  your subscriptions to this and other Galaxy lists, please use the interface 
  at:http://lists.bx.psu.edu/
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
  
   http://lists.bx.psu.edu/
 
  ___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] How to specify RUM output dir?

2012-04-19 Thread Dorset, Daniel C
Thanks Jeremy, that's very helpful, and it's great to hear from a developer!

I have been pursuing option (a), and I feel like I'm very close. The RUM tool 
runs, and the filesizes show up in the history, but the datasets show up as 
erroneous. When I click on the apparently-problematic dataset's bug icon, the 
error message shows the following two lines repeated over 100 times:

yes: standard output: Broken pip
yes: write error

I know what this means, generally, but not in the context of Galaxy. Is this a 
telltale symptom, or is it too generic to say?

Under the additional output, it shows exactly the STDOUT the tool shows when it 
executes and terminates properly from the command line. So I know I'm close, I 
feel like I'm missing something small. 

When I click on the view dataset button, I see the data, and it's legit. When 
I click Edit Attributes, I see a message at the bottom of the Edit 
Attributes pane that says  Required metadata values are missing. Some of 
these values may not be editable by the user. Selecting Auto-detect will 
attempt to fix these values. When I attempt to run the Auto-detect, this 
notification goes away.

It seems like the only issue right now is getting rid of that broken pipe 
error message. Once that's gone, perhaps the datasets won't be flagged as 
erroneous and I can use them in downstream processes.

If I can get this tool working perfectly, I'll definitely upload it to the 
Galaxy toolshed.

Any tips you could provide would be greatly appreciated!

Thanks,

Dan

You have two options:

(a) you can set up the tool to report only a subset of outputs from the tool;

or

(b) you can use a composite datatype to store the complete directory:

http://wiki.g2.bx.psu.edu/Admin/Datatypes/Composite%20Datatypes

Best,
J.


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] resubmit a job if the node fails

2012-04-19 Thread zhengqiu cai
Hi,

Can Galaxy resubmit a job if the node where the job is running fails?
I know sge can do that by using qsub -r.

It should be very useful if Galaxy can do that.

Thank you,

Cai

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY

2012-04-19 Thread JIE CHEN
Hi Peter,

Thank you for your patience. I checked the error message in the history.
They all give exactly the same error-- the one i gave in the first thread.
I re thinks the procedure many times and it shouldn't have any problem.
Don't know why.

What i have done:
1. Installed the mira wrappers using the Galaxy web UI and checked that the
mira.py and mira.xml is under one of the directoriesof the shed_tools
directory.
2. installed the mira 3.4.0 binaries in my host

Thanks a lot.

Tyler

On Thu, Apr 19, 2012 at 2:11 AM, Peter Cock p.j.a.c...@googlemail.comwrote:

 On Thu, Apr 19, 2012 at 12:40 AM, JIE CHEN jiechenable1...@gmail.com
 wrote:
  The version I installed is : mira_3.4.0_prod_linux-gnu_x86_64_static
 

 OK, good.

 The other key question I asked was did you get anything in the MIRA
 log file (it should be in your history with text data even though it will
 be
 red as a failed job)?

 Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY

2012-04-19 Thread Peter Cock
On Thu, Apr 19, 2012 at 6:35 PM, JIE CHEN jiechenable1...@gmail.com wrote:
 Hi Peter,

 Thank you for your patience. I checked the error message in the history.
 They all give exactly the same error-- the one i gave in the first thread.

Are you saying this is the entire contents of the MIRA log entry in the history?

Return error code 1 from command:
mira --job=denovo,genome,accurate SANGER_SETTINGS
-LR:lsd=1:mxti=0:ft=fastq
-FN:fqi=/media/partition2_/galaxydb_data/000/dataset_290.dat
SOLEXA_SETTINGS -LR:lsd=1:ft=fastq
-FN:fqi=/media/partition2_/galaxydb_data/000/dataset_290.dat
COMMON_SETTINGS -OUT:orf=1:orc=1:ora=1:orw=1:orm=0:org=0:ors=0
-OUT:rrot=1:rtd=1

I'm pretty sure you are just telling me the error message. I would have expected
more than that, e.g. a line MIRA took XXX minutes before that error message.

To try to be even clearer:

1. Start your web browser and goto your Galaxy
2. Upload/import the files
3. Select the MIRA tool from left hand pane
4. Select input files and set parameters
5. Click Execute
6. Notice that six new history entries appear: MIRA contigs (FASTA),
MIRA contigs (QUAL),MIRA contigs (CAF), MIRA contigs (ACE), MIRA
coverage (Wiggle), MIRA log
7. Wait for MIRA to fail and the six new history entries to go red.
8. Click on the eye icon for the red history item MIRA log
9. Copy and paste the MIRA log contents to an email.

Also, and perhaps equally useful, can you access this server at the
command line and try the exact same failing command (from a temp
directory - it may create lots of files and folders)?

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY

2012-04-19 Thread JIE CHEN
Hi Peter,

Here is the full log:


This is MIRA V3.4.0 (production version).

Please cite: Chevreux, B., Wetter, T. and Suhai, S. (1999), Genome Sequence
Assembly Using Trace Signals and Additional Sequence Information.
Computer Science and Biology: Proceedings of the German Conference on
Bioinformatics (GCB) 99, pp. 45-56.

To (un-)subscribe the MIRA mailing lists, see:
http://www.chevreux.org/mira_mailinglists.html

After subscribing, mail general questions to the MIRA talk mailing list:
mira_t...@freelists.org

To report bugs or ask for features, please use the new ticketing system at:
http://sourceforge.net/apps/trac/mira-assembler/
This ensures that requests don't get lost.


Compiled by: bach
Sun Aug 21 17:50:30 CEST 2011
On: Linux arcadia 2.6.38-11-generic #48-Ubuntu SMP Fri Jul 29 19:02:55
UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
Compiled in boundtracking mode.
Compiled in bugtracking mode.
Compiled with ENABLE64 activated.
Runtime settings (sorry, for debug):
Size of size_t  : 8
Size of uint32  : 4
Size of uint32_t: 4
Size of uint64  : 8
Size of uint64_t: 8
Current system: Linux whsiao-ubuntu 2.6.32-40-generic #87-Ubuntu SMP
Tue Mar 6 00:56:56 UTC 2012 x86_64 GNU/Linux



Parsing parameters: --job=denovo,genome,accurate SANGER_SETTINGS
-LR:lsd=1:mxti=0:ft=fastq
-FN:fqi=/media/partition2_/galaxydb_data/000/dataset_290.dat
COMMON_SETTINGS -OUT:orf=1:orc=1:ora=1:orw=1:orm=0:org=0:ors=0
-OUT:rrot=1:rtd=1




Parameters parsed without error, perfect.

-CL:pec and -CO:emeas1clpec are set, setting -CO:emea values to 1.
--
Parameter settings seen for:
Sanger data (also common parameters)

Used parameter settings:
  General (-GE):
Project name in (proin) : mira
Project name out (proout)   : mira
Number of threads (not) : 2
Automatic memory management (amm)   : yes
Keep percent memory free (kpmf) : 15
Max. process size (mps) : 0
EST SNP pipeline step (esps): 0
Use template information (uti)  : yes
Template insert size minimum (tismin)   : -1
Template insert size maximum (tismax)   : -1
Template partner build direction (tpbd) : -1
Colour reads by hash frequency (crhf)   : yes

  Load reads options (-LR):
Load sequence data (lsd): yes
File type (ft)  : fastq
External quality (eq)   : from SCF (scf)
Ext. qual. override (eqo)   : no
Discard reads on e.q. error (droeqe): no
Solexa scores in qual file (ssiqf)  : no
FASTQ qual offset (fqqo): 0

Wants quality file (wqf): yes

Read naming scheme (rns):  [san] Sanger Institute 
(sanger)

Merge with XML trace info (mxti): no

Filecheck only (fo) : no

  Assembly options (-AS):
Number of passes (nop)  : 4
Skim each pass (sep): yes
Maximum number of RMB break loops (rbl) : 2
Maximum contigs per pass (mcpp) : 0

Minimum read length (mrl)   : 80
Minimum reads per contig (mrpc) : 2
Base default quality (bdq)  : 10
Enforce presence of qualities (epoq): yes

Automatic repeat detection (ard): yes
Coverage threshold (ardct)  : 2
Minimum length (ardml)  : 400
Grace length (ardgl): 40
Use uniform read distribution (urd) : no
  Start in pass (urdsip): 3
  Cutoff multiplier (urdcm) : 1.5
Keep long repeats separated (klrs)  : no

Spoiler detection (sd)  : yes
Last pass only (sdlpo)  : yes

Use genomic pathfinder (ugpf)   : yes

Use emergency search stop (uess): yes
ESS partner depth (esspd)   : 500
Use emergency blacklist (uebl)  : yes
Use max. contig build time (umcbt)  : no
Build time in seconds (bts) : 1

  Strain and backbone options (-SB):
Load straindata (lsd)   : no
Assign default strain (ads) : no
Default strain name (dsn)   : StrainX
Load backbone (lb)  : no
Start backbone usage in pass (sbuip): 3
Backbone file type (bft): fasta
   

Re: [galaxy-dev] Default annotation track

2012-04-19 Thread Jeremy Goecks
Alex,

It's not clear what the problem is. Trackster will handle an arbitrarily large 
number of contigs (albeit somewhat clumsily). What contig annotation data are 
you trying to preserve and/or provide access to?

J.

On Apr 19, 2012, at 11:30 AM, Oleksandr Moskalenko wrote:

 I needed to add a genome to the tracks in our local instance. However, the 
 only available genome is a multi-fasta file of about 1800 supercontigs. I 
 preserve the sanity of my clients I concatenated the fasta file and provided 
 both versions. The unfortunate part is that the contig annotation data is 
 lost in that conversion. I wonder if there is a way to extract the contig 
 data as annotation and provide it as a default track in the concatenated 
 genome or something like that. Any ideas?
 
 Thanks,
 
 Alex
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] How to specify RUM output dir?

2012-04-19 Thread Jeremy Goecks
 yes: standard output: Broken pip
 yes: write error
 
 I know what this means, generally, but not in the context of Galaxy. Is this 
 a telltale symptom, or is it too generic to say?

Hard to say, but it's coming from your script and is indeed causing your job to 
fail. You might find it easier to debug if you run your tool/wrapper script 
from the command line.

Best,
J.



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] How to specify RUM output dir?

2012-04-19 Thread Dorset, Daniel C
Fantastic! Knowing that it's not coming from Galaxy is very helpful.

Thanks again,

Dan

-Original Message-
From: Jeremy Goecks [mailto:jeremy.goe...@emory.edu] 
Sent: Thursday, April 19, 2012 1:56 PM
To: Dorset, Daniel C
Cc: galaxy-dev@lists.bx.psu.edu
Subject: Re: How to specify RUM output dir?

 yes: standard output: Broken pip
 yes: write error
 
 I know what this means, generally, but not in the context of Galaxy. Is this 
 a telltale symptom, or is it too generic to say?

Hard to say, but it's coming from your script and is indeed causing your job to 
fail. You might find it easier to debug if you run your tool/wrapper script 
from the command line.

Best,
J.





___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] linking Galaxy and Integrated Genome Browser

2012-04-19 Thread Jeremy Goecks
Hiral,

I've cc'd the Galaxy development mailing list, which includes folks with 
experience in all areas of Galaxy. Can you be clear about what you're trying to 
do and what approach you're taking? Once it's clear what the issue is, someone 
can chime in with suggestions.

Best,
J.


On Apr 19, 2012, at 12:44 PM, Hiral Vora wrote:

 Hi Jeremy,
 
 I have a question regarding committing my changes. 
 I need to make changes to datatypes_conf.xml. But I see that repository does 
 not have that file. So should I commit my changes to 
 datatypes_conf.xml.sample instead?
 
 Thanks,
 Hiral

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Mira-Assembler: DOESN'T WORK ON GALAXY

2012-04-19 Thread Peter Cock
On Thu, Apr 19, 2012 at 7:13 PM, JIE CHEN jiechenable1...@gmail.com wrote:
 Hi Peter,

 Here is the full log:


Excellent :)

The good news is MIRA seems to be installed and running
fine - it just didn't like your test data, and I understand why:

 ...

 Sanger will load 1 reads.
 Longest Sanger: 36
 Longest 454: 0
 Longest IonTor: 0
 Longest PacBio: 0
 Longest Solexa: 0
 Longest Solid: 0
 Longest overall: 36
 Total reads to load: 1
 ...
 Sangertotal bases:36  used bases in used reads: 0
 454   total bases:0   used bases in used reads: 0
 IonTortotal bases:0   used bases in used reads: 0
 PacBiototal bases:0   used bases in used reads: 0
 Solexatotal bases:0   used bases in used reads: 0
 Solid total bases:0   used bases in used reads: 0

 ..

 Fatal error (may be due to problems of the input data or parameters):

 No read can be used for assembly.

 ...

Then finally some information my wrapper script adds:

 MIRA took 0.00 minutes
 Return error code 1 from command:
 mira --job=denovo,genome,accurate SANGER_SETTINGS -LR:lsd=1:mxti=0:ft=fastq
 -FN:fqi=/media/partition2_/galaxydb_data/000/dataset_290.dat COMMON_SETTINGS
 -OUT:orf=1:orc=1:ora=1:orw=1:orm=0:org=0:ors=0 -OUT:rrot=1:rtd=1

It appears you are trying to run MIRA with a single 36bp read,
telling MIRA this is a Sanger read. That is very odd (not least
because a 36bp read sounds more likely to be an early
Solexa/Illumina read from the length).

Has something gone wrong with loading the data into Galaxy?
Or did you just want to try a trivial test case? If so, it was too
simple and MIRA has stopped because it thinks it is bad input.

The MIRA output log file (which is actually written to the stout
if you run MIRA yourself at the command line) is quite verbose,
but it is incredibly useful for diagnosing problems. That is why
I collect it as one of the output files in Galaxy.

You should be able to try some larger realistic examples, e.g.
a virus or a bacterial genome depending on your server's
capabilities. And if they fail, have a look through the log file
for why MIRA said it failed.

Also keep in mind that the Galaxy wrapper is deliberately a
simplified front end - MIRA has dozens of command line
options which are not available via my wrapper for simplicity.

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Toolshed initial upload errors

2012-04-19 Thread Greg Von Kuster
Paul,

Sorry to see you're still experiencing problems.  Based on the issues you've 
encountered (as well as one or two others recently) I've spent some time 
re-working things to eliminate the need for installing mercurial to use a local 
tool shed.  We will build eggs for the mercurial package for the various 
versions of Python supported by Galaxy and include them in the distribution.  

I'm pretty close to having this finished, so it is likely that this will be 
available early next week.  

I'm not sure if this will fix the problems you're seeing, but at least it will 
eliminate one of the variables.

Greg Von Kuster

On Apr 19, 2012, at 5:02 AM, Paul-Michael Agapow wrote:

 [For those who came in late - I've installed a local toolshed, which allows 
 me to create repositories, but every time  I attempt to upload files, it 
 errors out with TypeError: array item must be char. For those who come 
 after me, here's what I worked out thus far.]
 
 Greg asked:
 
 Since you've tried uploading various files with no success, the problem is 
 likely
 to be caused by something specific to your environment - possibly the 
 version of
 the mercurial package you're using.  What version of Python are you running, 
 and
 what version of the mercurial package do you have installed with it?  Also, 
 what
 version of Galaxy do you have, and what database / version are you using?
 
 We're CentOS, an older flavour (4), but my Mercurial is up to data (2.1.2). 
 Python 2.6.4, Galaxy is 6799:40f1816d6857 (grabbed it fresh last week for 
 testing), running it with sqlite. However, the Mercurial is actually 
 installed local to the account I'm using, so I wonder if the toolshed is 
 getting confused with another version, although hg doesn't seem to be 
 installed on the system.
 
 Further investigations reveal that the files appear to be in the repo 
 (database/community_files). The error manifest in the middle of Mercurial, in 
 manifest.py where it attempts to coerce a Unicode string into a character 
 array. (As there are some reported issues of Windows file names with Unicode 
 under Mercurial, and I'm uploading from a Windows machine, I used a Mac to 
 create a repo and add a file. Nope, same behaviour.) The Cistrome galaxy fork 
 (https://bitbucket.org/cistrome/cistrome-harvard/src/e7e2fdd74496/lib/galaxy/webapps/community/controllers/upload.py)
  mentions occasional similar errors.
 
 I check the Mercurial installation:
 
 % hg --version
 Mercurial Distributed SCM (version 2.1.2+10-4d875bb546dc)
 ...
 % hg debuginstall
 Checking encoding (UTF-8)...
 Checking installed modules 
 (/home/f0/paul/Installed/lib/python2.6/site-packages/mercurial)...
 Checking templates 
 (/home/f0/paul/Installed/lib/python2.6/site-packages/mercurial/templates)...
 Checking commit editor...
 Checking username...
 No problems detected
 
 (Actually, I was missing a username and a user ~/.hgrc file. But making that, 
 it passes. Error still persists.)
 
 Work continues.
 
 
 Paul Agapow (paul-michael.aga...@hpa.org.uk)
 Bioinformatics, Health Protection Agency
 
 -
 **
 The information contained in the EMail and any attachments is
 confidential and intended solely and for the attention and use of
 the named addressee(s). It may not be disclosed to any other person
 without the express authority of the HPA, or the intended
 recipient, or both. If you are not the intended recipient, you must
 not disclose, copy, distribute or retain this message or any part
 of it. This footnote also confirms that this EMail has been swept
 for computer viruses, but please re-sweep any attachments before
 opening or saving. HTTP://www.HPA.org.uk
 **
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
 http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] BAM to BigWig (and tool ID clashes)

2012-04-19 Thread Brad Chapman

Lance and Peter;
Peter, thanks for noticing the problem and duplicate tools. Lance, I'm
happy to merge these so there are not two different versions out there.

I prefer your use for genomeCoverageBed over my custom hacks. That's a
nice approach I totally missed.

I avoid the need for the sam indexes by creating the file directly
from the information in the BAM header. I don't think there is any way
around creating it since it's required by the UCSC tools as well, but
everything you need is in the BAM header.

There might be a sneaky way to do this with samtools -H and awk but I'm
not nearly skilled enough to pull that out.

Let me know what you think. I can also update my python wrapper script
to use the genomeCoverageBed approach instead if you think that's
easier.

Brad

 Hi Peter,
 
 Thanks for the thoughtful comments.  I believe the requirement for the 
 genome was imposed by the use of an underlying BedTools utility. I also 
 think that in a newer version of that tool, the requirement was removed, 
 since you correctly point out it is not really necessary.
 
 I will see if I can update the tool to remove that requirement and also 
 see about changing the tool id.  Sorry for the conflict, that was an 
 oversight on my part, though it would be nice if the Tool Shed could 
 check and warn when someone tries to create a new tool. I would suggest 
 flagging the new repo as invalid until the id is updated instead of 
 outright rejection.
 
 As for the author info, you're right, I should really add that as well.  
 That tool was put together very quickly to meet the need of a customer 
 and I didn't properly clean things up before I uploaded.  I'll let you 
 know once I get an update out.  Of course, any patches etc. are 
 welcome.  ;-)
 
 Lance
 
 Peter Cock wrote:
  Hi Brad  Lance,
 
  I've been using Brad's bam_to_bigwig tool in Galaxy but realized
  today (with a new dataset using a splice-aware mapper) that it
  doesn't seem to be ignoring CIGAR N operators where a read is
  split over an intron. Looking over Brad's Python script which
  calculates the coverage to write an intermediate wiggle file, this is
  done with the samtools via pysam. It is not obvious to me if this
  can be easily modified to ignore introns. Is this possible Brad?
 
  I wasn't aware of Lance's rival bam_to_bigwig tool in the ToolShed
  till now, and that does talk about this issue. It has a boolean option
  to ignore gaps when computing coverage, recommended for
  RNA-Seq where reads are mapped across long splice junctions.
 
  Lance, from your tool's help it sounds like it needs a genome
  database build filled in. I don't understand this requirement - Brad's
  tool works just fine for standalone BAM files (for example reads
  mapped to an in house assembly). Is that not supported in your
  tool?
 
  Galaxy team - why does the ToolShed allow duplicate repository
  names (here bam_to_bigwig) AND duplicate tool IDs (again, here
  bam_to_bigwig)? Won't this cause chaos when sharing workflows?
  I would suggest checking this when a tool is uploaded and rejecting
  repository name or tool ID clashes.
 
  Regards,
 
  Peter
 
  P.S.
 
  Brad, your tool is missing an explicitrequirements  tag
  listing the UCSC binary wigToBigWig, and the Python library
  pysam.
 
  Lance, your tool doesn't seem to include any author information
  like your name or email address. I'm inferring it is yours from the
  Galaxy tool shed user id, lparsons.
 
 -- 
 Lance Parsons - Scientific Programmer
 134 Carl C. Icahn Laboratory
 Lewis-Sigler Institute for Integrative Genomics
 Princeton University
 
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Manually removing datasets from database

2012-04-19 Thread Ciara Ledero
Dear all,

I was able to run the cleanup scripts in Galaxy, using the -r option with
them. But why weren't the datasets removed from the disk? Can I now
manually remove them?

Cheers,

Diana M.
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Manually deleting datasets from server

2012-04-19 Thread Ciara Ledero
Dear all,

I was able to execute the cleanup scripts in galaxy, using the -r option
with them. According to what I've read, that option removes the datasets
from the disk. But why didn't that happen? Also, since I have 'deleted and
purged' the files, can I now remove them manually from the server?

Cheers,

CLedero
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Manually deleting datasets from server

2012-04-19 Thread Ciara Ledero
Just an additional problem. I retried running the scripts, and I got this
error when I was doing the purging:

Removing disk, file  Error attempting to purge data file:
Traceback (most recent call last):
  File scripts/cleanup_datasets/cleanup_datasets.py, line 518, in module
if __name__ == __main__: main()
  File scripts/cleanup_datasets/cleanup_datasets.py, line 116, in main
purge_datasets( app, cutoff_time, options.remove_from_disk, info_only =
options.info_only, force_retry = options.force_retry )
  File scripts/cleanup_datasets/cleanup_datasets.py, line 353, in
purge_datasets
_purge_dataset( app, dataset, remove_from_disk, info_only = info_only )
  File scripts/cleanup_datasets/cleanup_datasets.py, line 478, in
_purge_dataset
print Error attempting to purge data file: , dataset.file_name, 
error: , str( exc )
  File /home/applications/galaxy-dist/lib/galaxy/model/__init__.py, line
651, in get_file_name
assert self.object_store is not None, Object Store has not been
initialized for dataset %s % self.id
AssertionError: Object Store has not been initialized for dataset 1


Can you please enlighten me on this error? I am just new to Galaxy and
Python, so I'm quite at a loss here.

Thanks in advance for any help!

On Fri, Apr 20, 2012 at 10:24 AM, Ciara Ledero lede...@gmail.com wrote:

 Dear all,

 I was able to execute the cleanup scripts in galaxy, using the -r option
 with them. According to what I've read, that option removes the datasets
 from the disk. But why didn't that happen? Also, since I have 'deleted and
 purged' the files, can I now remove them manually from the server?

 Cheers,

 CLedero

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Potential database corruption with local galaxy instance

2012-04-19 Thread David O'Connor
To update our problems from the other day, it appears like there were multiple 
issues, at least one of which involved the postgres sequences. There were 
already id values in several tables that matched those that were being assigned 
by the sequence, creating errors involving duplicate keys. Offsetting the next 
assigned value of the sequences +1 seems to have fixed at least some of these 
problems. 

Cheers,

dave


-- 
David O'Connor
http://labs.pathology.wisc.edu/oconnor
ph: 608-301-5710


On Wednesday, April 18, 2012 at 11:42 AM, Jeremy Goecks wrote:

   % sh manage_db.sh (http://manage_db.sh/) downgrade 92
   % sh manage_db.sh (http://manage_db.sh/) upgrade
   
   
   
  
  The downgrade to 92 and upgrade created job.params.  
  
 
 
 This is progress. You should be able to run jobs again, yes?
  Unfortunately, we are still getting errors about duplicate key values. The 
  debug output when I try to export a history to a file is shown below my 
  signature. Is there anything that was updated recently that would change 
  primary keys?
 
 Primary keys are handled by SQLAlchemy, not Galaxy, so that's not the 
 problem. I would guess the issue arose due to the missing 'params' column in 
 job. This can likely be fixed by deleting all the rows in the 
 job_history_export_table:
 
 DELETE FROM job_export_history_archive;
 
 The only downside to this operation is that existing history archives won't 
 be found and will have to be recreated.
 
 Best,
 J.
 

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] linking Galaxy and Integrated Genome Browser

2012-04-19 Thread Hiral Vora
Hi, 

Thank you for getting back to me. 

I am developer at Loraine Lab and I am working on IGB (i.e Integrated Genome 
Browser). We would like to integrate IGB with Galaxy. 

So as per instruction of James,  I have cloned copy of galaxy-central to make 
changes. I will then commit those changes in. 

I require to change a file name datatypes_conf.xml but that file is not present 
there. So should I make my changes in datatypes_conf.xml.sample?

I am attaching copy of James email for reference.

--
Hiral

 Hi Ann, we'd be happy to integrate support for IGB. The normal way we do this 
 is through a pull request on bitbucket. You basically make a clone of our 
 development repository (galaxy-central) commit all the necessary changes, and 
 we can pull it in. This retains commit history (including attributing the 
 author of the code correctly in the history).
 
 I've copied Jeremy, who would probably be the person to handle the pull. Let 
 us know if this all makes sense and seems like a way forward. Thanks!
 
 -- jt
 
 James Taylor, Assistant Professor, Biology / Computer Science, Emory 
 University




On Thursday, April 19, 2012 at 3:28 PM, Jeremy Goecks wrote:

 Hiral,
 
 I've cc'd the Galaxy development mailing list, which includes folks with 
 experience in all areas of Galaxy. Can you be clear about what you're trying 
 to do and what approach you're taking? Once it's clear what the issue is, 
 someone can chime in with suggestions.
 
 Best,
 J.
 
 
 On Apr 19, 2012, at 12:44 PM, Hiral Vora wrote:
  Hi Jeremy,
  
  I have a question regarding committing my changes. 
  I need to make changes to datatypes_conf.xml. But I see that repository 
  does not have that file. So should I commit my changes to 
  datatypes_conf.xml.sample instead?
  
  Thanks,
  Hiral
  
 
 
 

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] cleanup_datasets.py inquiry

2012-04-19 Thread Ciara Ledero
Hi all,

I have tried Nate's fix on this script, but I got an indentation error in
line 31. Any ideas on how to fix this? I'm not familiar with python, so I'm
quite at a loss here.

Thanks!
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Spaces in uploaded dataset filenames

2012-04-19 Thread Oleksandr Moskalenko
Hi,

I have a problem with a tool wrapper I wrote handling input files with spaces 
in their names. If a user uploads a dataset that has spaces in the file name, 
e.g. foo bar.xml  file then even though my tool, which uses the last tmp 
directory method of handling multiple output files is provided by galaxy with 
a normal input filename like 
/galaxy/production/database/files/006/dataset_6667.dat the galaxy renames the 
output files to names based on the metadata like foo bar.log and foo 
bar.nxs output files, which don't get copied into the history.  I wonder if 
anyone ran into this issue before and how it was handled.

Thanks,

Alex
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/