[galaxy-dev] Running two Galaxy instances on the same Torque cluster

2013-04-05 Thread Josh Nielsen
Hello,

I have a question about running two Galaxy instances on separate hosts on
the same Torque cluster. For various reasons, including some recent changes
and/or removal of certain features (I am told BLAST was affected) in the
newer versions of Galaxy, I would like to keep our current older version of
Galaxy running while creating a separate Torque submit host to run the
latest version of Galaxy on it. I do not think that will pose any issues
for Torque since it will just see the new host as another submit host for
jobs, but I would like to know if this would cause any unforeseen issues
for either of the Galaxy instances.

They will both mount and store their data on the same network filesystem
but I will naturally have to create two separate directory trees for their
/basedir/database/files, pbs, job_working_directory, etc/ paths. I am
planning on making the local user the same on both submit nodes ('galaxy' -
we are not using LDAP on that cluster although we may in the future). Will
that cause any strange issues such as jobs being reported back to the wrong
galaxy instance? Will IP address or DNS name be a factor? Additionally I
hope there will not be an issue with the two instances both pointing to the
same FTP upload directory. The idea seems sound in my head but I want to
make sure I'm not excluding any critical considerations. Any suggestions or
insights would be appreciated.

Thanks,
Josh Nielsen
HudsonAlpha Institute for Biotechnology
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Required Galaxy umask settings for HTML downloads?

2012-12-04 Thread Josh Nielsen
Hello all,

I am having issues downloading HTML files from Galaxy the same as is
described in this email chain:

http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-August/010965.html

I am getting the error (13)Permission denied: xsendfile: cannot open file:
/basedir/galaxy_data/database/tmp/tmp8iEccn/library_download.zip which is
indeed a basic filesystem permissions issue. The problem is that the
permissions created for that directory and every directory created in tmp/
look like this:

drwx--+   2 galaxy galaxy  3 Dec  4 09:23 tmp8iEccn

And I have placed the Apache user in the galaxy group, but as you can see
no group permissions ever get set by Galaxy on the directories that it
creates (it is getting a 700 permissions setting).

As Nate Coraor suggested in the message linked to above, I have tried
altering the default umask but I ran into issues with getting non-existant
results. I use sudo service galaxy start as the galaxy user each time to
start the server and a ps -ef | grep galaxy confirms that Galaxy is
running as the galaxy user. Since I use sudo though I changed the sudoers
file to include:

rootALL=(ALL)   ALL
galaxy  ALL=(ALL)   ALL
Defaults umask_override
Defaults umask = 0002

This changed absolutely nothing. Then I started looking deeper into the PAM
configuration and added a umask directive to /etc/pam.d/sudo (and also
tried it in password-auth-ac and system-auth-ac) like this: session
optional pam_umask.so umask=0002. Still nothing changed in the permissions
in tmp/ when I tried to download an HTML file: no group permissions were
set. Then I dug deeper still and saw that sometimes if setting the mask in
/etc/pam.d/ config files is not enough that you can try to set a
system-wide mask in /etc/login.defs (following the suggestion here:
http://stackoverflow.com/questions/10220531/how-to-set-system-wide-umask).
Still no dice. I've pretty much exhausted my know-how in this department.
Any other suggestions of how to fix this or where the correct place to set
the umask is?

Thanks,
Josh Nielsen
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Required Galaxy umask settings for HTML downloads?

2012-12-04 Thread Josh Nielsen
Hi Nate,

Thanks for the reply. No I hadn't thought to add anything to
/etc/init.d/galaxy itself. It is a short enough script that I can paste it
below. What would I need to do to edit it with umask settings?

Also I should note, changing the umask in the PAM files actually did change
the default permissions for the galaxy user when I did an su - galaxy in
a bash shell and then created or 'touch'-ed any files (which you could
logically expect). But for some reason it didn't seem to make a difference
with the directories created in that tmp/ directory even though the galaxy
user was given ownership. That made me wonder if something was going on
internal to Galaxy, or something else, that was overwriting/ignoring the
system umask settings (which actually work fine in a shell environment as
the user itself). Maybe I'll look into that ACL stuff Paul mentioned.

Here is my /etc/init.d/galaxy script:


. /etc/rc.d/init.d/functions

GALAXY_USER=galaxy
GALAXY_DIST_HOME=/home/galaxy/galaxy-dist
GALAXY_RUN=${GALAXY_DIST_HOME}/run.sh
GALAXY_PID=${GALAXY_DIST_HOME}/paster.pid

case $1 in
start)
  echo -n Starting galaxy services: 
  daemon --user $GALAXY_USER ${GALAXY_RUN}
--daemon --pid-file=${GALAXY_PID}
  touch /var/lock/subsys/galaxy
;;
stop)
  echo -n Shutting down galaxy services: 
  daemon --user $GALAXY_USER ${GALAXY_RUN}
--stop-daemon
  rm -f /var/lock/subsys/galaxy
;;
status)
  daemon --user galaxy ${GALAXY_RUN} --status
;;
restart)
  $0 stop; $0 start
;;
reload)
  $0 stop; $0 start
;;
*)
  echo Usage: galaxy
{start|stop|status|reload|restart}
;;
esac
--

Thanks!
Josh

On Tue, Dec 4, 2012 at 9:56 AM, Nate Coraor n...@bx.psu.edu wrote:

 On Dec 4, 2012, at 10:52 AM, Josh Nielsen wrote:

  Hello all,
 
  I am having issues downloading HTML files from Galaxy the same as is
 described in this email chain:
 
  http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-August/010965.html
 
  I am getting the error (13)Permission denied: xsendfile: cannot open
 file: /basedir/galaxy_data/database/tmp/tmp8iEccn/library_download.zip
 which is indeed a basic filesystem permissions issue. The problem is that
 the permissions created for that directory and every directory created in
 tmp/ look like this:
 
  drwx--+   2 galaxy galaxy  3 Dec  4 09:23 tmp8iEccn
 
  And I have placed the Apache user in the galaxy group, but as you can
 see no group permissions ever get set by Galaxy on the directories that it
 creates (it is getting a 700 permissions setting).
 
  As Nate Coraor suggested in the message linked to above, I have tried
 altering the default umask but I ran into issues with getting non-existant
 results. I use sudo service galaxy start as the galaxy user each time to
 start the server and a ps -ef | grep galaxy confirms that Galaxy is
 running as the galaxy user. Since I use sudo though I changed the sudoers
 file to include:
 
  rootALL=(ALL)   ALL
  galaxy  ALL=(ALL)   ALL
  Defaults umask_override
  Defaults umask = 0002
 
  This changed absolutely nothing. Then I started looking deeper into the
 PAM configuration and added a umask directive to /etc/pam.d/sudo (and also
 tried it in password-auth-ac and system-auth-ac) like this: session
  optional   pam_umask.so umask=0002. Still nothing changed in the
 permissions in tmp/ when I tried to download an HTML file: no group
 permissions were set. Then I dug deeper still and saw that sometimes if
 setting the mask in /etc/pam.d/ config files is not enough that you can try
 to set a system-wide mask in /etc/login.defs (following the suggestion
 here:
 http://stackoverflow.com/questions/10220531/how-to-set-system-wide-umask).
 Still no dice. I've pretty much exhausted my know-how in this department.
 Any other suggestions of how to fix this or where the correct place to set
 the umask is?

 Hi Josh,

 Thanks for doing such extensive tests.  Have you tried setting the umask
 in the init script itself?

 --nate

 
  Thanks,
  Josh Nielsen


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Required Galaxy umask settings for HTML downloads?

2012-12-04 Thread Josh Nielsen
Hi Paul,

Thanks for replying. Interestingly I've never dealt with filesystem ACLs
before and I didn't even know that ext3/4 systems had that feature.

Here is my output from those commands:

bash getfacl tmp8iEccn
# file: tmp8iEccn
# owner: galaxy
# group: galaxy
user::rwx
group::---
mask::rwx
other::---

bash getfacl tmp
# file: tmp
# owner: root
# group: galaxy
user::rwx
group::rwx
other::rwx

What enforces these ACLs/where can they be tweaked?

P.S. The reason I run Galaxy with sudo is because if I try to do so as just
the Galaxy user it cannot create the process lock files: touch: cannot
touch `/var/lock/subsys/galaxy`. I suppose I could put the lock files
somewhere else or manually give galaxy group permission to /var/lock/subsys
(not so sure that's a good idea though), but sudo seemed to solve the
problem. You can see my init script in my reply to Nate.

Thanks,
Josh

On Tue, Dec 4, 2012 at 10:06 AM, Paul Boddie paul.bod...@biotek.uio.nowrote:

 On 04/12/12 16:52, Josh Nielsen wrote:

 I am getting the error (13)Permission denied: xsendfile: cannot open
 file:
 /basedir/galaxy_data/database/**tmp/tmp8iEccn/library_**download.zip
 which is
 indeed a basic filesystem permissions issue. The problem is that the
 permissions created for that directory and every directory created in tmp/
 look like this:

 drwx--+   2 galaxy galaxy  3 Dec  4 09:23 tmp8iEccn

 And I have placed the Apache user in the galaxy group, but as you can see
 no group permissions ever get set by Galaxy on the directories that it
 creates (it is getting a 700 permissions setting).


 Isn't the trailing + character an indication of ACLs being set on the
 directory? What do the following say...?

 getfacl /tmp/tmp8iEccn
 getfacl /tmp

 If you do have ACLs involved, it may be the case that various masks are
 being enforced via that mechanism.

 Paul

 P.S. I'm not sure that making the galaxy user a sudoer would have any
 effect unless the user was attempting to gain privileges, which would be a
 pretty scary way of running Galaxy, I would have thought.

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Required Galaxy umask settings for HTML downloads?

2012-12-04 Thread Josh Nielsen
Great. I'll give those ideas a shot to see if it gets me anywhere.

P.S. You referenced in the email that I linked to a fix in the next release
of Galaxy. Is that out yet or still in development?

-Josh

On Tue, Dec 4, 2012 at 10:44 AM, Nate Coraor n...@bx.psu.edu wrote:

 On Dec 4, 2012, at 11:23 AM, Josh Nielsen wrote:

  Hi Nate,
 
  Thanks for the reply. No I hadn't thought to add anything to
 /etc/init.d/galaxy itself. It is a short enough script that I can paste it
 below. What would I need to do to edit it with umask settings?
 
  Also I should note, changing the umask in the PAM files actually did
 change the default permissions for the galaxy user when I did an su -
 galaxy in a bash shell and then created or 'touch'-ed any files (which you
 could logically expect). But for some reason it didn't seem to make a
 difference with the directories created in that tmp/ directory even though
 the galaxy user was given ownership. That made me wonder if something was
 going on internal to Galaxy, or something else, that was
 overwriting/ignoring the system umask settings (which actually work fine in
 a shell environment as the user itself). Maybe I'll look into that ACL
 stuff Paul mentioned.

 Paul's suggestions are worth checking in to.  I'd be interested in knowing
 what the POSIX permissions are on /tmp itself, and what the ACLs are, if
 any.

 Those temporary files are created by creating a temporary directory using
 Python's tempfile.mkdtemp(), which creates them with a mode of 700, which
 is then masked by the current umask.  The change I added in the email you
 referenced in your original post changes the directory to mode 0777 masked
 by the umask (after it's created).

 Depending on how the pieces used in RHEL's startup() shell function handle
 the environment, you may be able to set it on the line above `daemon ...`
 inside the 'start)' branch of the case statement.  If that doesn't work,
 you may need to get more creative and do something like:

 daemon --user $GALAXY_USER umask 027; ${GALAXY_RUN} --daemon
 --pid-file=${GALAXY_PID}

 Alternatively, you can set it inside /home/galaxy/galaxy-dist/run.sh,
 since this startup script uses run.sh.

 --nate

  Here is my /etc/init.d/galaxy script:
 
  
  . /etc/rc.d/init.d/functions
 
  GALAXY_USER=galaxy
  GALAXY_DIST_HOME=/home/galaxy/galaxy-dist
  GALAXY_RUN=${GALAXY_DIST_HOME}/run.sh
  GALAXY_PID=${GALAXY_DIST_HOME}/paster.pid
 
  case $1 in
  start)
echo -n Starting galaxy services: 
daemon --user $GALAXY_USER ${GALAXY_RUN}
 --daemon --pid-file=${GALAXY_PID}
touch /var/lock/subsys/galaxy
  ;;
  stop)
echo -n Shutting down galaxy services: 
daemon --user $GALAXY_USER ${GALAXY_RUN}
 --stop-daemon
rm -f /var/lock/subsys/galaxy
  ;;
  status)
daemon --user galaxy ${GALAXY_RUN} --status
  ;;
  restart)
$0 stop; $0 start
  ;;
  reload)
$0 stop; $0 start
  ;;
  *)
echo Usage: galaxy
 {start|stop|status|reload|restart}
  ;;
  esac
  --
 
  Thanks!
  Josh
 
  On Tue, Dec 4, 2012 at 9:56 AM, Nate Coraor n...@bx.psu.edu wrote:
  On Dec 4, 2012, at 10:52 AM, Josh Nielsen wrote:
 
   Hello all,
  
   I am having issues downloading HTML files from Galaxy the same as is
 described in this email chain:
  
   http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-August/010965.html
  
   I am getting the error (13)Permission denied: xsendfile: cannot open
 file: /basedir/galaxy_data/database/tmp/tmp8iEccn/library_download.zip
 which is indeed a basic filesystem permissions issue. The problem is that
 the permissions created for that directory and every directory created in
 tmp/ look like this:
  
   drwx--+   2 galaxy galaxy  3 Dec  4 09:23 tmp8iEccn
  
   And I have placed the Apache user in the galaxy group, but as you can
 see no group permissions ever get set by Galaxy on the directories that it
 creates (it is getting a 700 permissions setting).
  
   As Nate Coraor suggested in the message linked to above, I have tried
 altering the default umask but I ran into issues with getting non-existant
 results. I use sudo service galaxy start as the galaxy user each time to
 start the server and a ps -ef | grep galaxy confirms that Galaxy is
 running as the galaxy user. Since I use sudo though I changed the sudoers
 file to include:
  
   rootALL=(ALL)   ALL
   galaxy  ALL=(ALL)   ALL
   Defaults umask_override
   Defaults umask = 0002
  
   This changed absolutely nothing. Then I started looking deeper into
 the PAM configuration and added a umask directive to /etc/pam.d/sudo (and
 also tried it in password-auth-ac and system-auth-ac

Re: [galaxy-dev] Issues up/downloading datasets after file_path change

2012-10-04 Thread Josh Nielsen
Yes I am, and recently I thought to check my httpd.conf and it still has a
XSendFilePath /panfs/galaxy_data setting, where /panfs is the obsolete
directory path, and figured that was the problem, I just haven't changed it
yet. Thanks for the suggestion to look! I'll let you know if changing that
doesn't fix it.

-Josh

On Mon, Oct 1, 2012 at 11:33 AM, Nate Coraor n...@bx.psu.edu wrote:

 Hi Josh,

 Are you using XSendFile or X-Accel-Redirect, by any chance?

 --nate

 On Sep 17, 2012, at 5:18 PM, Josh Nielsen wrote:

  Okay, I solved half of the problem. The upload job was consistently
 being submitted to the one compute node in our cluster that I forgot to
 create the symlink on that points to the new location. I can now upload
 files. This means I am fully functional now, which is good, however I still
 prefer to do away with the symlink approach altogether and get Galaxy to
 work directly with the new directory path. Any suggestions for getting that
 to work?
 
  Thanks!
 
  On Mon, Sep 17, 2012 at 3:50 PM, Josh Nielsen jniel...@hudsonalpha.com
 wrote:
  Hello all,
 
  I recently migrated the location of the Galaxy file_path directory
 (~/database/files/) along with the temp, job_working_directory, and pbs
 directory locations (all under ~/database/) to a new storage system mount
 point and I properly updated the change in the universe_wsgi.ini file but
 have run into some problems with managing our datasets. I can run Galaxy
 jobs just fine but I cannot view or download the datasets in the output of
 the jobs after they have run - when it tries to open it from the new path -
 and I get an actual browser error when trying to fetch the dataset. On a
 hunch I created a symbolic link named after the old path in the / directory
 pointing to the new location (and pointed file_path in the
 universe_wsgi.ini file back to it - which looks exactly like the old
 directory path), and sure enough this works as a (temporary) fix to
 view/download the files. However doing that seems to, in turn, mess with
 the file uploading function which will fail every time with a 
  OSError: [Errno 2] No such file or directory message.
 
  So I'm caught between two broken configurations depending on the path I
 point to (the new path or the symlink alternative). I guess in the latter
 situation it does not like traversing a symbolic link (which is the only
 explanation that I can think of). The root of the problem needs to be fixed
 though in that Galaxy seems to be remembering the old path from somewhere
 (the Galaxy database?) such that it will not let me simply migrate the data
 to a new location and change the path variables in
  universe_wsgi.ini. Any suggestions on the proper changes that need to be
 made to fix this permanently?
 
  Thanks,
  Josh Nielsen
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Issues up/downloading datasets after file_path change

2012-09-17 Thread Josh Nielsen
Hello all,

I recently migrated the location of the Galaxy
file_path directory (~/database/files/) along with the temp,
job_working_directory, and pbs directory locations (all under
~/database/) to a new storage system mount point and I properly updated the
change in the universe_wsgi.ini file but have run into some problems with
managing our datasets. I can run Galaxy jobs just fine but I cannot view or
download the datasets in the output of the jobs after they have run - when
it tries to open it from the new path - and I get an actual browser error
when trying to fetch the dataset. On a hunch I created a symbolic link
named after the old path in the / directory pointing to the new location
(and pointed file_path in the universe_wsgi.ini file back to it - which
looks exactly like the old directory path), and sure enough this works as a
(temporary) fix to view/download the files. However doing that seems to, in
turn, mess with the file uploading function which will fail every time with
a OSError: [Errno 2] No such file or directory message. So I'm caught
between two broken configurations depending on the path I point to (the new
path or the symlink alternative). I guess in the latter situation it does
not like traversing a symbolic link (which is the only explanation that I
can think of). The root of the problem needs to be fixed though in that
Galaxy seems to be remembering the old path from somewhere (the Galaxy
database?) such that it will not let me simply migrate the data to a new
location and change the path variables in universe_wsgi.ini. Any
suggestions on the proper changes that need to be made to fix this
permanently?

Thanks,
Josh Nielsen
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Issues up/downloading datasets after file_path change

2012-09-17 Thread Josh Nielsen
Okay, I solved half of the problem. The upload job was consistently being
submitted to the one compute node in our cluster that I forgot to create
the symlink on that points to the new location. I can now upload files.
This means I am fully functional now, which is good, however I still prefer
to do away with the symlink approach altogether and get Galaxy to work
directly with the new directory path. Any suggestions for getting that to
work?

Thanks!

On Mon, Sep 17, 2012 at 3:50 PM, Josh Nielsen jniel...@hudsonalpha.comwrote:

 Hello all,

 I recently migrated the location of the Galaxy
 file_path directory (~/database/files/) along with the temp,
 job_working_directory, and pbs directory locations (all under
 ~/database/) to a new storage system mount point and I properly updated the
 change in the universe_wsgi.ini file but have run into some problems with
 managing our datasets. I can run Galaxy jobs just fine but I cannot view or
 download the datasets in the output of the jobs after they have run - when
 it tries to open it from the new path - and I get an actual browser error
 when trying to fetch the dataset. On a hunch I created a symbolic link
 named after the old path in the / directory pointing to the new location
 (and pointed file_path in the universe_wsgi.ini file back to it - which
 looks exactly like the old directory path), and sure enough this works as a
 (temporary) fix to view/download the files. However doing that seems to, in
 turn, mess with the file uploading function which will fail every time with
 a OSError: [Errno 2] No such file or directory message. So I'm caught
 between two broken configurations depending on the path I point to (the new
 path or the symlink alternative). I guess in the latter situation it does
 not like traversing a symbolic link (which is the only explanation that I
 can think of). The root of the problem needs to be fixed though in that
 Galaxy seems to be remembering the old path from somewhere (the Galaxy
 database?) such that it will not let me simply migrate the data to a new
 location and change the path variables in universe_wsgi.ini. Any
 suggestions on the proper changes that need to be made to fix this
 permanently?

 Thanks,
 Josh Nielsen
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] FastQC Tool Errors

2012-07-02 Thread Josh Nielsen
Sure enough, this fixed the problem. I just installed the latest Sun JRE
and it is working now. Thanks for the suggestion! I would have never
guessed.

-Josh

On Mon, Jun 25, 2012 at 11:34 AM, simon andrews 
simon.andr...@babraham.ac.uk wrote:

  Yes, that's the broken version of gcj.  I don't have a Centos machine
 here at the moment, but I think if you install OpenJDK and use the
 alternatives system to select that as the default JRE then that should fix
 things.

  Simon.

  On 25 Jun 2012, at 17:19, Josh Nielsen wrote:

  Hi Simon,

 I recently installed Java with the yum package manager on our compute
 nodes, and our cluster is a Centos 6 environment. Here is what the results
 of java -version returned on the compute nodes:

 *bash# java -version*
 *java version 1.5.0*
 *gij (GNU libgcj) version 4.4.4 20100726 (Red Hat 4.4.4-13)*
 *
 *
 *Copyright (C) 2007 Free Software Foundation, Inc.*
 *This is free software; see the source for copying conditions.  There is
 NO*
 *warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
 PURPOSE*.

  Is this version of Java too old? Perhaps I need to install the JRE
 manually?

 Thanks!

 On *Sat Jun 23 04:20:53 EDT 2012*, Simon Andrews 
 simon.andr...@babraham.ac.uk 
 galaxy-dev%40lists.bx.psu.edu?Subject=Re%3A%20%5Bgalaxy-dev%5D%20FastQC%20Tool%20ErrorsIn-Reply-To=%3CD9909700-8628-4478-814C-449803EE45F1%40babraham.ac.uk%3E
  wrote:

 Are you by any chance running an older version of gcj as your java version?  
 There is a known bug in some of these where they don't correctly configure 
 the headless environment, even if the correct parameters are passed.  This 
 causes exactly the kind of errors you're seeing.

 If this is the case you'll need to install a more recent JRE (or update your 
 path to point to one which is already present).

 Simon.


 On Sat, Jun 23, 2012 at 6:30 AM, Josh Nielsen jniel...@hudsonalpha.com
 wrote:
  Hello,
 
  I am having an issue with getting the FastQC tool to work with Galaxy
 on our
  server. I downloaded the FastQC files (version 0.8.0) and changed the
  directory that the wrapper script looks for the 'fastqc' executable in,
 but
  when we run a job with it we have been getting the following output:
 
  Started analysis of Clip
 
  Approx 5% complete for Clip
  Approx 10% complete for Clip
  ...
  ...
  Approx 95% complete for Clip
  Approx 100% complete for Clip
 
  Analysis complete for Clip
 
  (.:9754): Gtk-WARNING **: cannot open display: 
 
  And then the job shows as failed in Galaxy. The output .dat file just
 has
  that same output/error message in it (though it seems to indicate it
 got to
  100%). Also when I try to execute the fastqc file directly (albeit with
 no
  arguments) I get this:
 
  Exception in thread main java.awt.HeadlessException:
  No X11 DISPLAY variable was set, but this program performed an operation
  which requires it.
  at
  java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:173)
  at java.awt.Window.init(Window.java:437)
  at java.awt.Frame.init(Frame.java:419)
  at java.awt.Frame.init(Frame.java:384)
  at javax.swing.JFrame.init(JFrame.java:174)
  at
 
 uk.ac.bbsrc.babraham.FastQC.FastQCApplication.init(FastQCApplication.java:271)
  at
 
 uk.ac.bbsrc.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:102)
 
  Both errors seem to have something to do with the graphical GUI
 component of
  FastQC (which I have seen some screenshots for on the FastQC webpage).
 If
  this application is GUI-driven how did the online PSU Galaxy get it to
 work
  with their wrapper script when the tools are run in a command-line
  environment with no X11 or Gtk? Essentially I'm just wondering what
 steps
  I'm missing here to getting this to work with our Galaxy mirror, other
 than
  just dropping the executable in place? Any suggestions?
 
  Thanks,
  Josh
 
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/



 --
 Ross Lazarus MBBS MPH;
 Associate Professor, Harvard Medical School;
 Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;



  The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT 
 *Registered
 Charity No. 1053902.*

 The information transmitted in this email is directed only to the
 addressee. If you received this in error, please contact the sender and
 delete this email from your system. The contents of this e-mail are the
 views of the sender and do not necessarily represent the views of the
 Babraham Institute. Full conditions at: 
 www.babraham.ac.ukhttp://www.babraham.ac.uk/email_disclaimer.html

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface

Re: [galaxy-dev] FastQC Tool Errors

2012-06-25 Thread Josh Nielsen
We are not currently running anything like Xvfb, but if that is the only
way to get it to run I suppose I can try it. How does PSU's Galaxy handle
grabbing the results and outputting them to Galaxy without having X11
applications (useless/unneeded with Galaxy - meant for manual thickclient
GUI interaction not browsers) starting on the servers on their end for
every user job executed? And how do they kill or manage the opened X11
sessions once started? The users do not/aren't supposed to see the X11 / R
Graphics, correct? They are only supposed to see whatever file output it
results in so that they can view it strictly through Galaxy, as per the
wrapper's text description: The tool produces a single HTML output file
that contains all of the results. It is only the results summary in HTML
that Galaxy  the user are concerned about, as I understand it.

We run Galaxy on a Linux cluster headnode (with no monitor  only boots to
init 3 [no GUI] - with admin-only ssh access) and it submits all jobs to
the compute nodes and then returns the results. I installed Java on each
compute nodes but do I now need to install that X11 virtual frame
buffer/Xvfb on all of the compute nodes also?

Thanks!

On Fri, Jun 22, 2012 at 5:35 PM, Ross ross.laza...@gmail.com wrote:

 Do you run an X11 virtual frame buffer - eg Xvfb?
 Otherwise AFAIK R graphics and Java will complain on headless nodes.

 On Sat, Jun 23, 2012 at 6:30 AM, Josh Nielsen jniel...@hudsonalpha.com
 wrote:
  Hello,
 
  I am having an issue with getting the FastQC tool to work with Galaxy on
 our
  server. I downloaded the FastQC files (version 0.8.0) and changed the
  directory that the wrapper script looks for the 'fastqc' executable in,
 but
  when we run a job with it we have been getting the following output:
 
  Started analysis of Clip
 
  Approx 5% complete for Clip
  Approx 10% complete for Clip
  ...
  ...
  Approx 95% complete for Clip
  Approx 100% complete for Clip
 
  Analysis complete for Clip
 
  (.:9754): Gtk-WARNING **: cannot open display: 
 
  And then the job shows as failed in Galaxy. The output .dat file just has
  that same output/error message in it (though it seems to indicate it got
 to
  100%). Also when I try to execute the fastqc file directly (albeit with
 no
  arguments) I get this:
 
  Exception in thread main java.awt.HeadlessException:
  No X11 DISPLAY variable was set, but this program performed an operation
  which requires it.
  at
  java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:173)
  at java.awt.Window.init(Window.java:437)
  at java.awt.Frame.init(Frame.java:419)
  at java.awt.Frame.init(Frame.java:384)
  at javax.swing.JFrame.init(JFrame.java:174)
  at
 
 uk.ac.bbsrc.babraham.FastQC.FastQCApplication.init(FastQCApplication.java:271)
  at
 
 uk.ac.bbsrc.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:102)
 
  Both errors seem to have something to do with the graphical GUI
 component of
  FastQC (which I have seen some screenshots for on the FastQC webpage). If
  this application is GUI-driven how did the online PSU Galaxy get it to
 work
  with their wrapper script when the tools are run in a command-line
  environment with no X11 or Gtk? Essentially I'm just wondering what steps
  I'm missing here to getting this to work with our Galaxy mirror, other
 than
  just dropping the executable in place? Any suggestions?
 
  Thanks,
  Josh
 
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/



 --
 Ross Lazarus MBBS MPH;
 Associate Professor, Harvard Medical School;
 Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] FastQC Tool Errors

2012-06-25 Thread Josh Nielsen
Hi Simon,

I recently installed Java with the yum package manager on our compute
nodes, and our cluster is a Centos 6 environment. Here is what the results
of java -version returned on the compute nodes:

*bash# java -version*
*java version 1.5.0*
*gij (GNU libgcj) version 4.4.4 20100726 (Red Hat 4.4.4-13)*
*
*
*Copyright (C) 2007 Free Software Foundation, Inc.*
*This is free software; see the source for copying conditions.  There is NO*
*warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE*
.

Is this version of Java too old? Perhaps I need to install the JRE manually?

Thanks!

On *Sat Jun 23 04:20:53 EDT 2012*, Simon Andrews 
simon.andr...@babraham.ac.uk
galaxy-dev%40lists.bx.psu.edu?Subject=Re%3A%20%5Bgalaxy-dev%5D%20FastQC%20Tool%20ErrorsIn-Reply-To=%3CD9909700-8628-4478-814C-449803EE45F1%40babraham.ac.uk%3E
 wrote:

 Are you by any chance running an older version of gcj as your java version?  
 There is a known bug in some of these where they don't correctly configure 
 the headless environment, even if the correct parameters are passed.  This 
 causes exactly the kind of errors you're seeing.

 If this is the case you'll need to install a more recent JRE (or update your 
 path to point to one which is already present).

 Simon.


 On Sat, Jun 23, 2012 at 6:30 AM, Josh Nielsen jniel...@hudsonalpha.com
 wrote:
  Hello,
 
  I am having an issue with getting the FastQC tool to work with Galaxy on
 our
  server. I downloaded the FastQC files (version 0.8.0) and changed the
  directory that the wrapper script looks for the 'fastqc' executable in,
 but
  when we run a job with it we have been getting the following output:
 
  Started analysis of Clip
 
  Approx 5% complete for Clip
  Approx 10% complete for Clip
  ...
  ...
  Approx 95% complete for Clip
  Approx 100% complete for Clip
 
  Analysis complete for Clip
 
  (.:9754): Gtk-WARNING **: cannot open display: 
 
  And then the job shows as failed in Galaxy. The output .dat file just has
  that same output/error message in it (though it seems to indicate it got
 to
  100%). Also when I try to execute the fastqc file directly (albeit with
 no
  arguments) I get this:
 
  Exception in thread main java.awt.HeadlessException:
  No X11 DISPLAY variable was set, but this program performed an operation
  which requires it.
  at
  java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:173)
  at java.awt.Window.init(Window.java:437)
  at java.awt.Frame.init(Frame.java:419)
  at java.awt.Frame.init(Frame.java:384)
  at javax.swing.JFrame.init(JFrame.java:174)
  at
 
 uk.ac.bbsrc.babraham.FastQC.FastQCApplication.init(FastQCApplication.java:271)
  at
 
 uk.ac.bbsrc.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:102)
 
  Both errors seem to have something to do with the graphical GUI
 component of
  FastQC (which I have seen some screenshots for on the FastQC webpage). If
  this application is GUI-driven how did the online PSU Galaxy get it to
 work
  with their wrapper script when the tools are run in a command-line
  environment with no X11 or Gtk? Essentially I'm just wondering what steps
  I'm missing here to getting this to work with our Galaxy mirror, other
 than
  just dropping the executable in place? Any suggestions?
 
  Thanks,
  Josh
 
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/



 --
 Ross Lazarus MBBS MPH;
 Associate Professor, Harvard Medical School;
 Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] FastQC Tool Errors

2012-06-25 Thread Josh Nielsen
Thanks Simon! I'll definitely give that a shot.

-Josh

On Mon, Jun 25, 2012 at 11:34 AM, simon andrews 
simon.andr...@babraham.ac.uk wrote:

  Yes, that's the broken version of gcj.  I don't have a Centos machine
 here at the moment, but I think if you install OpenJDK and use the
 alternatives system to select that as the default JRE then that should fix
 things.

  Simon.

  On 25 Jun 2012, at 17:19, Josh Nielsen wrote:

  Hi Simon,

 I recently installed Java with the yum package manager on our compute
 nodes, and our cluster is a Centos 6 environment. Here is what the results
 of java -version returned on the compute nodes:

 *bash# java -version*
 *java version 1.5.0*
 *gij (GNU libgcj) version 4.4.4 20100726 (Red Hat 4.4.4-13)*
 *
 *
 *Copyright (C) 2007 Free Software Foundation, Inc.*
 *This is free software; see the source for copying conditions.  There is
 NO*
 *warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
 PURPOSE*.

  Is this version of Java too old? Perhaps I need to install the JRE
 manually?

 Thanks!

 On *Sat Jun 23 04:20:53 EDT 2012*, Simon Andrews 
 simon.andr...@babraham.ac.uk 
 galaxy-dev%40lists.bx.psu.edu?Subject=Re%3A%20%5Bgalaxy-dev%5D%20FastQC%20Tool%20ErrorsIn-Reply-To=%3CD9909700-8628-4478-814C-449803EE45F1%40babraham.ac.uk%3E
  wrote:

 Are you by any chance running an older version of gcj as your java version?  
 There is a known bug in some of these where they don't correctly configure 
 the headless environment, even if the correct parameters are passed.  This 
 causes exactly the kind of errors you're seeing.

 If this is the case you'll need to install a more recent JRE (or update your 
 path to point to one which is already present).

 Simon.


 On Sat, Jun 23, 2012 at 6:30 AM, Josh Nielsen jniel...@hudsonalpha.com
 wrote:
  Hello,
 
  I am having an issue with getting the FastQC tool to work with Galaxy
 on our
  server. I downloaded the FastQC files (version 0.8.0) and changed the
  directory that the wrapper script looks for the 'fastqc' executable in,
 but
  when we run a job with it we have been getting the following output:
 
  Started analysis of Clip
 
  Approx 5% complete for Clip
  Approx 10% complete for Clip
  ...
  ...
  Approx 95% complete for Clip
  Approx 100% complete for Clip
 
  Analysis complete for Clip
 
  (.:9754): Gtk-WARNING **: cannot open display: 
 
  And then the job shows as failed in Galaxy. The output .dat file just
 has
  that same output/error message in it (though it seems to indicate it
 got to
  100%). Also when I try to execute the fastqc file directly (albeit with
 no
  arguments) I get this:
 
  Exception in thread main java.awt.HeadlessException:
  No X11 DISPLAY variable was set, but this program performed an operation
  which requires it.
  at
  java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:173)
  at java.awt.Window.init(Window.java:437)
  at java.awt.Frame.init(Frame.java:419)
  at java.awt.Frame.init(Frame.java:384)
  at javax.swing.JFrame.init(JFrame.java:174)
  at
 
 uk.ac.bbsrc.babraham.FastQC.FastQCApplication.init(FastQCApplication.java:271)
  at
 
 uk.ac.bbsrc.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:102)
 
  Both errors seem to have something to do with the graphical GUI
 component of
  FastQC (which I have seen some screenshots for on the FastQC webpage).
 If
  this application is GUI-driven how did the online PSU Galaxy get it to
 work
  with their wrapper script when the tools are run in a command-line
  environment with no X11 or Gtk? Essentially I'm just wondering what
 steps
  I'm missing here to getting this to work with our Galaxy mirror, other
 than
  just dropping the executable in place? Any suggestions?
 
  Thanks,
  Josh
 
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/



 --
 Ross Lazarus MBBS MPH;
 Associate Professor, Harvard Medical School;
 Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;



  The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT 
 *Registered
 Charity No. 1053902.*

 The information transmitted in this email is directed only to the
 addressee. If you received this in error, please contact the sender and
 delete this email from your system. The contents of this e-mail are the
 views of the sender and do not necessarily represent the views of the
 Babraham Institute. Full conditions at: 
 www.babraham.ac.ukhttp://www.babraham.ac.uk/email_disclaimer.html

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] FastQC Tool Errors

2012-06-22 Thread Josh Nielsen
Hello,

I am having an issue with getting the FastQC tool to work with Galaxy on
our server. I downloaded the FastQC files (version 0.8.0) and changed the
directory that the wrapper script looks for the 'fastqc' executable in, but
when we run a job with it we have been getting the following output:

Started analysis of Clip

Approx 5% complete for Clip
Approx 10% complete for Clip
...
...
Approx 95% complete for Clip
Approx 100% complete for Clip

Analysis complete for Clip

(.:9754): Gtk-WARNING **: cannot open display: 

And then the job shows as failed in Galaxy. The output .dat file just has
that same output/error message in it (though it seems to indicate it got to
100%). Also when I try to execute the fastqc file directly (albeit with no
arguments) I get this:

Exception in thread main java.awt.HeadlessException:
No X11 DISPLAY variable was set, but this program performed an operation
which requires it.
at
java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:173)
at java.awt.Window.init(Window.java:437)
at java.awt.Frame.init(Frame.java:419)
at java.awt.Frame.init(Frame.java:384)
at javax.swing.JFrame.init(JFrame.java:174)
at
uk.ac.bbsrc.babraham.FastQC.FastQCApplication.init(FastQCApplication.java:271)
at
uk.ac.bbsrc.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:102)

Both errors seem to have something to do with the graphical GUI component
of FastQC (which I have seen some screenshots for on the FastQC webpage).
If this application is GUI-driven how did the online PSU Galaxy get it to
work with their wrapper script when the tools are run in a command-line
environment with no X11 or Gtk? Essentially I'm just wondering what steps
I'm missing here to getting this to work with our Galaxy mirror, other than
just dropping the executable in place? Any suggestions?

Thanks,
Josh
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] user data upload directory structure

2012-05-16 Thread Josh Nielsen
Ah, yes. This is what I was just requesting yesterday in the email
that I sent, although it was much more long-winded. I didn't see this
email chain from the day before. Having a user-representative
directory structure would be beneficial in my mind.

I followed/understood your suggested directory structure up until the
arrows. Are those supposed to be symlinks? If so, what do you have in
mind? I was thinking that just having those subdirectories by user id
under files/ would be enough (although I could see how you could
symlink them to some other arbitrary location if you so desired).

My desired application was so that I could set up an FTP share to the
files/ directory so that our users could copy their (processed) files
off of the Galaxy server to other servers in our environment as well
as one of our other clusters. Having the datasets segregated into the
user's/owner's subdirectories would make it easier to identify and
copy them off for that purpose.

-Josh

Nate-
I do know about the disk accounting/quota features of Galaxy
As I eluded in my previous email, it goes beyond accounting actually. I
wanted to be able to implement something like:
~/galaxy-dist/database/files/user_id_000 - /one_data_pool_set/id_000
~/galaxy-dist/database/files/user_id_001 - /another_data_pool_set/id_001
which would match the usual data placement from a scheduler perspective too.
I'll look at  galaxy-dist/lib/galaxy/objectstore/__init__.py
Thanks a lot
JC
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] ? IOError: [Errno 13] Permission denied: u'/home/koala2/galaxy-central/database/compiled_templates/base_panels.mako.py'

2012-05-16 Thread Josh Nielsen
I have seen this error many times (for various mako.py files) and it is
often either when the user running it has insufficient privileges or the
file permissions on a folder (could be several directories upstream) or a
file have changed in some way. If you do a 'ps -ef | grep python' do you
see the paster.py process running with your expected user? Do you use sudo
to elevate the privileges when you run it?

I use a 'galaxy' user but I execute run.sh (which I have created a service
script for in /etc/init.d) with sudo while logged in as galaxy. Also you
might want to do an 'ls -l' in the compiled_templates/ directory and look
at the file permissions and file ownership.

-Josh
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Idea for user-based dataset subdirectories

2012-05-16 Thread Josh Nielsen
Hi David,

Actually that is an interesting idea to use a daemon to move the files into
associated user directories. Is that something that Galaxy Dev is
working/can work on, or was that just a suggestion? I'm not opposed to
doing any dev work of my own, but I don't really know Python that well and
I know most of the Galaxy code is Python.

I'm not sure that I follow what you are talking about with the joint
user/galaxy directory though. I'm of course wanting it to not be unified
(not all in the same directory) and rather be segregated by user into user
subdirectories, but I think you already caught that so I guess I just
didn't understand what you were getting at.

Josh Nielsen

--

How about if there were a completely separate daemon that monitored the galaxy 
database periodically to determine what datasets belong to which user(s).  Then
it would move the actual dataset to an area owned by the user and group 
accessible to galaxy, replacing the dataset with a symlink.  This would 
require no changes to the galaxy build, but it would require a constant 
monitoring system.

There is already a mechanism for users to move their files into a joint 
user/galaxy directory, but it is (as far as I know) only allowed for 
libraries, not
histories.  It would be better if there were a way for users to browse through 
their own directories as a tool, and be able to load files directly into their 
history.


David Hoover
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Idea for user-based dataset subdirectories

2012-05-16 Thread Josh Nielsen
Thanks for breaking that down for me. We are trying to set up some dev
machines in our environment in a few weeks and I may create a clone of our
production Galaxy mirror and play around with that version to see if I can
get the functionality that I'm looking for. I'll take that idea about
having a daemon into consideration.

Regards,
Josh

On Wed, May 16, 2012 at 1:08 PM, David Hoover hoove...@helix.nih.govwrote:

 No, this was all an idea I've had for a while, but never did anything
 about it.  I'm pretty sure the Galaxy developers are not interested in
 anything this locally-centric, and I don't blame them.  It ought to be
 something outside the Galaxy build completely, because Galaxy is meant to
 be system-independent.

 What I meant by 'joint user/galaxy directory' is a directory that is owned
 by a user, but that the galaxy user has read (and possibly write) access
 to.  This is entirely possible given either a well-informed user
 population, or an iron-clad suexec executable.

 The mechanism I alluded to is a feature by which a user can upload a
 directory of files all at once.  There is a configuration directive in
 universe_wsgi.ini, user_library_import_dir, that allows non-administrative
 users to upload an entire directory of files into a library.  The directive
 identifies the base directory, within which subdirectories named as the
 galaxy user login (email address) are searched.  The
 user_library_import_dir directory is owned by the galaxy user, and the
 subdirectories are owned by the user, but group owned by the galaxy user.
  A user will copy files to the subdirectory, login to galaxy, switch to
 their library, and upload all the files in the directory into a single
 library folder.

 There isn't much documentation about it in the main Galaxy wiki, so forget
 that.  I haven't enabled it in our local production site, and I haven't
 played with it in a long time.  I'm pretty sure that the files are not
 removed after uploading, and a user is free to re-upload the files again
 and again, so it's kind of quirky.  Also, if the files are not readable by
 the galaxy user, a bizarre and unhelpful error is thrown.

 If this functionality could be extended and elaborated, it could do what
 you want.  The user_library_import_dir requires that the user's login in
 Galaxy must be identical to the the user's login on the cluster, and that
 the permissions be kept correct.  Typically users have no idea what is
 going on with their permissions, so what are you going to do?

 David

 On May 16, 2012, at 1:33 PM, Josh Nielsen wrote:

  Hi David,
 
  Actually that is an interesting idea to use a daemon to move the files
 into associated user directories. Is that something that Galaxy Dev is
 working/can work on, or was that just a suggestion? I'm not opposed to
 doing any dev work of my own, but I don't really know Python that well and
 I know most of the Galaxy code is Python.
 
  I'm not sure that I follow what you are talking about with the joint
 user/galaxy directory though. I'm of course wanting it to not be unified
 (not all in the same directory) and rather be segregated by user into user
 subdirectories, but I think you already caught that so I guess I just
 didn't understand what you were getting at.
 
  Josh Nielsen
 
  --
  How about if there were a completely separate daemon that monitored the
 galaxy database periodically to determine what datasets belong to which
 user(s).  Then
  it would move the actual dataset to an area owned by the user and group
 accessible to galaxy, replacing the dataset with a symlink.  This would
 require no changes to the galaxy build, but it would require a constant
 monitoring system.
 
  There is already a mechanism for users to move their files into a joint
 user/galaxy directory, but it is (as far as I know) only allowed for
 libraries, not
  histories.  It would be better if there were a way for users to browse
 through their own directories as a tool, and be able to load files directly
 into their history.
 
 
  David Hoover
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Idea for user-based dataset subdirectories

2012-05-15 Thread Josh Nielsen
Hello,

Please forgive the length of this proposition as I try to explain my
reasoning behind this. Let me say first of all that I understand that
Galaxy is not meant to be everything to everyone and that requests for
features may not suit everyone who uses Galaxy. That being said I have an
idea or request that I think would be convenient for dealing with user's
datasets from a file-system perspective.

Galaxy has the obvious benefit and advantage (compared to manual
job-submission for tools on a cluster) of providing an interface for using
all the analysis tools, and the history of the operations done on your
data, all in one place. However I have found that putting all the output 
datasets in one directory (the files/000/ directory) on the file-system
causes a problem for the users if they specifically want to interact with
it *on the file-system*, and not just through the Web interface - for
whatever complicated or diverse reasons.

Since Galaxy runs on a cluster of its own in our environment, and we do not
allow users to remote connect into it to submit manual jobs (and
individually output it to their separate home directories) like we do our
main cluster, it is essentially a black box beyond the GUI interface of
Galaxy. That is essentially what we want except for how they can interact
with the output files.

The issue is that our users would like an easy means of copying their files
off of the Galaxy cluster to other servers from a command line (possibly
even automated by scripts). Even if we allow an FTP share of the output
directory for users to do that, the common
[galaxy-dist]/database/files/000/ directory clumps all of the files for all
users together in one directory and uses a sequential file-naming scheme
(dataset_N++) that is not easy to discriminate between as to who the owner
is for each file.

Is there a way that the dataset output directory locations could be
designed (or set optionally?) like the FTP upload feature's expected
directory structure: where the files are dropped into the corresponding
subdirectory of the user who produced it? For example having under
database/files/ subdirectories named according to the user's Galaxy account
id
(like [galaxy-dist]/database/files/jsmith,
[galaxy-dist]/database/files/sparker,
etc.). If they could be segregated by user it would be much easier to keep
track of what datasets belong to whom on the file-system. Then I could
possibly set up a read-only FTP share to the files/ directory on the
cluster, from which the users could directly copy the files in their
personal subdirectory to other systems, and perhaps batch download them,
rather than having to rely solely on the Web interface.

I understand that the way Galaxy is currently designed is that the files
are just generically named (the behind-the-scenes handling of data is a
black box) and it is the database that keeps track of which files belong to
whom, and which has the metadata for more meaningful dataset/job names,
etc. But a file-system hierarchy alternative would also be welcome in a
heavily command-line oriented computational environment too.

Would setting up a more user-representative output directory hierarchy on
the file-system like that be possible?

Best Regards,
Josh Nielsen
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Migrate history datasets to new database

2012-05-07 Thread Josh Nielsen
Hello all,

I recently tried moving from a tarball install of Galaxy to a Mercurial
managed one, and in the process something went wrong with the database
upgrade. I had intended to install the Mercurial-based Galaxy separately
(though on the same machine) and then move it to production once it was
working but it installed in-place over the current database while it was
still completely active/up and running. That broke  my existing Galaxy
install and I had to move to the new install immediately. I recall having
to run the Mercurial install's run.sh script multiple times though because
the upgrade sequence (looked like 87-88, 88-89, etc. as it progressed)
did not complete all the way the first time. I also ran it as root when I
probably should have done it as our galaxy user. Long story short now I
cannot log in to Galaxy even though Galaxy recognizes correct credentials
from the database. My debugging so far has not yielded any results.

At this point after a week of unsuccessful attempts to repair the existing
install I just want to create a fresh database and migrate over our users'
history and dataset (and possibly login credentials) information stored in
the database to the new one, if at all possible. Could someone give me any
guidance as to how to do that, and which table files (MYI, MYD, etc.) that
I should copy over into the new mysql database to make that happen?

P.S. I do have to thank Dannon Baker for helping me so far through private
email correspondence to try to figure out what went wrong with the current
install. However I'm not having any breakthroughs and our local Galaxy
mirror has been down for over a week now and I just want to start fresh and
migrate over critical data if possible.

Thanks for your help,
Josh Nielsen
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] More meaningful dataset names/easier method of identifying?

2012-04-24 Thread Josh Nielsen
Hello,

For a while now with the Galaxy mirror that we have I have found on many
occasions a need to identify which dataset_*.dat files on the file system
(in the [galaxy_dist]/database/files/000/ directory) belong to which
user, and even for the same user to distinguish between their various
datasets. Files directly uploaded by the user will have a Galaxy job 
dataset file name which match - like a Galaxy job name of data 18 (for
example) which actually is reflective of the file name 'dataset_18.dat' on
the file system. However any analysis on that file thereafter that produces
another dataset does not give you a clue of the corresponding file name.
For example, a Clip on data 18 run some time later may be called
'dataset_44.dat' on the filesystem, and a Map with Bowtie on data 18 that
runs on the clipped 'dataset_44.dat' may produce an output file of
'dataset_53.dat'.

When debugging failed jobs, and after the user has rerun them for the
umpteenth time, there may be dozens of identical or near-identical files to
weed through, and the generic naming scheme is not helpful even though it
is sequential (also not easy to keep track of/match up unless you are
watching the file writes in the directory live). The current implementation
makes sense for internal usage and the code that uses it, but it is
difficult for a human to distinguish which files match the jobs in Galaxy.

It would be useful to have more meaningful dataset file names or an easier
way to identify them (a record that matches the internal and external
names) for administrative maintenance reasons so that I can delete files,
or possibly even export those .dat files to a network share where our users
can perform manual analysis on them. Could anyone point me to where in the
code I could look to make the dataset names more meaningful? Or perhaps I
should request of the Galaxy developers (as a feature) a way for the users
themselves to see under the metadata name of their job (like Map with
Bowtie on data 18) in the right side pane the *actual* corresponding file
and location on the file system path to it (dataset_53.dat, for example).
Or if not for users at least something for Administrators. Even a database
that has four columns for the internal/filesystem dataset name, the job
metadata name, the Galaxy job number (that the user sees), and the user
that the dataset belongs to, would be helpful. A lot of our users are heavy
into informatics though and would probably prefer that the user be able to
see that information. Does anyone have any suggestions or thoughts about
this?

Thanks,
Josh Nielsen
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Galaxy pbs scripts and job_working_directory files

2012-04-24 Thread Josh Nielsen
Hello again,

As I mentioned in a recent post I often find need to debug jobs running
from our local Galaxy mirror and I often have the need to look at the
script  data files that the job is trying to use in order to figure out
what is causing a problem. The directories containing those file are in
'[galaxy_dist]/database/pbs/' and
'[galaxy_dist]/database/job_working_directory/' for me. Each job that is
run gets a corresponding .sh file in the pbs/ directory (like 344.sh) which
will have the entire sequence of bash commands to execute the job with and
also a call to a wrapper script somewhere in the middle normally. That
script information is very useful, but the problem is that when a job fails
(often within the first 30 seconds of running it) the script is deleted and
there is no trace of it left in the directory. The same with the output or
job data files in job_working_directory/.

I have had to suffice with using the technique of coordinating with the
user when to (re)run their failed job and then quickly within the 30 second
window do a cp -R script_I_care_about.sh copy_of_script.sh command, so
that when the script is deleted I have a copy that I can examine. The same
goes with copying the job_working_directory/ files. I know that it would
get very cluttered in those directories if they were not automatically
cleaned/deleted but I find those files essential for debugging. Is there a
way to force Galaxy to retain those files (optionally) for debugging
purposes? Maybe make a new option in the universe.ini file for that purpose
that can be set for people who want it?

Thanks,
Josh Nielsen
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Galaxy pbs scripts and job_working_directory files

2012-04-24 Thread Josh Nielsen
Ah, so this would just betray my ignorance of current features. :-) I'll
give that a try!

Thanks,
Josh

On Tue, Apr 24, 2012 at 4:09 PM, Dannon Baker dannonba...@me.com wrote:

 Josh,

 Check out the cleanup_job setting in universe_wsgi.ini(and included
 below).  It sounds like 'cleanup_job = onsuccess' is exactly what you're
 looking for.

 -Dannon

 # Clean up various bits of jobs left on the filesystem after completion.
  These
 # bits include the job working directory, external metadata temporary
 files,
 # and DRM stdout and stderr files (if using a DRM).  Possible values are:
 # always, onsuccess, never
 #cleanup_job = always

 On Apr 24, 2012, at 5:00 PM, Josh Nielsen wrote:

  Hello again,
 
  As I mentioned in a recent post I often find need to debug jobs running
 from our local Galaxy mirror and I often have the need to look at the
 script  data files that the job is trying to use in order to figure out
 what is causing a problem. The directories containing those file are in
 '[galaxy_dist]/database/pbs/' and
 '[galaxy_dist]/database/job_working_directory/' for me. Each job that is
 run gets a corresponding .sh file in the pbs/ directory (like 344.sh) which
 will have the entire sequence of bash commands to execute the job with and
 also a call to a wrapper script somewhere in the middle normally. That
 script information is very useful, but the problem is that when a job fails
 (often within the first 30 seconds of running it) the script is deleted and
 there is no trace of it left in the directory. The same with the output or
 job data files in job_working_directory/.
 
  I have had to suffice with using the technique of coordinating with the
 user when to (re)run their failed job and then quickly within the 30 second
 window do a cp -R script_I_care_about.sh copy_of_script.sh command, so
 that when the script is deleted I have a copy that I can examine. The same
 goes with copying the job_working_directory/ files. I know that it would
 get very cluttered in those directories if they were not automatically
 cleaned/deleted but I find those files essential for debugging. Is there a
 way to force Galaxy to retain those files (optionally) for debugging
 purposes? Maybe make a new option in the universe.ini file for that purpose
 that can be set for people who want it?
 
  Thanks,
  Josh Nielsen
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Trouble linking to/displaying at local Genome Browser

2012-02-27 Thread Josh Nielsen
Hello all,

I am having some trouble getting a display at link to my local UCSC
Genome Browser mirror to show up under my Galaxy jobs for viewing, although
I can get the link to the online UCSC GB to display. I followed the
suggestions in the following thread but it has yielded no results yet:
http://gmod.827538.n3.nabble.com/using-local-Genome-Browser-mirror-td1829290.html
.

Here's what I have tried:
- I edited the [galaxy-dir]/tool-data/shared/ucsc/ucsc_build_sites.txt file
and duplicated the 'main' UCSC entry but changed the new entry's name to
'internalgb' and changed the URL to point to our local GB with *
cgi-bin/hgTracks?* at the end
- I set *ucsc_display_sites = internalgb* in universe_wsgi.ini
- I made a modified copy of
[galaxy-dir]/tools/data_source/ucsc_tablebrowser.xml
(named HAIB_ucsc_tablebrowser.xml) in the same directory as the other XML
files and created a tool link on the left side of the Galaxy page for it by
adding it to the tool_conf.xml file (which works fine: when I click on it
it loads our local Genome Browser inside the central Galaxy window frame)
- I edited the HAIB_ucsc_tablebrowser.xml file according to the
recommendation in the thread above: *I had to change the name and id of
the tool in the new ucsc_tablebrowser.xml file, and keep the id the same
for the param.toold_id.value as well*. So I gave it a unique id
(changed ucsc_table_direct1
to HAIB_table_direct1) but kept the tool_id value field the same as the
original/main UCSC tablebrowser (ucsc_table_direct1). Then I put the new
HAIB id for our local tablebrowser into universe_wsgi.ini under the tool
runners section as *HAIB_table_direct1 = local:///*
- And of course I have restarted Galaxy after each change.

Still after all of this I cannot see a display at link for bam  other
files from Galaxy for my local UCSC GB. When I change universe_wsgi.ini to
have ucsc_display_sites = internalgb,main set (for both browsers) the
online/main GB will have a display at link on Galaxy jobs but no link
shows for my local browser mirror. I'm also a little fuzzy on the relation
between the ucsc_build_sites.txt file and the *_ucsc_tablebrowser.xml
files (if any). Any tips as to what I might be missing here?

Thanks!
Josh
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Trouble linking to/displaying at local Genome Browser

2012-02-27 Thread Josh Nielsen
Hi Dan,

Good catch sir! Genius! It is working now. The spacing was exactly the same
to the eye in vi for all the entries, but once I did a 'set list' in vi to
display formatting characters I saw that 'main' had the tab ^I characters
but my 'internalgb' entry didn't. This must have happened because I did a
visual copy with my cursor. Thank you.

-Josh

On Mon, Feb 27, 2012 at 11:09 AM, Daniel Blankenberg d...@bx.psu.edu wrote:

 Hi Josh,

 It sounds like you are really close to getting this to work. My first
 guess would be that there is an issue with the modified
 ucsc_build_sites.txt file. Can you check that the line you copied is tab
 delimited and not space delimited?


 Thanks for using Galaxy,

 Dan

 On Feb 27, 2012, at 11:46 AM, Josh Nielsen wrote:

 Hello all,

 I am having some trouble getting a display at link to my local UCSC
 Genome Browser mirror to show up under my Galaxy jobs for viewing, although
 I can get the link to the online UCSC GB to display. I followed the
 suggestions in the following thread but it has yielded no results yet:
 http://gmod.827538.n3.nabble.com/using-local-Genome-Browser-mirror-td1829290.html
 .

 Here's what I have tried:
 - I edited the [galaxy-dir]/tool-data/shared/ucsc/ucsc_build_sites.txt
 file and duplicated the 'main' UCSC entry but changed the new entry's name
 to 'internalgb' and changed the URL to point to our local GB with *
 cgi-bin/hgTracks?* at the end
 - I set *ucsc_display_sites = internalgb* in universe_wsgi.ini
 - I made a modified copy of
 [galaxy-dir]/tools/data_source/ucsc_tablebrowser.xml
 (named HAIB_ucsc_tablebrowser.xml) in the same directory as the other XML
 files and created a tool link on the left side of the Galaxy page for it by
 adding it to the tool_conf.xml file (which works fine: when I click on it
 it loads our local Genome Browser inside the central Galaxy window frame)
 - I edited the HAIB_ucsc_tablebrowser.xml file according to the
 recommendation in the thread above: *I had to change the name and id of
 the tool in the new ucsc_tablebrowser.xml file, and keep the id the same
 for the param.toold_id.value as well*. So I gave it a unique id (changed
 ucsc_table_direct1 to HAIB_table_direct1) but kept the tool_id value
 field the same as the original/main UCSC tablebrowser (ucsc_table_direct1).
 Then I put the new HAIB id for our local tablebrowser into universe_wsgi.ini
 under the tool runners section as *HAIB_table_direct1 = local:///*
 - And of course I have restarted Galaxy after each change.

 Still after all of this I cannot see a display at link for bam  other
 files from Galaxy for my local UCSC GB. When I change universe_wsgi.ini
 to have ucsc_display_sites = internalgb,main set (for both browsers)
 the online/main GB will have a display at link on Galaxy jobs but no link
 shows for my local browser mirror. I'm also a little fuzzy on the relation
 between the ucsc_build_sites.txt file and the *_ucsc_tablebrowser.xml
 files (if any). Any tips as to what I might be missing here?

 Thanks!
 Josh
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Trouble linking to/displaying at local Genome Browser

2012-02-27 Thread Josh Nielsen
One more thing while I'm at it. For clarification would only the first two
steps I listed be sufficient to get the display at link in Galaxy? If so
what is the benefit of the other steps (other than having an actual link to
your local genome browser mirror in Galaxy)? There's nothing that ties my
internalgb entry to the new tool_id I created is there?


On Mon, Feb 27, 2012 at 11:58 AM, Josh Nielsen jniel...@hudsonalpha.comwrote:

 Hi Dan,

 Good catch sir! Genius! It is working now. The spacing was exactly the
 same to the eye in vi for all the entries, but once I did a 'set list' in
 vi to display formatting characters I saw that 'main' had the tab ^I
 characters but my 'internalgb' entry didn't. This must have happened
 because I did a visual copy with my cursor. Thank you.

 -Josh

 On Mon, Feb 27, 2012 at 11:09 AM, Daniel Blankenberg d...@bx.psu.eduwrote:

 Hi Josh,

 It sounds like you are really close to getting this to work. My first
 guess would be that there is an issue with the modified
 ucsc_build_sites.txt file. Can you check that the line you copied is tab
 delimited and not space delimited?


 Thanks for using Galaxy,

 Dan

 On Feb 27, 2012, at 11:46 AM, Josh Nielsen wrote:

 Hello all,

 I am having some trouble getting a display at link to my local UCSC
 Genome Browser mirror to show up under my Galaxy jobs for viewing, although
 I can get the link to the online UCSC GB to display. I followed the
 suggestions in the following thread but it has yielded no results yet:
 http://gmod.827538.n3.nabble.com/using-local-Genome-Browser-mirror-td1829290.html
 .

 Here's what I have tried:
 - I edited the [galaxy-dir]/tool-data/shared/ucsc/ucsc_build_sites.txt
 file and duplicated the 'main' UCSC entry but changed the new entry's name
 to 'internalgb' and changed the URL to point to our local GB with *
 cgi-bin/hgTracks?* at the end
 - I set *ucsc_display_sites = internalgb* in universe_wsgi.ini
 - I made a modified copy of
 [galaxy-dir]/tools/data_source/ucsc_tablebrowser.xml
 (named HAIB_ucsc_tablebrowser.xml) in the same directory as the other
 XML files and created a tool link on the left side of the Galaxy page for
 it by adding it to the tool_conf.xml file (which works fine: when I click
 on it it loads our local Genome Browser inside the central Galaxy window
 frame)
 - I edited the HAIB_ucsc_tablebrowser.xml file according to the
 recommendation in the thread above: *I had to change the name and id of
 the tool in the new ucsc_tablebrowser.xml file, and keep the id the same
 for the param.toold_id.value as well*. So I gave it a unique id
 (changed ucsc_table_direct1 to HAIB_table_direct1) but kept the tool_id
 value field the same as the original/main UCSC tablebrowser (u
 csc_table_direct1). Then I put the new HAIB id for our local
 tablebrowser into universe_wsgi.ini under the tool runners section as 
 *HAIB_table_direct1
 = local:///*
 - And of course I have restarted Galaxy after each change.

 Still after all of this I cannot see a display at link for bam  other
 files from Galaxy for my local UCSC GB. When I change universe_wsgi.ini
 to have ucsc_display_sites = internalgb,main set (for both browsers)
 the online/main GB will have a display at link on Galaxy jobs but no link
 shows for my local browser mirror. I'm also a little fuzzy on the relation
 between the ucsc_build_sites.txt file and the *_ucsc_tablebrowser.xml
 files (if any). Any tips as to what I might be missing here?

 Thanks!
 Josh
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/




___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Issues displaying/downloading datasets

2012-01-18 Thread Josh Nielsen
I seem to be getting somewhere by monitoring the /var/log/httpd/error_log
file (I don't know why I didn't think of that before). For each dataset I
click I am seeing a corresponding message like this:

*[Wed Jan 18 09:31:06 2012] [debug] mod_proxy_http.c(56): proxy: HTTP:
canonicalising URL //localhost:8080/datasets/b23533e4ff1bb7ec/display/*
*
*
*[Wed Jan 18 09:31:06 2012] [error] [client 172.26.14.93] (20023)The given
path was above the root path: xsendfile: unable to find file:
/panfs/galaxy_data/database/files/000/dataset_118.dat, referer:
http://galaxy-dev.haib.org/galaxy/history*
*
*
*[Wed Jan 18 09:31:06 2012] [debug] mod_proxy_http.c(1836): proxy: end body
send*

So it seems that there is some issue with the xsendfile module, or with
apache serving up the dataset files,
because '/panfs/galaxy_data/database/files/000/dataset_118.dat' is present.
The only thing that I recall changing recently was to move
the /panfs/galaxy_data/database/ directory from the local file system to an
nfs share with the exact same path (mounted on /panfs). I'm not sure how
that would make a difference though. I'm still looking into the xsendfile
error lead though.

-Josh


On Tue, Jan 17, 2012 at 11:08 AM, Josh Nielsen jniel...@hudsonalpha.comwrote:

 Also, I just uploaded a 1.3GB FASTQ file and the small preview box in the
 history pane shows the first few lines, and when I click on the eye it
 actually displays in the window with the message This dataset is large and
 only the first megabyte is shown below. Show all | Save and it shows the
 first megabyte with no problems, but if I click 'Show all' or 'Save' I get
 the message The requested URL /galaxy/datasets/2faba7054d92b2df/display/
 was not found on this server from apache. So I'm having a specific problem
 with displaying the whole dataset according to the URL it is trying to load.

 -Josh

 -- Forwarded message --
 From: Josh Nielsen jniel...@hudsonalpha.com
 Date: Tue, Jan 17, 2012 at 10:22 AM
 Subject: Issues displaying/downloading datasets
 To: galaxy-dev@lists.bx.psu.edu


 Hello all,

 I recently have been having problems viewing/displaying datasets (with the
 eye icon) as well as downloading datasets in Galaxy which I have uploaded,
 although I can actually point to those datasets as input to other tools and
 they show up on the drop down menus and it runs perfectly. Every time that
 I click on the eye icon for a dataset in my history pane I get an apache
 error which displays in the window that says it cannot find
 /galaxy/datasets/X/display/?preview=True. I see a corresponding entry
 like this in paster.log (for example):

 *[17/Jan/2012:09:53:03 -0500] GET
 /galaxy/datasets/92b83968e0b52980/display/?preview=True HTTP/1.1 200 - 
 http://galaxy-dev.haib.org/galaxy/history; Mozilla/5.0 (Macintosh; Intel
 Mac OS X 10.6; rv:8.0.1) Gecko/20100101 Firefox/8.0.1*

 For downloads I get the same error except that the requested URL is: *GET
 /galaxy/datasets/92b83968e0b52980/display?to_ext=txt*

 The alphanumeric code is of course different for each dataset but I am
 puzzled at how to even debug this because I cannot find anywhere on the
 file system or under the galaxy-dist directory any path that is named
 datasets and has a display subfolder (so I assume it is an internal
 url/path notation). I looked at the python code some and all I got was a
 headache. I see that it uses a fetch url method to grab a specific url for
 each dataset but I'm not sure what it is actually looking for on the file
 system, or if the url is just an alias to something else. I thought to
 check in the MySQL database but didn't see any corresponding values that
 matched datasets or the alphanumeric code (which I still can't tell where
 it is getting that from). Everything else in Galaxy works fine except for
 this. Could anyone please point me in the right direction about how to
 debug this? It would be much appreciated!

 Thanks,
 Josh


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Issues displaying/downloading datasets

2012-01-18 Thread Josh Nielsen
Ah, following the lead in the log paid off. I had to add the statement
*XSendFilePath
/panfs/galaxy_data *to /etc/httpd/conf/httpd.conf. Apparently ever since I
had moved the dataset directory to /panfs/galaxy_data by setting the
file_path variable in universe_wsgi.ini I have not attempted to view a
dataset, and so I didn't even notice that the functionality broke when I
moved it at first. I just needed to point XSendFilePath to the new
directory and it worked. I hope this helps someone else if they encounter
the same problem.

Cheers,
Josh
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Issues displaying/downloading datasets

2012-01-17 Thread Josh Nielsen
Hello all,

I recently have been having problems viewing/displaying datasets (with the
eye icon) as well as downloading datasets in Galaxy which I have uploaded,
although I can actually point to those datasets as input to other tools and
they show up on the drop down menus and it runs perfectly. Every time that
I click on the eye icon for a dataset in my history pane I get an apache
error which displays in the window that says it cannot find
/galaxy/datasets/X/display/?preview=True. I see a corresponding entry
like this in paster.log (for example):

*[17/Jan/2012:09:53:03 -0500] GET
/galaxy/datasets/92b83968e0b52980/display/?preview=True HTTP/1.1 200 - 
http://galaxy-dev.haib.org/galaxy/history; Mozilla/5.0 (Macintosh; Intel
Mac OS X 10.6; rv:8.0.1) Gecko/20100101 Firefox/8.0.1*

For downloads I get the same error except that the requested URL is: *GET
/galaxy/datasets/92b83968e0b52980/display?to_ext=txt*

The alphanumeric code is of course different for each dataset but I am
puzzled at how to even debug this because I cannot find anywhere on the
file system or under the galaxy-dist directory any path that is named
datasets and has a display subfolder (so I assume it is an internal
url/path notation). I looked at the python code some and all I got was a
headache. I see that it uses a fetch url method to grab a specific url for
each dataset but I'm not sure what it is actually looking for on the file
system, or if the url is just an alias to something else. I thought to
check in the MySQL database but didn't see any corresponding values that
matched datasets or the alphanumeric code (which I still can't tell where
it is getting that from). Everything else in Galaxy works fine except for
this. Could anyone please point me in the right direction about how to
debug this? It would be much appreciated!

Thanks,
Josh
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Issues displaying/downloading datasets

2012-01-17 Thread Josh Nielsen
Also, I just uploaded a 1.3GB FASTQ file and the small preview box in the
history pane shows the first few lines, and when I click on the eye it
actually displays in the window with the message This dataset is large and
only the first megabyte is shown below. Show all | Save and it shows the
first megabyte with no problems, but if I click 'Show all' or 'Save' I get
the message The requested URL /galaxy/datasets/2faba7054d92b2df/display/
was not found on this server from apache. So I'm having a specific problem
with displaying the whole dataset according to the URL it is trying to load.

-Josh

-- Forwarded message --
From: Josh Nielsen jniel...@hudsonalpha.com
Date: Tue, Jan 17, 2012 at 10:22 AM
Subject: Issues displaying/downloading datasets
To: galaxy-dev@lists.bx.psu.edu


Hello all,

I recently have been having problems viewing/displaying datasets (with the
eye icon) as well as downloading datasets in Galaxy which I have uploaded,
although I can actually point to those datasets as input to other tools and
they show up on the drop down menus and it runs perfectly. Every time that
I click on the eye icon for a dataset in my history pane I get an apache
error which displays in the window that says it cannot find
/galaxy/datasets/X/display/?preview=True. I see a corresponding entry
like this in paster.log (for example):

*[17/Jan/2012:09:53:03 -0500] GET
/galaxy/datasets/92b83968e0b52980/display/?preview=True HTTP/1.1 200 - 
http://galaxy-dev.haib.org/galaxy/history; Mozilla/5.0 (Macintosh; Intel
Mac OS X 10.6; rv:8.0.1) Gecko/20100101 Firefox/8.0.1*

For downloads I get the same error except that the requested URL is: *GET
/galaxy/datasets/92b83968e0b52980/display?to_ext=txt*

The alphanumeric code is of course different for each dataset but I am
puzzled at how to even debug this because I cannot find anywhere on the
file system or under the galaxy-dist directory any path that is named
datasets and has a display subfolder (so I assume it is an internal
url/path notation). I looked at the python code some and all I got was a
headache. I see that it uses a fetch url method to grab a specific url for
each dataset but I'm not sure what it is actually looking for on the file
system, or if the url is just an alias to something else. I thought to
check in the MySQL database but didn't see any corresponding values that
matched datasets or the alphanumeric code (which I still can't tell where
it is getting that from). Everything else in Galaxy works fine except for
this. Could anyone please point me in the right direction about how to
debug this? It would be much appreciated!

Thanks,
Josh
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] How and where to install tool dependencies

2011-12-12 Thread Josh Nielsen
Thanks Nate! Updated documentation is always welcome and useful. I
appreciate the clarifications.

-J

On Mon, Dec 12, 2011 at 10:18 AM, Nate Coraor n...@bx.psu.edu wrote:

 On Dec 9, 2011, at 4:34 PM, Josh Nielsen wrote:

  Hello,
 
  I have a question which I have not seen specifically addressed in the
 online Galaxy wiki documentation about how to integrate tools
 (dependencies) into Galaxy. I have implemented a locally managed instance
 of Galaxy that my business is using with our cluster and now have a freshly
 installed and configured instance of Galaxy running. It is bare-bones right
 now and I did not use mercurial to sync any existing files/directory
 structures. I have seen the page on external tool dependencies (
 http://wiki.g2.bx.psu.edu/Admin/Tools/Tool%20Dependencies) needed for
 Galaxy, but I am somewhat unsure where to place the tools to utilize them
 as intended (other than through trial  error).
 
  It appears that there are shell directories for the tools under
 ~/galaxy-dist/tools/ with basic wrapper scripts but without the
 corresponding executables (very few that I've noticed have the tools
 already in them). Is the intent to download the dependency tools and
 (building from source if necessary) take the binaries in those directories
 and copy them to their corresponding directory under ~/galaxy-dist/tools/?
 This seems to have worked with an error I first got when clipping a FASTQ
 file which reported that fastx_clipper was not a recognized command. So I
 downloaded the FASTX Toolkit, compiled the binaries, and copied only the
 binaries into the corresponding fastx tools directory. Would I do the same
 thing for TopHat and Cufflinks by taking all their binaries (combined) and
 copying them into ~/galaxy-dist/tools/ngs_rna/?

 Hi Josh,

 There are two ways to do this.  The simplest is to place the binaries into
 a directory on the Galaxy user's $PATH.  The second is via the tool
 dependency system, which I need to write up documentation for to put in the
 wiki, which I'll do this week.

  Even if that is the case though, I have occasionally gotten errors about
 tools missing in completely different directories. One was for the FASTQ
 Groomer. One user saw this error in their browser (which for now is the
 only way I know to figure out where tools are *expected* to be):
 
  File /home/galaxy/galaxy-dist/tools/rgenetics/rgFastQC.py, line 141,
 in assert os.path.isfile(opts.executable),'##rgFastQC.py error - cannot
 find executable %s' % opts.executable AssertionError: ##rgFastQC.py error -
 cannot find executable
 /home/galaxy/galaxy-dist/tool-data/shared/jars/FastQC/fastqc

 Java JARs are a special case, and FastQC has a unique way of locating its
 jars, which is why it is expected to be found in that directory.  This
 needs to be documented.

 
  To fix this I downloaded the FastQC tar file from its webpage, unzipped
 it, and copied the fastqc binary/script to the
 home/galaxy/galaxy-dist/tool-data/shared/jars/FastQC/ directory. I also
 had to mkdir FastQC/ under jars/ to place it there since it didn't already
 exist. Had I not been told the specific directory by the error I'm not sure
 how I would have intuitively known to place the binary there (unless I'm
 overlooking some critical documentation). And how do I know that other
 similar things are not missing which should be there? Can anyone shed some
 light on this please? Adding a brief page on the Galaxy wiki site under the
 Admin section about this would really help, even if it only showed an
 example for one or two specific tools.

 The list of external dependencies by tool is maintained here:

 http://wiki.g2.bx.psu.edu/Admin/Tools/Tool%20Dependencies

 I'll update this page with links to the new documentation when I write it.
  I should also add that work is under way to make it possible to
 automatically install these dependencies as needed.

 --nate

 
  Thanks,
  Josh
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/




-- 
Josh Nielsen
Systems Administrator
HudsonAlpha Institute for Biotechnology
256-319-1485
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] How and where to install tool dependencies

2011-12-12 Thread Josh Nielsen
Thanks Jeremy,

That does clear quite a few things up. I will probably take the route of
just adding the directories that contain the tool executables to my path.
And I guess I'll just pay attention to which tools are Java based and copy
the executables under the jars directory. Thanks for looking on the wiki
too. If you find (or create) a suitable wiki page sometime soon would you
be so kind as to post a link to it here, for myself and posterity? That
would be great. Thanks for your help!

P.S. Sorry for the 'double post'. I don't use mailing lists often.

-J

On Sat, Dec 10, 2011 at 8:33 AM, Jeremy Goecks jeremy.goe...@emory.eduwrote:

 Josh,

 It appears that there are shell directories for the tools under
 ~/galaxy-dist/tools/ with basic wrapper scripts but without the
 corresponding executables (very few that I've noticed have the tools
 already in them). Is the intent to download the dependency tools and
 (building from source if necessary) take the binaries in those directories
 and copy them to their corresponding directory under ~/galaxy-dist/tools/?
 This seems to have worked with an error I first got when clipping a FASTQ
 file which reported that fastx_clipper was not a recognized command. So I
 downloaded the FASTX Toolkit, compiled the binaries, and copied only the
 binaries into the corresponding fastx tools directory. Would I do the same
 thing for TopHat and Cufflinks by taking all their binaries (combined) and
 copying them into ~/galaxy-dist/tools/ngs_rna/?


 You'll want to read about Galaxy Tool files a bit to understand the files
 in ~/galaxy-dist/tools:


 http://wiki.g2.bx.psu.edu/Admin/Tools/Tool%20Config%20Syntax#Admin.2BAC8-Tools.2BAC8-Tool_Config_Syntax.Galaxy_Tool_XML_File

 These are not shell directories; instead, they include tool config files +
 additional wrapper scripts to run a tool in Galaxy.

 To answer your question, executables for tools need to be in your path but
 do not need to be in the config/wrapper directories. For example, in an SGE
 cluster, we suggest setting the PATH environment var in ~/.sge_request

 Even if that is the case though, I have occasionally gotten errors about
 tools missing in completely different directories. One was for the FASTQ
 Groomer. One user saw this error in their browser (which for now is the
 only way I know to figure out where tools are *expected* to be):

 *File /home/galaxy/galaxy-dist/tools/rgenetics/rgFastQC.py, line 141,
 in assert os.path.isfile(opts.executable),'##rgFastQC.py error - cannot
 find executable %s' % opts.executable AssertionError: ##rgFastQC.py error -
 cannot find executable 
 /home/galaxy/galaxy-dist/tool-data/shared/jars/FastQC/fastqc
 *


 The exception to the above is Java-based tools. For these tools, you'll
 need to use the ~/galaxy-dist/shared/jars directory. This is a limitation
 of Galaxy that will likely be addressed in the future.

 Adding a brief page on the Galaxy wiki site under the Admin section about
 this would really help, even if it only showed an example for one or two
 specific tools.


 I looked a bit but couldn't find it; I suspect it is out on the wiki
 somewhere, though clearly it needs to be easier to find.

 Good luck,
 J.





-- 
Josh Nielsen
Systems Administrator
HudsonAlpha Institute for Biotechnology
256-319-1485
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] How and where to install tool dependencies

2011-12-10 Thread Josh Nielsen
Hello,

I have a question which I have not seen specifically addressed in the
online Galaxy wiki documentation about how to integrate tools
(dependencies) into Galaxy. I have implemented a locally managed instance
of Galaxy that my business is using with our cluster and now have a freshly
installed and configured instance of Galaxy running. It is bare-bones right
now and I did not use mercurial to sync any existing files/directory
structures. I have seen the page on external tool dependencies (
http://wiki.g2.bx.psu.edu/Admin/Tools/Tool%20Dependencies) needed for
Galaxy, but I am somewhat unsure where to place the tools to utilize them
as intended (other than through trial  error).

It appears that there are shell directories for the tools under
~/galaxy-dist/tools/ with basic wrapper scripts but without the
corresponding executables (very few that I've noticed have the tools
already in them). Is the intent to download the dependency tools and
(building from source if necessary) take the binaries in those directories
and copy them to their corresponding directory under ~/galaxy-dist/tools/?
This seems to have worked with an error I first got when clipping a FASTQ
file which reported that fastx_clipper was not a recognized command. So I
downloaded the FASTX Toolkit, compiled the binaries, and copied only the
binaries into the corresponding fastx tools directory. Would I do the same
thing for TopHat and Cufflinks by taking all their binaries (combined) and
copying them into ~/galaxy-dist/tools/ngs_rna/?

Even if that is the case though, I have occasionally gotten errors about
tools missing in completely different directories. One was for the FASTQ
Groomer. One user saw this error in their browser (which for now is the
only way I know to figure out where tools are *expected* to be):

*File /home/galaxy/galaxy-dist/tools/rgenetics/rgFastQC.py, line 141, in
assert os.path.isfile(opts.executable),'##rgFastQC.py error - cannot find
executable %s' % opts.executable AssertionError: ##rgFastQC.py error -
cannot find executable
/home/galaxy/galaxy-dist/tool-data/shared/jars/FastQC/fastqc
*

To fix this I downloaded the FastQC tar file from its webpage, unzipped it,
and copied the fastqc binary/script to the home/galaxy/galaxy
-dist/tool-data/shared/jars/FastQC/** directory. I also had to mkdir
FastQC/ under jars/ to place it there since it didn't already exist. Had I
not been told the specific directory by the error I'm not sure how I would
have intuitively known to place the binary there (unless I'm overlooking
some critical documentation). And how do I know that other similar things
are not missing which should be there? Can anyone shed some light on this
please? Adding a brief page on the Galaxy wiki site under the Admin section
about this would really help, even if it only showed an example for one or
two specific tools.

Thanks,
Josh Nielsen
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/