Re: [galaxy-dev] Downloading UCSC complete database

2012-10-07 Thread Sean Davis
On Sun, Oct 7, 2012 at 10:38 AM, Perez, Ricardo ricky_...@neo.tamu.edu wrote:
 Dear all,

 I am currently working on downloading the genome data from the UCSC database.
 I have figured out how to obtain the genome of one species at a time, however 
 this would take a bit of time if I have to type every command by hand.
 Is there any command that would download the all the data from the UCSC 
 databases?
 If not, how would I go to start in writing a script that would do so.

We mirror directly out of the mysql data directory.  In this script,
the /var/local/mysql directory is where the actual server files are
kept.

https://gist.github.com/3848717

Note, this does not download the .txt and .sql files.  Instead, it is
reading and writing the mysql files directly and may break if the
server versions are too dissimilar.  Also, be sure that you test this
a bit before trying it on your production database to make sure that
it is working as expected.

Sean
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Automatic citation list from a tool, workflow, or history

2011-12-15 Thread Sean Davis
On Thu, Dec 15, 2011 at 6:16 AM, Peter Cock p.j.a.c...@googlemail.comwrote:

 Dear all,

 It has become a convention that each tool/wrapper in
 Galaxy includes citation instructions in their help text
 (although not all the tools do this - I think they should).

 It occurred to me this could be formalised, with explicit
 markup in the tool XML file, embedding the citation
 (at very least with an identifier like the DOI or ISBN,
 there is probably a good existing XML standard
 that could be followed).

 Then, Galaxy would be able to automatically pull out
 a list of citations the tool authors have requested be
 cited, removing duplicates (e.g. matching DOI), from
 a history or a workflow.

 The aim of this is (a) to make it easier to write up your
 methods by supplying all the references, and (b) to help
 ensure tool authors get the acknowledgement they
 deserve.

 Does this sound like a good idea?


Hi, Peter.

I think this can be useful for tool authors, developers, and users.  As for
markup, bibtex has low barriers to entry, is a stable format, and could
easily be included in a citation tag as text and used semantically when
available.

Sean
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Staged Method for cluster running SGE?

2011-04-26 Thread Sean Davis
On Tue, Apr 26, 2011 at 5:11 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 Hi all,

 So far we've been running our local Galaxy instance on
 a single machine, but I would like to be able to offload
 (some) jobs onto our local SGE cluster. I've been reading
 https://bitbucket.org/galaxy/galaxy-central/wiki/Config/Cluster

 Unfortunately in our setup the SGE cluster head node is
 a different machine to the Galaxy server, and they do not
 (currently) have a shared file system. Once on the cluster,
 the head node and the compute nodes do have a shared
 file system.

 Therefore we will need some way of copying input data
 from the Galaxy server to the cluster, running the job,
 and once the job is done, copying the results back to the
 Galaxy server.

 The Staged Method on the wiki sounds relevant, but
 appears to be for TORQUE only (via pbs_python), not
 any of the other back ends (via DRMAA).

 Have I overlooked anything on the Cluster wiki page?

 Has anyone attempted anything similar, and could you
 offer any guidance or tips?

Hi, Peter.

You might consider setting up a separate queue for SGE jobs.  Then,
you could specify a prolog and epilog script that will copy files from
the galaxy machine into the cluster (in the prolog) and back to galaxy
(in the epilog).  This assumes that there is a way to map from one
file system to the other, but for Galaxy, that is probably the case
(galaxy files on the galaxy server are under the galaxy instance and
galaxy files on the cluster will probably be run as a single user in
that home directory).  I have not done this myself, but the advantage
to using prolog and epilog scripts is that galaxy jobs then do not
need any special configuration--all the work is done transparently by
SGE.

Sean
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] postgresql to galaxy

2011-04-12 Thread Sean Davis
Hi, Hari.

You should probably make sure that you can connect to postgres from
the command line before trying to connect using galaxy.  In
particular, it looks like you need to set up this file correctly:

http://www.postgresql.org/docs/8.2/static/auth-pg-hba-conf.html

Sean


On Tue, Apr 12, 2011 at 7:50 AM, hari krishna pobbati.h...@gmail.com wrote:

 Hi,
   I am planning to change database from sqlite to postgresql .
   for this i installed postgresql 8.1.2 and created user and database at
 my home location.
   from my home i can able to login to that database


   psql -d galaxy -U galaxy -h 192.168.65.8

 where database and user name as galaxy and hostname
 I modified universal.wsgi.ini file as
 database_connection
 =postgres:///galaxy?user=galaxypassword=galaxy123!@#[?host=/var/run/postgresql]

 database_engine_option_strategy = threadlocal
 database_engine_option_server_side_cursors = True
 database_engine_option_pool_size = 5

 database_engine_option_max_overflow = 10

 after these modification when i ran the server am getting error like this:


 *
 Traceback (most recent call last):

   File
 /home/gridmon/hari/galaxy_new/galaxy-central/lib/galaxy/web/buildapp.py,
 line 82, in app_factory
 app = UniverseApplication( global_conf = global_conf, **kwargs )
   File /home/gridmon/hari/galaxy_new/galaxy-central/lib/galaxy/app.py,
 line 30, in __init__

 create_or_verify_database( db_url, self.config.database_engine_options )
   File
 /home/gridmon/hari/galaxy_new/galaxy-central/lib/galaxy/model/migrate/check.py,
 line 54, in create_or_verify_database
 dataset_table = Table( dataset, meta, autoload=True )

   File
 /home/gridmon/hari/galaxy_new/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.5.egg/sqlalchemy/schema.py,
 line 108, in __call__
 return type.__call__(self, name, metadata, *args, **kwargs)
   File
 /home/gridmon/hari/galaxy_new/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.5.egg/sqlalchemy/schema.py,
 line 236, in __init__

 _bind_or_error(metadata).reflecttable(self,
 include_columns=include_columns)
   File
 /home/gridmon/hari/galaxy_new/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.5.egg/sqlalchemy/engine/base.py,
 line 1261, in reflecttable

 conn = self.contextual_connect()
   File
 /home/gridmon/hari/galaxy_new/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.5.egg/sqlalchemy/engine/threadlocal.py,
 line 194, in contextual_connect
 return self.session.get_connection(**kwargs)

   File
 /home/gridmon/hari/galaxy_new/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.5.egg/sqlalchemy/engine/threadlocal.py,
 line 20, in get_connection
 return self.engine.TLConnection(self, self.engine.pool.connect(),
 close_with_result=close_with_result)

   File
 /home/gridmon/hari/galaxy_new/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.5.egg/sqlalchemy/pool.py,
 line 151, in connect
 agent = _ConnectionFairy(self)
   File
 /home/gridmon/hari/galaxy_new/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.5.egg/sqlalchemy/pool.py,
 line 304, in __init__

 rec = self._connection_record = pool.get()
   File
 /home/gridmon/hari/galaxy_new/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.5.egg/sqlalchemy/pool.py,
 line 161, in get
 return self.do_get()

   File
 /home/gridmon/hari/galaxy_new/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.5.egg/sqlalchemy/pool.py,
 line 639, in do_get
 con = self.create_connection()
   File
 /home/gridmon/hari/galaxy_new/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.5.egg/sqlalchemy/pool.py,
 line 122, in create_connection

 return _ConnectionRecord(self)
   File
 /home/gridmon/hari/galaxy_new/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.5.egg/sqlalchemy/pool.py,
 line 198, in __init__
 self.connection = self.__connect()

   File
 /home/gridmon/hari/galaxy_new/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.5.egg/sqlalchemy/pool.py,
 line 261, in __connect
 connection = self.__pool._creator()
   File
 /home/gridmon/hari/galaxy_new/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.5.egg/sqlalchemy/engine/strategies.py,
 line 80, in connect

 raise exc.DBAPIError.instance(None, None, e)
 OperationalError: (OperationalError) FATAL:  no pg_hba.conf entry for host
 [local], user galaxy, database galaxy, SSL off
 *



 Can any one help me for integrating postgresql to galaxy
 waiting for ur kind reply




 --
 Thanks  Regards,
 Hari Krishna .M



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Recommended Specs for Production System

2011-04-08 Thread Sean Davis
On Fri, Apr 8, 2011 at 10:26 AM, Nate Coraor n...@bx.psu.edu wrote:
 Assaf Gordon wrote:

 Forgot to mention SGE/PBS: you definitely want to use them (even if you're 
 using a single machine),
 because the local job runner doesn't take into account multi-threaded 
 programs when scheduling jobs.
 So another core is needed for the SGE scheduler daemons (sge_qmaster and 
 sge_execd).

 I haven't tested, but it's entirely possible that the SGE daemons could
 happily share cores with other processes.  I'd be surprised if they
 spent a whole lot of time on-CPU.

We run SGE for NGS and do not find a need to set aside cores for the
daemons.  That said, if you do have an active cluster (more than a
couple of machines), the SGE master node does benefit from having a
core set aside.

Sean

 A cluster runner is recommended for other reasons, too - restartability
 of the Galaxy process is one of the big ones.

 --nate
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] SGE and Galaxy (a different approach)

2011-04-05 Thread Sean Davis
On Tue, Apr 5, 2011 at 12:27 PM, andrew stewart
andrew.c.stew...@gmail.com wrote:
 I'm aware of how to configure Galaxy to use SGE in universe_wsgi.ini,
 however what I want to do is a little different.

Hi, Andrew.  Take a look at this page:

https://bitbucket.org/galaxy/galaxy-central/wiki/Config/Cluster

In particular, does the last section, Tool Configuration, describe
something like what you want to do?

Sean


 Because I only want
 certain processes to be submitted to the queue, I'd rather control this at
 the tool configuration level (the xml wrapper).  For example:
 command interpreter=bash
     qsub myscript.sh
 /command
 This will work, except that the status of the job (in Galaxy) shows as
 completed even though the job has simply been submitted to SGE.  Basically
 Galaxy 'loses track' of the process because the submission process
 (myscript.sh) has completed even if the actual job hasn't.
 Has anyone else tried anything like this before, or have anything helpful to
 suggest?  One thought is to somehow cause the myscript.sh process to pause
 until the SGE job has completed... somehow.
 Any advice appreciated.
 Thanks,
 Andrew

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/