[galaxy-dev] Dataset deletion / database update after user purge

2015-01-08 Thread Nicolas Lapalu

Hi Galaxy team,


We performed a database and datasets purge of our Galaxy server. We 
decided to delete and purge few user accounts. We supposed that the 
purge of a user would trigger cascade deletions in database. So we used 
the web interface to delete and to purge these accounts. Then we 
launched cleanup_dataset.py in several ways:


python scripts/cleanup_datasets/cleanup_datasets.py universe_wsgi.ini -d 
1 -1 -r  cleanup_1.log
python scripts/cleanup_datasets/cleanup_datasets.py universe_wsgi.ini -d 
1 -2 -r  cleanup_2.log
python scripts/cleanup_datasets/cleanup_datasets.py universe_wsgi.ini -d 
1 -3 -r  cleanup_3.log
python scripts/cleanup_datasets/cleanup_datasets.py universe_wsgi.ini -d 
1 -5 -r  cleanup_5.log
python scripts/cleanup_datasets/cleanup_datasets.py universe_wsgi.ini -d 
1 -4 -r  cleanup_4.log
python scripts/cleanup_datasets/cleanup_datasets.py universe_wsgi.ini -d 
1 -6 -r  cleanup_6.log



After a simple control, I can see that some datasets belonging to (and 
only to) purged user accounts are still present with a status : deleted 
= t and purged = f. Moreover, some datasets (not shared with other users 
or linked to a library) have the status deleted = f and purged = f. When 
I look at the history status associated with the dataset, I can obtain 
deleted = t and purged = f. So, the purge of a user account don't change 
the status of his histories and datasets.
Do you thing that I can change the history status 'deleted = f to 
deleted = t, and purged = f to purged = t' for all purged users ? I 
expect that the cleanup script launches the cascade purge !



below the sql command:

// all datasets belonging to at least one purged user (shared or not)
select D.id as dataset id,U.email, D.deleted as dataset deleted, 
D.purged as dataset purged , H.deleted as history deleted ,H.purged 
as history purged from galaxy_user U,history H 
,history_dataset_association A, dataset D where A.dataset_id=D.id and 
H.id=A.history_id and U.id=H.user_id and D.id in (select D.id from 
galaxy_user U,history H ,history_dataset_association A, dataset D where 
A.dataset_id=D.id and H.id=A.history_id and U.id=H.user_id and U.deleted 
= 't' and U.purged = 't') order by D.id;

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Text file busy

2015-01-08 Thread John Chilton
Hey Evan,

  Galaxy should perhaps be able to retry submissions that fail -
especially if they fail quickly and I have created a Trello card for
this here (https://trello.com/c/hxy2bcIb). Nate has added some
features for job state handling plugins
(https://bitbucket.org/galaxy/galaxy-central/commits/7b209e06ddb944e953d340754439f4e3e5dc339d)
and it may be possible to write a plugin to do this today though
immediate submissions failures maybe should be handled a level above
this by the framework... not sure.

  I am not really sure this is the appropriate solution though for
this particular problem though - this seems like an unfortunate
interplay between your file system and your cluster manager and it
would seem that any script or platform that automates the creation of
submissions of jobs would potentially be subject to the same problems.
Solving it in Galaxy would be a application level solution to a
system-level configuration problem in my opinion. Have you ran this
problem by the systems staff - it seems like it should be possible to
delay each submission by a half of a second or change the flushing
settings of the file system.

  As you mentioned - a local work around might be to `time.sleep(1)`
before `external_job_id = self.ds.runJob(jt)` in
lib/galaxy/jobs/runners/drmaa.py or similar line line pbs.py. Do you
want to try that and let us know if it addresses the problem?

  Finally, in terms of the workflow - if you rerun the failed step in
the GUI you should be given the option via a new checkbox on the tool
form to resume the workflow.

-John


On Mon, Jan 5, 2015 at 4:48 PM, Evan Bollig PhD boll0...@umn.edu wrote:
 I get this error occasionally:

 /bin/sh: 1: 
 /opt/galaxy/web/database/job_working_directory/000/100/galaxy_100.sh:
 Text file busy

 When this occurs, the step fails outright. Resubmitting the step
 resolves the issue and things run no problem. If this error appears
 early in a long workflow, I have to manually resubmit ALL dependent
 steps... what a pain!

 Perhaps this is something the Galaxy job scheduler can look out for,
 flush() the system, sleep() a second or two to let the file write and
 close, and then rerun. A more fault-tolerant way of running workflows
 without unnecessary human intervention.

 Cheers,
 -Evan Bollig
 Research Associate | Application Developer | User Support Consultant
 Minnesota Supercomputing Institute
 599 Walter Library
 612 624 1447
 e...@msi.umn.edu
 boll0...@umn.edu
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   https://lists.galaxyproject.org/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Mercurial is out of date for Toolshed

2015-01-08 Thread Ryan G
Yes, I will look at this this week.  I'm using a clean copy of galaxy-dist
and will apply against that to see how it works.  Thanks a bunch Dave B for
doing this!

On Thu, Jan 8, 2015 at 9:42 AM, John Chilton jmchil...@gmail.com wrote:

 Hey Ryan,

 It looks like Dave B. has come to rescue with an egg and a patch that
 implements this:


 https://bitbucket.org/galaxy/galaxy-central/pull-request/626/upgrade-mercurial-egg-to-324

 It probably won't be back-ported to the forthcoming January release
 but I would suspect the raw patch
 (
 https://api.bitbucket.org/2.0/repositories/galaxy/galaxy-central/diff/davebgx/galaxy-central:625bf1fd88ee..63d901ca0e6e
 )
 will apply cleanly against the last few releases. If you have some
 time and want to apply it to your release and let us know if it fixes
 your problems that would probably help the pull request along.

 -John

 On Wed, Jan 7, 2015 at 10:56 AM, Ryan G ngsbioinformat...@gmail.com
 wrote:
  Yes, I think I found the root cause...its because the Galaxy egg for
  mercurial is using an older version of Mercurial that has this known bug.
  The only real fix is to upgrade the Galaxy egg to mercurial 3.x.
 
  The only way I know of to test this is to build a new Galaxy egg which
 I'm
  not familiar with.  But as I understand it the new Mercurial version will
  also break Galaxy.
 
 
  On Mon, Jan 5, 2015 at 6:19 PM, Björn Grüning bjoern.gruen...@gmail.com
 
  wrote:
 
  Hi Ryan,
 
  unfortunately not (yet). But I'm really surprised you are getting this
  error. It working for me and in our docker containers and other
  deployments. Can we try to detect the root cause of this error? Do you
  have conflicting mercurial version.
 
  Cheers,
  Bjoern
 
   Thanks.  Is there an alternative way I can install tools from the
   Toolshed?  This problem pretty much renders the toolshed unusable for
   me...
  
   On Mon, Jan 5, 2015 at 11:14 AM, John Chilton jmchil...@gmail.com
   wrote:
  
   It is certainly the case that Galaxy should be using the latest
   version of mercurial given a number of high profile bugs with older
   versions. Unfortunately it doesn't seem possible to just drop the new
   version it - there are some API changes that prevent Galaxy from
   loading when doing this and the number of people who can add new eggs
   to Galaxy is low.
  
   I have created a Trello card to track this issue:
  
   https://trello.com/c/9A9uIav0
  
   Let us know if you happen to find a workaround for this issue.
  
   -John
  
  
   On Mon, Dec 22, 2014 at 2:55 PM, Ryan G ngsbioinformat...@gmail.com
 
   wrote:
   I've been trying to track down why I can't get anything from the
   toolshed
   installed and finally have it figured out.
  
   Whenever I tried to install anything I always got an Error with no
   explanation of what the error was.  After enabling Debug messages
 into
   the
   log file, I see the error is:
  
   tool_shed.util.hg_util DEBUG 2014-12-22 14:47:48,910 Error cloning
   repository: httpsconnection instance has no attribute
 '_set_hostport'
  
   I googled around and found out this is a known bug/issue with older
   version
   of Mercurial and was fixed in v3.
  
   I added a line to hg_util.py to see where it picks up hg.  Its using
   version
   2.2.3.  Indeed, one of the eggs downloaded by Galaxy is
   mercurial-2.2.3.
  
   I have the newest version of mercurial installed in my site-packages
   folder
   but I guess that's not what galaxy wants.  So my question is, how
 do I
   get
   Galaxy to use the latest version of Mercurial?  And, Why did it
   download
   an
   older version?
  
   ___
   Please keep all replies on the list by using reply all
   in your mail client.  To manage your subscriptions to this
   and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/
  
   To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/
  
  
  
  
   ___
   Please keep all replies on the list by using reply all
   in your mail client.  To manage your subscriptions to this
   and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/
  
   To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/
  
 
 

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Mercurial is out of date for Toolshed

2015-01-08 Thread John Chilton
Hey Ryan,

It looks like Dave B. has come to rescue with an egg and a patch that
implements this:

https://bitbucket.org/galaxy/galaxy-central/pull-request/626/upgrade-mercurial-egg-to-324

It probably won't be back-ported to the forthcoming January release
but I would suspect the raw patch
(https://api.bitbucket.org/2.0/repositories/galaxy/galaxy-central/diff/davebgx/galaxy-central:625bf1fd88ee..63d901ca0e6e)
will apply cleanly against the last few releases. If you have some
time and want to apply it to your release and let us know if it fixes
your problems that would probably help the pull request along.

-John

On Wed, Jan 7, 2015 at 10:56 AM, Ryan G ngsbioinformat...@gmail.com wrote:
 Yes, I think I found the root cause...its because the Galaxy egg for
 mercurial is using an older version of Mercurial that has this known bug.
 The only real fix is to upgrade the Galaxy egg to mercurial 3.x.

 The only way I know of to test this is to build a new Galaxy egg which I'm
 not familiar with.  But as I understand it the new Mercurial version will
 also break Galaxy.


 On Mon, Jan 5, 2015 at 6:19 PM, Björn Grüning bjoern.gruen...@gmail.com
 wrote:

 Hi Ryan,

 unfortunately not (yet). But I'm really surprised you are getting this
 error. It working for me and in our docker containers and other
 deployments. Can we try to detect the root cause of this error? Do you
 have conflicting mercurial version.

 Cheers,
 Bjoern

  Thanks.  Is there an alternative way I can install tools from the
  Toolshed?  This problem pretty much renders the toolshed unusable for
  me...
 
  On Mon, Jan 5, 2015 at 11:14 AM, John Chilton jmchil...@gmail.com
  wrote:
 
  It is certainly the case that Galaxy should be using the latest
  version of mercurial given a number of high profile bugs with older
  versions. Unfortunately it doesn't seem possible to just drop the new
  version it - there are some API changes that prevent Galaxy from
  loading when doing this and the number of people who can add new eggs
  to Galaxy is low.
 
  I have created a Trello card to track this issue:
 
  https://trello.com/c/9A9uIav0
 
  Let us know if you happen to find a workaround for this issue.
 
  -John
 
 
  On Mon, Dec 22, 2014 at 2:55 PM, Ryan G ngsbioinformat...@gmail.com
  wrote:
  I've been trying to track down why I can't get anything from the
  toolshed
  installed and finally have it figured out.
 
  Whenever I tried to install anything I always got an Error with no
  explanation of what the error was.  After enabling Debug messages into
  the
  log file, I see the error is:
 
  tool_shed.util.hg_util DEBUG 2014-12-22 14:47:48,910 Error cloning
  repository: httpsconnection instance has no attribute '_set_hostport'
 
  I googled around and found out this is a known bug/issue with older
  version
  of Mercurial and was fixed in v3.
 
  I added a line to hg_util.py to see where it picks up hg.  Its using
  version
  2.2.3.  Indeed, one of the eggs downloaded by Galaxy is
  mercurial-2.2.3.
 
  I have the newest version of mercurial installed in my site-packages
  folder
  but I guess that's not what galaxy wants.  So my question is, how do I
  get
  Galaxy to use the latest version of Mercurial?  And, Why did it
  download
  an
  older version?
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
 
 
 
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
 


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/