Hi Ravi, If you take a look at the dataset's entry in the history_dataset_association table, is that marked deleted? admin_cleanup_datasets.py only marks history_dataset_association rows deleted, not datasets.
Running the cleanup_datasets.py flow with -d 0 should have then caused the dataset to be deleted and purged, but this may not be the case if there is more than one instance of the dataset you are trying to purge (either another copy in a history somewhere, or in a library). --nate On Tue, Mar 25, 2014 at 5:12 PM, Sanka, Ravi <rsa...@jcvi.org> wrote: > I have now been able to successfully remove datasets from disk. After > deleting the dataset or history from the front-end interface (as the user), > I then run the cleanup scripts as admin: > > python ./scripts/cleanup_datasets/cleanup_datasets.py ./universe_wsgi.ini > -d 0 -1 $@ >> ./scripts/cleanup_datasets/delete_userless_histories.log > python ./scripts/cleanup_datasets/cleanup_datasets.py ./universe_wsgi.ini > -d 0 -2 -r $@ >> ./scripts/cleanup_datasets/purge_histories.log > python ./scripts/cleanup_datasets/cleanup_datasets.py ./universe_wsgi.ini > -d 0 -3 -r $@ >> ./scripts/cleanup_datasets/purge_datasets.log > python ./scripts/cleanup_datasets/cleanup_datasets.py ./universe_wsgi.ini > -d 0 -5 -r $@ >> ./scripts/cleanup_datasets/purge_folders.log > python ./scripts/cleanup_datasets/cleanup_datasets.py ./universe_wsgi.ini > -d 0 -4 -r $@ >> ./scripts/cleanup_datasets/purge_libraries.log > python ./scripts/cleanup_datasets/cleanup_datasets.py ./universe_wsgi.ini > -d 0 -6 -r $@ >> ./scripts/cleanup_datasets/delete_datasets.log > > However, my final goal is to have a process that can remove old datasets > from disk regardless of whether or not the users have deleted them at the > front-end (and then automate said process via cronjob). This will be > essentially in a situation where users are likely to leave datasets > unattended and accumulating disk space. > > I found the following Galaxy thread: > > > http://dev.list.galaxyproject.org/Re-Improving-Administrative-Data-Clean-Up-pgcleanup-py-vs-cleanup-datasets-py-td4659330.html > > And am trying to use the script it mentions: > > python ./scripts/cleanup_datasets/admin_cleanup_datasets.py > universe_wsgi.ini -d 30 --smtp <smtp server> --fromaddr rsa...@jcvi.org > > I chose -d 30 to remove all datasets older than 30 days, which currently > only targets one dataset. The resulting stdout indicates success: > > """"""""""""""""""""""""""""""""""""" > # 2014-03-25 16:27:47 - Handling stuff older than 30 days > Marked HistoryDatasetAssociation id 301 as deleted > > From: rsa...@jcvi.org > To: isi...@jcvi.org > Subject: Galaxy Server Cleanup - 1 datasets DELETED > ---------- > Galaxy Server Cleanup > --------------------- > The following datasets you own on Galaxy are older than 30 days and have > been DELETED: > > "Small.fastq" in history "Unnamed history" > > You may be able to undelete them by logging into Galaxy, navigating to the > appropriate history, selecting "Include Deleted Datasets" from the history > options menu, and clicking on the link to undelete each dataset that you > want to keep. You can then download the datasets. Thank you for your > understanding and cooporation in this necessary cleanup in order to keep > the Galaxy resource available. Please don't hesitate to contact us if you > have any questions. > > -- Galaxy Administrators > > Marked 1 dataset instances as deleted > """"""""""""""""""""""""""""""""""""" > > But when I check the database, the status of dataset 301 is unchanged > (ok-false-false-true). > > I then run the same cleanup_datasets.py routine from above (but with -d > 30), but dataset 301 is still present. I tried a second time, this time > using -d 0, but still no deletion (which is not surprising since the > dataset's deleted status is still false). > > If I run admin_cleanup_datasets.py again with the same parameters, the > stdout says no datasets matched the criteria, so it seems to remember it's > previous execution, but it's NOT actually updating the database. > > What am I doing wrong? > > ---------------------------------------------- > Ravi Sanka > ICS - Sr. Bioinformatics Engineer > J. Craig Venter Institute > 301-795-7743 > ---------------------------------------------- > > From: Carl Eberhard <carlfeberh...@gmail.com> > Date: Tuesday, March 18, 2014 2:09 PM > To: Peter Cock <p.j.a.c...@googlemail.com> > Cc: Ravi Sanka <rsa...@jcvi.org>, "firstname.lastname@example.org" < > email@example.com> > Subject: [CONTENT] Re: [galaxy-dev] Re: Unable to remove old datasets > > The cleanup scripts enforce a sort of "lifetime" for the datasets. > > The first time they're run, they may mark a dataset as deleted and also > reset the update time and you'll have to wait N days for the next stage of > the lifetime. > > The next time they're run, or if a dataset has already been marked as > deleted, the actual file removal happens and purged is set to true (if it > wasn't already). > > You can manually pass in '-d 0' to force removal of datasets recently > marked as deleted. > > The purge scripts do not check 'allow_user_dataset_purge', of course. > > > On Tue, Mar 18, 2014 at 11:50 AM, Carl Eberhard > <carlfeberh...@gmail.com>wrote: > >> I believe it's a (BAD) silent failure mode in the server code. >> >> If I understand correctly, the purge request isn't coughing an error when >> it gets to the 'allow_user_dataset_purge' check and instead is silently >> marking (or re-marking) the datasets as deleted. >> >> I would rather it fail with a 403 error if purge is explicitly requested. >> >> That said, it of course would be better to remove the purge operation >> based on the configuration then to show an error after we've found you >> can't do the operation. The same holds true for the 'permanently remove >> this dataset' link in deleted datasets. >> >> I'll see if I can find out the answer to your question on the cleanup >> scripts. >> >> >> On Tue, Mar 18, 2014 at 10:49 AM, Peter Cock >> <p.j.a.c...@googlemail.com>wrote: >> >>> On Tue, Mar 18, 2014 at 2:14 PM, Carl Eberhard <carlfeberh...@gmail.com> >>> wrote: >>> > Thanks, Ravi & Peter >>> > >>> > I've added a card to get the allow_user_dataset_purge options into the >>> > client and to better show the viable options to the user: >>> > https://trello.com/c/RCPZ9zMF >>> >>> Thanks Carl - so this was a user interface bug, showing the user >>> non-functional permanent delete (purge) options. That's clearer now. >>> >>> In this situation can the user just 'delete', and wait N days for >>> the cleanup scripts to actually purge the files and free the space? >>> (It seems N=10 in scripts/cleanup/purge_*.sh at least, elsewhere >>> like the underlying Python script the default looks like N=60). >>> >>> Regards, >>> >>> Peter >>> >> >> > > ___________________________________________________________ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > http://lists.bx.psu.edu/ > > To search Galaxy mailing lists use the unified search at: > http://galaxyproject.org/search/mailinglists/ >
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/