Thanks Nate, that makes sense. However it seems I still have an issue:
```
select count(d.id)
from dataset d
join history_dataset_association hda on d.id = hda.dataset_id
where d.deleted = 't' and hda.purged = 'f';
count
-------
67464
(1 row)
```
Perhaps I'll need to write some script to check each of these to see if
the data does, indeed, exist, and then set the flag appropriately... Hrmm.
Does anyone know what the `dataset.deleted` flag is used for? Is that
just supposed to be set when all `hda.purged` are `t`. Sort of like a
shortcut for queries?
- Lance
Nate Coraor <mailto:[email protected]>
December 2, 2016 at 11:15 AM
Lance,
usegalaxy.org <http://usegalaxy.org> has 4,652,912 such datasets. The
cause here is that deleting an entire history does not mark the HDAs
deleted (so that if you view a deleted history you can see what
datasets were deleted and which were not at the time of deletion).
There is a separate hda.purged column that indicates that an HDA is no
longer user-recoverable by the user. I have 699 datasets that are
d.deleted but not hda.purged, this number should be 0.
--nate
Lance Parsons <mailto:[email protected]>
November 30, 2016 at 2:20 PM
I've run into issues over the past year where some jobs would
occasionally fail to start (stuck in a `new` state). I tracked them
down to a situataion where `dataset.deleted` is set to `t` yet the
`history_dataset_association.deleted` is `f`. Simply setting
`dataset.deleted` to `f` in those instances resolved the issue and the
jobs ran. The datasets have all still been on disk.
Since this is a pretty annoying situation, I thought I'd check to see
if there are other datasets with this problem. Shockingly, I found
many thousands of such datasets:
```
select count(d.id)
from dataset d
join history_dataset_association hda on d.id = hda.dataset_id
where d.deleted = 't' and hda.deleted = 'f';
count
-------
76977
(1 row)
```
I'm hesitant to update so many rows in my database so I thought I'd
put this out there for comment. What do others see when running the
above query? Has anyone run into this or a similar issue? Thanks.
--
Lance Parsons - Scientific Programmer
Carl C. Icahn Laboratory - Room 136
Lewis-Sigler Institute for Integrative Genomics
Princeton University
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/