Hello all,

Greg Von Kuster wrote, On 03/14/2011 11:09 AM:
> Sebastian and Assaf, 
> 
> For the above, determining whether a dataset is shared is currently 
> available, but what is your definition of a "published dataset"?
> 
We use the following query to see if a dataset is shared between more than one 
history:
 select
   count(*)
  from
   history_dataset_association hda
  where hda.dataset_id=NNNNN

And the following to see if it's shared between more than one user:
==
select
  count ( distinct history.user_id )
from
  history,
  dataset,
  history_dataset_association hda
where
  dataset.id = hda.dataset_id
 AND
  history.id = hda.history_id
 AND
  dataset.id = NNNN
==

Are there other types of sharing (between users) ?

> 
>> 
>> It would be great if there was a way for users to see the datasets 
>> (and the jobs, parameters, etc.) of all their datasets (ever),
>> even if I deleted the underlying physical file.
> 
> 
> Where are you talking about "seeing the datasets, jobs, parameters, 
> etc" whose underlying file has been removed from disk?  Would this
> be in the history, where you can currently see "deleted" datasets,
> but not "purged" datasets?
> 
> 
I'm not sure about the best UI way of displaying those purged datasets,
but something like showing the datasets just like the current green rectangles 
(or the way deleted datasets are displayed, with a warning),
showing the user what was the analysis, the tools, the parameters,
and if/when the user clicks on the "eye" icon or the "download" icon, he will 
get a message saying "this dataset has been purged".

But he will still be able to "re-run" or view the tool's parameters.

The rational behind this:
I'm in the same situation as Sebastian - I have users running big jobs, on many 
FASTQ files, on crazy datasets (example: Genomics Interval's "Join" on two 
300MB BED files, producing a 70GB file, then filtering the results with GREP, 
or 8 x PE100 FASTQ files that go throw the same workflow). Many times they 
don't need the intermediate files, but never bother to delete them (this was 
before the workflows had an option to delete intermediate files, and even now 
not many people are using this feature - and I can't force them).

I want to delete those intermediate files, but once I delete+purge them - all 
records of the jobs/datasets are gone.
Users (looking at their histories) can't tell/remember how they got from the 
first FASTQ file to the last final file.
They don't always use workflows, so "just take a look at the workflow" not a 
good solution.

Having a way to display the meta-data of a purged dataset (especially since 
each dataset has a "peek" and "info" data) would be very helpful (helpful - but 
not top priority - I don't want to create the wrong impression).

-gordon

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to