It is configured to be on the same disk.

Also yes some of them are hard links.  Namely workspace/mediapackage.

[root@worker-qa workspace]# du -hs *
11G     collection
296K    http_worker-qa.media.berkeley.edu
15G     mediapackage


There is 11G in workspace/collection that is not hard links. The space calculation in my previous message excluded the hard linked files (i.e. did not count them twice) from the overall calculation.

No there is not a cleanup routine defined in the workflow...at least I don't think so. I'm using the default workflows that ship with matterhorn. Problem is I don't know what to clean up. Also how do I add a clean up routine?

Also does downloads contain copies of the original data plus the encoded data?

--
Jon

On 9/12/12 2:34 PM, Tobias Wunden wrote:
Jonathan,

We have done 15 approximately 1 hr in length SD recordings.  A couple are 3 hrs 
in length.  A couple are a few minutes in length.  Total time is approximately 
15 hrs.

This is what the space looks like:

[root@worker-qa opencast]# du -hs *
324K    archive
4.0K    archive-temp
6.5G    downloads
14G    files
4.0K    ingest
416K    schedulerindex
332K    searchindex
340K    seriesindex
776K    workflow
11G    workspace

I have not put in a ticket, because it's not clear this is a bug. However, 
surely 15 hours of SD recording does not require over 30gb of storage space.  
The questions I am raising and discussion I'm trying to have is about what this 
storage is being used for.  Once a good understanding of this is achieved on 
list, an appropriate wiki article can be written for it and adopters will be 
able to make intelligent storage decisions.

Did you configure the working file repository and the workspace to live on the 
same disk or share? If so, all if the 11GB in :workspace should be hard links 
to corresponding files in /files.


Second, do you have the "cleanup" operation in your workflow? This should clear 
both the workspace and the working file repository from a workflow's temporary files but 
will also disable you from going back to a recording and do things like reencode because 
you are removing everything but the distribution artifacts.

This will mostly leave you with 6.5 GB of distribution material.

Tobias

_______________________________________________
Matterhorn-users mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn-users

Reply via email to