This machine is not clustered, so it isn't that. There has been no
manual intervention either.
What sort of configuration issue would cause this?
Basically workspace/collection has 11G in it and files/collection has
42m in it. They are not duplicates of each other. Pretty much all of
the extra space in workspace/collection is in ingest-temp.
I'll ask our developers here to take a look at the workflows. A casual
glance though does show that the cleanup operation is in there.
The cleanup operation in both our repository and the MH main one is
instructing it to:
<configuration
key="preserve-flavors">*/source,dublincore/*</configuration>
That might explain all the stuff in the mediapackage directory.
--
Jon
On 9/13/12 1:05 AM, Tobias Wunden wrote:
Hi Jonathan,
It is configured to be on the same disk.
Also yes some of them are hard links. Namely workspace/mediapackage.
[root@worker-qa workspace]# du -hs *
11G collection
296K http_worker-qa.media.berkeley.edu
15G mediapackage
There is 11G in workspace/collection that is not hard links. The space
calculation in my previous message excluded the hard linked files (i.e. did not
count them twice) from the overall calculation.
from my understanding, if there are files in the workspace that are not hard
linked, this points to either one of a) configuration issue where workspace and
working file repository are not linked appropriately or b) manual
interventation. When checking the configuration, make sure that *all* of your
machines in the cluster use *the same* url for the working file repository. So
no default value like localhost:8008 but a url pointing to *one* of your
machines. This should be documented on the page that Chris pointed you at
earlier. If that documentation isn't clear enough, please consider patching it.
No there is not a cleanup routine defined in the workflow...at least I don't
think so. I'm using the default workflows that ship with matterhorn. Problem
is I don't know what to clean up. Also how do I add a clean up routine?
The default workflow does ship with a "cleanup" operation at the very end, so
you may want to do a checkout from official Matterhorn SVN, you may be holding on to a
modified copy in the UC Berkeley msub.
Also does downloads contain copies of the original data plus the encoded data?
No, just the distribution versions + metadata needed by Engage.
Tobias
_______________________________________________
Matterhorn-users mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn-users