2012/6/20 Tobias Wunden <[email protected]>

> Ruben,
>
> > The question is that the "cleanup" operation is not aware of what's been
> archived, distributed or published. And it's not aware because they're
> effectively different mediapackages in different places of Matterhorn which
> actually the same recording but in different contexts. It all goes down to
> what Olli said:
> >
> > In my opinion it is _crucial_ that MH keeps track of all the metadata
> and that the media files are handled by MH
> >
> > But it doesn't (or at least it doesn't do it so well), and the archived
> copy gets out of sync, etc. So what's the point of archiving one copy but
> working on *other* copy of the mediapackage, anyway? We are working on the
> same resources but from different (and unsynchronized) views.
>
> as I explained before multiple times: the archive *has not been
> implemented yet*. As long as there is no archive which is able to store a
> copy of the files, the working file repository is the only place where
> files live while being processed by Matterhorn. This issue will soon be
> resolved as the archive implementation gets finished, but it's not yet the
> case.


Another copy of the files? So now we've got the distribution copies, in its
own mediapackage; the working file repository copy, in its own mediapackage
too; the episode service copy, also in its own mediapackage. And we are
going to add *yet another* copy for the archive service. So that makes four
*different* mediapackages refering to the very same recording! No wonder
why it's so difficult to keep track of the files.

On the other hand, the "archive" operation name is misleading: if there is
no archiving service in place, how come we can already "archive" things?
Even more, why can't be the episode's "archiving" sufficient? Why don't we
keep things in synch with the "episode" copy? I mean, what's the point of
"archive" being an optional operation you can include or not in your
workflow? How wouldn't anyone want to keep their recordings archived and
ready for being processed again for whatever reason which may come in the
future?


>  > Also, when someone is testing Matterhorn, the disk consumption is also
> a characteristic to consider. If we keep all the "garbage" generated in the
> workflows, we'll bias the estimates the adopters will be making about how
> much disk space Matterhorn requires. So I disagree with Tobias in that this
> matter is not only relevant to those who are already designing their
> pilots, but also to those who are considering whether or not they are
> interested in deploying such pilots.

>
> > There's also a solution that Tobias already hinted in his mail: as long
> as the "cleanup" operation is *before* the "archive" and "publish"
> operation, we won't end up with broken references. Therefore, I'm afraid
> I'm voting -1 against this proposal, and I #propose changing it to "place
> the operations in the correct order; i.e. cleanup first and then archive
> and publish". Of course, that probably means the arguments to the cleanup
> operation will have to change, so that all the files intended to be
> published and archived are kept (and I have just thought that if you change
> that arguments accordingly, it doesn't matter where you put the operation,
> since those files will be saved anyway).
>
> Please send an official counter proposal then that people can vote on.
>

Will do so. Thanks for saying it! (I thought saying it in this thread would
suffice).


>
> Tobias
> _______________________________________________
> Matterhorn mailing list
> [email protected]
> http://lists.opencastproject.org/mailman/listinfo/matterhorn
>
>
> To unsubscribe please email
> [email protected]
> _______________________________________________
>
_______________________________________________
Matterhorn mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn


To unsubscribe please email
[email protected]
_______________________________________________

Reply via email to