Hi Jon, >From a bird's view all you need is your workers and admin hosts having access >to the same shared storage (with the right permissions and properly configured >of course), that storage, let's say an NFS mount needs to support hard links.
When Matterhorn initializes it will try to determine if the relevant storage supports hard links (it will show in the logs like this " INFO (WorkspaceImpl:163) - Hard links between the working file repository and the workspace enabled"). In a properly configured shared storage setup with hard links detected there is no copying/processing/copying back, it's all "in place". Now when it comes to configuration these are the keys in config.properties that need to point at a shared/hard link enabled storage # The path to the repository of files used during media processing. org.opencastproject.file.repo.path=/nfs/mount/work/shared/files # The path to the working files (recommend using fast, transient storage) org.opencastproject.workspace.rootdir=/nfs/mount/work/shared/workspace In addition A couple of other keys should be pointing at a shared storage # The directory where the matterhorn streaming app for Red5 stores the streams org.opencastproject.streaming.directory=/nfs/mount/distribution/streams # The directory to store media, metadata, and attachments for download from the engage tool org.opencastproject.download.directory=/nfs/mount/distribution/downloads All this and more was documented by Tobias Wunden (Entwine CTO) here http://opencast.jira.com/wiki/display/MH/Sample+Distributed+Installation http://opencast.jira.com/wiki/display/MH/SAMPLE+Customization "[...] The workspace directory (org.opencastproject.workspace.rootdir) will ideally be shared amongst the nodes of a system. Any time one of the Matterhorn services needs to work on a certain piece of media, the service will first download the file to the workspace and then start processing. Now as a recording travels through the system to be processed, each media track and metadata file will be touched multiple times by different services. If the workspace is shared, download occurs only once instead of multiple times. The biggest performance gain can be achieved by putting both the working file repository's storage directory and the shared workspace on one single network volume. This means that there will be no downloading from the working file repository to the workspace but hard linking, which can be done in a blink of an eye. [...] " ************* Jaime Gago Systems Engineer [email protected] @JaimeGagoTech Entwine - Knowledge In Motion www.entwinemedia.com @entwinemedia On Sep 4, 2012, at 1:50 PM, Jonathan Felder wrote: > Do the workers pull the files off the nfs mount, do their thing, and then > place the completed files back on the mount? If not, how is the latency? > I'd expect there to be significant performance considerations after the > number of workers increases. > > How is all of this configured? > > -- > Jon > > On 9/4/12 1:28 PM, Christopher Brooks wrote: >> Jon, >> >> I assume Adam can answer this in more detail from our end, but we do >> multiple workers with a single admin. The workers can request all of >> the files from the admin and hand files back, but even better is to >> have them all use the same shared storage. Thus a request for a file >> is dealt with by having the worker just look on the NFS mount instead >> of looking at a REST endpoint. >> >> Chris >> >> On Tue, 4 Sep 2012 13:06:57 -0700 >> Jonathan Felder <[email protected]> wrote: >> >>> Has anyone tried a configuration using multiple worker machines? >>> >>> How do you handle the file management? Do all of the workers utilize >>> shared storage with the admin server or can the admin server hand >>> files to the workers and the workers send back completed workflows? >>> >>> -- >>> Jon >>> _______________________________________________ >>> Matterhorn-users mailing list >>> [email protected] >>> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users >> >> >> > _______________________________________________ > Matterhorn-users mailing list > [email protected] > http://lists.opencastproject.org/mailman/listinfo/matterhorn-users _______________________________________________ Matterhorn-users mailing list [email protected] http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
