Hi Steven,

You'll have incoming network traffic while files are ingested to your
admin. You'll have outgoing network traffic as the admin stores these
files on a SAN.  Then the workers will hit the SAN to download the
files to work on, then push those files back to the SAN after they have
finished working on them so that engage can serve them up to the world.

The big bottleneck for us was that hit between the admin and worker
machines, since the ones being pushed to engage are smaller (processed)
videos.

If you're not using NFS but you want direct storage, have you
considered merging your admin and worker machines?  If you did this you
could dramatically cut network traffic, but you need to make sure that
your workers are setup so that they don't max out the number of cores
available and thrash the admin.

In our case, disks attached to our VMs are on the same SAN as
everything else, so we don't bother with that and do everything over a
network share.  For production we were looking at RHEL GFS2, but this
distributed filesystem doesn't seem suitable for times of high network
loads (e.g. we can cause it to hang a machine).

Regards,

Chris

On Thu, 01 Sep 2011 08:37:29 -0500
Walter Schwarz <[email protected]> wrote:

> The long version:
> It might seem counterintuitive but I've seen MORE shared disk I/O
> when workers used local storage.  I believe this is because an
> additional copy of the workspace needs to be maintained for each
> worker on their local drive.  I see peak disk I/O is when a media
> package is first accepted and ingest starts, the zip file is unpacked
> and the pieces are put in their various areas to be processed.  After
> that point most of what the workers are doing must be in memory with
> minimal file reads and writes as I see comparatively very little disk
> I/O.
> 
> The short version:
> Using local disk for workers seems to require more disk activity
> during the times when disk I/O is already in high demand.
> 
> When I configured workers to use their own disk I deployed them with
> the -Pworker,serviceregistry,workspace-stub profile and configured 
> org.opencastproject.workspace.rootdir to be local disk, like 
> /mnt/encoding.
> 
> 
> 
> > Steven M Lichti <[email protected]> 
> > 
> > I want to do the encoding on my worker VMs on local drives. I 
> > received a 50GB /dev/sdb1 device on each of two workers, mounted at 
> > /mnt/encoding.
> > 
> > I'm going through my configuration files now, and I would like to 
> > have the encoding workspace point to /mnt/encoding. However, since 
> > only the workers appear to need this (and PLEASE correct me if I'm 
> > wrong), only the workers will need to have this setting:
> > org.opencastproject.storage.dir=/opt/matterhorn/content
> > 
> > /opt/matterhorn/content is a NFS link to shared storage, but it's 
> > crazy-slow to encode across the network.
> > 
> > My goal is to do the encoding locally (on the workers) in /mnt/
> > encoding, then publish to /opt/matterhorn/content, where the 
> > distribution/engage server can get at the files.
> > 
> > There is also the setting: 
> > org.opencastproject.download.directory=$
> > {org.opencastproject.storage.dir}/downloads
> > 
> > If I change the org.opencastproject.storage.dir variable on the 
> > workers to /mnt/encoding, then set the downloads directory to /opt/
> > matterhorn/content/downloads, would that do the trick?
> > 
> > Please let me know what you all think…
> > 
> > Thank you!
> > 
> > --Steven.



-- 
Christopher Brooks, BSc, MSc
ARIES Laboratory, University of Saskatchewan

Web: http://www.cs.usask.ca/~cab938
Phone: 1.306.966.1442
Mail: Advanced Research in Intelligent Educational Systems Laboratory
     Department of Computer Science
     University of Saskatchewan
     176 Thorvaldson Building
     110 Science Place
     Saskatoon, SK
     S7N 5C9
_______________________________________________
Matterhorn-users mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn-users

Reply via email to