Hi Steven, You'll have incoming network traffic while files are ingested to your admin. You'll have outgoing network traffic as the admin stores these files on a SAN. Then the workers will hit the SAN to download the files to work on, then push those files back to the SAN after they have finished working on them so that engage can serve them up to the world.
The big bottleneck for us was that hit between the admin and worker machines, since the ones being pushed to engage are smaller (processed) videos. If you're not using NFS but you want direct storage, have you considered merging your admin and worker machines? If you did this you could dramatically cut network traffic, but you need to make sure that your workers are setup so that they don't max out the number of cores available and thrash the admin. In our case, disks attached to our VMs are on the same SAN as everything else, so we don't bother with that and do everything over a network share. For production we were looking at RHEL GFS2, but this distributed filesystem doesn't seem suitable for times of high network loads (e.g. we can cause it to hang a machine). Regards, Chris On Thu, 01 Sep 2011 08:37:29 -0500 Walter Schwarz <[email protected]> wrote: > The long version: > It might seem counterintuitive but I've seen MORE shared disk I/O > when workers used local storage. I believe this is because an > additional copy of the workspace needs to be maintained for each > worker on their local drive. I see peak disk I/O is when a media > package is first accepted and ingest starts, the zip file is unpacked > and the pieces are put in their various areas to be processed. After > that point most of what the workers are doing must be in memory with > minimal file reads and writes as I see comparatively very little disk > I/O. > > The short version: > Using local disk for workers seems to require more disk activity > during the times when disk I/O is already in high demand. > > When I configured workers to use their own disk I deployed them with > the -Pworker,serviceregistry,workspace-stub profile and configured > org.opencastproject.workspace.rootdir to be local disk, like > /mnt/encoding. > > > > > Steven M Lichti <[email protected]> > > > > I want to do the encoding on my worker VMs on local drives. I > > received a 50GB /dev/sdb1 device on each of two workers, mounted at > > /mnt/encoding. > > > > I'm going through my configuration files now, and I would like to > > have the encoding workspace point to /mnt/encoding. However, since > > only the workers appear to need this (and PLEASE correct me if I'm > > wrong), only the workers will need to have this setting: > > org.opencastproject.storage.dir=/opt/matterhorn/content > > > > /opt/matterhorn/content is a NFS link to shared storage, but it's > > crazy-slow to encode across the network. > > > > My goal is to do the encoding locally (on the workers) in /mnt/ > > encoding, then publish to /opt/matterhorn/content, where the > > distribution/engage server can get at the files. > > > > There is also the setting: > > org.opencastproject.download.directory=$ > > {org.opencastproject.storage.dir}/downloads > > > > If I change the org.opencastproject.storage.dir variable on the > > workers to /mnt/encoding, then set the downloads directory to /opt/ > > matterhorn/content/downloads, would that do the trick? > > > > Please let me know what you all think⦠> > > > Thank you! > > > > --Steven. -- Christopher Brooks, BSc, MSc ARIES Laboratory, University of Saskatchewan Web: http://www.cs.usask.ca/~cab938 Phone: 1.306.966.1442 Mail: Advanced Research in Intelligent Educational Systems Laboratory Department of Computer Science University of Saskatchewan 176 Thorvaldson Building 110 Science Place Saskatoon, SK S7N 5C9 _______________________________________________ Matterhorn-users mailing list [email protected] http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
