Thanks Chris and Walter, I made the changes to see what would happen. At the moment, file processing appears to be speedier than before, so only the overall time & network bandwidth will tell a difference.
This shouldn't be a consideration when our new encoding server is available, but we need to do these experiments now to figure all of this out for the coming weeks. Thanks both! --Steven. -- Steven Lichti Academic Technologies Northwestern University [email protected] (847) 467-7805 On 9/1/11 10:11 AM, "Christopher Brooks" <[email protected]> wrote: >Hi Steven, > >You'll have incoming network traffic while files are ingested to your >admin. You'll have outgoing network traffic as the admin stores these >files on a SAN. Then the workers will hit the SAN to download the >files to work on, then push those files back to the SAN after they have >finished working on them so that engage can serve them up to the world. > >The big bottleneck for us was that hit between the admin and worker >machines, since the ones being pushed to engage are smaller (processed) >videos. > >If you're not using NFS but you want direct storage, have you >considered merging your admin and worker machines? If you did this you >could dramatically cut network traffic, but you need to make sure that >your workers are setup so that they don't max out the number of cores >available and thrash the admin. > >In our case, disks attached to our VMs are on the same SAN as >everything else, so we don't bother with that and do everything over a >network share. For production we were looking at RHEL GFS2, but this >distributed filesystem doesn't seem suitable for times of high network >loads (e.g. we can cause it to hang a machine). > >Regards, > >Chris > >On Thu, 01 Sep 2011 08:37:29 -0500 >Walter Schwarz <[email protected]> wrote: > >> The long version: >> It might seem counterintuitive but I've seen MORE shared disk I/O >> when workers used local storage. I believe this is because an >> additional copy of the workspace needs to be maintained for each >> worker on their local drive. I see peak disk I/O is when a media >> package is first accepted and ingest starts, the zip file is unpacked >> and the pieces are put in their various areas to be processed. After >> that point most of what the workers are doing must be in memory with >> minimal file reads and writes as I see comparatively very little disk >> I/O. >> >> The short version: >> Using local disk for workers seems to require more disk activity >> during the times when disk I/O is already in high demand. >> >> When I configured workers to use their own disk I deployed them with >> the -Pworker,serviceregistry,workspace-stub profile and configured >> org.opencastproject.workspace.rootdir to be local disk, like >> /mnt/encoding. >> >> >> >> > Steven M Lichti <[email protected]> >> > >> > I want to do the encoding on my worker VMs on local drives. I >> > received a 50GB /dev/sdb1 device on each of two workers, mounted at >> > /mnt/encoding. >> > >> > I'm going through my configuration files now, and I would like to >> > have the encoding workspace point to /mnt/encoding. However, since >> > only the workers appear to need this (and PLEASE correct me if I'm >> > wrong), only the workers will need to have this setting: >> > org.opencastproject.storage.dir=/opt/matterhorn/content >> > >> > /opt/matterhorn/content is a NFS link to shared storage, but it's >> > crazy-slow to encode across the network. >> > >> > My goal is to do the encoding locally (on the workers) in /mnt/ >> > encoding, then publish to /opt/matterhorn/content, where the >> > distribution/engage server can get at the files. >> > >> > There is also the setting: >> > org.opencastproject.download.directory=$ >> > {org.opencastproject.storage.dir}/downloads >> > >> > If I change the org.opencastproject.storage.dir variable on the >> > workers to /mnt/encoding, then set the downloads directory to /opt/ >> > matterhorn/content/downloads, would that do the trick? >> > >> > Please let me know what you all thinkÅ >> > >> > Thank you! >> > >> > --Steven. > > > >-- >Christopher Brooks, BSc, MSc >ARIES Laboratory, University of Saskatchewan > >Web: http://www.cs.usask.ca/~cab938 >Phone: 1.306.966.1442 >Mail: Advanced Research in Intelligent Educational Systems Laboratory > Department of Computer Science > University of Saskatchewan > 176 Thorvaldson Building > 110 Science Place > Saskatoon, SK > S7N 5C9 >_______________________________________________ >Matterhorn-users mailing list >[email protected] >http://lists.opencastproject.org/mailman/listinfo/matterhorn-users _______________________________________________ Matterhorn-users mailing list [email protected] http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
