Ruben, > Robin, > > Ruben,
I am very sorry for the cock-up but I was talking to my colleague Robin when I was writing :) Leslaw Begin forwarded message: > From: Dr Leslaw Zieleznik <[email protected]> > Date: May 2, 2012 4:12:21 PM GMT+01:00 > To: Matterhorn Users <[email protected]> > Subject: Re: [Matterhorn-users] Disk storage on multiple server installation > > > Robin, > > Ruben, > >> I think the standard HD size in our VMs is 8GB. As all the files relevant to >> MH are in the shared volume, the local HD in the machines is rather >> irrelevant. > Is that enough including the mysql server installation on Admin server? > >> Regarding the .jdbc.url setting, in our case it's set to >> jdbc:mysql://<admin_url>/matterhorn . > Yes sure, I just wrote this too shortly. > > And what about the RAM per server, what in your opinion is the recommended > size per server and also the number of vcpu's? > > As for the storage (permanent and temporary) I did check my storage and I > have found, that for 'one unit' of the recoded media size the storage is as > follow: > on folder /streams 2x (i.e. twice the unit) I don't understand > why twice (with Wowza) ? > /downloads 1x (one unit) > > /workspace > /mediapackage 1x > /server_name/static 1x what > the static is for? > /files/mediapackage 1x > So for the permanent storage, the required space is 2x or 1x of the recorded > unit, and other 3 additional units in workspace and files can be cleanup. > > Thus is means, that for a projected size of recordings say (nTB), the shared > volume size should be: > i) with the streaming server: at least 2*(nTB) plus > 3*max_size_of_single_recording*100 (assuming that 100 of recordings on > temporary storage will stay until deleted) > ii) with progression download: at least 1*(nTB) " > " " " " > " " " > " ) > > Is this more or less correct? > > Thanks, > Leslaw > > On May 2, 2012, at 3:18 PM, Rubén Pérez wrote: > >> Leslaw, >> >> I think the standard HD size in our VMs is 8GB. As all the files relevant to >> MH are in the shared volume, the local HD in the machines is rather >> irrelevant. >> >> Regarding the .jdbc.url setting, in our case it's set to >> jdbc:mysql://<admin_url>/matterhorn . The "jdbc:mysql" part is important, >> and should (I guess) be adapted to the database engine used (mysql in our >> case). >> >> Regards >> Rubén >> >> 2012/5/2 Dr Leslaw Zieleznik <[email protected]> >> >> Ruben, >> >> Thank you very much for your extensive explanation, things are getting much >> clearer for me now :) >> >> In summary of what you are saying, please let me know if I am wrong, not >> only the Workspace but also the storage can be on the same (large) shared >> volume accessible by all three severs. >> In fact because the Workspace is part of the 'opencast' folder (as on the >> single server installation) , we can have only one main folder on the shared >> volume, according to: >> org.opencastproject.storage.dir= >> {Volume_name}/opt/matterhorn/felix/work/opencast >> >> If my understanding is correct, the installation is very simply. >> >> My new question is then, how much of disk space you do allocate in VMs per >> server, I mean for Admin, Worker and Engage? >> >> >> And as for your answer to question 4. >> >>> In practice, if MH was compiled with the profile "service-registry", then >>> you don't need to set this parameter, as the DB will be accessed with the >>> credentials specified in the keys ".db.user", ".db.password" and so on. If, >>> in turn, MH was compiled with the profile "service-registry-stub", then you >>> need to set up this URL to one of the other servers that was compiled with >>> the "service-registry" profile, and thus have direct access to the database. >> >> >> So the link between the servers is via credential and the .jdbc.url set to >> //admin_IP_address/matterhorn when MySql is installed on the Admin server? >> >> Thanks again, >> Leslaw >> >> >> >> On May 2, 2012, at 12:34 PM, Rubén Pérez wrote: >> >>> Leslaw, >>> >>> I'll try to answer your questions inline. >>> >>> Regards >>> >>> 2012/5/2 Leslaw Zieleznik <[email protected]> >>> >>> I am trying to work out the disk storage on the multiple server >>> installation, and I am unclear in few places of the 'Install Across >>> Multiple Servers V.1.3' document. >>> >>> 1. I understand, the Workspace should be installed on a shared volume and >>> the >>> org.opencastproject.workspace.rootdir should be pointing to this >>> volume. >>> The question is, what is the recommended size of this volume in >>> proportion to the projected size of media storage? >>> I understand that Workspace is a temporary storage? >>> >>> The workspace is (as its name points out), the "space" where all the "work" >>> takes place; in practice, the location where the files to process are >>> copied and where the resulting files are created before being copied to >>> their final locations. It is "temporary" because the files there are >>> supposed to be disposable once they have been copied in their final >>> destinations, but that's not done automatically by each service --you have >>> to specifically delete the unneeded files using the "cleanup" operation in >>> the workflow. >>> It should be in a shared volume if possible, because otherwise the files >>> are copied via HTTP, which is slower than using a distributed file system >>> such as NFS. >>> I cannot tell you how big that storage should be in comparison with the >>> media storage, because we have been using the same volume for the workspace >>> *and* the storage, i.e. we don't mount the workspace and the share storage >>> separately. However, I guess it depends on how many simultaneous jobs are >>> done at the same time in the installation, and the size of the files >>> involved. As the files are not erased till the end of the workflow, you >>> need to account for the whole size of all the source files, the space for >>> all the resulting files (including the intermediate steps) and perhaps a >>> little extra space for temporary files that the ffmpeg may need (I'm not so >>> sure of this). >>> >>> 2. The document suggest that storage dir location should be on every server, >>> org.opencastproject.storage.dir=/opt/matterhorn/felix/work/opencast >>> >>> As I understand, subfolders /downloads and /streams are created in the >>> final stage of processing, therefore these will be created on the Engage >>> server storage disk only? >>> If this is right, the largest disk for media storage should be then >>> allocated on the Engage server? >>> Or perhaps the org.opencastproject.storage.dir should be on the >>> shared volume? >>> >>> I understand this should be in a shared volume, because all the relevant >>> directories are subdirectories of this one by default. If so, such folders >>> can be accessed from all the machines in the installation, even though they >>> are only relevant for the engage player. >>> Those folders are in the shared volume in Vigo, but this has to do with the >>> fact that we never create VMs with large amounts of disk. Instead, we have >>> a/some dedicated storage server(s) and the machines that need extra space >>> mount a volume from that/those machine(s). Then, instead of assigning >>> separate volumes for each Matterhorn server, we assign a very big volume to >>> contain all the MH-relevant folders. Other scenarios, though, may call for >>> different approaches. >>> >>> 3. Where the repository file used during the media processing should be >>> allocated? >>> so the org.opencastproject.file.repo.path = >>> ${org.opencastproject.storage.dir}/files >>> Should this be on every server or on the shared volume? >>> >>> In my understanding, this is not only used in the media processing, but >>> it's like the "library" where all the content that has been processed is >>> stored. Please others can correct me if I'm wrong. I think this should be >>> in the shared volume, because its contents are relevant for all the >>> different machines of the installation. If this is not shared, the files >>> are retrieved via HTTP, which is slower. >>> >>> 4. To which server the service registry should be allocated (this is >>> commented out on the single server installation)? >>> >>> org.opencastproject.serviceregistry.url=${org.opencastproject.server.url}/serviceregistry >>> >>> The service registry doesn't have to do with the storage, but with the >>> database. One of the servers in the installation has to run a database, and >>> the others need to access that DB remotely, because different services >>> require that. I don't know the specific cases, but in the case of the >>> service registration, you can still have a server in your installation >>> without granting its access to the database. Instead, it can use a REST >>> service for that, but it needs to know the URL of one of the servers in the >>> installation that has direct access to the database. In practice, if MH was >>> compiled with the profile "service-registry", then you don't need to set >>> this parameter, as the DB will be accessed with the credentials specified >>> in the keys ".db.user", ".db.password" and so on. If, in turn, MH was >>> compiled with the profile "service-registry-stub", then you need to set up >>> this URL to one of the other servers that was compiled with the >>> "service-registry" profile, and thus have direct access to the database. >>> I know this explanation is a bit obscure, but I hope you understood me. >>> >>> Many thanks in advance, >>> Leslaw >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Matterhorn-users mailing list >>> [email protected] >>> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users >>> >>> >>> _______________________________________________ >>> Matterhorn-users mailing list >>> [email protected] >>> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users >> >> >> _______________________________________________ >> Matterhorn-users mailing list >> [email protected] >> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users >> >> >> _______________________________________________ >> Matterhorn-users mailing list >> [email protected] >> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users > 4 (0)1865 483973 Fax: +44 (0)1865 483073 ======================
_______________________________________________ Matterhorn-users mailing list [email protected] http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
