Ruben,

> Robin,
> 
> Ruben,

I am very sorry for the cock-up but I was talking to my colleague Robin when I 
was writing :)

Leslaw




Begin forwarded message:

> From: Dr Leslaw Zieleznik <[email protected]>
> Date: May 2, 2012 4:12:21 PM GMT+01:00
> To: Matterhorn Users <[email protected]>
> Subject: Re: [Matterhorn-users] Disk storage on multiple server installation
> 
> 
> Robin,
> 
> Ruben,
> 
>> I think the standard HD size in our VMs is 8GB. As all the files relevant to 
>> MH are in the shared volume, the local HD in the machines is rather 
>> irrelevant.
> Is that enough including the mysql server installation on Admin server?
> 
>> Regarding the .jdbc.url setting, in our case it's set to 
>> jdbc:mysql://<admin_url>/matterhorn .
> Yes sure, I just wrote this too shortly.
> 
> And what about the RAM per server, what in your opinion is the recommended 
> size per server and also the number of vcpu's?
> 
> As for the storage (permanent and temporary) I did check my storage and I 
> have found, that for  'one unit' of the recoded media size the storage is as 
> follow:
>   on folder /streams   2x          (i.e. twice the unit)  I don't understand 
> why twice (with Wowza) ?
>                   /downloads   1x     (one unit)
> 
>                   /workspace
>                                       /mediapackage           1x
>                                       /server_name/static    1x       what 
> the static is for?
>                   /files/mediapackage                         1x
> So for the permanent storage, the required space is 2x or 1x of the recorded 
> unit, and other 3 additional units in workspace and files can be cleanup.
> 
> Thus is means, that for a projected size of recordings say (nTB),  the shared 
> volume size should be:
>   i) with the streaming server:          at least    2*(nTB)  plus   
> 3*max_size_of_single_recording*100  (assuming that 100 of recordings on 
> temporary storage will stay until deleted) 
>  ii) with progression download:      at least    1*(nTB)       "              
>         "                   "                   "                          "  
>                          "                       "                         "  
>                         "                )           
> 
> Is this more or less correct? 
> 
> Thanks,
> Leslaw
> 
> On May 2, 2012, at 3:18 PM, Rubén Pérez wrote:
> 
>> Leslaw,
>> 
>> I think the standard HD size in our VMs is 8GB. As all the files relevant to 
>> MH are in the shared volume, the local HD in the machines is rather 
>> irrelevant.
>> 
>> Regarding the .jdbc.url setting, in our case it's set to 
>> jdbc:mysql://<admin_url>/matterhorn . The "jdbc:mysql" part is important, 
>> and should (I guess) be adapted to the database engine used (mysql in our 
>> case). 
>> 
>> Regards
>> Rubén
>> 
>> 2012/5/2 Dr Leslaw Zieleznik <[email protected]>
>> 
>> Ruben,
>> 
>> Thank you very much for your extensive explanation, things are getting much 
>> clearer for me now :)
>> 
>> In summary of what you are saying, please let me know if I am wrong, not 
>> only the Workspace but also the storage can be on the same (large) shared 
>> volume accessible by all three severs.
>> In fact because the Workspace is part of the 'opencast' folder (as on the 
>> single server installation) , we can have only one main folder on the shared 
>> volume, according to:
>>   org.opencastproject.storage.dir= 
>> {Volume_name}/opt/matterhorn/felix/work/opencast
>> 
>> If my understanding is correct, the installation is very simply.
>> 
>> My new question is then, how much of disk space you do allocate  in VMs per 
>> server, I mean for Admin, Worker and Engage?
>> 
>> 
>> And as for your answer to question 4. 
>> 
>>> In practice, if MH was compiled with the profile "service-registry", then 
>>> you don't need to set this parameter, as the DB will be accessed with the 
>>> credentials specified in the keys ".db.user", ".db.password" and so on. If, 
>>> in turn, MH was compiled with the profile "service-registry-stub", then you 
>>> need to set up this URL to one of the other servers that was compiled with 
>>> the "service-registry" profile, and thus have direct access to the database.
>> 
>> 
>> So the  link between the servers is via credential and the .jdbc.url set  to 
>>  //admin_IP_address/matterhorn   when MySql is installed on the Admin server?
>> 
>> Thanks again,
>> Leslaw
>> 
>> 
>> 
>> On May 2, 2012, at 12:34 PM, Rubén Pérez wrote:
>> 
>>> Leslaw,
>>> 
>>> I'll try to answer your questions inline. 
>>> 
>>> Regards
>>> 
>>> 2012/5/2 Leslaw Zieleznik <[email protected]>
>>> 
>>> I am trying to work out the disk storage on the multiple server 
>>> installation, and I am unclear in few places of the 'Install Across 
>>> Multiple Servers V.1.3' document.
>>> 
>>> 1. I understand, the Workspace should be installed on a shared volume and 
>>> the 
>>>     org.opencastproject.workspace.rootdir   should be pointing to this 
>>> volume.
>>>     The question is, what is the recommended size of this volume in 
>>> proportion to the projected size of media storage?  
>>>     I understand that Workspace is a temporary storage?
>>> 
>>> The workspace is (as its name points out), the "space" where all the "work" 
>>> takes place; in practice, the location where the files to process are 
>>> copied and where the resulting files are created before being copied to 
>>> their final locations. It is "temporary" because the files there are 
>>> supposed to be disposable once they have been copied in their final 
>>> destinations, but that's not done automatically by each service --you have 
>>> to specifically delete the unneeded files using the "cleanup" operation in 
>>> the workflow. 
>>> It should be in a shared volume if possible, because otherwise the files 
>>> are copied via HTTP, which is slower than using a distributed file system 
>>> such as NFS.
>>> I cannot tell you how big that storage should be in comparison with the 
>>> media storage, because we have been using the same volume for the workspace 
>>> *and* the storage, i.e. we don't mount the workspace and the share storage 
>>> separately. However, I guess it depends on how many simultaneous jobs are 
>>> done at the same time in the installation, and the size of the files 
>>> involved. As the files are not erased till the end of the workflow, you 
>>> need to account for the whole size of all the source files, the space for 
>>> all the resulting files (including the intermediate steps) and perhaps a 
>>> little extra space for temporary files that the ffmpeg may need (I'm not so 
>>> sure of this). 
>>>  
>>> 2. The document suggest that storage dir location should be on every server,
>>>      org.opencastproject.storage.dir=/opt/matterhorn/felix/work/opencast
>>> 
>>>      As I understand, subfolders /downloads and /streams are created in the 
>>> final stage of processing, therefore these will be created on the Engage 
>>> server storage disk only?
>>>      If this is right, the largest disk for media storage should be then 
>>> allocated on the Engage server?
>>>      Or  perhaps the   org.opencastproject.storage.dir    should be on the 
>>> shared volume?
>>> 
>>> I understand this should be in a shared volume, because all the relevant 
>>> directories are subdirectories of this one by default. If so, such folders 
>>> can be accessed from all the machines in the installation, even though they 
>>> are only relevant for the engage player.
>>> Those folders are in the shared volume in Vigo, but this has to do with the 
>>> fact that we never create VMs with large amounts of disk. Instead, we have 
>>> a/some dedicated storage server(s) and the machines that need extra space 
>>> mount a volume from that/those machine(s). Then, instead of assigning 
>>> separate volumes for each Matterhorn server, we assign a very big volume to 
>>> contain all the MH-relevant folders. Other scenarios, though, may call for 
>>> different approaches. 
>>>  
>>> 3. Where the repository file used during the media processing should be 
>>> allocated?
>>>    so the   org.opencastproject.file.repo.path = 
>>> ${org.opencastproject.storage.dir}/files
>>>      Should this be on every server or on the shared volume?
>>> 
>>> In my understanding, this is not only used in the media processing, but 
>>> it's like the "library" where all the content that has been processed is 
>>> stored. Please others can correct me if I'm wrong. I think this should be 
>>> in the shared volume, because its contents are relevant for all the 
>>> different machines of the installation. If this is not shared, the files 
>>> are retrieved via HTTP, which is slower. 
>>>  
>>> 4. To which server the service registry should be allocated (this is 
>>> commented out on the single server installation)?
>>>     
>>> org.opencastproject.serviceregistry.url=${org.opencastproject.server.url}/serviceregistry
>>> 
>>> The service registry doesn't have to do with the storage, but with the 
>>> database. One of the servers in the installation has to run a database, and 
>>> the others need to access that DB remotely, because different services 
>>> require that. I don't know the specific cases, but in the case of the 
>>> service registration, you can still have a server in your installation 
>>> without granting its access to the database. Instead, it can use a REST 
>>> service for that, but it needs to know the URL of one of the servers in the 
>>> installation that has direct access to the database. In practice, if MH was 
>>> compiled with the profile "service-registry", then you don't need to set 
>>> this parameter, as the DB will be accessed with the credentials specified 
>>> in the keys ".db.user", ".db.password" and so on. If, in turn, MH was 
>>> compiled with the profile "service-registry-stub", then you need to set up 
>>> this URL to one of the other servers that was compiled with the 
>>> "service-registry" profile, and thus have direct access to the database.
>>> I know this explanation is a bit obscure, but I hope you understood me. 
>>>  
>>> Many thanks in advance,
>>> Leslaw
>>>     
>>>    
>>> 
>>> 
>>> 
>>> 
>>>    
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Matterhorn-users mailing list
>>> [email protected]
>>> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
>>> 
>>> 
>>> _______________________________________________
>>> Matterhorn-users mailing list
>>> [email protected]
>>> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
>> 
>> 
>> _______________________________________________
>> Matterhorn-users mailing list
>> [email protected]
>> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
>> 
>> 
>> _______________________________________________
>> Matterhorn-users mailing list
>> [email protected]
>> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
> 

4 (0)1865 483973
Fax: +44 (0)1865 483073
======================

_______________________________________________
Matterhorn-users mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn-users

Reply via email to