Adding restrictions to output files would mean the user will not be able to 
access the output files even if he was able to create a new experiment with 
input files. That would mean, the user waits for the experiment’s output only 
to find out he doesn’t have enough storage space. Reason why I did not add any 
validation for output data. Is this a valid use case? Also, is there any way to 
know how much space an experiment’s output will need before launching the 
experiment?
After the previous discussion, I assumed once GroupResourceProfile and 
UserStoragePreference (apparently it’s about to be deprecated now) are in play, 
Airavata would be able to choose which StoragePreference should be chosen, 
reason why I went with mechanism where the API server validates the storage 
limit. With the current architecture as is, if Airavata tracking the user 
storage(MFT in the future) is not necessary, then maybe the gateway worrying 
about the storage quota makes more sense.

Will raise the PR for this soon.

Regards,
Vivek.

From: "Christie, Marcus Aaron" <[email protected]>
Reply-To: <[email protected]>
Date: Tuesday, August 4, 2020 at 6:34 PM
To: "[email protected]" <[email protected]>
Cc: "Wannipurage, Dimuthu Upeksha" <[email protected]>
Subject: Re: Validating user storage quota

Hi Vivek,

Yes, since the Django Portal is the user's data store, it should enforce the 
policy. If I think about the Airavata API in other contexts, I'm not sure it 
makes sense for the API client to inform the API server of how much space is 
being used. Also, this mechanism is insufficient because it doesn't take into 
account output files that get added to the data store when the experiment is 
executed.

I think it would be fine if the API provided information about the quota 
limits. Then the Django portal can query those limits and apply them.



On Aug 3, 2020, at 6:32 PM, Bandaru, Vivek Shresta 
<[email protected]<mailto:[email protected]>> wrote:

Hi Marcus,

Thanks for the reply.

I think it should be up to the data store to calculate the amount of storage 
space use and apply the quota. Currently, that's the Django portal. However, in 
the future it will likely be MFT.  -> Do you mean to say that the validation 
should be done in the gateway rather than Airavata?  In my current 
implementation, Django-Portal calculates the amount of Data used by a User and 
sends this to Airavata when an experiment is being created(through a new 
Airavata API) where the validation happens. Once the user reaches the storage 
limit (validated against the new entry, UserStorageQuota in 
StoragePreferences), Airavata throws an exception resulting in the cancellation 
of the new experiment. Is this approach fine?

Regards,
Vivek.

From: "Christie, Marcus Aaron" <[email protected]<mailto:[email protected]>>
Reply-To: <[email protected]<mailto:[email protected]>>
Date: Monday, August 3, 2020 at 1:43 PM
To: Airavata Dev <[email protected]<mailto:[email protected]>>
Cc: "Wannipurage, Dimuthu Upeksha" <[email protected]<mailto:[email protected]>>
Subject: Re: Validating user storage quota

Hi Vivek,



On Jul 30, 2020, at 9:31 PM, Bandaru, Vivek Shresta 
<[email protected]<mailto:[email protected]>> wrote:

Hi All,

I’ve been working on various approaches to validate the storage quota for a 
gateway user. Though the storage for experiments is taken care by the gateway, 
the storage quota validation needs to be done on Airavata. This way, the 
gateways need not develop their own mechanism to track the quotas for every 
user.

The only way as far as I know for a user to add his files to the storage are 
through Storage page and the Create a New experiment page where they can add 
input files. Utilizing the existing

Also, Airavata deposits output files into the experimentDataDir and these will 
count against the user's quota.



datastore.size(‘directory_path’) api in the Django-portal, the amount of space 
used by the user can be tracked, and when the Gateway sends this data to 
Airavata through a new API, Airavata validates the space utilized with the 
Storage Quota specified in the Storage Preference.
Currently, whenever a user tries to create a new Experiment, even before the 
input files are added to the Experiment’s directory, I’ve added the validation 
and when the storage limit exceeds, this is how the new experiment page is 
rendered:

<image001.png>

Questions:


  1.  Since UserStoragePreference feature isn’t being used, I’m not really sure 
if I need to use the StorageResource mentioned in UserStoragePreference. Once 
the gateways start using this (not sure if it’s only Django-Portal which isn’t 
using UserStoragePreference), then the necessary code can be added.
I would like to know the teams thoughts on this.

I wouldn't worry about UserStoragePreference. That is somewhat deprecated, to 
be replaced with a personal GroupResourceProfile.




  1.
  2.  I feel that maybe a better way to approach this problem would be if 
Airavata tracks the size of each and every file going into the user’s directory 
instead of the gateway telling the size of the user’s directory. The amount of 
space used by an individual user can be tracked in the USERS table through a 
new column. This is where I’m currenty stuck.
Airavata currently uses SCPFileTransferWrapper.java for uploading the input 
files onto the compute resources. There is no means of knowing the file size 
being transferred.
Using the current implementation, one possible approach that I could think of 
is to download the file from the gateway to a temporary directory where 
Airavata is deployed, and through that URI, the file size can be retrieved. But 
this would involve downloading every input file on all the gateways of all the 
users onto a temporary location on Airavata and deleting them.
I’m currently looking for a better alternative to track the file size given a 
URI.

I think it should be up to the data store to calculate the amount of storage 
space use and apply the quota. Currently, that's the Django portal. However, in 
the future it will likely be MFT.




  1.
  2.  Once MFT is integrated into Airavata, Airavata will not use the existing 
file transfer protocols. So, does it make sense to develop this tracking 
mechanism on MFT and for now, use the above mentioned validation 
mechanism(Gateway gets the size of the user directory)?

I think definitely understanding how MFT factors in to the future plans will be 
good. Maybe Dimuthu can chime in here.




  1.

Any pointers are appreciated.
Thanks for reading.

Regards,
Vivek.

Reply via email to