Using ID’s is challenging incase of migration to new versions. Experiments and Client APIs (hardcoded ids) may be using old ids and hosted service need to migrate to a newer database or structure for some reason, we will see unnecessary failures or we need to migrate the old database all the time. I am still working though the App catalog API to provide a better insight but differently i feel a need of a step by step guide to register application. RegisterSampleApplications has lot of application together and it made the class bulky and can be confusing for users.
Thanks Raminder On Jul 25, 2014, at 11:36 AM, Lahiru Gunathilake <[email protected]> wrote: > Hi Suresh, > > I think both the points below is not something I suggested to change. IMHO > ComputingResources,Application Interfaces, Deployment Descriptors are nice > and includes everything and associated nicely but not exposed to use in a > easy way. > > I think my concerns are simply about unwanted Module, API exposing Ids, and > no easy way to get above three documents rather than traversing one by one. > > Regards > Lahiru > > I was looking through the API and here are some more thoughts on this: > > AppCatalog is designed treating Airavata as a shared resource among a common > set of users. For instance, you go onto a supercomputer you see a list of > applications. You leave to users how to use them. Similar is our deployment > vs interface. A typical scientific computing application will have dozens of > parameters, any given interface will only expose a subset of them or all > them. But it will be bad to enforce for every deployment there is an > interface. Similar to when a application is deployed, you do no anticipate > how it will be used, we do not want to tightly bound deployments and > interfaces. > > I think you are critiquing in the right direction but your perspective is a > single user single application (one deployment set per interface). I agree, > the design did not take into consideration how to do the hello world easily. > So step back, think about how the design should be when you get dozen’s of > applications and a hand full for interfaces for one deployment and so forth. > If you still think the design is complex to handle such scenarios, lets > revisit. I am not talking about corner cases, I am talking about real-world > production usage vs ease of use for hello world developer testing. A good way > for you to understand is, login to any HPC machine and see how the system is > described (once on a semi-static web page (compute resource description), > uniquely identified modules (deployment description) and leave the interfaces > to users (interface descriptions). > Suresh > > On Jul 25, 2014, at 8:37 AM, Suresh Marru <[email protected]> wrote: > > > Lahiru, > > > > Glad that you took time to sink in the design and provide suggestions. I > > see that all of them are hitting the similar issue, so let me respond a > > summary one: > > > > The suggestions you have are not really the design but usability and ease > > of debugging. The Id’s and modules are there so we disambiguate usage. In > > the previous approaches (blame me) I exactly always advocated for what you > > ask below and we have seen over time, serious production usage suffers from > > it. For instance in a ParamChem workflow where Gaussian is wrapped in 10 > > different ways, its very error prone. We should not push the burden onto > > users to do the right thing, but the system should enforce it. As an > > example, if google docs uses names, imagine the commonly used terms like > > “meeting notes” or resume. There are ID’s for a reason. But as a user you > > never deal with ID’s. But if you use google docs API, you very well needs > > to work with ID’s. > > > > Same here, I like the other suggestion you made in a different context, > > lets develop tooling to deal with debugging. And if there are UI’s to > > describe documents, you never work with ID’s. Take > > RegisterSampleApplications as an example, I do not think you want simper > > than that. There will be additional steps. But app catalog will be register > > once use 1000’s of times. For this, I think its ok to spend extra 5 minutes > > to register an application to make sure users use exactly what they intend > > to instead of some wild card guessing. > > > > Suresh > > > > On Jul 25, 2014, at 3:43 AM, Lahiru Gunathilake <[email protected]> wrote: > > > >> Hi All, > >> > >> I think app-catalog design is a well-thought comprehensive design and I > >> would like to propose following suggestions. Please correct me if I am > >> wrong. > >> > >> I think we have to minimize the complexity in app-catalog model if some of > >> the components are not really necessary in 99% of our scenarios. If the > >> corner cases are complex we shouldn't make things complex for our most > >> frequent scenarios. > >> > >> Example: I think we do not need a layer of module and I think its not > >> worth api dealing with whole module layer. We can simply achieve the > >> module layer information by giving some meaningful name for the > >> application Deployment document. > >> Ex:Amber-1.4,Amber-1.3 > >> > >> I think we should remove all the Id based query because its very difficult > >> to program against Ids, IMHO exposing the Ids in the API is a bad design. > >> If someone try to create same named document in same level again we can > >> simply give an error without exposing Ids to the API. To achieve this we > >> should not allow users to create Application deployment documents before > >> they create its Application interface and parsing along the application > >> interface name. > >> > >> And in practice allowing users to give same names is a bad design (with Id > >> model we allow users to give same name) because ultimately what users will > >> see is set of names not the Ids, so I think we have to enforce that in the > >> API. I think good API should always lead users to do minimum errors. > >> > >> For me this is more like a tree structure and we do not have scenario of > >> random access. If we know all the names we should be able to get it in a > >> single call with all the objects in it as we do in a normal tree where > >> data will be stored in the leaf nodes. Currently we do lot of traversing > >> among different documents, these methods will be useful during scheduling > >> but in the case of using Experiment configuration where we specify > >> everything precisely we should be able to get it right away. > >> > >> Regards > >> Lahiru > >> -- > >> System Analyst Programmer > >> PTI Lab > >> Indiana University > > > > > > > -- > System Analyst Programmer > PTI Lab > Indiana University
