Hey Suresh, Yep. Depending on Airavata's repository search needs, we can also pull in Apache Lucene, and Solr, as we need to. I'm very familiar with those technologies and a former member of the Lucene PMC so I know those guys and their technology well.
Cheers, Chris On Aug 9, 2011, at 7:28 PM, Suresh Marru wrote: > > On Aug 9, 2011, at 1:25 PM, Mattmann, Chris A (388J) wrote: > >>> It indeed looks like a very active project and the reference implementation >>> for JCR, thank for the pointer. I was poking through the documentation, but >>> did not get yet get my hands dirty. It might be quick to ask you, do you >>> know how easy will it be to add custom schemas and make the content of the >>> document searchable? For example, can I add a WSDL or a BPEL document and >>> find out across the repository which of the application services wsdl's >>> wrap Gaussian molecular chemistry model? This is a just an illustrative >>> example, but I am curious how the indexes will be built for content and how >>> bad the performance will be if we make lot of content searchable. >> >> I definitely think you can do this, as you can define user-tags on the >> content items at each node in the repository and then search for those nodes >> later on. It's probably best to sign up to [email protected] and >> ask there but that's based on my limited understanding of the system. > > Thanks Chris for this additional information. > > I will create some JIRA tasks so we can try out JCR and Jackrabbit for some > simple repository tasks in gfac and xbaya. I think Airavata will have more > complicated repository tasks, but to start with we can try simple examples. > As a long term task I think it will be better we consolidate all Airavata > repository needs so we can create interfaces and try out different > implementations before we agree upon one. > > Suresh > > >> Thanks, >> Chris >> >>> >>> Thanks for your insights, >>> Suresh >>> >>>> >>>> Cheers, >>>> Chris >>>> >>>> On Aug 9, 2011, at 9:55 AM, Suresh Marru wrote: >>>> >>>>> Hi All, >>>>> >>>>> We are stalled on this thread, so how about getting to a consensus. Since >>>>> I did not see any further discussion on the use of schemas, should we >>>>> assume we want to retain XML Schemas and add simplified beans to easily >>>>> work with instead of generated xmlbeans? The schemas for reference are at >>>>> [1]. Also, as Patanachai explained in the original message below, there >>>>> are three types of schema documents for GFAC to describe the >>>>> computational host, application deployment description and finally >>>>> service interface. Using these three descriptions, a application service >>>>> wsdl is generated and GFAC manages the deployed application on various >>>>> computational resources. There is a mapping between these deployment >>>>> descriptions. I am reading the JCR API document [2] and intrigued by the >>>>> relevance. But my inference is from a theoretical stand point and >>>>> wondering if any one on the list has experience good and bad on working >>>>> against JCR spec. >>>>> >>>>> Suresh >>>>> >>>>> [1] - >>>>> https://svn.apache.org/repos/asf/incubator/airavata/trunk/modules/commons/gfac-schema/schemas/ >>>>> [2] - http://jcp.org/en/jsr/detail?id=283 >>>>> >>>>> On Aug 1, 2011, at 12:07 AM, Suresh Marru wrote: >>>>> >>>>>> Hi Patanachai, >>>>>> >>>>>> Thanks for explaining the issue in detail. In simple terms, we need >>>>>> multiple client components register a description about an application >>>>>> and store it in a registry. GFac will need to pull the registered >>>>>> description document and execute and manage the compute job. Along with >>>>>> XBaya as the client which registers the document, there are other >>>>>> clients including a gadget interface. >>>>>> >>>>>> I agree that the current scheme has to revisited (and fix minor issues >>>>>> like you mention about the gridftp tags). But moving from xmlschema to >>>>>> a light weight option is a bigger question. With a proper bean >>>>>> generation library and serializing/deserializing methods I personally >>>>>> favor xml schema but I do not want to be biased either. I am -1 for POJO >>>>>> simply because it will limit non-java bases clients like a simple php >>>>>> web form. JSON in general sounds like a good alternative, but I do not >>>>>> experience with it in a validation and schema sense. >>>>>> >>>>>> I will wait for others to chime in, if there are no better alternatives >>>>>> suggestion, I will import the missing GFac schema from code donation >>>>>> into a commons area - >>>>>> https://svn.apache.org/repos/asf/incubator/airavata/donations/ogce-donation/modules/utils/schemas/gfac-schema-utils/ >>>>>> >>>>>> Cheers, >>>>>> Suresh >>>>>> >>>>>> On Jul 29, 2011, at 2:09 PM, [email protected] wrote: >>>>>> >>>>>>> Hi devs, >>>>>>> >>>>>>> I want to discuss about the type system in GFAC-Core. >>>>>>> >>>>>>> Currently, GFAC module read and write a necessary information based on >>>>>>> XML >>>>>>> schema (called GFAC-Schema) as a definition. GFAC-Schema library is >>>>>>> generated from XMLbeans (http://xmlbeans.apache.org/) and is referenced >>>>>>> in >>>>>>> the project. >>>>>>> >>>>>>> Examples of GFAC-Schema are: >>>>>>> HostTypeDescription, which describes an environment for a host such as >>>>>>> Java >>>>>>> version, Temp directory, GridFTP endpoint etc. >>>>>>> ServiceTypeDescription, which describes a service such as parameters, >>>>>>> service name, etc. >>>>>>> GFAC-SimpleType, which defines a simple parameter type to the service >>>>>>> such >>>>>>> as Boolean, Double, Integer, etc. >>>>>>> >>>>>>> This is how system work roughly: >>>>>>> After deploying their software on a computing host, users will register >>>>>>> their host, application, service description via XBaya-GUI (Java Swing). >>>>>>> This registration information will be saved to XRegistry as XML string >>>>>>> according to XML schema. >>>>>>> When users invoke a (Web) service, GFAC will load the necessary >>>>>>> information >>>>>>> (host, application directory, parameters, etc.) and execute the deployed >>>>>>> software . >>>>>>> Then, GFAC parses the output from the software, wraps it and send out >>>>>>> as an >>>>>>> appropriate parameter type format. >>>>>>> >>>>>>> >>>>>>> So, the question is do we want to continue using XML-Schema. >>>>>>> If, we agree to use XML-Schema, we should import some initial schema >>>>>>> from >>>>>>> OGCE GFAC as a new module in Airavata. Also, we need to redesign some >>>>>>> schema. >>>>>>> For Instance, current HostType schema requires GridFTP Endpoint element >>>>>>> which is not necessary if a computing host doesn't have GridFTP. >>>>>>> >>>>>>> Otherwise, what do you propose? POJO, JSON, etc. >>>>>>> >>>>>>> -- >>>>>>> Best Regards, >>>>>>> Patanachai Tangchaisin >>>>>> >>>>> >>>> >>>> >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> Chris Mattmann, Ph.D. >>>> Senior Computer Scientist >>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>>> Office: 171-266B, Mailstop: 171-246 >>>> Email: [email protected] >>>> WWW: http://sunset.usc.edu/~mattmann/ >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> Adjunct Assistant Professor, Computer Science Department >>>> University of Southern California, Los Angeles, CA 90089 USA >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> >>> >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Senior Computer Scientist >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 171-266B, Mailstop: 171-246 >> Email: [email protected] >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Assistant Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
