Hi! I am back after some time in the country side.
I discover today the committer meeting transcript. First, I am "Christophe" and not "Dupriez"! I hope meeting all of you in Göteborg to "break the ice"! [4:07pm] bollini: we have received many comments from Dupriez and documentation about a different approach My implemented specs (http://www.windmusic.org/dspace) did not received any approval by comitters: the reaction was Larry and Andrea starting the same work! [4:08pm] bollini: but we have not received any codes and (as far as I know) the Dupriez code works on an old dspace 1.4 or less My work is on the basis of DSpace 1.4.2 [4:09pm] bollini: with the Larry model and few changes that I have made we can cover the functional requirements of Dupriez I do not see how you may support indexation and retrieval by translations, synonyms and broader concepts (or precise concepts). Furthermore, if concepts are prefixed by qualifiers (like the MeSH as used in Pubmed) and suffixed by quantities (like "top" in PubMed or instruments quantities in WindMusic), the Larry Model is not enough either. [4:10pm] stuartlewis: Excellent - sounds good. Are you going to liaise with Christophe to check that his functional requirements are met? [4:10pm] bollini: the only think that remains out is the out-of-box SKOS "integration" of the Dupriez model [4:10pm] bollini: no Stuart, I haven't Data structures are VERY important, especially this one which will impact BlackLight integration (my next endeavour: http://projectblacklight.org/). They should be discussed before final development and integration, on the basis of what can be measured on some prototypes. WindMusic is there to test (it is operational from the customer point of view). It will be demonstrated at DSUG 2009 (Göteborg): in 20 minutes!!! To start integrating my code to the 1.6 fork that Stuart and Mark created for me, I would need at least the aproval of my proposed data structure. Explanations of it below... [4:11pm] bollini: I have check it by myself because there are the same requirements that I need for an us customer [4:11pm] stuartlewis: Do you think it can all be integrated in time for 1.6? (would be another great feature to be able to have) [4:12pm] bollini: I can prepare a patch for the 14th September [4:13pm] stuartlewis: Excellent [4:13pm] bollini: we need to test it and make some changes also to the xmlui [4:13pm] bollini: I'm working only on JSPUI and postgres [4:13pm] stuartlewis: And Larry is working on Oracle and xmlui? [4:13pm] bollini: yes [4:14pm] bradmc: Anyone disagree with that path? I disagree with the idea of not completing the reflexion on the design of the data structure. If we agree on data storage, then contributing code becomes much more easier. For my work, I rejected the idea of Larry because it was not providing all the functionalities my customer needed (he wants even more now that he intensively uses the system: support of reflexive "see also" relations, auto-complete based on a key more sophisticated than just the title...) I am not a committer (or a release coordinator). I cannot afford to make a patch to a running target (1.6 or so) and have it rejected without discussion. I wants to have an approval on specs and then on data structures and then on software architecture and then on independant contributions. So the main principle I need to make accepted now is below: Anywhere some text may be entered as data, one (and eventually more) "indexing string" may be specified (helped by a menu or an auto-complete when possible). This "indexing string" is composed of: * an optional prefix which precises the ROLE (example: is the following person acts as an author? a composer? an illustrator?). This prefix code ends with an underscore ("_"). - When displayed, prefixes are translated depending of user language. - When updated, allowed prefixes are presented in a menu. ** some spacing * an "entry code" with ** a "scheme code" (authority list identifier) followed by an underscore; The scheme code can be a "notation list identifier" to use an alternate coding scheme for a given list (CAS and PubChemId for instance) ** the entry code within the scheme * an optional suffix which precises the QUANTITY (example: is the preceding person is the main author?). This suffix code begins with some spacing and an underscore ("_"). - When displayed, suffixes are translated depending of user language. - When updated, allowed suffixes are presented in a menu. Example: illustrator_ person_1234 _top which means "this contributor is the person with code 1234, act as an illustrator and is the "top" contributor" Lucene is able to retrieve this kind of string. A search can be made for: "illustrator_ person_1234 _top" "illustrator_ person_1234" "person_1234 _top" ** A search can also be made on any synonyms or translations or broaders of "person_1234". ** A EOL distance is specified to Lucene to separate occurrences of the same field. The underscore is choosen because: * it is rarely used in real world texts * it is acceptable to tie parts together within an unique word to be indexed by Lucene. Lucene tokenizers are modified to skip words with underlines (no stemming of codes!) So a reference to a precise entry within a precise authority list is a unique word for Lucene for precise searches and counting (WindMusic is providing links with counters next to every concepts). This reference is enriched at indexing time with all translations, synonyms and broaders. This design is to easily embed "concept references" in text fields of many software: DSpace, JSPWiki, SolR+BlackLight, etc... Authority control is VERY important because it is integration within the Semantic Web (Web 3.0): you quit text search to enter in the structured relations world... Please do not rule this "on the corner of a table"... I am hoping for constructive discussions... Have a nice day! Christophe ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel