[CODE4LIB] 34 th ELAG conference, 9-11th June 2010, Helsinki, Finland
Call for papers for the 34th ELAG conference, 9-11th June 2010, Meeting New User Expectations Helsinki, Finland The ELAG (European Library Automation Group) Conference is Europe's premier conference for library and information management technology. The meetings aim at in depth discussions of particular library automation topics and at the promotion of informal exchange of ideas and experience. The topics covered are technical and meant for participants with an appropriate technical background. We invite you to submit a paper on this year's main topic Meeting New User Expectations. You will find more information about ELAG, the topic and its sub-themes in the document attached. Information about the conference can be found at: http://elag2010.nationallibrary.fi/ . There will be a pre-conference meeting of one day, unconference style, on Using Solr to index your bibliographic data. This will be held on the 8th June 2010. More information will be available later at the website mentioned above. Peter Call for papers for the 34th ELAG conference.rtf Description: Call for papers for the 34th ELAG conference.rtf
Re: [CODE4LIB] Q: what is the best open source native XML database
I've had the best experience (query speed, primarily) with BaseX. This was primarily for large XML document processing, so I'm not sure how much it will satisfy your transactional needs. I was initially using eXist, and then switched over to BaseX because the speed gains were very noticeable. -Sean On Jan 16, 2010, at 11:15 AM, Godmar Back wrote: Hi, we're currently looking for an XML database to store a variety of small-to-medium sized XML documents. The XML documents are unstructured in the sense that they do not follow a schema or DTD, and that their structure will be changing over time. We'll need to do efficient searching based on elements, attributes, and full text within text content. More importantly, the documents are mutable. We'll like to bring documents or fragments into memory in a DOM representation, manipulate them, then put them back into the database. Ideally, this should be done in a transaction-like manner. We need to efficiently serve document fragments over HTTP, ideally in a manner that allows for scaling through replication. We would prefer strong support for Java integration, but it's not a must. Have other encountered similar problems, and what have you been using? So far, we're researching: eXist-DB (http://exist.sourceforge.net/ ), Base-X (http://www.basex.org/ ), MonetDB/XQuery (http://www.monetdb.nl/XQuery/ ), Sedna (http://modis.ispras.ru/sedna/index.html ). Wikipedia lists a few others here: http://en.wikipedia.org/wiki/XML_database I'm wondering to what extent systems such as Lucene, or even digital object repositories such as Fedora could be coaxed into this usage scenario. Thanks for any insight you have or experience you can share. - Godmar
Re: [CODE4LIB] Q: what is the best open source native XML database
On Tue, Jan 19, 2010 at 10:09 AM, Sean Hannan shan...@jhu.edu wrote: I've had the best experience (query speed, primarily) with BaseX. This was primarily for large XML document processing, so I'm not sure how much it will satisfy your transactional needs. I was initially using eXist, and then switched over to BaseX because the speed gains were very noticeable. What about the relative maturity/functionality of eXist vs BaseX? I'm a bit skeptical to put my eggs in a University project basket not backed by a continuous revenue stream (... did I just say that out loud?) - Godmar
Re: [CODE4LIB] Q: what is the best open source native XML database
Godmar, We're using eXist for a couple of apps here, and like it quite a bit. The full text search extensions in the 1.4 release are backed by Lucene, and it's pretty quick once you've tuned it (try some searches here: http://diglib.princeton.edu/ead/ -- this is running on a beta of 1.4) and set up the indexing properly. Performance will not be good until you've configured some indexes and tweaked the JVM settings. There is a bit of a learning curve involved here, but the documentation is decent, and the community and developers are quite active and accessible. You can GET and PUT and DELETE documents very easily, or POST xqueries to get fragments. You can also GET fragments or documents by supplying parameters to an xquery stored in the database--they call this their REST-style API[1]. There are a few other ways to get content in and out[2], and Java integration isn't a problem via the xml:db API[3]. You can also write extension modules in Java. -Jon 1. http://exist.sourceforge.net/devguide_rest.html 2. http://exist.sourceforge.net/devguide.html 3. http://exist.sourceforge.net/devguide_xmldb.html On 01/16/2010 11:15 AM, Godmar Back wrote: Hi, we're currently looking for an XML database to store a variety of small-to-medium sized XML documents. The XML documents are unstructured in the sense that they do not follow a schema or DTD, and that their structure will be changing over time. We'll need to do efficient searching based on elements, attributes, and full text within text content. More importantly, the documents are mutable. We'll like to bring documents or fragments into memory in a DOM representation, manipulate them, then put them back into the database. Ideally, this should be done in a transaction-like manner. We need to efficiently serve document fragments over HTTP, ideally in a manner that allows for scaling through replication. We would prefer strong support for Java integration, but it's not a must. Have other encountered similar problems, and what have you been using? So far, we're researching: eXist-DB (http://exist.sourceforge.net/ ), Base-X (http://www.basex.org/ ), MonetDB/XQuery (http://www.monetdb.nl/XQuery/ ), Sedna (http://modis.ispras.ru/sedna/index.html ). Wikipedia lists a few others here: http://en.wikipedia.org/wiki/XML_database I'm wondering to what extent systems such as Lucene, or even digital object repositories such as Fedora could be coaxed into this usage scenario. Thanks for any insight you have or experience you can share. - Godmar -- Jon Stroop Metadata Analyst C-17-D2 Firestone Library Princeton University Princeton, NJ 08544 Email: jstr...@princeton.edu Phone: (609)258-0059 Fax: (609)258-0441 http://diglib.princeton.edu http://diglib.princeton.edu/ead