I wouldn't say this proposal was about speed.  I'd say this proposal  
is about flexibility.

If you have all your Fedora data stored in a relational database,  
rather than just bits and pieces, you can access it via any system  
that also talks to an RDBMS.  Rather than having to write single- 
purpose client code that talks to Fedora's REST or SOAP APIs, or  
parses FOXML files directly, or talks to Mulgara, or to gSearch, or to  
the parts that are stored in the RDBMS now, or more realistically,  
some horrible combination of the above, you can simply talk directly  
to the RDBMS and get any and all data you need.

Honestly, this would make Fedora almost instantly accessible via any  
number of web frameworks (e.g. Django, Rails, etc.) that have an ORM,  
vastly widening your potential developer base.  Instead of having to  
learn some cumbersome client library that breaks every time the core  
developers change their API (yes, that's a comment on the REST API's  
lack of stability so far (and yes, I know it has been in beta, but  
it's still been a point of pain)), a developer can just point their  
framework of choice at the DB and away they go.  Big win all around  
from my perspective.

Regards,

---Peter
On Oct 26, 2009, at 8:14 AM, Gert Schmeltz Pedersen wrote:

> Have you tried to use Fedora GSearch? I do not think that a  
> relational database search nor an xml database search perform better  
> than GSearch with Lucene or Solr.
>
> Cheers,
> Gert
>
>> -----Original Message-----
>> From: Asger Askov Blekinge [mailto:a...@statsbiblioteket.dk]
>> Sent: Monday, October 26, 2009 12:47 PM
>> To: Lodewijk Bogaards
>> Cc: Fedora commons developers
>> Subject: Re: [Fedora-commons-developers] Custom database module for
>> Fedora
>>
>> Hi
>>
>> 10 days and no replies. That's not nice of people. So here I go.
>>
>> I think I can follow the design you propose, even though I am not
>> really
>> into the database code part of Fedora.
>> To retell it, so you can check my understanding: There is some config
>> in
>> DefaultDOManager.dbspec that determines which part of a fedora object
>> is
>> cached in the database. You amend that config, so that the user can
>> provide a config file, so that additional content is cached.
>> That's all there is, right?
>>
>> I am not against the idea, but I consider it a stopgap measure.
>> The problem you outline is that actually querying the foxml files  
>> is to
>> slow in the fedora design. You want a faster way to access the
>> contents,
>> and thus you propose to store it in a database. So far I agree, the
>> fedora backend is not fast for small queries (as the entire object is
>> parsed for any query), and some indexed frontend is sometimes  
>> required.
>> Now, I do not know the performance of the various open source xml
>> databases, but it sounds radically simpler to store/backup the foxml
>> objects in an xml database, than writing complex expressions for
>> mapping
>> selected parts to a relational database.
>>
>> Having such an database, which could either be a cache of the foxml
>> files, or the primary store for the foxml files would allow fast
>> queries
>> about properties on the objects or datastreams. This should  
>> probably be
>> the design we work towards, but your idea could easily serve as a
>> current way of doing database integration while we have no xml
>> database.
>>
>> Regards
>>
>>
>> On Fri, 2009-10-16 at 20:32 +0200, Lodewijk Bogaards wrote:
>>> Hi,
>>>
>>> For speed reasons we wanted a database that contains the same
>> information
>>> Fedora contains. I have emailed before (subject: gDatabase) that I
>> figured
>>> that Fedora already has a feature to do so, for the dublin core and
>> some
>>> other digital object properties, and that with some work Fedora can
>> be made
>>> to keep the database synchronized for its user-made XML data as  
>>> well.
>>> Currently I have this working within Fedora.
>>>
>>> I am sending you the source which was made on top of the Fedora  
>>> 3.2.1
>> source
>>> release, an example foxml and database schema.
>>>
>>> The idea is that DefaultDOManager.dbspec is extended with this line:
>>>
>>>    <include href="server/config/custom-db.xml" />
>>>
>>> Then in that file under the Fedora home dir you can put your own
>> database
>>> schema, which is an extension of the database schema used in the
>> dbspec
>>> file.
>>>
>>> Columns get their data by value getters. Currently I have  
>>> implemented
>> one
>>> value getter that uses an xPath query to get a value. This value
>> getting
>>> code does not necessarily run for all digital objects. It is  
>>> possible
>> to
>>> choose a content model and/or datastream id that must be present for
>> the
>>> tables to be updated by the digital object. Here is an example of
>> table with
>>> a column:
>>>
>>> <table name="easyFiles" contentModel="info:fedora/fedora-
>> system:easyfile"
>>> datastreamId="file">
>>>
>>> <column name="filename" type="varchar(256)" notNull="true"
>> index="filename"
>>> default="-">
>>>  <value delimiterType="row" delimiter=",">
>>>    <valuegetter type="xPath" xPath="//easyfile:filename"
>>> nsPrefix="easyfile" nsUri="http://easy.dans.knaw.nl/files";
>>> delimiterType="normal" delimiter="," />
>>>  </value>
>>> </column>
>>>
>>> An xPath query may return several values. For that two kinds of
>> delimiters
>>> may be used. A row delimiter (meaning several rows are created for
>> each
>>> value) and a normal delimiter (meaning a string value is inserted
>> after
>>> every row). Also a values tag may contain several valuegetter tags,
>> which
>>> can be delimited in the same two ways.
>>> If two columns return two rows those two rows are added together as
>> one row.
>>> Also a defaultvalue for a second valuegetter may be used. Thus
>> creating the
>>> possibility of composing rows almost any way one wants based on
>> Fedora data.
>>>
>>> A pid must always be present, but does not need to be the primary  
>>> key
>>> (primaryKey attribute of the table). It is thus up to the user how
>> the data
>>> is composed into tables, and if the user makes a mistake an
>> SQLException is
>>> thrown and the digital object is thus not ingested/updated, thus
>> forming
>>> another kind of safety net that does not necessarily work so well if
>> the
>>> database would be filled from within the users application.
>>>
>>> With this simple system it is possible to do almost any kind of
>> database
>>> synchronization based on Fedora data. I have seen many projects  
>>> based
>> on
>>> Fedora that employ a database alongside Fedora in order to speed up
>> the
>>> querying process. I therefore think this might be useful for many.
>>>
>>> Of course the search interface that comes with Fedora may also be
>> extended
>>> to make use of this new feature, but since that is not a need for  
>>> our
>>> project at the moment I have not taken the time to do so.
>>>
>>> I would be very pleased if this could become part of subsequent
>> Fedora
>>> releases. Hopefully others think so too.
>>>
>>> Kind regards,
>>>
>>> Lodewijk Bogaards
>>>
>>>
>>
>>
>> -----------------------------------------------------------------------
>> -------
>> Come build with us! The BlackBerry(R) Developer Conference in SF, CA
>> is the only developer event you need to attend this year. Jumpstart
>> your
>> developing skills, take BlackBerry mobile applications to market and
>> stay
>> ahead of the curve. Join us from November 9 - 12, 2009. Register now!
>> http://p.sf.net/sfu/devconference
>> _______________________________________________
>> Fedora-commons-developers mailing list
>> Fedora-commons-developers@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons- 
>> developers
>
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry(R) Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart  
> your
> developing skills, take BlackBerry mobile applications to market and  
> stay
> ahead of the curve. Join us from November 9 - 12, 2009. Register now!
> http://p.sf.net/sfu/devconference
> _______________________________________________
> Fedora-commons-developers mailing list
> Fedora-commons-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Fedora-commons-developers mailing list
Fedora-commons-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

Reply via email to