We have agreement, then. I prefer to not store large binary stuff in
Directory Server, but if the binaries never change, it may not be such
a big deal. Of course, I would choose MySQL for the database :)
On Jan 14, 2009, at 8:53 AM, Adam Tauno Williams wrote:
I think most people who have looked into this would agree with
Terry. I
think that if you choose option 1, you will find that your directory
software is designed to return relatively small amounts of data and
is
just not efficient at moving large blobs of data like the documents
that
you are thinking of storing. You will want to do proof-of-concept
performance testing before committing to this approach to make sure
the
delivered system would have adequate response time under load.
We store some BLOBs in LDAP (such as a user's desktop wallpaper). If
they are of "reasonable" size it works very well. When I tested
(which
was some time and versions ago) it was loading/updating the BLOBs that
hurt performance and ballooned the logs. I think it works well for
items that are read-mostly, I wouldn't but BLOBs in the Dit that are
frequently changed.
In option 2 it is true that you will have to maintain two
repositories,
and it will be difficult for you keep them consistent. Many kinds of
system bugs and failures will cause an update to be completed on one
repository and not the other. If you choose this approach, be sure
to
develop a utility which will check consistency between the two
repositories.
Agree. I wonder why you'd want to build a document repository on LDAP
at all? I'm a fan of LDAP but it seems, IMO, ill suited for that
purpose.
Option 3 attracted a lot of interest in the 90's when database
companies
like Informix and Oracle were positioning their DBMS products as the
place to store all of your data, in whatever form. I believe that
there
were a number of success stories in that area. There seems to be
less
interest now. I gather it is just very difficult to create one DBMS
product that can efficiently support many concurrent updates (as a
DBMS
must), many concurrent queries (as a DBMS must) and also serve big
blobs
of read-only data (like documents).
As an Informix shop I think the loss-of-interest is just because it is
now common place and barely worth mentioning. Again, if the BLOBs are
read-mostly performance is very good and a modern RDMBS can feed
them to
a client very efficiently. However you do have to take BLOBs into
account in your configuration; Informix (and other) RDBMs allow [and
recommend] you create separate partitions (or whatever specific term
the
RDBMS in question uses) where the BLOBs are stored apart from
transactional data.
The first two capabilities add a
lot of system overhead that works against the third capability. On
the
plus side, a DBMS will help you a lot in keeping its repository
consistent with the directory repository. It may be expensive
though.
I am writing of enterprise-level DBMSs like Oracle, DB2, etc. that
I'd recommend DB2, which has a connection unlimited free version, for
doing this kind of work if you need a free (as in beer) RDBMS.
--
Adam Tauno Williams, Network & Systems Administrator
Consultant - http://www.whitemiceconsulting.com
Developer - http://www.opengroupware.org