At 6:59 AM +0530 12/18/07, Yuvaraj Athur Raghuvir wrote:
Thanks for the interesting discussion. What I got so far is summarized
below:
1) Row based versus Column based storage is an implementation detail.
2) SQL used for access is independent of storage mechanism adopted.
3) Row based storage with indices on all columns reaches read performance of
column based storage.
4) Creating/updating indices  fast using new algorithms is a direction of
improvement for SQLite

The main way that this difference is an implementation detail is in the sense that your database schema and the DBMS API can be used unchanged with both. However, the 2 have different performance characteristics, which is why one would pick one over the other.

If a DBMS is smart enough, it can automatically pick the best storage method for performance and you don't have to think about it.

However, many DBMS are not that smart and so typically users find themselves making explicit changes to their schemas, specifying the storage method explicitly, in order to compensate and/or give the DBMS hints. In these typical situations, what should be an implementation detail is something that can have a lot of impact on your schema design.

Now, if the storage is an implementation detail, can the following scenario
be realized?
a) Given: Distributed highly-available system which is implemented as
maintaining replicas of data
b) The replicas of data have different storage mechanisms which is also
recorded in the (distributed) database coordinator.
c) This would, in essence, be a hybrid database - hybrid in the sense of
using different data storage strategies (row-based / column-based) in the
replicas.

This would allow for the database coordinator to intelligently respond to
the various operations on the database by redirecting the  original request
to the appropriate replica. The cost would be when the data changes and each
of the replicas have to be brought into sync. Here again, the intelligence
should be such that the storage schema that achieves the best performance
for that SQL statement should be used and the sync can happen in the back
ground.

My perspective is that progressively, the data storage (implementation)
strategies will pay an important role given that OLTP/OLAP requirements are
getting blurred.

That could all be made to work, but I don't know if anyone actually has implemented this yet ... or maybe that was your intention.

-- Darren Duncan

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to