+0.75 to Roland for splitting the data, and another +1 for reiterating the
true functionality of what LITE means.

Splitting the data into chunks will help with keeping a database small as
it can throw the raw data into pages that were previously marked as
deleted, thereby not increasing the database sizes.  I think raw blob data
is already stored as a separate entity within the DB, but, I'm only about
5% through reading the 4meg amalgamation source code.  (Yes, I'm reading
line by line, manually.  Fortunately I've written a "book" program that
reads text files and remembers what page I'm on. ;))

The loss of 0.25 comes with the contradiction of light versus increased
work load.  Streaming a chunk of data as one huge "thing" is going to be
faster in regards to writing and reading.  You write data first byte to
last, and only stop at the EOF or if an IO error comes up.  Read the data
in with the same thought process.  No real CPU processing is going to be
affected. Now if you start throwing in splitting the data, you're going to
start throwing processing overhead to determine unused records, pages, and
what not.  Now on write the DBMS has to figure out what isn't used, see if
it has enough space in the page to write, and so on.  This takes away from
the speed part of Lite.  Reading will be less intensive, however, more
thought has to happen to properly read in the data.

So now the question comes to which camp wins?  The streaming group, or the
partitioning group?  Both can, and still maintain backwards compatibility
for reading, however, databases written with the partitioning of the blobs,
older versions of SQLite3 won't know what is going on, so it'll break.

Assuming (since I'm not sure) that the streaming option is what is
implemented in the current implementation of SQLite3, that particular code
base stays in place.  When a database is created, a PRAGMA can be set PRIOR
to the first write of any BLOB data that would indicate how the database
writes the data.  Resetting this pragma would have no effect after the
first BLOB is written, even if all BLOB data is removed.  By default, the
old method can be used so older versions of SQLite can handle the data.
This could potentially be extended to a per-table field for future
endeavors.

So now we're looking at one extra check to see how the data should be
written and read, which takes away from speed (Albeit extremely small slice
of time) however, both camps win.

ALL THAT SAID, I doubt it'd get implemented due to



On Fri, Apr 26, 2013 at 11:30 AM, Roland Hughes <rol...@logikalsolutions.com
> wrote:

> Speaking as an IT professional with 20+ years in the field, I would have
> to say adding any kind of "file system" support to SQLite would be a
> horrible thing.  Yes, I've used Oracle.  I've also used the only real
> product Oracle has, RDB on OpenVMS.  I've written books covering MySQL,
> PostgreSQL, and Object databases like POET.
>
> Lite is called lite for a reason.  Bare functionality with a lot of
> speed.
>
> The architects for this product need to take a lesson from the old DOS
> xBase systems.  Blobs should not be handled as one big unit.   They need
> to be given unique ID's and stored in fixed size chunks off in a hidden
> table much like a "memo" field was back in the day.  The "hidden" or
> child table supporting the blob column would have a key of ID +
> sequence.  The actual row should be ID, Sequence, BytesUsed, Chunk.
> They BytesUsed allows you to keep exact byte sizes.  All Chunk data
> types should be a raw byte data type.  There should be multiple chunk
> column types:  chunk24, chunkM, and chunkXM for 1024 bytes, 1Meg, and
> 10Meg chunk column types.
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to