Re: [sqlite] pragma vs select for introspection

Wols Lists Wed, 15 Dec 2010 16:33:35 -0800

On 15/12/10 02:47, Darren Duncan wrote:
> Wols Lists wrote:
>> On 15/12/10 00:18, Darren Duncan wrote:
>> The point I'm making is that a list doesn't contain any ordering *data*
>> - it's inherent in the fact of a list. A list is an abstract concept. In
>> Pick, I can store a data structure that IS an abstract list. In an rdbms
>> I can't.
>>
>> Put another way, in Pick the function "storelistindatabase()" and
>> "getlistfromdatabase()" are, at a fundamental level, direct inverses -
>> there's a one-to-one mapping.
>>
>> In an rdbms, the function "storelistindatabase()" has an inverse
>> "getdatafromdatabase()" which returns something completely different
>> from what went in.
>
> I would expect that any RDBMS which has a "storelistindatabase()"
> would also have a "getlistfromdatabase()".  Sure, it may fail if you
> call the latter for something which isn't a list, but then I would
> expect the same in Pick, unless everything in Pick is a list.


Hold onto that thought!

I think I botched my wording - In Pick, getlistfromdatabase() and
getdatafromdatabase() would be the same function. In an RDBMS, because
the index is data, they're not.

But back to that thought, you're almost spot on :-) The database
structure consists of FILEs (tables in relational terminology) which
consist of - to use a mac term - two "forks". The DATA fork and the
DICTionary fork. These are structurally identical, so much so that the
master dictionary only has one physical fork, which is logically both
forks, and is therefore self-describing :-) Each RECORD (relational row)
in a fork consists of a key-list pair - those in the DICTionary
describing the FIELDs (columns), and those in the DATA instancing the
cells described in the columns. So, at this level, each fork is a set -
we have a bunch of items all with a unique primary key, and a
database-defined order that is pseudo-random. (Going back to the real
world, this pseudo-random order is why Pick guarantees to retrieve the
sought-after data from disk at a 99% first-attempt success rate :-)

Now if the column is the x-axis, and the row is the y-axis, each cell
can itself be a list in the z-axis! And so on. (Yes, some people do
complain Pick has its rows and columns the wrong way round from sensible :-)

In *practice* all Pick implementations effectively stop at the next
axis, the t-axis. But there's no theoretical reason why they should.
It's just that, at this point, the programmer's brain explodes trying to
cope with the all the dimensions. (And don't say an rdbms is easier to
cope with - it's actually more complicated, because the programmer has
to remember which tables are nested, rather than the database being "in
your face" about it.)

And pretty much every Pick database actually has three more dimensions
available after this, they're just not used because of exactly that
reason :-)

>
>>> If Pick has any understanding of the data itself which is higher
>>> level, other than external metadata which is also bit strings, then it
>>> would be doing modeling in order to do this, such as to treat text in
>>> text-specific ways.
>>
>> Here again, we come to a fundamental mis-match between the relational
>> view of things, and the Pick view. In the relational view, if the table
>> does not have a column definition, there is no column. The definition,
>> by definition, defines the column :-)
>>
>> In Pick, the DICTionary de*scribes* the column. If there's no
>> definition, the column can still exist. You just don't know what's in it
>> :-) Pick uses the description to understand the data, relational uses
>> the definition to define the data.
>>
>> Without a definition, you can't model. So Pick doesn't. It understands,
>> instead.
>
> From my perspective at least, a relational database works more like a
> Pick database than you think; and this is reflected in Muldis D.  I
> recognize that some other people see things in a way that are more
> different, and SQL reflects this.

But I personally focus on the guarantees that Pick gives about response
times, I can calculate that "in a perfect world it cannot be less than x
seconds, in the real world it will be about y seconds" (and x and y are
usually about the same). Relational merely says "I can guarantee that
there is answer, and I that I will find it eventually".

>
> A primary difference as I see it is that tuple + relation + scalar
> values are conceptually the basic building blocks of a relational
> database while Pick uses other things.  Obviously, if what you want to
> store is exactly like a basic building block, then doing so will be
> simpler.

As I said, the idea of enforcing good design is totally alien to Pick
:-) but ...

The basic building block should be the (real world) atom. Let's say I'm
represented by my NI number. That's my primary key. Without that there
is no name, no age/d-o-b, no residence, no nothing. And it naturally
belongs in a set. So we stick it in a FILE. Along with *all* the
associated, tightly bound, attributes. And any Pick programmer who is
negligent and doesn't de*scribe* those attributes in normal form should
imho be shot. :-)

And this is where things get tricky :-) Is my wife's key an attribute of
mine? Or are both our keys an attribute of the atom of marriage? Pick
*needs* an arbitrary answer. By not recognising the concept of "atom",
relational analyses the problem out of existence :-)

>
> In Muldis D, you can work with any arbitrarily complex value, a
> relation or otherwise, without first declaring a type for it.  The
> *only* purpose of declaring a type in Muldis D is for defining a
> constraint on a variable or a parameter; it also helps with
> optimization since the DBMS can then better predict what is going to
> be used where.
>
> For example, you can simply say:
>
>   @:{ { pizza_name => 'Hawaiian', toppings => { 'ham', 'pineapple' } } }
>
> ... without declaring anything first, and what you have there is a
> binary relation value literal consisting of a single tuple of 2
> attributes, and one of those attributes' values is a set of 2 elements.
>
> You could also take any value and introspect it, whereby you can be
> given back a type definition that *describes* the value.
>
> "the database" in Muldis D is in the general case simply a non-lexical
> variable whose type is, loosely, "any tuple whose attribute values are
> relations".  You can declare that the type of "the database" is more
> specific, such as with specific columns and such, but that is optional
> (though commonly done).
>
> So in Muldis D, you can simply say "store this X" and it will, without
> you having to define columns or whatever first.  And I consider this
> to be completely valid for a relational database.
>
> This sounds like how you describe Pick.

I'll have to investigate Muldis D :-)

>
> Now SQL can't do this on the other hand, but that's a limitation of SQL.
>
> (As a tangent that is more on-topic, the Muldis D approach is more in
> common with SQLite than by many other SQL DBMSs in that a SQLite row
> column value can be of any (scalar) type, and you don't have to
> declare a column to be of a particular type in order to store a value
> there; if you do then that is just a local constraint rather than a
> fundamental limitation.)

Which is great when you're trying to store infinity or NAN in a numeric
field (both valid values, imho :-)

>
>>> Atomicity is just an abstraction for certain kinds of error detection
>>> and correction.  Pick can't be truly atomic, but only provide an
>>> illusion of such, and so can other DBMSs, including relational ones,
>>> as the implementations provide.  (And even then, operating systems are
>>> known to lie about whether data has been physically written to disk
>>> when you fsync.)
>>>
>> You're wrong there. Pick IS truly atomic. Yep, OSes can lie, and if Pick
>> accepts that lie then carnage will occur, but the word "atom" is greek
>> for "indivisible". Let's take my pizza for example. "Hawaiian = ham,
>> pineapple". That is an atom. Take away any part of it, and it's no
>> longer a hawaiian pizza. And as far as Pick is concerned (if properly
>> programmed :-) that will remain, for ever and always, an atom. It comes
>> in as an atom. It passes through as an atom. And it's fed out to the OS
>> to put on disk as an atom. Pick is truly atomic
>
> Is this meant to say that Pick is not designed to look at parts of
> things it is fed, but rather just takes what it is input as
> indivisible and can only store/fetch it as a whole?

At the data *store* level, yes. At the data access/query level, you can
say "get me anthony's age", and it will get my record from the DATA
fork, "age" from the DICTionary fork, and then interpret and present me
with the data I asked for  (and you can define age in terms of d-o-b and
date() so Pick will calculate it for you on request). RDBMSs, on being
asked to get one attribute, try to guess which other attributes are
worth fetching at the same time. A properly designed Pick app just gets
them by default :-)

Oh - another little difference - "Structured *Query* Language" is a
misnomer, it updates as well. Pick has no equivalent :-) It's the job of
the app to tell Pick what to store, but the query language (called
ENGLISH, because it is, actually, quite close to English) will return
whatever piece of information is asked for - provided, of course, the
programmer has provided the description Pick needs to find it! :-)

>
>> (even if the combo "Pick
>> on a computer" isn't).
>
> Are you trying to say that Pick can exist in some form other than
> "Pick on a computer"?  Is Pick a specification or a DBMS implementation?

I'm just separating the layers. If I design my Pick app properly, and
pass Pick a chunk of data that represents a real-world atom, at the
database level it stays an atom. A true relational database makes no
such guarantee - "the app has no need to know, therefor it is not
allowed to know".

But to actually answer your question, Pick is a family of database
implementations. I know from personal experience two distinct variants
(one as two sub-variants). And I could at a stretch probably name near
enough ten more. And again, speaking from personal experience, porting
between them is a cinch - easier, apparently, than porting a relational
data base between different RDBMSs. So yes, it is an (informal) spec.
But adhered to much more rigorously in practice than the relational
spec. Oh - and the original Pick implementations were mostly (all?)
operating systems. Bit like AS/OS/400.

And Pick was a specific implementation as well. Not the original, but
named after one of the designers. That particular implementation was
originally an OS, and incidentally was the first commercial database I
know of to be ported to Linux (Oracle would like to claim that crown,
but Pick pre-dates it :-)

>
> -- Darren Duncan
>
Cheers,
Wol

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] pragma vs select for introspection

Reply via email to