RE: Queries as Structures

Rick Osborne Thu, 14 Sep 2000 21:40:14 -0700
[Steve: I'm cc-ing the list just in case others might be curious ...]

>From the (admittedly minimal) research I've done on how CF stores queries
internally, it appears that queries are stored as a structure of arrays.
That is, you have a base structure (myQuery=StructNew()) and then each of
the columns is an array in that structure (myQuery.UniqueID=ArrayNew(1)).
My main observation that leads me to this conclusion is the ways that
queries are accessed, which is by column name then by row number
(myQuery.UniqueID[10]).

The alternative (structure) layout we were talking about earlier in the
thread starts off the same: with a structure at the top level
(myQuery=StructNew()).  Each row would be another structure under that
(myQuery[42]=StructNew()) using the primary key for the row as the key to
the top structure.  Column names would then become keys for the
row/sub-structure, with cell values as the value (myQuery[42].Name="Rick").
Even though the pseudo-query uses integers for keys, it isn't actually in
any semblance of order.  This is because you may have primary keys on the
order of 82374, but you may only have 100 rows, so obviously you don't want
an array with 82374 elements.  That is why it is a structure of structures
instead of an array of structures.  This is also why it is actually less
efficient than the first method.  Remember that (myQuery[42].Name IS
myQuery[42]["Name"]).  That is, for each row (primary key/sub-structure) in
the query, you are not only defining your values ("Rick"), but you are also
redefining your column names ("Name") each time as well.  In English, if
your query has 100 rows, then you have the string for each column name
("Name") in memory 100 times.  If you have 5 columns, then you have 500
strings.  Not very efficient.  If you think it would help to limit the
object to an array of structures, the reverse of the first method, think
again.  You'd still have the column-name-duplicating problem, but then
you've also lost the ability to access rows by the primary key.

Speed-wise, I think there would also be an improvement.  In the first model,
you have one structure lookup and one array lookup.  In the second model you
have two structure lookups.  We know that structure lookups are slower than
array lookups.  This is because an array lookup will use constant time,
while a structure lookup will use linear time.  That is, even if you have a
million array entries, looking up one will always be the same speed as
another.  But, looking up structure entries gets slower, linearly, with each
key that you add, as it takes longer and longer to find which key you are
looking for.  So, two structure lookups is going to be slower than one array
plus one structure.  Hence, method one will be faster than method two,
especially for large datasets.  (I'm not going to write any code to back
that up, I'll just leave it as an exercise to the reader.  I'd eat my hat if
I've guessed wrong, tho.  Then I'd fly up to Boston and make them fix it.)

Even if you went back to the original design and made your own structure of
arrays, I doubt you'd be saving much overhead.  You'd have to keep your own
variables for the intrinsic .RecordCount, .ColumnList, and .CurrentRow
properties, as well as doing your own synchronization between array lengths.
(If your arrays/columns were of differing lengths and you tried to loop
through them, you'd have problems.)  Also, sorting the entire dataset would
be a complete pain (no CFX_QUERYSORT, and CFX access to arrays is mired in a
rolling mass of lameness), and you'd lose simple functions like
QueryAddRow().  You could emulate ValueList() with ArrayToList(), and it
would be interesting to see speed comparisons.  All in all, though, I really
don't think you'd gain anything but a headache.

I imagine that the Allaire devlopers had this same discussion, what, 6 years
ago?  It explains the backwards non-intuitive query access (q.c[i] instead
of q[i].c, even though I still think they could have obfuscated this enough
to let us do it) quite well, I think.

Did that answer your question?  :)

-Ramblin' Rick

-----Original Message-----
From: Steve Bernard
Sent: Thursday, September 14, 2000 7:37 PM
To: [EMAIL PROTECTED]
Subject: RE: Queries as Structures


Out of curiosity, why do you feel a query dataset would use less memory than
a comprable structure?

Steve

------------------------------------------------------------------------------
Archives: http://www.mail-archive.com/[email protected]/
To Unsubscribe visit 
http://www.houseoffusion.com/index.cfm?sidebar=lists&body=lists/cf_talk or send a 
message to [EMAIL PROTECTED] with 'unsubscribe' in the body.
RE: Queries as Structures

Reply via email to