Re: [sqlite] Table within a table??

2009-11-04 Thread CityDev


Darren Duncan wrote:
> 
>   Being that arrays *are* relations, you can use all the relational
> operators on them.  
> 

Just to be totally clear - an array is not a relation. An array has fixed
order of each dimension (eg columns and rows), and you address it by
position. A relation is unordered (although you can use a query language
like SQL to produce ordered tables), and you address it by data values.

The rationale was that a user of the information wouldn't have to dig around
to find out that Field 1 is Q1 sales, Field 2 is Q2 sales, Field 5 is Actual
Sales to Date and Field 6 is Last Year Total Sales etc. Q2 Sales would have
to have a name, although there's nothing to stop someone calling it
SXVAT_Q2_PreCV2. It's better than nothing.

Nothing's perfect and it seems no-one can say what a domain should contain
but that's no excuse to throw the baby out with the bath water. Stick as
close to relational thinking as you can, and your design will be that much
more effective.  
-- 
View this message in context: 
http://old.nabble.com/Table-within-a-table---tp26125451p26199506.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Table within a table??

2009-11-03 Thread John Crenshaw

> Jay A. Kreibich wrote:
>> On Tue, Nov 03, 2009 at 02:51:23AM -0800, CityDev scratched on the
wall:
>>> That just seems so contrary to the original idea of the relational
>>> model that you shouldn't have any data whose meaning is not defined
>>> by data (in the case of an array you need to understand the
>>> significance of relative position - remember relations have no row
>>> or column order to stop you playing that game). 
>
> So what is a character string then?  An *ordered* sequence of
characters.  And 
> yet this coexists just fine with the relational model.  An "array" is
just a 
> generalization of this concept.

Except that the characters in a string lose all meaning when used
individually. Arrays on the other hand, while ordered, generally hold
sequences of data such that each element has substantial meaning
individually, and may need to be queried against. In fact, even storing
strings tends to cause problems, because often people want to query only
a slice of a string and this often requires a full table scan.

If you absolutely must store arrays, you can do this by storing them in
blobs. If you need to query against the array elements, use a custom
function (though, if you need to query against the elements in the
array, you should REALLY be using a relational table.)

As far as order goes, it is reasonably easy to order the rows using a
field for that purpose.

John
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Table within a table??

2009-11-03 Thread Darren Duncan
Jay A. Kreibich wrote:
> On Tue, Nov 03, 2009 at 02:51:23AM -0800, CityDev scratched on the wall:
>> That just seems so contrary to the original idea of the relational
>> model that you shouldn't have any data whose meaning is not defined
>> by data (in the case of an array you need to understand the
>> significance of relative position - remember relations have no row
>> or column order to stop you playing that game). 

So what is a character string then?  An *ordered* sequence of characters.  And 
yet this coexists just fine with the relational model.  An "array" is just a 
generalization of this concept.

>   Yes, EXACTLY.
> 
>   And so it is for nearly any other compound datatype.
> 
>   Arrays have implied ordering.  You can't JOIN against an array
>   without extracting meta-data like that.  Even if you do that, you
>   have to pair that data together and carry it through any manipulation.
>   If the data gets separated or lost, both sets of data becomes
>   worthless.  Not Relational.  Same is true of any other container
>   with an inherent ordering or listing (which is most of them).

I see no problem supporting arrays in the relational model because you can 
define an array in terms of a relation.  For example, an array can be any 
relation that has an explicit attribute, say call it "index", which defines the 
conceptual order of the elements, which are defined using the remaining 
relation 
attributes.  Being that arrays *are* relations, you can use all the relational 
operators on them.  And so an array-valued attribute *is* an RVA.

Of course, you'll want to be careful in how you use the array-relations so that 
you keep the index-defining attribute where you need to.  But this is the same 
as with any other important attributes, such as person name or person id or SIN 
whatever you need in the context.

(Similarly, you can implement a bag/multiset over a relation by just having an 
explicit attribute to count instances, say call it "count".)

Using the above methodology, it is perfectly valid to have an "order by" in the 
relational model; the result of an order-by operation is a relation that is 
like 
its main input relation but with an extra indexing attribute added.  Note that 
the RANK operator one can find in math or SQL is essentially the same thing. 
And "limit" is just a slice/restriction/semijoin on a relation with an ordering 
attribute, that filters on the value of the indexing attribute being in a 
certain range.

Such is one way that my Muldis D language supports the actual relational model 
while at the same time supporting all the useful things programmers want to do, 
and that SQL can do.

>> So I still go back to point i) - you don't need to do this.
> 
>   I hope not, as my main point was that I feel they aren't allowed
>   under the Relational Model.  I agree with your thoughts, I just think
>   there is a more rigid way of showing it to be true.
> 
>   But that doesn't mean people don't want them.

Indeed.  There is a lot of data where it makes sense to be ordered.  But then, 
doing it properly in the relational model is encoding that desire as *data*, 
such as my above example does.

>   The point you're getting at is still valid, however.  One aspect of
>   the Model is that a system can *physically* store the data in all
>   kinds of interesting ways because all the transforms are 100%
>   reversible.  You don't really need to store the table, as defined, in
>   any physical way as long as you can compute it from the data you've
>   stored.  RVAs, as a data modeling idea, fit right into that because
>   the only real difference between an RVA and a non-RVA is how you
>   write the data out on the display.

Absolutely, and that's one of the best features of the relational model.  That 
it gives users a lot of flexibility while implementations have a lot of 
flexibility to optimize behind the scenes, while they can be confident this 
optimization won't change the results users get.  Not so much with SQL.

-- Darren Duncan
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Table within a table??

2009-11-03 Thread Jay A. Kreibich
On Tue, Nov 03, 2009 at 02:51:23AM -0800, CityDev scratched on the wall:
> 
> Whilst it's true that SQL isn't essential for a relational database

  More to the point, any database that supports SQL does not, and
  cannot, support the data typing and data manipulation rules set out
  by the Relational Model.

  The Relational Model defines a set of structure and manipulation
  rules.  It is a mathematical system.  It does not define any kind
  of query language or symbolic expression system-- that is outside
  of the scope of the Model.  In fact, almost by definition a query
  language has to be outside of the scope of the Model in order to be
  practical (for example, some kind of ORDER BY option in a query).

  The SQL standard defines not only the language syntax, but also
  defines it's own data typing and data manipulation rules.  It defines
  it's own mathematical system-- kind of.  The data manipulation
  environment defined by SQL is not really rigid or self-consistent
  enough to call a "mathematical system."  The SQL system is strongly
  based off the Relational Model, but it is fundamentally incompatible
  in many small but important ways.

  So even if we ignore "SQL the language", what we have left in "SQL
  the data manipulation environment" is not the Relational Model.
  Siblings, perhaps, but not the same.

  So when you said, "SQL is a first-order language so it can't easily
  handle substructure," as a point of argument, my feeling is that
  A) that might be true, B) but it doesn't matter, C) and it really
  doesn't matter.

  A) Sure, why not.  SQL can't easily handle a lot of the things it
  tries to do.  But since the computational environment is not defined
  by the language, it B) doesn't really matter.  Just because my fancy
  calculator doesn't have a way to express and generate 4D graphs 
  doesn't mean 4D geometry doesn't exist as a full and mature system,
  and isn't useful for somethings.  It only means I need a better
  calculator (with a *way* better display...) or I need to do it "by
  hand" using the rules of the environment directly.
  
  But, C) it really doesn't matter, because the context of the discussion
  was the Relational Model, and you can't really make Relational Model
  arguments when thinking in SQL.  Like one of my computational
  professors used to say, "When you argue in English, you're wrong."
  It's kind of the same thing.  You can't really make Relational Model
  arguments in SQL.

> I'm interested in your remark that relational databases now cope with
> 'arrays'.

  Didn't say that.  I said several SQL databases do.  
  
  Although looking through some documentation, it looks like that
  number is a bit less than I first thought.

> That just seems so contrary to the original idea of the relational
> model that you shouldn't have any data whose meaning is not defined
> by data (in the case of an array you need to understand the
> significance of relative position - remember relations have no row
> or column order to stop you playing that game). 

  Yes, EXACTLY.

  And so it is for nearly any other compound datatype.

  Arrays have implied ordering.  You can't JOIN against an array
  without extracting meta-data like that.  Even if you do that, you
  have to pair that data together and carry it through any manipulation.
  If the data gets separated or lost, both sets of data becomes
  worthless.  Not Relational.  Same is true of any other container
  with an inherent ordering or listing (which is most of them).

  What about sets?  The Relational Model has a lot of set theory in it,
  but it isn't "pure" set theory.  For example, many "true" set
  operations will result in sets of sets.  Those operations in the
  Relational Model (applied to the sub-sections of relations) will
  always be "flattened" to just sets of elements.  So sets don't work
  within the Relational Model unless you modify their behavior.

  If you do that, you no longer have "real" sets.  You basically
  have... a relation.  Or something that acts like a relation in pretty
  much every way that counts.  So now you have a RVA, and we know those
  don't count because they can be seamlessly transformed into a non-RVA
  representation without losing data or meaning (or generating
  meta-data, which is what I meant by "meaning").



  And, BTW, SQL does consider table/result-set columns to be ordered.
  One more fundamental difference that causes all kinds of headaches
  for query optimizers, because of issues like technically A NATURAL
  JOIN B is NOT equal to B NATURAL JOIN A.

> So I still go back to point i) - you don't need to do this.

  I hope not, as my main point was that I feel they aren't allowed
  under the Relational Model.  I agree with your thoughts, I just think
  there is a more rigid way of showing it to be true.

  But that doesn't mean people don't want them.

  So why are they there?  Most SQL people aren't all that aware of 
  the theory behind database 

Re: [sqlite] Table within a table??

2009-11-03 Thread P Kishor
On Tue, Nov 3, 2009 at 4:51 AM, CityDev  wrote:
> I'm interested in your remark that relational databases now cope with
> 'arrays'. Personally I've never seen that in DB2, Jet or SQLite. That just
> seems so contrary to the original idea of the relational model that you
> shouldn't have any data whose meaning is not defined by data (in the case of
> an array you need to understand the significance of relative position -
> remember relations have no row or column order to stop you playing that
> game).

http://www.postgresql.org/docs/8.2/interactive/arrays.html



-- 
Puneet Kishor http://www.punkish.org
Carbon Model http://carbonmodel.org
Charter Member, Open Source Geospatial Foundation http://www.osgeo.org
Science Commons Fellow, http://sciencecommons.org/about/whoweare/kishor
Nelson Institute, UW-Madison http://www.nelson.wisc.edu
---
Assertions are politics; backing up assertions with evidence is science
===
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Table within a table??

2009-11-01 Thread Darren Duncan
Thank you for this response, Jay, and your other long one.  They seem very well 
informed and helpful to the general practitioner.  And "Database in Depth" is 
indeed a great book, and should be read by anyone wanting to understand 
databases; despite the name, it is actually quite approachable.

Regarding "atomic" et al, my own impression of what Date says is slightly 
different.  My perception is not that he tries to define "atomic" and then 
gives 
up, but rather that he emphasizes from the start that practically speaking 
there 
is no such thing.  That is ...

One main point in support of relation-valued attributes (in terms of which all 
other kinds of collections can be represented, such as arrays) is that 
rejecting 
them on the basis of only wanting atomic or scalar values doesn't make sense, 
because any value can be represented as a collection of other values.  For 
example, a string can be a collection of characters, or a number can be a 
string 
of digits, etc.  And so, forbidding collection values is making a very 
arbitrary 
limitation.  This means that the simplest or more pure solution is to allow any 
value at all as an attribute value.

I won't repeat what you said about relations without RVAs being losslessly 
convertible to relations with them, and back again likewise (it is like taking 
the math expression "5x + 10y" and changing it to "5 * (x + 2y)", then changing 
it back).  But this is what I was referring to in my original message in the 
thread where I said using the 2 tables was logically equivalent to using one 
and 
making work_history an RVA.

In regards to 1NF, as you say, relations are already in 1NF by definition, and 
only SQL tables are capable of not being in 1NF, since in the general case they 
are not relations.  As far as I know the main distinction here is that SQL 
allows duplicate rows in a table, and so a SQL table is like a bag, while a 
relation by definition is a set and never has duplicate tuples (rows).  In that 
respect, having a 1NF table in SQL is as simple as having a primary key or 
unique key on that table.  But even then, some SQL rowsets may have duplicate 
rows depending on how a SELECT is built, while relations never would.

Now, even if SQL isn't the same as the relational model, it has a lot in 
common, 
and to a large degree if one uses SQL the right way its behavior can be quite 
close to the relational model, and more correct and trouble-free.

I highly recommend Chris Date's newer book, "SQL and Relational Theory: How to 
Write Accurate SQL Code" (2009), http://oreilly.com/catalog/9780596523084/ , 
which talks about how to use the SQL DBMSs you already have in the most 
effective way possible, in regards to getting the behavior you want error free.

And yes, that is very applicable to SQLite specifically.

-- Darren Duncan

Jay A. Kreibich wrote:
> On Sun, Nov 01, 2009 at 01:23:52PM -0800, CityDev scratched on the wall:
>> Darren Duncan wrote:
>>>   Or at least it is in the version of the relational model 
>>> that allows non-scalar attribute values, but that is the one that Chris
>>> Date et al, as well as myself ascribe to.
>> I didn't read this through but I recall Chris Date defining a relational
>> database as formed from relations normalised to 1NF. First Normal Form
>> basically means no substructure ie each column is a single value from a
>> scalar domain.
> 
>> To say it would be nice to have structured data is ignoring
>> two things: i) we've got this for just fine without it, and ii) SQL is a
>> first-order language so it can't easily handle substructure. 
> 
>   Let me address these two points first.  As for "i", then please
>   explain to my why nearly every SQL database supports array values?
>   I'm not sure it is in the standard, but it is so common it might
>   as well be (for those wondering, no, SQLite does not support arrays).
>   The answer for "ii" is easy: SQL is not the Relational Model, so the
>   only point you might be making is that SQL makes for a poor relational
>   language.
> 
>   Now back to your first paragraph.  This is the more interesting one.
> 
>  Executive Summary:  Multi-valued attributes (column types)
>  are not allowed in the Relational Model.  Relational-Valued
>  Attributes (tables in tables) are OK, however, because they
>  don't count.
> 
>   Now a few pages on why
> 
>   The term used by Date (and Codd) is "atomic", not scalar.  Of course,
>   this is generally meaningless, as it is very hard to define what
>   makes a value "atomic."  Are text values atomic?  The SQL standard
>   includes a number of functions to extract, match, and manipulate
>   sub-strings.  What about integers?  SQLite includes bit operations
>   that can be applied to integer values.  Does that make them non-atomic?
> 
>   In "Database In Depth", Date eventually just gives up trying to
>   define "atomic" in a formal sense, says it cannot and should not be
>   done.  In his writings, Date points out 

Re: [sqlite] Table within a table??

2009-11-01 Thread Jay A. Kreibich
On Sun, Nov 01, 2009 at 01:23:52PM -0800, CityDev scratched on the wall:
> 
> Darren Duncan wrote:
> > 
> >   Or at least it is in the version of the relational model 
> > that allows non-scalar attribute values, but that is the one that Chris
> > Date et al, as well as myself ascribe to.
> 
> I didn't read this through but I recall Chris Date defining a relational
> database as formed from relations normalised to 1NF. First Normal Form
> basically means no substructure ie each column is a single value from a
> scalar domain.

> To say it would be nice to have structured data is ignoring
> two things: i) we've got this for just fine without it, and ii) SQL is a
> first-order language so it can't easily handle substructure. 

  Let me address these two points first.  As for "i", then please
  explain to my why nearly every SQL database supports array values?
  I'm not sure it is in the standard, but it is so common it might
  as well be (for those wondering, no, SQLite does not support arrays).
  The answer for "ii" is easy: SQL is not the Relational Model, so the
  only point you might be making is that SQL makes for a poor relational
  language.

  Now back to your first paragraph.  This is the more interesting one.

 Executive Summary:  Multi-valued attributes (column types)
 are not allowed in the Relational Model.  Relational-Valued
 Attributes (tables in tables) are OK, however, because they
 don't count.

  Now a few pages on why


  The term used by Date (and Codd) is "atomic", not scalar.  Of course,
  this is generally meaningless, as it is very hard to define what
  makes a value "atomic."  Are text values atomic?  The SQL standard
  includes a number of functions to extract, match, and manipulate
  sub-strings.  What about integers?  SQLite includes bit operations
  that can be applied to integer values.  Does that make them non-atomic?

  In "Database In Depth", Date eventually just gives up trying to
  define "atomic" in a formal sense, says it cannot and should not be
  done.  In his writings, Date points out that even atoms are not
  "atomic," and can be broken down into quarks-- but that the idea of
  atomic values is still a good and powerful one, even if it is inexact.

  Given all that, it might come as a surprise that Codd and Date *do*
  support the idea that RVA (Relation-Valued Attribute) fit into the
  Relational Model.  A few points on that.  First, the whole concept
  of 1NF is from the SQL world.  To put a table into 1NF is basically
  to make it mimic the requirements of a relation.  Under the
  Relational Model, it is basically impossible to have a valid
  Relation that is not in 1NF.  If it isn't in 1NF, it isn't a relation.

  So why allow RVAs?  Well, to start with, under the Relational Model
  relations are atomic.  Yes, really.  Under the formal Model you're
  not allowed to change or alter any sub-value of a relation.  A
  relational value is what it is, and the whole state of the relation
  is what represents its value.  By taking the product of the column
  type domains, you can compute a known, finite domain for every possible
  value for a given relation (relations don't allow duplicate rows,
  so the number of rows is bound).  That's typically a *very* large
  domain, but since it is known and finite, it is a valid type domain.

  As an extension of that idea, when you alter a column or row under
  the Model you're not changing a sub-part of the relation, you're
  replacing the whole relational value with a new value.  INSERT
  doesn't add a new row to an existing relation, it computes a new
  relation value that is the UNION of the old relation and the new row.
  Similar manipulations can apply the UPDATE and DELETE, but in all
  cases you're not altering a relation, you're computing a new one.

  So, in terms of the model that defines both what a relation is and
  what it can contain, relations are atomic-- and from that RVAs are
  naturally allowed.

  At this point, if you're raising an eyebrow and thinking that sounds
  like a pretty questionable slight of hand in the formal definition of
  atomic values, I'd be quick to agree.  But here's another point that
  is, perhaps, more important:  Any relation with a RVA can be
  decomposed into a non-RVA relation.  And, just as importantly, can
  be re-composed back into the original RVA.  That is, you can "round
  trip" in and out of a RVA representation without any
  loss of data or meaning.  One can make a strong
  argument that RVAs are just a different way to organize the data
  "on paper", but in the end the addition (or removal) of RVA doesn't
  alter what the Model is capable of representing or doing.  This works
  the same way that any static multi-dimensional array type can be
  reduced to a one-dimensional array type without loss of functionality.

  And it is *this* point that separates RVAs from storing arrays,
  lists, or other "multi-value" containers as a row/column 

Re: [sqlite] Table within a table??

2009-11-01 Thread CityDev


Darren Duncan wrote:
> 
> 
>   Or at least it is in the version of the relational model 
> that allows non-scalar attribute values, but that is the one that Chris
> Date et 
> al, as well as myself ascribe to.
> 
> 

I didn't read this through but I recall Chris Date defining a relational
database as formed from relations normalised to 1NF. First Normal Form
basically means no substructure ie each column is a single value from a
scalar domain. To say it would be nice to have structured data is ignoring
two things: i) we've got this for just fine without it, and ii) SQL is a
first-order language so it can't easily handle substructure. 
-- 
View this message in context: 
http://old.nabble.com/Table-within-a-table---tp26125451p26154639.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Table within a table??

2009-11-01 Thread Petite Abeille

On Nov 1, 2009, at 2:32 AM, Jay A. Kreibich wrote:

>  Anyways... I've gone on long enough.  Good luck with your design.
>  Think a bit, ask good questions, and hopefully we can all see
>  a different point of view and learn to see something new.

Excellent post, thank you :)

Along the same lines, "Double-thinking in SQL" is worth a read as well:

http://explainextended.com/2009/07/12/double-thinking-in-sql/

--
PA.
http://alt.textdrive.com/nanoki/


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Table within a table??

2009-11-01 Thread P Kishor
+1 Jay, for explaining.
another +1 for explaining so well.
yet another +1 for explaining in such detailed.

Heck, if you had written such a long post full of nonsense, you should
have gotten +1. Since you wrote such a long post that is so useful and
helpful to probably everyone, your post should really be enshrined in
the sqlite wiki.



On Sat, Oct 31, 2009 at 8:32 PM, Jay A. Kreibich  wrote:
>
> On Sat, Oct 31, 2009 at 01:13:31AM -0400, mark m scratched on the wall:
>> O.K.  I think I am starting to get the idea.  It is just so foreign for me
>> to organize things this way. A master work history table for all cases
>> almost seems confusing.  It will just take a bit of adjustment for me
>> to "trust" the database way of doing things.  Text files organized in
>> the way I described has always made it easy for me to figure things
>> out when there was a problem.
>
>  This is not uncommon.  If I had to guess, I'd say you're an experienced
>  applications developer, but are somewhat new to databases.  You're
>  applying your instincts and experience in designing and laying out
>  runtime-data structures and classes to your database design, but find
>  yourself stumbling over some of the minor conceptual differences in
>  this new environment.
>
>  The good news is that this is normal, and nearly everyone makes these
>  mistakes.  The great news is that I've found much of that experience
>  can be utilized and successfully applied to database design as soon
>  as you wrap your head around a few minor differences.
>
>  The biggest stumbling block-- and the one you've hit head-on here--
>  is to think of tables as *instances* of compound data structures.  They
>  look a lot like arrays or dynamic lists, so this is an easy mistake.
>
>  In your case, you need a bunch of records associated with some other
>  table row, so the instinct is to create a brand new table to fit that
>  need.  You'll end up with a bunch of tables with the exact same
>  type-signature (column/type/name pattern), but that's how instances
>  of data structures or classes work-- this is exactly what you'd do
>  if that table was some fancy C++ class that offered a dynamic list
>  of elements.  It also provides for a clear and direct route to get
>  from a main data record to a collection of related sub-members.
>
>  Unfortunately, in the database world, that's the wrong concept and,
>  as others have pointed out, this will lead to no end of problems.
>
>  Here is the first big rule, and it alone will get you pretty far:
>
>                Tables are data structures and classes.
>
>  If you would define a data structure or class to fit some specific
>  and well defined purpose, that's a table.  The .sql file that
>  defines your tables should map to the .h file in your head.
>
>  Don't think of tables themselves as multi-element data structures,
>  think of them as the data structures definitions.  It just happens that
>  in the database environment, every instance of a particular data
>  structure (i.e. a row) happens to be managed by a global instance pool.
>
>  In extension of that, the database world has no "contains a."   All
>  your compound structures are built from "references a."  And the
>  references that bind everything together are not pointers, but ID
>  values.
>
>  The other big stumbling block is that most of these references are
>  "backwards".  If you were building this as an application, you'd have
>  your main records, and each one of those would have a pointer off to
>  some dynamic list or array that held the log records.  In the database
>  world, the main record simply exists with some type of unique
>  identifier.  Rather than the main record pointing to the associated log
>  records, the log records point back and the main record they're
>  associated with.
>
>  This tends to make application developers uncomfortable.  If you had
>  a main record and wanted a list of all the log records associated
>  with it, you'd want to be able to de-reference some pointer or other
>  association directly to some container item, like an array or list of
>  log records.  This is generally what leads to the desire to build
>  multiple tables of the same type, because tables look a lot like
>  instances of a container item.  You're thinking of those tables as
>  *instances* of multi-element data structures, rather than a global
>  collection of all sub-instances of that type.
>
>  That's pretty natural.  After all, it would be an odd way to build an
>  application.  While many environments have global lists of objects
>  (e.g. static class variables that hold a list of every instance of
>  that particular class), you rarely use those lists for building
>  general associations.  In specific, if you were manipulating record X
>  and wanted a list of all the associated log records for X, you would
>  never consider putting a "I belong to data X" field into the log
>  record.  This would require scanning the global 

Re: [sqlite] Table within a table??

2009-11-01 Thread Kees Nuyt
On Sun, 1 Nov 2009 02:27:16 -0400, mark m
 wrote:

> P.S. your developer vs. database perspective should be
> a sticky or FAQ for other newbies on this mailing list.

If Jay allows (I'm sure he does), 
feel free to add an entry to:
http://www.sqlite.org/cvstrac/wiki

The chapter
"Hints For Using SQLite More Effectively"
seems appropriate.
-- 
  (  Kees Nuyt
  )
c[_]
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Table within a table??

2009-11-01 Thread mark m
Well, you pretty much guessed right on just about everything going through
my head.  I am brand new to database programming and have been trying to
learn
just enough to accomplish what I want to do within my own app.

The app I'm working on has evolved over the years into something that could
now
benefit from something more sophisticated than text file data storage.  Your
explanations
were EXTREMELY helpful.  If anything it gives me the confidence that I'm
going down
the right road.  If not for the great help from everyone in this thread I
might have abandoned
this effort altogether.  My developer instincts were telling me that I was
missing
something important or asking the wrong questions.

Thanks again for all the help.

P.S. your developer vs. database perspective should be a sticky or FAQ for
other newbies on this mailing list.


Mark

On Sat, Oct 31, 2009 at 9:32 PM, Jay A. Kreibich  wrote:

>
> On Sat, Oct 31, 2009 at 01:13:31AM -0400, mark m scratched on the wall:
> > O.K.  I think I am starting to get the idea.  It is just so foreign for
> me
> > to organize things this way. A master work history table for all cases
> > almost seems confusing.  It will just take a bit of adjustment for me
> > to "trust" the database way of doing things.  Text files organized in
> > the way I described has always made it easy for me to figure things
> > out when there was a problem.
>
>   This is not uncommon.  If I had to guess, I'd say you're an experienced
>  applications developer, but are somewhat new to databases.  You're
>  applying your instincts and experience in designing and laying out
>  runtime-data structures and classes to your database design, but find
>  yourself stumbling over some of the minor conceptual differences in
>  this new environment.
>
>  The good news is that this is normal, and nearly everyone makes these
>  mistakes.  The great news is that I've found much of that experience
>  can be utilized and successfully applied to database design as soon
>  as you wrap your head around a few minor differences.
>
>  The biggest stumbling block-- and the one you've hit head-on here--
>  is to think of tables as *instances* of compound data structures.  They
>  look a lot like arrays or dynamic lists, so this is an easy mistake.
>
>  In your case, you need a bunch of records associated with some other
>  table row, so the instinct is to create a brand new table to fit that
>  need.  You'll end up with a bunch of tables with the exact same
>  type-signature (column/type/name pattern), but that's how instances
>  of data structures or classes work-- this is exactly what you'd do
>  if that table was some fancy C++ class that offered a dynamic list
>  of elements.  It also provides for a clear and direct route to get
>  from a main data record to a collection of related sub-members.
>
>  Unfortunately, in the database world, that's the wrong concept and,
>  as others have pointed out, this will lead to no end of problems.
>
>  Here is the first big rule, and it alone will get you pretty far:
>
>Tables are data structures and classes.
>
>  If you would define a data structure or class to fit some specific
>  and well defined purpose, that's a table.  The .sql file that
>  defines your tables should map to the .h file in your head.
>
>  Don't think of tables themselves as multi-element data structures,
>  think of them as the data structures definitions.  It just happens that
>  in the database environment, every instance of a particular data
>  structure (i.e. a row) happens to be managed by a global instance pool.
>
>  In extension of that, the database world has no "contains a."   All
>  your compound structures are built from "references a."  And the
>  references that bind everything together are not pointers, but ID
>  values.
>
>  The other big stumbling block is that most of these references are
>  "backwards".  If you were building this as an application, you'd have
>  your main records, and each one of those would have a pointer off to
>  some dynamic list or array that held the log records.  In the database
>  world, the main record simply exists with some type of unique
>  identifier.  Rather than the main record pointing to the associated log
>  records, the log records point back and the main record they're
>  associated with.
>
>  This tends to make application developers uncomfortable.  If you had
>  a main record and wanted a list of all the log records associated
>  with it, you'd want to be able to de-reference some pointer or other
>  association directly to some container item, like an array or list of
>  log records.  This is generally what leads to the desire to build
>  multiple tables of the same type, because tables look a lot like
>  instances of a container item.  You're thinking of those tables as
>  *instances* of multi-element data structures, rather than a global
>  collection of all sub-instances of that type.
>
>  That's pretty 

Re: [sqlite] Table within a table??

2009-10-31 Thread Jay A. Kreibich

On Sat, Oct 31, 2009 at 01:13:31AM -0400, mark m scratched on the wall:
> O.K.  I think I am starting to get the idea.  It is just so foreign for me
> to organize things this way. A master work history table for all cases
> almost seems confusing.  It will just take a bit of adjustment for me
> to "trust" the database way of doing things.  Text files organized in
> the way I described has always made it easy for me to figure things
> out when there was a problem.

  This is not uncommon.  If I had to guess, I'd say you're an experienced
  applications developer, but are somewhat new to databases.  You're
  applying your instincts and experience in designing and laying out
  runtime-data structures and classes to your database design, but find
  yourself stumbling over some of the minor conceptual differences in
  this new environment.

  The good news is that this is normal, and nearly everyone makes these
  mistakes.  The great news is that I've found much of that experience
  can be utilized and successfully applied to database design as soon
  as you wrap your head around a few minor differences.

  The biggest stumbling block-- and the one you've hit head-on here--
  is to think of tables as *instances* of compound data structures.  They
  look a lot like arrays or dynamic lists, so this is an easy mistake.

  In your case, you need a bunch of records associated with some other
  table row, so the instinct is to create a brand new table to fit that
  need.  You'll end up with a bunch of tables with the exact same
  type-signature (column/type/name pattern), but that's how instances
  of data structures or classes work-- this is exactly what you'd do
  if that table was some fancy C++ class that offered a dynamic list
  of elements.  It also provides for a clear and direct route to get
  from a main data record to a collection of related sub-members.

  Unfortunately, in the database world, that's the wrong concept and,
  as others have pointed out, this will lead to no end of problems.
 
  Here is the first big rule, and it alone will get you pretty far:

Tables are data structures and classes.
  
  If you would define a data structure or class to fit some specific
  and well defined purpose, that's a table.  The .sql file that
  defines your tables should map to the .h file in your head.  
  
  Don't think of tables themselves as multi-element data structures,
  think of them as the data structures definitions.  It just happens that
  in the database environment, every instance of a particular data
  structure (i.e. a row) happens to be managed by a global instance pool.
  
  In extension of that, the database world has no "contains a."   All
  your compound structures are built from "references a."  And the
  references that bind everything together are not pointers, but ID
  values.

  The other big stumbling block is that most of these references are
  "backwards".  If you were building this as an application, you'd have
  your main records, and each one of those would have a pointer off to
  some dynamic list or array that held the log records.  In the database
  world, the main record simply exists with some type of unique
  identifier.  Rather than the main record pointing to the associated log
  records, the log records point back and the main record they're
  associated with.

  This tends to make application developers uncomfortable.  If you had
  a main record and wanted a list of all the log records associated
  with it, you'd want to be able to de-reference some pointer or other
  association directly to some container item, like an array or list of
  log records.  This is generally what leads to the desire to build
  multiple tables of the same type, because tables look a lot like
  instances of a container item.  You're thinking of those tables as
  *instances* of multi-element data structures, rather than a global
  collection of all sub-instances of that type.

  That's pretty natural.  After all, it would be an odd way to build an
  application.  While many environments have global lists of objects
  (e.g. static class variables that hold a list of every instance of
  that particular class), you rarely use those lists for building
  general associations.  In specific, if you were manipulating record X
  and wanted a list of all the associated log records for X, you would
  never consider putting a "I belong to data X" field into the log
  record.  This would require scanning the global list of log records
  to get from a main record to all the associated log records.  Not
  only is that somewhat indirect, it sounds very inefficient.

  But this is exactly what you do in the database world.  The main
  difference being that this is exactly the kind of thing that
  databases are very very good at-- both in terms of optimized searches
  for specific records in a large collection (if properly indexed) as
  well as powerful set manipulations that allow very complex
  

Re: [sqlite] Table within a table??

2009-10-30 Thread mark m
O.K.  I think I am starting to get the idea.  It is just so foreign for me
to organize things this way.
A master work history table for all cases almost seems confusing.  It will
just take a bit of adjustment
for me to "trust" the database way of doing things.  Text files organized in
the way I described has always
made it easy for me to figure things out when there was a problem.

I will give this a try.

Thanks very much for all the help.

On Fri, Oct 30, 2009 at 4:04 PM, Darren Duncan wrote:

> mark m wrote:
> > Thanks very much!!  It also occurred to me that I could have a Table
> named
> > "case1" and another
> > named "case1workhist".  The RDBMS wouldn't know they were related but my
> > application could be
> > set up to know this.
> >
> > Here is more detail on my current data organization:
> >
> > Open Cases
> >Case 1...
> >Case 2...
> >   field 1
> >   field 2
> >   work history
> >  item 1
> >  item 2
> > worker ID
> > duration
> > type
> > rate
> >  item 3
> >  ...
> >  ...
> >  item n
> >Case 3
> >...
> >...
> >Case n
> >
> > In my app, a certain case is chosen to be displayed.  All of the above
> > information is displayed in one form
> > or another within my app.  The user can change any given piece of info or
> > add new information.  So, I guess
> > I could just look for the tables "Casen" and "casenworkhist" to display
> my
> > info.  I do however need to go through
> > and calculate all the hours for all open cases and other calculations
> like
> > that.  I want to be sure I'm setting things
> > up in a way that will allow me to do this with minimal overhead.
> >
> > In your method I would only have 1 workhist table??  whereas in my method
> I
> > would have n workhist tables.  Is it
> > better to have only 1 and use a foriegn key like you describe to link
> work
> > history records with a given case??  Is the
> > rule generally to minimize the number of tables??
>
> A rule for relational database best practices is to minimize the number of
> tables that are mutually homogeneous in meaning and structure, and to allow
> or
> exploit multiple tables that are mutually heterogeneous.  If you have a
> conceptual reason for having multiple same-looking tables, then you encode
> that
> as an extra column in the 1 table.
>
> So for example, the work history details for *all* of your cases would go
> in *1*
> work_history table, not a separate table for each case, and you would have
> a
> field in work_history called case_number to identify which records of that
> table
> belong to each case.
>
> Given the hierarchy you mention, a table layout like this might work:
>
>   CREATE TABLE cases (
> case_number INTEGER PRIMARY KEY,
> ,
> 
>   )
>
>   CREATE TABLE workers (
>  worker_id INTEGER PRIMARY KEY,
>  ...
>   )
>
>   CREATE TABLE work_histories (
> work_history_item_number INTEGER PRIMARY KEY,
> case_number INTEGER,
> worker_id INTEGER,
> duration,
> type,
> rate,
> UNIQUE KEY (case_number, worker_id)  # or make this pk instead
> FOREIGN KEY (case_number) REFERENCES cases (case_number),
> FOREIGN KEY (worker_id) REFERENCES workers (worker_id)
>   )
>
> By the way, I name my tables in plural to describe what the table as a
> whole
> represents, (much like how one might name an array variable in a program,
> which
> is what a table is analogous to save for not being ordered), which is for
> example a collection of cases.  But some people will tell you tables should
> be
> named after what an individual record in it represents; I disagree with
> them (it
> is like naming an array after what one of its elements is) but it is
> commonly
> practiced, and its a style issue in the end, the DBMS doesn't care.
>
> -- Darren Duncan
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Table within a table??

2009-10-30 Thread Darren Duncan
mark m wrote:
> Thanks very much!!  It also occurred to me that I could have a Table named
> "case1" and another
> named "case1workhist".  The RDBMS wouldn't know they were related but my
> application could be
> set up to know this.
> 
> Here is more detail on my current data organization:
> 
> Open Cases
>Case 1...
>Case 2...
>   field 1
>   field 2
>   work history
>  item 1
>  item 2
> worker ID
> duration
> type
> rate
>  item 3
>  ...
>  ...
>  item n
>Case 3
>...
>...
>Case n
> 
> In my app, a certain case is chosen to be displayed.  All of the above
> information is displayed in one form
> or another within my app.  The user can change any given piece of info or
> add new information.  So, I guess
> I could just look for the tables "Casen" and "casenworkhist" to display my
> info.  I do however need to go through
> and calculate all the hours for all open cases and other calculations like
> that.  I want to be sure I'm setting things
> up in a way that will allow me to do this with minimal overhead.
> 
> In your method I would only have 1 workhist table??  whereas in my method I
> would have n workhist tables.  Is it
> better to have only 1 and use a foriegn key like you describe to link work
> history records with a given case??  Is the
> rule generally to minimize the number of tables??

A rule for relational database best practices is to minimize the number of 
tables that are mutually homogeneous in meaning and structure, and to allow or 
exploit multiple tables that are mutually heterogeneous.  If you have a 
conceptual reason for having multiple same-looking tables, then you encode that 
as an extra column in the 1 table.

So for example, the work history details for *all* of your cases would go in 
*1* 
work_history table, not a separate table for each case, and you would have a 
field in work_history called case_number to identify which records of that 
table 
belong to each case.

Given the hierarchy you mention, a table layout like this might work:

   CREATE TABLE cases (
 case_number INTEGER PRIMARY KEY,
 ,
 
   )

   CREATE TABLE workers (
 worker_id INTEGER PRIMARY KEY,
 ...
   )

   CREATE TABLE work_histories (
 work_history_item_number INTEGER PRIMARY KEY,
 case_number INTEGER,
 worker_id INTEGER,
 duration,
 type,
 rate,
 UNIQUE KEY (case_number, worker_id)  # or make this pk instead
 FOREIGN KEY (case_number) REFERENCES cases (case_number),
 FOREIGN KEY (worker_id) REFERENCES workers (worker_id)
   )

By the way, I name my tables in plural to describe what the table as a whole 
represents, (much like how one might name an array variable in a program, which 
is what a table is analogous to save for not being ordered), which is for 
example a collection of cases.  But some people will tell you tables should be 
named after what an individual record in it represents; I disagree with them 
(it 
is like naming an array after what one of its elements is) but it is commonly 
practiced, and its a style issue in the end, the DBMS doesn't care.

-- Darren Duncan
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Table within a table??

2009-10-30 Thread P Kishor
On Fri, Oct 30, 2009 at 2:41 AM, mark m  wrote:
> Thanks very much!!  It also occurred to me that I could have a Table named
> "case1" and another
> named "case1workhist".  The RDBMS wouldn't know they were related but my
> application could be
> set up to know this.

Expand Darren's suggestion with the following --

CREATE TABLE worker (
worker_id INTEGER PRIMARY KEY,
b TEXT
  );

  CREATE TABLE work_history (
worker_id INTEGER,
c INTEGER,
d TEXT,
CONSTRAINT FOREIGN KEY (worker_id) REFERENCING worker (worker_id)
case_id INTEGER
  );

CREATE TABLE cases (
case_id INTEGER PRIMARY KEY,
e TEXT
  );

So, above I have added a table for cases where you can store your case
history, and then, in your work_history table, added a case_id FK.


>
> Here is more detail on my current data organization:
>
> Open Cases
>   Case 1...
>   Case 2...
>      field 1
>      field 2
>      work history
>         item 1
>         item 2
>            worker ID
>            duration
>            type
>            rate
>         item 3
>         ...
>         ...
>         item n
>   Case 3
>   ...
>   ...
>   Case n
>
> In my app, a certain case is chosen to be displayed.  All of the above
> information is displayed in one form
> or another within my app.  The user can change any given piece of info or
> add new information.  So, I guess
> I could just look for the tables "Casen" and "casenworkhist" to display my
> info.  I do however need to go through
> and calculate all the hours for all open cases and other calculations like
> that.  I want to be sure I'm setting things
> up in a way that will allow me to do this with minimal overhead.
>
> In your method I would only have 1 workhist table??  whereas in my method I
> would have n workhist tables.  Is it
> better to have only 1 and use a foriegn key like you describe to link work
> history records with a given case??  Is the
> rule generally to minimize the number of tables??
>
> Mark
>
> On Fri, Oct 30, 2009 at 2:44 AM, Darren Duncan wrote:
>
>> mark m wrote:
>> > I'm very new to database programming so this question is pretty basic
>> >
>> > I have data that is currently organized as follows:
>> >
>> > Each case has several fields that contain only one value.  There are
>> several
>> > fields that have a pipe-delimited string
>> > that represents a work history.  Each work history item has its own
>> fields
>> > like data, worker ID etc.  So, as I convert
>> > my text files over to database format, I find myself wanting to have a
>> table
>> > within a table.  So, ideally the case table would
>> > have several singular fields and a field named "work history" that would
>> > contain a table that would have all of the work history
>> > in it.
>> >
>> > But, I haven't found a way to do this.  If this is not possible, what
>> would
>> > be the best way to organize data such as this in
>> > a database program.
>> >
>> > Thanks for the help.
>> >
>> > Mark
>>
>> What you are talking about is perfectly reasonable from a logical
>> standpoint,
>> and in the relational model the feature would be called "relation-valued
>> attributes" or "RVAs".  Or at least it is in the version of the relational
>> model
>> that allows non-scalar attribute values, but that is the one that Chris
>> Date et
>> al, as well as myself ascribe to.  Logically speaking, RVAs are what you
>> get as
>> the intermediate stage of a "GROUP BY", and are the input for aggregate
>> operators like SUM()/COUNT()/MIN()/MAX()/etc.
>>
>> However, SQLite and many other SQL DBMSs don't support RVAs, so you'll have
>> to
>> use a logically equivalent arrangement of 2 sibling tables that have a
>> parent/child (say, "worker"/"work_history") foreign-key relationship.
>>
>> That is, instead of this kind of schema (in pseudocode):
>>
>>   var worker : TABLE {
>>     worker_id : INT, b : TEXT, work_history : TABLE { c : INT, d : TEXT }
>>   }
>>
>> ... you have this kind of schema:
>>
>>   var worker : TABLE { worker_id : INT, b : TEXT }
>>   var work_history : TABLE { worker_id : INT, c : INT, d : TEXT }
>>
>> ... and work_history.worker_id has a foreign key constraint on
>> worker.worker_id
>> .  The SQL to do the latter is approximately:
>>
>>   CREATE TABLE worker (
>>     worker_id INTEGER PRIMARY KEY,
>>     b TEXT
>>   );
>>
>>   CREATE TABLE work_history (
>>     worker_id INTEGER,
>>     c INTEGER,
>>     d TEXT,
>>     CONSTRAINT FOREIGN KEY (worker_id) REFERENCING worker (worker_id)
>>   );
>>
>> You would have a record in "worker" for each distinct "worker_id" and that
>> record contains all the details that aren't part of the work history.  Then
>> you
>> have a record in "work_history" for each record that would have been in the
>> inner table of "worker" had it existed, and you have the added "worker_id"
>> field
>> in "work_history" for every history row that would refer to the same
>> worker.
>> Having the same "worker_id" 

Re: [sqlite] Table within a table??

2009-10-30 Thread mark m
Thanks very much!!  It also occurred to me that I could have a Table named
"case1" and another
named "case1workhist".  The RDBMS wouldn't know they were related but my
application could be
set up to know this.

Here is more detail on my current data organization:

Open Cases
   Case 1...
   Case 2...
  field 1
  field 2
  work history
 item 1
 item 2
worker ID
duration
type
rate
 item 3
 ...
 ...
 item n
   Case 3
   ...
   ...
   Case n

In my app, a certain case is chosen to be displayed.  All of the above
information is displayed in one form
or another within my app.  The user can change any given piece of info or
add new information.  So, I guess
I could just look for the tables "Casen" and "casenworkhist" to display my
info.  I do however need to go through
and calculate all the hours for all open cases and other calculations like
that.  I want to be sure I'm setting things
up in a way that will allow me to do this with minimal overhead.

In your method I would only have 1 workhist table??  whereas in my method I
would have n workhist tables.  Is it
better to have only 1 and use a foriegn key like you describe to link work
history records with a given case??  Is the
rule generally to minimize the number of tables??

Mark

On Fri, Oct 30, 2009 at 2:44 AM, Darren Duncan wrote:

> mark m wrote:
> > I'm very new to database programming so this question is pretty basic
> >
> > I have data that is currently organized as follows:
> >
> > Each case has several fields that contain only one value.  There are
> several
> > fields that have a pipe-delimited string
> > that represents a work history.  Each work history item has its own
> fields
> > like data, worker ID etc.  So, as I convert
> > my text files over to database format, I find myself wanting to have a
> table
> > within a table.  So, ideally the case table would
> > have several singular fields and a field named "work history" that would
> > contain a table that would have all of the work history
> > in it.
> >
> > But, I haven't found a way to do this.  If this is not possible, what
> would
> > be the best way to organize data such as this in
> > a database program.
> >
> > Thanks for the help.
> >
> > Mark
>
> What you are talking about is perfectly reasonable from a logical
> standpoint,
> and in the relational model the feature would be called "relation-valued
> attributes" or "RVAs".  Or at least it is in the version of the relational
> model
> that allows non-scalar attribute values, but that is the one that Chris
> Date et
> al, as well as myself ascribe to.  Logically speaking, RVAs are what you
> get as
> the intermediate stage of a "GROUP BY", and are the input for aggregate
> operators like SUM()/COUNT()/MIN()/MAX()/etc.
>
> However, SQLite and many other SQL DBMSs don't support RVAs, so you'll have
> to
> use a logically equivalent arrangement of 2 sibling tables that have a
> parent/child (say, "worker"/"work_history") foreign-key relationship.
>
> That is, instead of this kind of schema (in pseudocode):
>
>   var worker : TABLE {
> worker_id : INT, b : TEXT, work_history : TABLE { c : INT, d : TEXT }
>   }
>
> ... you have this kind of schema:
>
>   var worker : TABLE { worker_id : INT, b : TEXT }
>   var work_history : TABLE { worker_id : INT, c : INT, d : TEXT }
>
> ... and work_history.worker_id has a foreign key constraint on
> worker.worker_id
> .  The SQL to do the latter is approximately:
>
>   CREATE TABLE worker (
> worker_id INTEGER PRIMARY KEY,
> b TEXT
>   );
>
>   CREATE TABLE work_history (
> worker_id INTEGER,
> c INTEGER,
> d TEXT,
> CONSTRAINT FOREIGN KEY (worker_id) REFERENCING worker (worker_id)
>   );
>
> You would have a record in "worker" for each distinct "worker_id" and that
> record contains all the details that aren't part of the work history.  Then
> you
> have a record in "work_history" for each record that would have been in the
> inner table of "worker" had it existed, and you have the added "worker_id"
> field
> in "work_history" for every history row that would refer to the same
> worker.
> Having the same "worker_id" values in both sibling tables tells the RDBMS
> which
> rows in the 2 tables correspond to each other.
>
> -- Darren Duncan
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Table within a table??

2009-10-30 Thread Darren Duncan
mark m wrote:
> I'm very new to database programming so this question is pretty basic
> 
> I have data that is currently organized as follows:
> 
> Each case has several fields that contain only one value.  There are several
> fields that have a pipe-delimited string
> that represents a work history.  Each work history item has its own fields
> like data, worker ID etc.  So, as I convert
> my text files over to database format, I find myself wanting to have a table
> within a table.  So, ideally the case table would
> have several singular fields and a field named "work history" that would
> contain a table that would have all of the work history
> in it.
> 
> But, I haven't found a way to do this.  If this is not possible, what would
> be the best way to organize data such as this in
> a database program.
> 
> Thanks for the help.
> 
> Mark

What you are talking about is perfectly reasonable from a logical standpoint, 
and in the relational model the feature would be called "relation-valued 
attributes" or "RVAs".  Or at least it is in the version of the relational 
model 
that allows non-scalar attribute values, but that is the one that Chris Date et 
al, as well as myself ascribe to.  Logically speaking, RVAs are what you get as 
the intermediate stage of a "GROUP BY", and are the input for aggregate 
operators like SUM()/COUNT()/MIN()/MAX()/etc.

However, SQLite and many other SQL DBMSs don't support RVAs, so you'll have to 
use a logically equivalent arrangement of 2 sibling tables that have a 
parent/child (say, "worker"/"work_history") foreign-key relationship.

That is, instead of this kind of schema (in pseudocode):

   var worker : TABLE {
 worker_id : INT, b : TEXT, work_history : TABLE { c : INT, d : TEXT }
   }

... you have this kind of schema:

   var worker : TABLE { worker_id : INT, b : TEXT }
   var work_history : TABLE { worker_id : INT, c : INT, d : TEXT }

... and work_history.worker_id has a foreign key constraint on worker.worker_id 
.  The SQL to do the latter is approximately:

   CREATE TABLE worker (
 worker_id INTEGER PRIMARY KEY,
 b TEXT
   );

   CREATE TABLE work_history (
 worker_id INTEGER,
 c INTEGER,
 d TEXT,
 CONSTRAINT FOREIGN KEY (worker_id) REFERENCING worker (worker_id)
   );

You would have a record in "worker" for each distinct "worker_id" and that 
record contains all the details that aren't part of the work history.  Then you 
have a record in "work_history" for each record that would have been in the 
inner table of "worker" had it existed, and you have the added "worker_id" 
field 
in "work_history" for every history row that would refer to the same worker. 
Having the same "worker_id" values in both sibling tables tells the RDBMS which 
rows in the 2 tables correspond to each other.

-- Darren Duncan
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Table within a table??

2009-10-30 Thread mark m
I'm very new to database programming so this question is pretty basic

I have data that is currently organized as follows:

Each case has several fields that contain only one value.  There are several
fields that have a pipe-delimited string
that represents a work history.  Each work history item has its own fields
like data, worker ID etc.  So, as I convert
my text files over to database format, I find myself wanting to have a table
within a table.  So, ideally the case table would
have several singular fields and a field named "work history" that would
contain a table that would have all of the work history
in it.

But, I haven't found a way to do this.  If this is not possible, what would
be the best way to organize data such as this in
a database program.

Thanks for the help.

Mark
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users