Re: [sqlite] Schema design and/or SELECT construction

2009-06-03 Thread Craig Smith
Chris and Jay:

Thank you both very much for  your comments; that solves it.  I am not  
a trained database designer, but I have resources on normalization; I  
simply neglected to consult them, thinking that this was a SELECT  
problem, not a design problem.  Your input was just what I was hoping  
for.

> The concept of core and other keywords is a bit arbitrary.
> What is important (i.e. core) today might not be so tomorrow.
>
> Parsing comma separated lists in a single attribute is
> likely to be a bother.

>  Generally, I would call this a bad idea.  By most people's
>  thinking it also breaks First Normal Form.
>
>  Besides, you can't do database operations on comma separated lists.
>  If you need a one-to-many (one pic to many keywords) then do it right
>  and build an actual one-to-many relationship between two tables.

Craig Smith
cr...@macscripter.net



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Schema design and/or SELECT construction

2009-06-03 Thread Jay A. Kreibich
On Mon, Jun 01, 2009 at 09:21:40PM -0700, Craig Smith scratched on the wall:
> Hello:
> 
> I have about 3000 electronic images that I am preparing to distribute  
> to my family members.  They use a variety of operating systems, so I  
> am providing the images on remote hard drives, with the images divided  
> into folders based on years the images were created.  All images were  
> obtained via scanning of negatives and slides, and each image has been  
> keyworded in EXIF format.
> 
> What I want is to create a master SQLite database catalog of all the  
> images, and I have come up with a proposed schema:
> 
> TABLE main (all pertinent image data such as date, location in folder,  
> etc.)
> TABLE core_keywords (id, name) --This table would only hold the names  
> of my immediate family members, each with a corresponding id
> TABLE other_keyword (id,name) -- all other keywords

> I have thought to create two keyword fields in the main table, one to  
> hold the ids of the core_keywords (comma separated) and one to hold  
> the ids of the other_keywords, also comma separated.

  Generally, I would call this a bad idea.  By most people's
  thinking it also breaks First Normal Form.
  
  Besides, you can't do database operations on comma separated lists.
  If you need a one-to-many (one pic to many keywords) then do it right
  and build an actual one-to-many relationship between two tables.

  Although, in this case, I'd say it is actually many-to-many.

  Might I suggest:

TABLE pictures (pic_id INTEGER PRIMARY KEY, filename NOT NULL UNIQUE, . );
TABLE keywords (key_id INTEGER PRIMARY KEY, is_core bool, word NOT NULL UNIQUE);
TABLE pic_to_key (pic_id, key_id, PRIMARY KEY (pic_id, key_id));
INDEX pic_to_key (key_id, pick_id);  -- just because

  Define your pictures.
  Define your keywords.  You can flag specific keywords as "core."
  Define which pics have which keywords.

  You can then use a keyword to find pictures, or use pictures to find
  keywords.

> What I cannot  
> devise is an elegant method to SELECT based on the core_keywords to  
> achieve the following sorts:
> 
> 1- Find images with a single core_keyword id, that is, only images of  
> a single person, no other core persons in the image

  Lookup the key_id.  Find all the pics with that key idea.  Find all
  the key ids for those pictures.  Count them, select the pics with 1.

> 2- Find images with a specific set of core_keyword ids, such as 1 and  
> 6 or 2 and 5 and 7, etc., with no other core persons in the image

  Same basic idea, left as an exercise to the reader.

  I'll offer this hint, however:

  This is a situation that will sometimes lead to the need to do
  Relational Divison, something that isn't directly supported by SQL.
  Have a look at Celko's article on it:
  http://www.dbazine.com/ofinterest/oi-articles/celko1

  You'll often end up with a "vertical" list of key_ids (a picture
  joined against pic_to_key to get a list of keys) and what you need is
  a "horizontal" list to run through your logic statements.  That all
  points to Relational Division.

  Or you can just move the selection logic to your applicaiton.


   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Our opponent is an alien starship packed with atomic bombs.  We have
 a protractor."   "I'll go home and see if I can scrounge up a ruler
 and a piece of string."  --from Anathem by Neal Stephenson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Schema design and/or SELECT construction

2009-06-03 Thread Chris Peachment
What's wrong with a properly normalised schema like this:

create table main (id, name,...);

create table keyword (id, label);

create table crossref (id_main, id_keyword);

The concept of core and other keywords is a bit arbitrary.
What is important (i.e. core) today might not be so tomorrow.

Parsing comma separated lists in a single attribute is
likely to be a bother.


On Mon, 2009-06-01 at 21:21 -0700, Craig Smith wrote:
> Hello:
> 
> I have about 3000 electronic images that I am preparing to distribute  
> to my family members.  They use a variety of operating systems, so I  
> am providing the images on remote hard drives, with the images divided  
> into folders based on years the images were created.  All images were  
> obtained via scanning of negatives and slides, and each image has been  
> keyworded in EXIF format.
> 
> What I want is to create a master SQLite database catalog of all the  
> images, and I have come up with a proposed schema:
> 
> TABLE main (all pertinent image data such as date, location in folder,  
> etc.)
> TABLE core_keywords (id, name) --This table would only hold the names  
> of my immediate family members, each with a corresponding id
> TABLE other_keyword (id,name) -- all other keywords
> 
> I have thought to create two keyword fields in the main table, one to  
> hold the ids of the core_keywords (comma separated) and one to hold  
> the ids of the other_keywords, also comma separated.  What I cannot  
> devise is an elegant method to SELECT based on the core_keywords to  
> achieve the following sorts:
> 
> 1- Find images with a single core_keyword id, that is, only images of  
> a single person, no other core persons in the image
> 2- Find images with a specific set of core_keyword ids, such as 1 and  
> 6 or 2 and 5 and 7, etc., with no other core persons in the image
> 
> The idea is to create a document with lists of all images that are  
> exclusive to single individuals, specific pairs, etc., so that family  
> members can easily find themselves or groups, regardless of image  
> catalog software they use on their particular systems, which may or  
> may not be able to perform these types of sorts.
> 
> I am not asking anyone to actually write the SELECT statements for me,  
> but rather point me toward the operands that would achieve my goal.  I  
> have read through the documentation, and I cannot seem to generate the  
> logic in my head to SELECT WHERE core_id is only 4.
> 
> If anyone has an idea on a more efficient database design, or TABLE  
> schema, please do not hesitate to proffer your thoughts.  I am hoping  
> to have it all figured out BEFORE I load up the tables with data.  (I  
> am actually still scanning images at this stage, but trying to prepare  
> for the next phase.)
> 
> Thank you very much for your time and consideration.
> 
> 
> Craig Smith
> cr...@macscripter.net
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Schema design and/or SELECT construction

2009-06-03 Thread Craig Smith
Hello:

I have about 3000 electronic images that I am preparing to distribute  
to my family members.  They use a variety of operating systems, so I  
am providing the images on remote hard drives, with the images divided  
into folders based on years the images were created.  All images were  
obtained via scanning of negatives and slides, and each image has been  
keyworded in EXIF format.

What I want is to create a master SQLite database catalog of all the  
images, and I have come up with a proposed schema:

TABLE main (all pertinent image data such as date, location in folder,  
etc.)
TABLE core_keywords (id, name) --This table would only hold the names  
of my immediate family members, each with a corresponding id
TABLE other_keyword (id,name) -- all other keywords

I have thought to create two keyword fields in the main table, one to  
hold the ids of the core_keywords (comma separated) and one to hold  
the ids of the other_keywords, also comma separated.  What I cannot  
devise is an elegant method to SELECT based on the core_keywords to  
achieve the following sorts:

1- Find images with a single core_keyword id, that is, only images of  
a single person, no other core persons in the image
2- Find images with a specific set of core_keyword ids, such as 1 and  
6 or 2 and 5 and 7, etc., with no other core persons in the image

The idea is to create a document with lists of all images that are  
exclusive to single individuals, specific pairs, etc., so that family  
members can easily find themselves or groups, regardless of image  
catalog software they use on their particular systems, which may or  
may not be able to perform these types of sorts.

I am not asking anyone to actually write the SELECT statements for me,  
but rather point me toward the operands that would achieve my goal.  I  
have read through the documentation, and I cannot seem to generate the  
logic in my head to SELECT WHERE core_id is only 4.

If anyone has an idea on a more efficient database design, or TABLE  
schema, please do not hesitate to proffer your thoughts.  I am hoping  
to have it all figured out BEFORE I load up the tables with data.  (I  
am actually still scanning images at this stage, but trying to prepare  
for the next phase.)

Thank you very much for your time and consideration.


Craig Smith
cr...@macscripter.net



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users