Re: [Sqlalchemy-users] db design question

Michael Bayer Fri, 25 Aug 2006 08:03:09 -0700

I would say theres nothing wrong using a one-to-many relationship for  
this.  I am in the school that says "normalize as much as you can but  
not excessively".  also, its something you can change later if you  
wanted, the data can be migrated from one-to-many into a many-to-many  
using a few insert-from-select statements.

the advantages to "many-to-many" in this case are focused around the  
fact that you get a keyword table that nicely stores all keywords  
uniquely, which takes up a lot less space for a database that has  
many, many articles, and might be perceived as "cleaner" if you later  
want to add other relationships up to your keywords, such as a  
"keyword_search_stats" table or something similar.

you could also make the case that if you are trying to query various  
keyword patterns across many articles, its more efficient to compare  
integers to each other instead of strings, although that difference  
depends a lot on what kind of indexing is set up...maybe someone on  
the list can fill us in on if integer indexes are inherently more  
efficient than string indexes.

the many-to-many join produces a much smaller set of rows with which  
to search for strings.  compare the two queries:

        many-to-many:

        select * from articles, article_keywords, keywords where  
articles.id==article_keywords.article_id and
        article_keywords.keyword_id==keywords.id and keywords.name in  
('some', 'keywords')

        one-to-many:

        select * from articles, article_keywords where  
articles.id==article_keywords.article_id and
        article_keywords.name in ('some', 'keywords')

so lets say there are 48000 articles, and the total number of unique  
keywords is 1800.  articles have an average of five keywords so the  
number of "article-keyword" association rows in both cases is about  
240000.

query #1 will select two rows from the "keywords" table, and  
basically needs to scan for two separate strings out of a set of  
1800, once for each.  it then produces a set of integer ids for those  
keywords which are used to join up against the article and  
article_keywords table.  Query #2 on the other hand has to scan  
240000 rows for the two separate strings.

it might not actually make so much of a difference if you have proper  
btree indexes set up...maybe someone has more in-depth knowledge of  
indexing and can add to this.  but for space-saving reasons alone i  
think many-to-many is worth it.

On Aug 25, 2006, at 10:25 AM, Martin Stein wrote:

> Hi all,
>
> Just a short question concerning database design in general and
> SQLAlchemy in particular. Let's take the many-to-many example from the
> docs (chapter Data Mapping), where we have a table containing articles
> and one containing some keywords. Now, I understand the normal  
> layout of
> such a many-to-many relationship: Create 2 separate tables (in this  
> case
> "articles " and "keywords") and create the relationship link using  
> a 3rd
> table ("articlekeywords_table").
>
> I was wondering: In this special case, where one end of the  
> relationship
> consists of very "small" objects - essentially, each keyword object is
> not more than a string - why shouldn't I simply use a one-to-many
> structure like this:
>
> articles_table = Table('articles', metadata,
> Column('article_id', Integer, primary_key = True),
> Column('headline', String(150), key='headline'),
> Column('body', TEXT, key='body'),
> )
>
> my_keywords_table = Table('article_keywords', metadata,
> Column('article_id', Integer, ForeignKey("articles.article_id")),
> Column('keyword_string', String(50))
> )
>
> I am aware that this is not a db-layout which is normalized like the
> theory says you should, but in this special case where the second  
> end of
> the relationship only consists of single strings... why not? What  
> is the
> advantage of the "classic" many-to-many way of doing it?
>
> Probably, I am missing something, so I would appreciate if someone  
> could
> tell me the flaw in my thinking.
>
> Thanks and Cheers,
> Martin
>
>
>
>
> ---------------------------------------------------------------------- 
> ---
> Using Tomcat but need to do more? Need to support web services,  
> security?
> Get stuff done quickly with pre-integrated technology to make your  
> job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache  
> Geronimo
> http://sel.as-us.falkag.net/sel? 
> cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Sqlalchemy-users mailing list
> Sqlalchemy-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Sqlalchemy-users mailing list
Sqlalchemy-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users

Re: [Sqlalchemy-users] db design question

Reply via email to