Giuseppe "Cowo" Corbelli wrote:
>>> Anyway, regarding your message: that's a very good news! :-)
>> Talking about the implementation: I think I can easily modify
>> the dbschema.py module to include a function to create the
>> indexes; I'll do this in the next days.
>>
>> Now my doubts are about what indexes to create: Giuseppe (ehi, any
>> other volunteer with a bit of experience in database design is also
>> welcome :-) can you take a look at the currently defined indexes?
>>
>> Critical tables are CastInfo, Name and Title.
>> I'm not sure that the current indexes are the best ones.
> 
> Let's start with these 3.
> What are the most common select operations on these tables?

Some new info:
1) Not all the DBMS create indexes for foreign keys. PostgreSQL 8 and SQLite 3 
don't, at least.

2) From SQLObject sources I discovered another point: the createTable method 
takes a boolean parameter createIndexes. I suggest that Davide (or someone 
with MySQL installed) do a quick test, by creating one or more tables 
containing foreign keys with
tableClass.createTable(createIndexes=False)
If indexes for the foreign keys are not created we should keep the ForeignKeys 
and create indexes at a later time to improve performance.
If they are created, likely they cannot be deleted (various reasons), so we 
can stay with the current CVS (no Foreign Keys, just plain data).


Since indexes are not SQL standard and not every index method supports 
multicolumn indexes I propose to use single-column indexes only. [*]
Just a couple of reminders: primary keys may be already indexed by default 
(PostgreSQL does do this, SQLite doesn't); indexes are most useful when 
retrieving a few rows from a great bunch of data.

CastInfo
   personIDIdx = DatabaseIndex(personID)
   movieIDIdx = DatabaseIndex(movieID)

OK! It's likely you search this table when you already know what person and 
movie you are talking about.

Name
   pNameIdx = DatabaseIndex({'column': name, 'length': 5}, imdbIndex)
   namePcodeCfIdx = DatabaseIndex(namePcodeCf)
   namePcodeNfIdx = DatabaseIndex(namePcodeNf)
   surnamePcodeIdx = DatabaseIndex(surnamePcode)

OK, with 2 points:
1) pNameIdx, I think it would be better to index the uppercased or lowercased 
column, and search for the up or low case. It seems that this can be done with
{'expression':'lower(name)'} but I am still unable to get it working
2) The table is usually searched for a specific imdbID? If so a
imdbIDIdx = DatabaseIndex(imdbID) would be good

Title
   mTitleIdx = DatabaseIndex({'column': title, 'length': 5}, imdbIndex,
                               kind, productionYear)

Because of what I said before [*], I'd suggest to use 4 separate indexes.

-- 
         Giuseppe "Cowo" Corbelli ~\/~ My software: http://cowo.yoda2000.net
-<! Non c'e' niente da dire in proposito. Tutto quello che uno deve fare e'
   colpire i tasti giusti al momento giusto, e lo strumento suona da solo. !>-
                                 J.S. Bach



_______________________________________________
Imdbpy-devel mailing list
Imdbpy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Reply via email to