Davide Alberani wrote: Second round of this reply.
>> so we can stay with the current CVS (no Foreign Keys, just plain >> data). > > Makes sense; actually there are few foreign keys, but I think they > doesn't greatly affect performances. I skipped this point in my last reply. The only way to have small impact on performance is to have few records in the table(few compared to bigger tables). Let's take CastInfo, Role foreign key: If I understood MySQL docs correctly there's an index on column Role. There are also a great number of CastInfo elements, forcing index update on each INSERT. IMHO _all_ indexes on big tables should not be created in the beginning, but only when DB population is done. Pro: - We don't need them during INSERT phase, and gain a valuable speed boost - We can create them once INSERT is done Con: - We loose the referential integrity Time to make a choice :-) >> Name >> pNameIdx = DatabaseIndex({'column': name, 'length': 5}, >> imdbIndex) namePcodeCfIdx = DatabaseIndex(namePcodeCf) >> namePcodeNfIdx = DatabaseIndex(namePcodeNf) surnamePcodeIdx = >> DatabaseIndex(surnamePcode) >> >> OK, with 2 points: 1) pNameIdx, I think it would be better >> to index the uppercased or lowercased column, and search >> for the up or low case. It seems that this can be done with >> {'expression':'lower(name)'} but I am still unable to get it working > > Wait: does we need these "pNameIdx" and "mTitleIdx" indexes at all? > > Actually we need to search the "name" or the "title" column (and "imdbIndex" > and friends) only in a specific case: substituting strings in text fields. > E.g.: if in the "plot" information theres a line like that: > ... and she worked with the great actor 'Gino Pino (II)' (qv) ... Well, the plot file contains a great number of plots, so "name" and "title" should be searched a great deal of times. I'd go for the index. > the Name table is scanned for the row with name set to "Gino Pino" > and imdbIndex set to "II"; its personID is returned and a Person > object instanced. > > In this case, searching for person names/movie titles we _always_ > know the complete and correct name (upper/lower case included). Good point. No need to index the up/lowercase. > So: this indexes are probably needed (and as you suggested it's better > to split them in separate indexes); I'm wondering if we can just > index the "name"/"title" column and what's the best length of text > to be considered (5 was choose following my mood of the time, I fear). I'd say not to specify length, because this partial indexing is (AFAIK) supported by MySQL only and, however, we need an index on the full value. -- Giuseppe "Cowo" Corbelli ~\/~ My software: http://cowo.yoda2000.net -<! Non c'e' niente da dire in proposito. Tutto quello che uno deve fare e' colpire i tasti giusti al momento giusto, e lo strumento suona da solo. !>- J.S. Bach _______________________________________________ Imdbpy-devel mailing list Imdbpy-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-devel