Hi Denis,

I've been reading your email carefully and I'd like to comment it.

On 28/03/2006, at 14:24, Dennis Cote wrote:

With these tables you will have 25K rows in the File table, one per file, and 250K rows in the Attribute table assuming an average of 10 attributes per file (your example had 14).

An index on the attribute name and value pairs will greatly speedup lookups of a particular attribute.

There is still some redundancy in this set of tables since the names of the attributes are being repeated over and over in the Attribute table. This may not be a big problem, but you really only have 100 attributes, so there is no sense storing 25K copies of the name of a common attribute that applies to all files. This information can be factored out into a third table as shown below.

[...] This will give you three tables (which will not cause a noticeable difference in the database open time). One with 25K rows of file names. One with 100 rows of attribute definition data, which includes the attribute name. And one with 250K rows of attribute value data.

Due to application requirements, I must rely on LIKE and GLOB in order to match data, matching data that contains some value (sensitive or insensitive match).

Now, it seems to me that using either LIKE or GLOB will force a row scan anyhow, since it can't use the index, correct? So your solution would force me to perform the row scan for 250K rows when matching values. That is 10 times more data to scan through.

I'm not sure if there is a way to speed up LIKE or GLOB queries such as:

SELECT myvalue FROM files WHERE myvalue LIKE '%finit%';

Thanks a lot for the advice.

Regards,

-- Tito

Reply via email to