A follow-up of the previous email.

> 
> A) Enabled the collation function for all TEXT columns in the database
> when creating/altering a table. If indexes are created on these columns,
> the given collation function will be used for the index. This is done in
> the 'collation' remote branch in gnome git.
> 

I tried to measure the impact of collation when inserting the resources
in the case where we enable collation by default in all text columns. In
this case, the collation function gets called every time we insert data
in a column which has an index, as the index gets sorted based on it.

I did run several full first-time indexes, on around 24k files in my PC.
The given values are best ones over 4-5 tests. I disabled FTS in the
tests, so that the effect of the different parsers is not considered.
Anyway, I did it too late as I already had run the same tests with FTS
enabled, so as an extra I also give the indexing times with FTS enabled.

 * no collation: ~100s
 * libicu:       ~101s --> ~105s with FTS
 * libunistring: ~103s --> ~104s with FTS
 * glib:         ~105s --> ~305s with FTS

So, using libicu the effect of collation while inserting data can hardly
be noticed. libunistring and glib ones, even if slower, also perform
very well.


>  * When setting collation in the column (A cases) there seems to be an
> impact on the search time, even if ORDER BY not used (different values
> for glib/icu/unistring in case A.1; and case A.1 compared to B1). This
> is pretty strange, and don't really know why. Someone?

About this comment in the previous email; forget it. I enabled collation
also in the Uri column of the Resources table, which is wrong. The
numbers above are given with this fix applied.

Cheers,

-- 
Aleksander

_______________________________________________
tracker-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/tracker-list

Reply via email to