A follow-up of the previous email. > > A) Enabled the collation function for all TEXT columns in the database > when creating/altering a table. If indexes are created on these columns, > the given collation function will be used for the index. This is done in > the 'collation' remote branch in gnome git. >
I tried to measure the impact of collation when inserting the resources in the case where we enable collation by default in all text columns. In this case, the collation function gets called every time we insert data in a column which has an index, as the index gets sorted based on it. I did run several full first-time indexes, on around 24k files in my PC. The given values are best ones over 4-5 tests. I disabled FTS in the tests, so that the effect of the different parsers is not considered. Anyway, I did it too late as I already had run the same tests with FTS enabled, so as an extra I also give the indexing times with FTS enabled. * no collation: ~100s * libicu: ~101s --> ~105s with FTS * libunistring: ~103s --> ~104s with FTS * glib: ~105s --> ~305s with FTS So, using libicu the effect of collation while inserting data can hardly be noticed. libunistring and glib ones, even if slower, also perform very well. > * When setting collation in the column (A cases) there seems to be an > impact on the search time, even if ORDER BY not used (different values > for glib/icu/unistring in case A.1; and case A.1 compared to B1). This > is pretty strange, and don't really know why. Someone? About this comment in the previous email; forget it. I enabled collation also in the Uri column of the Resources table, which is wrong. The numbers above are given with this fix applied. Cheers, -- Aleksander _______________________________________________ tracker-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/tracker-list
