On Tue, 2006-08-01 at 18:59 +0900, David Balmain wrote:
> Hmmm. Sounds like an interesting application. One solution would be to
> cache the sort index on disk. The problem with this is that the cache
> would still need to be recalculated every time you add more documents
> to the index so you'll still have the long wait occasionally. I'll
> look into it anyway at a later stage.

For my application this wouldn't really be a problem since data is only
loaded maybe once a week. But does the cache need to be recalculated
completely? Database indexes work incrementally.

> Another idea that I can implement now is to add a BYTES sort type
> which would basically sort by the order the terms appear in the index.
> Let's say you index dates in the format "YYYYMMDD" and you sort by
> INTEGER. Everytime you load the sort index you need to go through
> every single date and convert it from string to integer. But this is
> unnecessary since the dates are already in order in the index. A BYTES
> sort type would take advantage of this. 

For my date fields this would work.

> You'd get an even bigger
> benefit for ascii strings. strcoll is used to sort strings but this is
> unnecessary for ascii strings as they are already correctly ordered in
> the index. Also, the index needs to keep each string in memory which
> would also be unnessary.

One of my text order fields should have nothing but ASCII. The other is
a title and can include arbitrary UTF-8, so I guess it wouldn't work for
that one.

> Sorry if this isn't very clear. I'm not sure how much it will help.
> We'll have to wait and see.

The BYTES ordering would probably speed it up but for my specific case,
storing it on disk would be perfect. It would probably be a very good
thing in case someone uses ferret to code command line tools that access
a common index. Without storing the sorting on disk it will get
recreated every time a command is ran.

Pedro.

_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to