Hi,

> > I just commited revision 9020. It splits the search input into words
> > (separated by whitespace) and appends each word to the "AND" query,
> > I tested it with a xapian engine and with Pootle and the toolkit at revision
> > 8822 due to some issues, that I did not investigate at that moment. But I
> > assume, it should work well with HEAD, too.
> 
> Thank you for this, Lars. I guess things should work on trunk, but we
> need to test and confirm at some stage. Do you consider this good enough
> to backport to the 1.2 branch for the release of 1.2.1?

Yes - I don't see any potential problems.


> > Just to make sure, that I did not neglect anything: is a simple "split"
> > call the right approach to separate words in a language neutral way?
> > (see line 1020 in Pootle/projects.py)
> 
> It is the best we can do without going into lots of work. The bigger
> question is perhaps how Lucene and Xapian splits/tokenises words. We
> might want to get closer to that, rather than doing the 100% correct
> thing.

As far as I remember, there is not an obvious way to use the tokenize function
of xapian separately. I just used it indirectly during indexing.
For Lucene, I don't know this at all.
I will take a look.


> > Just to clarify the current behaviour of the search field:
> > 1) every word search is "partial" and case-insensitive - thus "poot" will
> > find "Pootle"
> > 2) Multiple words get splitted into single words. The single queries are
> > partial, too. They are combined by "AND".
> > 3) The order of multi-word queries does not matter: a search for "admin
> > pootle" will return "Pootle Languages Admin Page" (and others).
> > 4) Multiple word input can be a mixture of source and target strings: a
> > search for "remove sprache" will return "Remove Language" which is
> > translated to "Sprache loeschen"
> > 
> > Do you think, that this detailed description of the search processing would
> > be suitable for the "searching" wiki page[1] of Pootle? Then I could add it
> > there ...
> > 
> > 
> > I would appreciate any comments!
> > 
> > regards,
> > Lars
> > 
> > [1] http://translate.sourceforge.net/wiki/pootle/searching
> 
> 
> I think we can definitely add it there. There are some definite
> differences between the indexed search and the pogrep search, so it
> would be good to document them well.

I will add a table with examples to the wiki page.


> I think there is still another way to really improve things: if we can
> obtain possibly relevant results quickly from the indexer and use a real
> GrepFilter to filter out the less relevant ones from there. This would
> get the behaviour much closer to the non-indexed search, but still with
> a good speedup, I think. Does this sound doable, Lars? Should we try to
> do that for Pootle 1.2.1?

I did not realize deficiencies/differences of the indexing-based search for
now. So I am not sure, how this would improve the search. Maybe we can discuss
this, when we have a table comparison of the search results in the wiki page.
I will do my half today in the evening ...
Regarding possible work on this topic: until christmas I will barely find time
to work on this. If this delay suits to the release date of 1.2.1, then I can do
it afterwards.

have a nice day,
Lars

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle

Reply via email to