Hi, > > I just commited revision 9020. It splits the search input into words > > (separated by whitespace) and appends each word to the "AND" query, > > I tested it with a xapian engine and with Pootle and the toolkit at revision > > 8822 due to some issues, that I did not investigate at that moment. But I > > assume, it should work well with HEAD, too. > > Thank you for this, Lars. I guess things should work on trunk, but we > need to test and confirm at some stage. Do you consider this good enough > to backport to the 1.2 branch for the release of 1.2.1?
Yes - I don't see any potential problems. > > Just to make sure, that I did not neglect anything: is a simple "split" > > call the right approach to separate words in a language neutral way? > > (see line 1020 in Pootle/projects.py) > > It is the best we can do without going into lots of work. The bigger > question is perhaps how Lucene and Xapian splits/tokenises words. We > might want to get closer to that, rather than doing the 100% correct > thing. As far as I remember, there is not an obvious way to use the tokenize function of xapian separately. I just used it indirectly during indexing. For Lucene, I don't know this at all. I will take a look. > > Just to clarify the current behaviour of the search field: > > 1) every word search is "partial" and case-insensitive - thus "poot" will > > find "Pootle" > > 2) Multiple words get splitted into single words. The single queries are > > partial, too. They are combined by "AND". > > 3) The order of multi-word queries does not matter: a search for "admin > > pootle" will return "Pootle Languages Admin Page" (and others). > > 4) Multiple word input can be a mixture of source and target strings: a > > search for "remove sprache" will return "Remove Language" which is > > translated to "Sprache loeschen" > > > > Do you think, that this detailed description of the search processing would > > be suitable for the "searching" wiki page[1] of Pootle? Then I could add it > > there ... > > > > > > I would appreciate any comments! > > > > regards, > > Lars > > > > [1] http://translate.sourceforge.net/wiki/pootle/searching > > > I think we can definitely add it there. There are some definite > differences between the indexed search and the pogrep search, so it > would be good to document them well. I will add a table with examples to the wiki page. > I think there is still another way to really improve things: if we can > obtain possibly relevant results quickly from the indexer and use a real > GrepFilter to filter out the less relevant ones from there. This would > get the behaviour much closer to the non-indexed search, but still with > a good speedup, I think. Does this sound doable, Lars? Should we try to > do that for Pootle 1.2.1? I did not realize deficiencies/differences of the indexing-based search for now. So I am not sure, how this would improve the search. Maybe we can discuss this, when we have a table comparison of the search results in the wiki page. I will do my half today in the evening ... Regarding possible work on this topic: until christmas I will barely find time to work on this. If this delay suits to the release date of 1.2.1, then I can do it afterwards. have a nice day, Lars ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Translate-pootle mailing list Translate-pootle@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/translate-pootle