Hi, looks very cool!
Indeed, I guess stemming still needs some improvements. For instance searching for "embeter": http://ec2-50-19-181-163.compute-1.amazonaws.com:8080/xwiki/bin/view/Search/Search?text=embeter did not return any results although searching for "embêter": http://ec2-50-19-181-163.compute-1.amazonaws.com:8080/xwiki/bin/view/Search/Search?text=emb%C3%AAter returns 2 results whereas both should be the same. Keep up the good work! Guillaume On Sat, Jun 30, 2012 at 11:03 PM, Paul Libbrecht <[email protected]> wrote: > Ah, feedback! This is really good. > > Le 30 juin 2012 à 20:26, Ludovic Dubost a écrit : > > This is nice progress. I've had a look and I have a few remarks: > > > > 1/ Some weird results > > > > It seems the results are not always ok. For instance this page > > > http://ec2-50-19-181-163.compute-1.amazonaws.com:8080/xwiki/bin/view/SearchTest/AMultilangPage > > comes up if I search for "SearchTest" but it does not come up for "liste" > > Also these 2 searches says 6 and 1 results and show only 3 and 0 results. > > This is due to the multi-lingual document (one document in four languages). > The multilinguality is, I think, on top of Savitha's priority. > > > 2/ Avanced queries > > > > I was also wondering if we can use advanced queries. > > I've been trying > > > > SearchTest +space:SearchTest > > > > and this does not seem to work. > > There's a good reason for this: the syntax for search currently in use is > "Dismax". This is a query-parser that is rather less technical, so it > avoids such issues as considering an apostrophe as a separator (an issue > that was reported). > > The queries you are suggesting, which I think can be useful, only work > with the Lucene Query-Parser, and not with dismax. This will be > configurable but I am not sure which one should be the default. > > > 3/ It's important that we end up with at least the same features as in > > lucene. > > Mmmh, not *all* of the features. > E.g. that all fields are stored is really not desired (and almost never > used in search results). > > > For instance being able to query all the fields we could query > > in lucene is important. For instance: > > object:XWiki.XWikiUsers > > should return only users > > Something of this sort will be needed to achieve the advanced search > scenario. > > > Ordering and Scoring is also something that existed in lucene. How > > would this work in SOLR ? > > A score is already displayed currently. > > > 4/ we also want of course the advantages of SOLR, which means > > facetting. Tags, Spaces, Wikis can be interesting facets > > The reason multilingual documents have been a problem thus far is that > Savitha is also trying to make the language a facet which is really > interesting but is raising an amount of difficulties. > > > 5/ in terms of multilingual search (in case of a multilingual wiki) we > > need to make sure that you can say that you make a search in a > > specific language and the correct stemmer is used (if stemming is used > > at indexing time we need to index the content in each language with > > the correct stemmer). I saw that you did some things with languages so > > maybe SOLR has also other ways to handle this. > > If you look into the source, you can see some of that. > Solr can do this very nicely declaratively with the schema.xml and > solrconfig.xml. > > Part of Savitha's intent was to offer an adminstrative UI to manipulate > this but I'd personally prefer editing files manually. Or maybe we even > have to invent an extended schema syntax for XWiki-Solr (thus indicating > that a field of solr, of this and that type, tokenization and storage, if > fed by a property x/yz of an xwiki document). > > paul > _______________________________________________ > devs mailing list > [email protected] > http://lists.xwiki.org/mailman/listinfo/devs > _______________________________________________ devs mailing list [email protected] http://lists.xwiki.org/mailman/listinfo/devs

