Hi! I'll definitely try Cirrus, but still it's interesting to see Lucene
working. Besides everynew extension by WMF typically requires very fresh
MediaWiki version which can be a burden for 3rd parties.
I tried to add InitializeSettings.php, run ./build and ./lsearchd again.
Still no good, when I search the word банк, I expect Lucene to find also
банков, банки, банке, etc., and I can see that these word forms are
presented in a file
LuceneSearch.jar/uzip://org/apache/lucene/analysis/ru/stemsUnicode.txt
and words.Unicode.txt.
Still when I search for банк, I only get банк and the following log:
18409 [pool-2-thread-1] INFO org.wikimedia.lsearch.search.SearchEngine -
Using FilterWrapper wrap: {} []
18414 [pool-2-thread-1] INFO org.wikimedia.lsearch.search.SearchEngine -
search wikivote: query=[банк] parsed=[custom(+contents:банк^0.2 relevance
([((P contents:банк) (P sections:банк^0.25))^2.0], (P
alttitle:банк~20^2.5) (P related:банк^12.0)) (P alttitle:банк~20))]
hit=[0] in 7ms using IndexSearcherMul:1391088160991
18439 [pool-2-thread-1] INFO org.wikimedia.lsearch.spell.Suggest -
wikivote for original=[банк] suggest: [банк] using=[] in 18 ms
24262 [pool-2-thread-2] INFO org.wikimedia.lsearch.frontend.HttpHandler -
query:/search/wikivote/%D0%B1%D0%B0%D0%BD%D0%BA?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15%2C90%2C91%2C92%2C93%2C102%2C103%2C106%2C107%2C108%2C109%2C170%2C171offset=0limit=20version=2.1iwlimit=10searchall=1
what:search dbname:wikivote term:банк
24263 [pool-2-thread-2] INFO org.wikimedia.lsearch.search.SearchEngine -
Using FilterWrapper wrap: {} []
-
Yury Katkov, WikiVote
On Fri, Jan 31, 2014 at 1:02 AM, Nikolas Everett never...@wikimedia.orgwrote:
I hate to say this after all you went through setting up Lucene Search but
it is end of life and not receiving any real support. We're in the process
of replacing it with the combination of
CirrusSearchhttps://www.mediawiki.org/wiki/Extension:CirrusSearch
/Elasticsearch http://www.elasticsearch.org/ which work pretty much the
same way the MWSearch/Lucene Search combination does. CirrusSearch has to
be smarter than MWSearch because Elasticsearch doesn't have any Mediawiki
knowledge but because it links into Mediawiki it can do things like expand
templates. I like it but I'm biased.
That aside, it looks like Lucene Search is supposed to read
InitializeSettings which is kind of wmf specific thing. You might be able
to trick it into doing it by putting a file called InitializeSettings.php
in the conf directory with the contents
'wgLanguageCode' = array(
'your $wgDBname' = 'ru',
),
CirrusSearch, if you care to try it, reads the language code from
wgLanguageCode.
Nik
On Thu, Jan 30, 2014 at 3:39 PM, Yury Katkov katkov.ju...@gmail.com
wrote:
Hi guys!
I've installed MWSearch and Lucene Search extensions but I can see that
the
search engine doesn't understand the morphology of Russian (doesn't
recognize word forms). How can I turn the morphological analyzer on? How
it's done in Russian Wikipedia?
Cheers,
-
Yury Katkov, WikiVote
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l