Thanks for the replies, Nik and Chad. Sounds like I should switch. Is 1.21.2 recent enough? I'm going to try this on a development server.
On Nov 7, 2013, at 10:50 AM, Nikolas Everett wrote: > On Thu, Nov 7, 2013 at 11:37 AM, Jim Hu <[email protected]> wrote: > >> Hi Nik, >> >> As I was reading the docs for MWSearch, I considered whether I should >> switch to CirrusSearch, so it may not be a difficult sell. I'd even >> volunteer to try to update the documentation if you're willing to help walk >> me through it. >> >> But to show how clueless I am... I'm not sure how to check the other end, >> since I'm not clear on what it's trying to do. Here's my undoubtedly deeply >> flawed understanding of what happens (this reflects that I'm a biologist by >> training and badly self-taught on wikis and linux/unix/osx). >> >> I'm assuming that the problem is in this first step of the update script >> >> java -cp LuceneSearch.jar org.wikimedia.lsearch.oai.IncrementalUpdater -l >> $@ \ >> >> It's listing a bunch of update items (the ... in my first post). I am >> guessing that it pulls info on revisions from the mysql database and >> converts them to some format that gets sent to the indexer, which I assume >> is part of apache Lucene. From the error, it's failing to pass that >> through some socket to the indexer. But I don't know how to see a log for >> activity on that socket. >> > > You have the right idea but by "the other side" I mean a log on the > indexer. It is some other java process probably running on the Hexamer > host that I saw in the indexer logs. It should have something in the > logs. Hopefully. > > >> My similarly uninformed reading about CirrusSearch is that it uses >> elasticsearch, which in turn uses Lucene. So if the problem is between the >> incrementalUpdater and Lucene, I might have similar issues with >> CirrusSearch. But if CirrusSearch gives more informative errors, that >> would help!! And maybe I should switch anyway, as it sounds like support >> for MWsearch will go away at some point. >> > > Lucene is a library that can be embedded in Java applications to provide > full text searching capabilities (and geospatial search and few other > things). Anyway, LuceneSearch is a Mediawiki specific application that > provides Lucene's full text search capabilities in a way that the MWSearch > extension understands. > > Elasticsearch serves the same purpose for CirrusSearch as LuceneSearch > serves for MWSearch. We like Elasticsearch because it is general purpose > and sees a ton more development than LuceneSearch. > > As far as support goes - we haven't done much with LuceneSearch/MWSearch in > a while. I work on CirrusSearch every day, as does Chad who seems to have > replied while I'm sending this email. Elasticsearch itself has had 44 > people submit code to it in the past month. Its a more healthy ecosystem > but it might be a pain to switch. CirrusSearch requires a very recent > version of Mediawiki, for example. > > Nik > _______________________________________________ > MediaWiki-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/mediawiki-l ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ MediaWiki-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
