On Fri, Oct 19, 2018 at 3:40 PM Trey Jones <[email protected]> wrote: > > Instead of "the capacity" I meant "this capacity", but should have said "this > feature", referring to Elasticsearch integration—though the information on > system capacity was still interesting.
Isn't that "capability" more than "capacity" (I'm trying to improve my English here). Though I knew that is sounded ambiguous! > On Fri, Oct 19, 2018 at 3:57 AM, Guillaume Lederrey <[email protected]> > wrote: >> >> On Thu, Oct 18, 2018 at 4:48 PM Trey Jones <[email protected]> wrote: >> > >> > Hi Everyone, >> > >> > I'm at WikiConference NA today, and I was chatting with someone from OCLC, >> > and he mentioned that BlazeGraph can be configured to call out to a >> > full-text search engine. It looks like it only works with SOLR out of the >> > box, but the documentation mentions that Elasticsearch is a candidate >> > search endpoint. >> > >> > Obviously it wouldn't be worth doing any real work on investigating this >> > until the BlazeGraph/Amazon situation is clearer, and maybe Stas or others >> > have looked at it in the past and already know why it isn't worth the >> > added complexity, but there are some interesting use cases where combining >> > full text and SPARQL would be useful—for example if you are looking for a >> > person, you know part of their name, and some facts about them. In >> > general, any full-text search with additional structured data constraints. >> > >> > Anyone already know anything about the capacity of BlazeGraph? >> >> It all depends on what you mean by "capacity" and by "blazegraph". If >> by capacity you mean do we have enough hardware, the answer is not >> entirely easy. >> >> The cluster servicing the public wdqs endpoint (which probably means >> "blazegraph" in this context) has widely varying load patterns, is >> sometime overloaded and is overall difficult to size correctly >> (especially since we don't have a good definition of what a good SLO >> would be, see [1]). >> >> The internal wdqs endpoint is in a much better situation, with a more >> controlled load and a reasonable amount of headroom. I don't have a >> good visibility on the projects that might start using this internal >> cluster more, so that headroom might be consumed fairly quickly >> depending of what load we add to the cluster. >> >> Last point: I have no idea what that blazegraph / elasticsearch >> integration looks like, but it sounds like it might be possible to >> generate arbitrary elasticsearch queries from SPARQL. If that's the >> case, we don't want to expose such a functionality on the public wdqs >> endpoint, or at least not with our current production elasticsearch >> backend as the target. That being said, it sounds like a very >> interesting idea! >> >> Have fun! >> >> Guillaume >> >> >> [1] https://phabricator.wikimedia.org/T199228 >> >> > Thanks, >> > —Trey >> > >> > Trey Jones >> > Sr. Software Engineer, Search Platform >> > Wikimedia Foundation >> > _______________________________________________ >> > Discovery mailing list >> > [email protected] >> > https://lists.wikimedia.org/mailman/listinfo/discovery >> >> >> >> -- >> Guillaume Lederrey >> Operations Engineer, Search Platform >> Wikimedia Foundation >> UTC+2 / CEST >> >> _______________________________________________ >> Discovery mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/discovery > > > _______________________________________________ > Discovery mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/discovery -- Guillaume Lederrey Operations Engineer, Search Platform Wikimedia Foundation UTC+2 / CEST _______________________________________________ Discovery mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/discovery
