Instead of "the capacity" I meant "this capacity", but should have said "this feature", referring to Elasticsearch integration—though the information on system capacity was still interesting.
On Fri, Oct 19, 2018 at 3:57 AM, Guillaume Lederrey <[email protected] > wrote: > On Thu, Oct 18, 2018 at 4:48 PM Trey Jones <[email protected]> wrote: > > > > Hi Everyone, > > > > I'm at WikiConference NA today, and I was chatting with someone from > OCLC, and he mentioned that BlazeGraph can be configured to call out to a > full-text search engine. It looks like it only works with SOLR out of the > box, but the documentation mentions that Elasticsearch is a candidate > search endpoint. > > > > Obviously it wouldn't be worth doing any real work on investigating this > until the BlazeGraph/Amazon situation is clearer, and maybe Stas or others > have looked at it in the past and already know why it isn't worth the added > complexity, but there are some interesting use cases where combining full > text and SPARQL would be useful—for example if you are looking for a > person, you know part of their name, and some facts about them. In general, > any full-text search with additional structured data constraints. > > > > Anyone already know anything about the capacity of BlazeGraph? > > It all depends on what you mean by "capacity" and by "blazegraph". If > by capacity you mean do we have enough hardware, the answer is not > entirely easy. > > The cluster servicing the public wdqs endpoint (which probably means > "blazegraph" in this context) has widely varying load patterns, is > sometime overloaded and is overall difficult to size correctly > (especially since we don't have a good definition of what a good SLO > would be, see [1]). > > The internal wdqs endpoint is in a much better situation, with a more > controlled load and a reasonable amount of headroom. I don't have a > good visibility on the projects that might start using this internal > cluster more, so that headroom might be consumed fairly quickly > depending of what load we add to the cluster. > > Last point: I have no idea what that blazegraph / elasticsearch > integration looks like, but it sounds like it might be possible to > generate arbitrary elasticsearch queries from SPARQL. If that's the > case, we don't want to expose such a functionality on the public wdqs > endpoint, or at least not with our current production elasticsearch > backend as the target. That being said, it sounds like a very > interesting idea! > > Have fun! > > Guillaume > > > [1] https://phabricator.wikimedia.org/T199228 > > > Thanks, > > —Trey > > > > Trey Jones > > Sr. Software Engineer, Search Platform > > Wikimedia Foundation > > _______________________________________________ > > Discovery mailing list > > [email protected] > > https://lists.wikimedia.org/mailman/listinfo/discovery > > > > -- > Guillaume Lederrey > Operations Engineer, Search Platform > Wikimedia Foundation > UTC+2 / CEST > > _______________________________________________ > Discovery mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/discovery >
_______________________________________________ Discovery mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/discovery
