Hi Oleksi, If it seems odd to you that MarkLogic continues to pressure you *not* to use the API they built, you are not alone.
Erik, I don't think it's reasonable to suggest to customers only needing to extend or replace a small portion of Search API that they rewrite significant parts of its functionality using 3rd-party tools (some of which haven't been updated for years) instead of using the clearly documented extensibility hooks of the ML-provided API. If MarkLogic wants to push customers with more complex Search API needs in a different direction, that's fine, but it would be a lot more palatable if ML actually did some of the legwork upfront - provide code examples, blogs, documentation, etc. - to demonstrate how that should be done correctly. At a minimum it's confusing to be told not to use the the provided tools and frustrating that the alternatives suggested require a lot more work and uncertainty. Ideally, if ML doesn't want customers using parts of the Search API, they should just build a replacement that they are willing to endorse (maybe even using one of the JS parsers you recommend). -Will > On Feb 1, 2017, at 11:58 AM, Erik Hennum <[email protected]> wrote: > > Hi, Oleksii: > > To be clear, we discourage use of custom grammars. > > Besides the JavaScript parser generators that I mentioned previously, you > might also consider the XQuery approach demonstrated in: > > https://github.com/mblakele/xqysp > > These approaches will support more flexible and performant parsers than the > dynamic grammar of the Search API. > > If you have a requirement that can be addressed only by a custom grammar and > not by one of these approaches, please open a support ticket. > > > Erik Hennum > > ________________________________________ > From: [email protected] > [[email protected]] on behalf of Oleksii Segeda > [[email protected]] > Sent: Wednesday, February 01, 2017 7:28 AM > To: MarkLogic Developer Discussion > Subject: Re: [MarkLogic Dev General] Custom search grammar > > Hi Erik, > > Did you figure out how to extend the grammar? > > Regards, > Oleksii Segeda > IT Analyst > Information and Technology Solutions > > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Oleksii Segeda > Sent: Monday, January 30, 2017 3:09 PM > To: MarkLogic Developer Discussion <[email protected]> > Subject: Re: [MarkLogic Dev General] Custom search grammar > > Hi Erik, > > Yes, that's is desired behavior. > > Ideally, I would like to avoid custom constraints, simply because search > grammar looks cleaner in the search box. In addition, some of our users are > already familiar with simple search operators like AND, OR, so BOOST won't > look like an alien to them. > > I guess a postprocessing can be used as you suggested, however I'm interested > in custom search grammar, because I may need to extend it more in the future. > > Thank you, > Oleksii Segeda > IT Analyst > Information and Technology Solutions > > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Erik Hennum > Sent: Monday, January 30, 2017 2:42 PM > To: MarkLogic Developer Discussion <[email protected]> > Subject: Re: [MarkLogic Dev General] Custom search grammar > > Hi, Oleksii: > > Thanks for providing more detail. > > Just to confirm, is it clear that, in a boost query, the right-hand > term is optional? Documents with only the left-hand term will still > appear in the results though with less relevance than documents > that have both terms. > > By contrast, AND-related terms are both required and both > contribute to relevance. > > Anyway, to increase weight, one approach would be to define a tag > for a quoted phrase and pass the phrase to a Search API custom > constraint or to cts:parse() with a binding to a query generator function: > > http://docs.marklogic.com/guide/search-dev/cts_query#id_13456 > > The custom code could then tokenize the phrase and combine the > terms with a boost-query or and-query, adding appropriate weight. > > Another approach would be to do postprocessing of the query tree > returned by cts:parse() or search:parse() to replace the default > boost-query or and-query with a query that has more weight. > > In either approach, you would then search on the query. > > I mention cts:parse() because it parses query text more quickly > than search:parse() > > > Hoping that helps, > > Erik Hennum > > ________________________________________ > From: [email protected] > [[email protected]] on behalf of Oleksii Segeda > [[email protected]] > Sent: Monday, January 30, 2017 10:55 AM > To: [email protected] > Subject: Re: [MarkLogic Dev General] Custom search grammar > > Hi Erik, > > I'm trying to boost some parts of search query. For example, if user types > `trade BOOST water`, I want documents with the word "water" to be higher in > the results. > cts:boost-query seems to be a perfect fit, but the default BOOST doesn't let > you specify weights. > > My ultimate goal is to convert `trade BOOST water` to something like this: > > cts:boost-query(cts:word-query("trade"), cts:word-query("water", (), 10.0) > ) > > Regards, > Oleksii Segeda > IT Analyst > Information and Technology Solutions > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of > [email protected] > Sent: Monday, January 30, 2017 1:08 PM > To: [email protected] > Subject: General Digest, Vol 151, Issue 42 > > Send General mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > http://developer.marklogic.com/mailman/listinfo/general > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of General digest..." > > > Today's Topics: > > 1. Custom search grammar (Oleksii Segeda) > 2. Re: Custom search grammar (Erik Hennum) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 30 Jan 2017 16:51:26 +0000 > From: Oleksii Segeda <[email protected]> > Subject: [MarkLogic Dev General] Custom search grammar > To: "[email protected]" > <[email protected]> > Message-ID: > > <bn1pr0101mb0769b9cdcd5e7697ace8381bcb...@bn1pr0101mb0769.prod.exchangelabs.com> > > Content-Type: text/plain; charset="us-ascii" > > Hi there, > > I'm trying to declare a custom search grammar. I declared a custom function > via search options, which supposed to parse "BOOST" keyword: > > <joiner strength="2" apply="custom-boost" > ns="http://worldbankgroup.org/search/grammar" at="/lib/grammar-boost.xqy" > tokenize="word">BOOST</joiner> > > I declared this function and just copied existing implementation from > impl:joiner-boost function in /MarkLogic/appservices/search/search-impl.xqy : > > declare function grammar:custom-boost($ps as map:map, $left as element()?, > $opts as element()?) as schema-element(cts:query) { > let $symbol := impl:symbol-lookup($ps) > let $_ := tdop:advance($ps) > let $expr1 := tdop:expression($ps, $symbol/@strength) > return > if (empty($left)) > then ($left, impl:msg($ps, <cts:annotation > warning="SEARCH-IGNOREDQTEXT:[{string($symbol)} {$expr1}]: expected two > arguments"/>)) > else > element { xs:QName($symbol/@element) } { > attribute qtextjoin {concat($symbol/string())}, > attribute strength {$symbol/@strength}, > attribute qtextgroup { > impl:opts($ps)/opt:grammar/opt:starter[@apply eq "grouping"]/(string(), > @delimiter/string()) }, > for $opt in > $symbol/@options/tokenize(normalize-space(.)<mailto:$symbol/@options/tokenize(normalize-space(.)>, > "\s") return <cts:option>{$opt}</cts:option>, > element cts:matching-query { > attribute qtextref { "schema-element(cts:query)" }, > $left }, > element cts:boosting-query { > attribute qtextref { "schema-element(cts:query)" }, > $expr1 } > } > }; > > Unfortunately this doesn't work, because for some reason impl:symbol-lookup > returns an empty sequence. > Any ideas what went wrong here? > > > Oleksii Segeda > > IT Analyst > > Information and Technology Solutions > > [http://siteresources.worldbank.org/NEWS/Images/spacer.png] > > [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://developer.marklogic.com/pipermail/general/attachments/20170130/1958bd77/attachment-0001.html > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: image003.png > Type: image/png > Size: 6577 bytes > Desc: image003.png > Url : > http://developer.marklogic.com/pipermail/general/attachments/20170130/1958bd77/attachment-0002.png > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: image004.png > Type: image/png > Size: 170 bytes > Desc: image004.png > Url : > http://developer.marklogic.com/pipermail/general/attachments/20170130/1958bd77/attachment-0003.png > > ------------------------------ > > Message: 2 > Date: Mon, 30 Jan 2017 18:07:41 +0000 > From: Erik Hennum <[email protected]> > Subject: Re: [MarkLogic Dev General] Custom search grammar > To: MarkLogic Developer Discussion <[email protected]> > Message-ID: > <dfdf2fd50bf5aa42adaf93ff2e3ca1850bd7d...@exchg10-be01.marklogic.com> > Content-Type: text/plain; charset="windows-1252" > > Hi, Oleksii: > > Can you explain what you are trying to accomplish? > > There may be better ways of doing the same thing than creating a custom > grammar, which is really a tool of last resort. > > For instance, a custom constraint can map a term to a custom query. > > For other cases, it's often useful to do postprocessing on the generated > query. > > If a custom grammar really is unavoidable, in many cases a special-purpose > third-party parsing tool may provide a faster and more flexible alternative > to the limited custom grammar in the Search API. > > For instance, the Jison.js and Peg.js parsers work with server-side > JavaScript. (A nearly.js parser is also available, though I've heard no > reports about it yet.) > > > Hoping that helps, > > > Erik Hennum > > > ________________________________ > From: [email protected] > [[email protected]] on behalf of Oleksii Segeda > [[email protected]] > Sent: Monday, January 30, 2017 8:51 AM > To: [email protected] > Subject: [MarkLogic Dev General] Custom search grammar > > Hi there, > > I?m trying to declare a custom search grammar. I declared a custom function > via search options, which supposed to parse ?BOOST? keyword: > > <joiner strength="2" apply="custom-boost" > ns="http://worldbankgroup.org/search/grammar" at="/lib/grammar-boost.xqy" > tokenize="word">BOOST</joiner> > > I declared this function and just copied existing implementation from > impl:joiner-boost function in /MarkLogic/appservices/search/search-impl.xqy : > > declare function grammar:custom-boost($ps as map:map, $left as element()?, > $opts as element()?) as schema-element(cts:query) { > let $symbol := impl:symbol-lookup($ps) > let $_ := tdop:advance($ps) > let $expr1 := tdop:expression($ps, $symbol/@strength) > return > if (empty($left)) > then ($left, impl:msg($ps, <cts:annotation > warning="SEARCH-IGNOREDQTEXT:[{string($symbol)} {$expr1}]: expected two > arguments"/>)) > else > element { xs:QName($symbol/@element) } { > attribute qtextjoin {concat($symbol/string())}, > attribute strength {$symbol/@strength}, > attribute qtextgroup { > impl:opts($ps)/opt:grammar/opt:starter[@apply eq "grouping"]/(string(), > @delimiter/string()) }, > for $opt in > $symbol/@options/tokenize(normalize-space(.)<mailto:$symbol/@options/tokenize(normalize-space(.)>, > "\s") return <cts:option>{$opt}</cts:option>, > element cts:matching-query { > attribute qtextref { "schema-element(cts:query)" }, > $left }, > element cts:boosting-query { > attribute qtextref { "schema-element(cts:query)" }, > $expr1 } > } > }; > > Unfortunately this doesn?t work, because for some reason impl:symbol-lookup > returns an empty sequence. > Any ideas what went wrong here? > > > Oleksii Segeda > > IT Analyst > > Information and Technology Solutions > > [http://siteresources.worldbank.org/NEWS/Images/spacer.png] > > [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://developer.marklogic.com/pipermail/general/attachments/20170130/33effe92/attachment.html > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: image003.png > Type: image/png > Size: 6577 bytes > Desc: image003.png > Url : > http://developer.marklogic.com/pipermail/general/attachments/20170130/33effe92/attachment.png > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: image004.png > Type: image/png > Size: 170 bytes > Desc: image004.png > Url : > http://developer.marklogic.com/pipermail/general/attachments/20170130/33effe92/attachment-0001.png > > ------------------------------ > > _______________________________________________ > General mailing list > [email protected] > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > > > End of General Digest, Vol 151, Issue 42 > **************************************** > _______________________________________________ > General mailing list > [email protected] > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected] > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected] > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > [email protected] > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected] > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
