Unlike the search:parse parser, xqysp has a fixed grammar. The grammar isn't designed to be user-configurable, just generally useful. It treats NEAR as an infix op, not a prefix op. It treats most punctuation like whitespace, as a token boundary.
The Apache license permits you to fork https://github.com/mblakele/xqysp and change its behavior to suit yourself, and that's probably easier than writing the parts you don't want to change. The XQuery is fairly straightforward. I would start by modifying the unit tests in test/xqysp.xml to expect the behavior you want, then enable the xqysp.xqy $DEBUG variable and start experimenting. You'll see quite a lot of debug-state() output in the logs. The NEAR change would involve moving $TOK-NEAR and $TOK-ONEAR out of $TOKS-INFIX and into $TOKS-PREFIX. I'm not sure what other changes you'd have to make, but that's where testing comes in. The punctuation change looks uglier because the behavior you want is state-dependent: different for NEAR than for other cases. Try writing out the EBNF to describe that, and you'll see why I find it unappealing. You'd probably have to add a third parameter to p:word, telling it whether or not to treat commas as token boundaries. Modify the callers to match. Then I think the actual comma-joining could piggyback on the existing behavior for $TOKS-WORD-JOIN, but slightly more complicated because you want that behavior to be parameterized. -- Mike On 27 Oct 2012, at 04:53 , Abhishek53 S <[email protected]> wrote: > Hi All, > > We are using XQYSP for search term parsing [Really extraordinary concept] > inside our solution. We want to use comma & whitespace (current scenario it's > only whitespace) both as tokenizer for creating literals node for proximity > search but not for other clauses. > Eg. > > Search Term: "NEAR (cat,dog)" - expected to be parsed as > > <root> > > <expression type="prefix" op="NEAR/100"> > > <group> > > <literal>cat</literal> > > <literal>dog</literal> > > </group> > > </expression> > > </root> > > where as terms without NEAR cluases should be parsed as > > Search Term Phrase: "cat,dog" > <root> > > <literal>cat,dog</literal> > > </root> > > Any way to move ahead :) > Thanks > Abhishek Srivastav > Tata Consultancy Services > Cell:- +91-9883389968 > Mailto: [email protected] > Website: http://www.tcs.com > ____________________________________________ > Experience certainty. IT Services > Business Solutions > Outsourcing > ____________________________________________ > =====-----=====-----===== > Notice: The information contained in this e-mail > message and/or attachments to it may contain > confidential or privileged information. If you are > not the intended recipient, any dissemination, use, > review, distribution, printing or copying of the > information contained in this e-mail message > and/or attachments to it are strictly prohibited. If > you have received this communication in error, > please notify us by reply e-mail or telephone and > immediately and permanently delete the message > and any attachments. Thank you > > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
