Cool, +1! Best, Yingyi
On Fri, Sep 16, 2016 at 1:54 PM, Taewoo Kim <[email protected]> wrote: > So, in summary, we agree to use a function format for the full-text search, > rather than using XQuery syntax. "contains" doesn't have to be > "string-contains" and "text" doesn't have to be a reserved word. > > The possible syntax would be: > > *ftcontains*(expression1, expression2, parameter record expression) > *matches*(expression1, expression2, parameter record expression) > > Expression1 is the field that we conduct a full-text search. > Expression2 contains the number of keywords that will be searched on > Expression1. > Parameter Record Expression contains the parameters in a record format. > > An example could be: ftcontains($o.title, ["hello","hi"], {"mode":"all"}) > which checks whether $o.title contains both "hello" and "hi". > > Chen mentioned that how to pass parameter needs a separate discussion. > However, for now, parameters in a record is a viable solution unless we > want to separate each parameter as a parameter to the function itself. It > would be harder to remember the position of each parameter. > > > > > > > Best, > Taewoo > > On Fri, Sep 16, 2016 at 10:12 AM, Heri Ramampiaro <[email protected]> > wrote: > > > +1 > > > > -heri > > > > > On Sep 15, 2016, at 19:01, Chen Li <[email protected]> wrote: > > > > > > For full-text search, I like "ftcontains()" since it's very intuitive. > > > > > > Syntax for advanced full-text features such as stop words, analyzers, > and > > > languages need a separate discussion. > > > > > > Chen > > > > > > On Thu, Sep 15, 2016 at 5:58 PM, Taewoo Kim <[email protected]> > wrote: > > > > > >> @Till: I see. Thanks for the suggestion. It's more clearer now. > > >> > > >> Best, > > >> Taewoo > > >> > > >> On Thu, Sep 15, 2016 at 5:58 PM, Till Westmann <[email protected]> > > wrote: > > >> > > >>> And as it turns out, we already have some infrastructure to > translate a > > >>> constant record constructor expression into a record in > > >>> LangRecordParseUtil. > > >>> So supporting that wouldn’t be too painful. > > >>> > > >>> Cheers, > > >>> Till > > >>> > > >>> > > >>> On 15 Sep 2016, at 17:41, Till Westmann wrote: > > >>> > > >>> One option to express those parameters, would be to pass in a > (compile > > >> time > > >>>> constant) record/object. E.g. > > >>>> > > >>>> where ftcontains($o.title, ["hello","hi"], > > >>>> { "combine": "and", "stop list": "default" }) > > >>>> > > >>>> That way we could have named optional parameters (please ignore the > > >>>> ugliness of > > >>>> my chosen parameters) which avoid the problem of dealing with > > positions. > > >>>> We do have a nested datamodel, so we could put it to good use here > :) > > >>>> > > >>>> Does this make sense? > > >>>> > > >>>> Cheers, > > >>>> Till > > >>>> > > >>>> On 15 Sep 2016, at 16:26, Taewoo Kim wrote: > > >>>> > > >>>> @Till: we can add whether the given search is AND/OR search, stop > list > > >>>>> and/or stemming method. For example, if we use ftcontains(), then > it > > >>>>> might > > >>>>> look like: > > >>>>> > > >>>>> 1) where ftcontains($o.title, "hello"): find $o where the title > field > > >>>>> contains hello. > > >>>>> 2) where ftcontains($o.title, ["hello","hi"], any): find $o where > the > > >>>>> title > > >>>>> field contains hello *and/or* hi. > > >>>>> 3) where ftcontains($o.title, ["hello","hi"], all): find $o where > the > > >>>>> title > > >>>>> field contains both hello *and* hi. > > >>>>> 4) where ftcontains($o.title, ["hello","hi"], all, > defaultstoplist): > > >> find > > >>>>> $o where the title field contains both hello *and* hi. Also apply > the > > >>>>> default stoplist to the search. The default stop list contains the > > >> number > > >>>>> of English common words that can be filtered. > > >>>>> > > >>>>> The issue here is that the position of each parameter should be > > >> observed > > >>>>> (e.g., the third one indicates whether we do > disjunctive/conjunctive > > >>>>> search. The fourth one tells us which stop list we use). So, if we > > have > > >>>>> three parameters, how to specify/omit these becomes a challenge. > > >>>>> > > >>>>> Best, > > >>>>> Taewoo > > >>>>> > > >>>>> On Thu, Sep 15, 2016 at 4:12 PM, Till Westmann <[email protected]> > > >> wrote: > > >>>>> > > >>>>> Makes sense to me (especially as I always think about this specific > > one > > >>>>>> as > > >>>>>> "ftcontains" :) ). > > >>>>>> > > >>>>>> Another thing you mentioned is about the parameters that will get > > >> added > > >>>>>> in > > >>>>>> the > > >>>>>> future. Could you provide an example for this? > > >>>>>> > > >>>>>> Cheers, > > >>>>>> Till > > >>>>>> > > >>>>>> On 15 Sep 2016, at 15:37, Taewoo Kim wrote: > > >>>>>> > > >>>>>> Maybe we could come up with a function form - *ftcontains*(). > Here, > > ft > > >>>>>> is > > >>>>>> > > >>>>>>> > > >>>>>>> an abbreviation for full-text. This function replaces "contains > > text" > > >>>>>>> in > > >>>>>>> XQuery spec. An example might be: > > >>>>>>> > > >>>>>>> XQuery spec: where $o.titile contains text "hello" > > >>>>>>> AQL: where ftcontains($o.title, "hello") > > >>>>>>> > > >>>>>>> Best, > > >>>>>>> Taewoo > > >>>>>>> > > >>>>>>> On Thu, Sep 15, 2016 at 3:18 PM, Taewoo Kim <[email protected]> > > >>>>>>> wrote: > > >>>>>>> > > >>>>>>> @Till: Got it. I agree to your opinion. The issue here for the > > >>>>>>> full-text > > >>>>>>> > > >>>>>>>> search is that many function parameters that controls the > behavior > > >> of > > >>>>>>>> full-text search will be added in the future. Maybe this is not > > the > > >>>>>>>> issue? > > >>>>>>>> :-) > > >>>>>>>> > > >>>>>>>> Best, > > >>>>>>>> Taewoo > > >>>>>>>> > > >>>>>>>> On Thu, Sep 15, 2016 at 3:11 PM, Till Westmann < > [email protected]> > > >>>>>>>> wrote: > > >>>>>>>> > > >>>>>>>> Hi, > > >>>>>>>> > > >>>>>>>>> > > >>>>>>>>> I think that our challenge here is, that XQuery is very liberal > > in > > >>>>>>>>> the > > >>>>>>>>> introduction of new keywords, as the grammar is keyword free. > > >>>>>>>>> However, > > >>>>>>>>> they > > >>>>>>>>> often use combinations of words "contain" "text" to > disambiguate. > > >>>>>>>>> AQL on the other had is not keyword free and so each time we > > >>>>>>>>> introduce a > > >>>>>>>>> new > > >>>>>>>>> one, we create a backwards compatibility problem. It seems that > > for > > >>>>>>>>> AQL > > >>>>>>>>> using a > > >>>>>>>>> function-based syntax would create fewer problems. > > >>>>>>>>> > > >>>>>>>>> Cheers, > > >>>>>>>>> Till > > >>>>>>>>> > > >>>>>>>>> On 2 Mar 2016, at 18:25, Taewoo Kim wrote: > > >>>>>>>>> > > >>>>>>>>> Hello All, > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>>> I would like to suggest a current function name change. I am > > >>>>>>>>>> currently > > >>>>>>>>>> working on Full Text Search features. XQuery Full-text search > > spec > > >>>>>>>>>> [1] > > >>>>>>>>>> states that for a full-text search, the syntax is *RangeExpr ( > > >>>>>>>>>> "contains" > > >>>>>>>>>> "text" FTSelection FTIgnoreOption? )?*. As you see, we are > going > > >> to > > >>>>>>>>>> use > > >>>>>>>>>> "contains text something". And we already have contains() > > function > > >>>>>>>>>> [2] > > >>>>>>>>>> that > > >>>>>>>>>> does a substring match. So, in order to remove possible > > >> ambiguities > > >>>>>>>>>> between two features, *contains()* will be renamed to > > >>>>>>>>>> *string-contains()* > > >>>>>>>>>> when I merge my index-only branch to the master if there is no > > >>>>>>>>>> strong > > >>>>>>>>>> opinion on this. Thank you. I will send another note as my > merge > > >>>>>>>>>> progresses. Thank you. > > >>>>>>>>>> > > >>>>>>>>>> [1] https://www.w3.org/TR/xpath-full-text-10/#doc-xquery10- > > FTCon > > >>>>>>>>>> tainsExpr > > >>>>>>>>>> > > >>>>>>>>>> [2] > > >>>>>>>>>> https://asterix-jenkins.ics.uci.edu/job/asterix-test-full/si > > >>>>>>>>>> te/asterix-doc/aql/functions.html#StringFunctions > > >>>>>>>>>> > > >>>>>>>>>> Best, > > >>>>>>>>>> Taewoo > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>> > > >>> > > >>> > > >> > > > > >
