Cool, +1!

Best,
Yingyi

On Fri, Sep 16, 2016 at 1:54 PM, Taewoo Kim <wangs...@gmail.com> wrote:

> So, in summary, we agree to use a function format for the full-text search,
> rather than using XQuery syntax. "contains" doesn't have to be
> "string-contains" and "text" doesn't have to be a reserved word.
>
> The possible syntax would be:
>
> *ftcontains*(expression1, expression2, parameter record expression)
> *matches*(expression1, expression2, parameter record expression)
>
> Expression1 is the field that we conduct a full-text search.
> Expression2 contains the number of keywords that will be searched on
> Expression1.
> Parameter Record Expression contains the parameters in a record format.
>
> An example could be: ftcontains($o.title, ["hello","hi"], {"mode":"all"})
> which checks whether $o.title contains both "hello" and "hi".
>
> Chen mentioned that how to pass parameter needs a separate discussion.
> However, for now, parameters in a  record is a viable solution unless we
> want to separate each parameter as a parameter to the function itself. It
> would be harder to remember the position of each parameter.
>
>
>
>
>
>
> Best,
> Taewoo
>
> On Fri, Sep 16, 2016 at 10:12 AM, Heri Ramampiaro <heri...@gmail.com>
> wrote:
>
> > +1
> >
> > -heri
> >
> > > On Sep 15, 2016, at 19:01, Chen Li <che...@gmail.com> wrote:
> > >
> > > For full-text search, I like "ftcontains()" since it's very intuitive.
> > >
> > > Syntax for advanced full-text features such as stop words, analyzers,
> and
> > > languages need a separate discussion.
> > >
> > > Chen
> > >
> > > On Thu, Sep 15, 2016 at 5:58 PM, Taewoo Kim <wangs...@gmail.com>
> wrote:
> > >
> > >> @Till: I see. Thanks for the suggestion. It's more clearer now.
> > >>
> > >> Best,
> > >> Taewoo
> > >>
> > >> On Thu, Sep 15, 2016 at 5:58 PM, Till Westmann <ti...@apache.org>
> > wrote:
> > >>
> > >>> And as it turns out, we already have some infrastructure to
> translate a
> > >>> constant record constructor expression into a record in
> > >>> LangRecordParseUtil.
> > >>> So supporting that wouldn’t be too painful.
> > >>>
> > >>> Cheers,
> > >>> Till
> > >>>
> > >>>
> > >>> On 15 Sep 2016, at 17:41, Till Westmann wrote:
> > >>>
> > >>> One option to express those parameters, would be to pass in a
> (compile
> > >> time
> > >>>> constant) record/object. E.g.
> > >>>>
> > >>>>    where ftcontains($o.title, ["hello","hi"],
> > >>>>                     { "combine": "and", "stop list": "default" })
> > >>>>
> > >>>> That way we could have named optional parameters (please ignore the
> > >>>> ugliness of
> > >>>> my chosen parameters) which avoid the problem of dealing with
> > positions.
> > >>>> We do have a nested datamodel, so we could put it to good use here
> :)
> > >>>>
> > >>>> Does this make sense?
> > >>>>
> > >>>> Cheers,
> > >>>> Till
> > >>>>
> > >>>> On 15 Sep 2016, at 16:26, Taewoo Kim wrote:
> > >>>>
> > >>>> @Till: we can add whether the given search is AND/OR search, stop
> list
> > >>>>> and/or stemming method. For example, if we use ftcontains(), then
> it
> > >>>>> might
> > >>>>> look like:
> > >>>>>
> > >>>>> 1) where ftcontains($o.title, "hello"): find $o where the title
> field
> > >>>>> contains hello.
> > >>>>> 2) where ftcontains($o.title, ["hello","hi"], any): find $o where
> the
> > >>>>> title
> > >>>>> field contains hello *and/or* hi.
> > >>>>> 3) where ftcontains($o.title, ["hello","hi"], all): find $o where
> the
> > >>>>> title
> > >>>>> field contains both hello *and* hi.
> > >>>>> 4) where ftcontains($o.title, ["hello","hi"], all,
> defaultstoplist):
> > >> find
> > >>>>> $o where the title field contains both hello *and* hi. Also apply
> the
> > >>>>> default stoplist to the search. The default stop list contains the
> > >> number
> > >>>>> of English common words that can be filtered.
> > >>>>>
> > >>>>> The issue here is that the position of each parameter should be
> > >> observed
> > >>>>> (e.g., the third one indicates whether we do
> disjunctive/conjunctive
> > >>>>> search. The fourth one tells us which stop list we use). So, if we
> > have
> > >>>>> three parameters, how to specify/omit these becomes a challenge.
> > >>>>>
> > >>>>> Best,
> > >>>>> Taewoo
> > >>>>>
> > >>>>> On Thu, Sep 15, 2016 at 4:12 PM, Till Westmann <ti...@apache.org>
> > >> wrote:
> > >>>>>
> > >>>>> Makes sense to me (especially as I always think about this specific
> > one
> > >>>>>> as
> > >>>>>> "ftcontains" :) ).
> > >>>>>>
> > >>>>>> Another thing you mentioned is about the parameters that will get
> > >> added
> > >>>>>> in
> > >>>>>> the
> > >>>>>> future. Could you provide an example for this?
> > >>>>>>
> > >>>>>> Cheers,
> > >>>>>> Till
> > >>>>>>
> > >>>>>> On 15 Sep 2016, at 15:37, Taewoo Kim wrote:
> > >>>>>>
> > >>>>>> Maybe we could come up with a function form - *ftcontains*().
> Here,
> > ft
> > >>>>>> is
> > >>>>>>
> > >>>>>>>
> > >>>>>>> an abbreviation for full-text. This function replaces "contains
> > text"
> > >>>>>>> in
> > >>>>>>> XQuery spec. An example might be:
> > >>>>>>>
> > >>>>>>> XQuery spec: where $o.titile contains text "hello"
> > >>>>>>> AQL: where ftcontains($o.title, "hello")
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Taewoo
> > >>>>>>>
> > >>>>>>> On Thu, Sep 15, 2016 at 3:18 PM, Taewoo Kim <wangs...@gmail.com>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>> @Till: Got it. I agree to your opinion. The issue here for the
> > >>>>>>> full-text
> > >>>>>>>
> > >>>>>>>> search is that many function parameters that controls the
> behavior
> > >> of
> > >>>>>>>> full-text search will be added in the future. Maybe this is not
> > the
> > >>>>>>>> issue?
> > >>>>>>>> :-)
> > >>>>>>>>
> > >>>>>>>> Best,
> > >>>>>>>> Taewoo
> > >>>>>>>>
> > >>>>>>>> On Thu, Sep 15, 2016 at 3:11 PM, Till Westmann <
> ti...@apache.org>
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> I think that our challenge here is, that XQuery is very liberal
> > in
> > >>>>>>>>> the
> > >>>>>>>>> introduction of new keywords, as the grammar is keyword free.
> > >>>>>>>>> However,
> > >>>>>>>>> they
> > >>>>>>>>> often use combinations of words "contain" "text" to
> disambiguate.
> > >>>>>>>>> AQL on the other had is not keyword free and so each time we
> > >>>>>>>>> introduce a
> > >>>>>>>>> new
> > >>>>>>>>> one, we create a backwards compatibility problem. It seems that
> > for
> > >>>>>>>>> AQL
> > >>>>>>>>> using a
> > >>>>>>>>> function-based syntax would create fewer problems.
> > >>>>>>>>>
> > >>>>>>>>> Cheers,
> > >>>>>>>>> Till
> > >>>>>>>>>
> > >>>>>>>>> On 2 Mar 2016, at 18:25, Taewoo Kim wrote:
> > >>>>>>>>>
> > >>>>>>>>> Hello All,
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>> I would like to suggest a current function name change. I am
> > >>>>>>>>>> currently
> > >>>>>>>>>> working on Full Text Search features. XQuery Full-text search
> > spec
> > >>>>>>>>>> [1]
> > >>>>>>>>>> states that for a full-text search, the syntax is *RangeExpr (
> > >>>>>>>>>> "contains"
> > >>>>>>>>>> "text" FTSelection FTIgnoreOption? )?*. As you see, we are
> going
> > >> to
> > >>>>>>>>>> use
> > >>>>>>>>>> "contains text something". And we already have contains()
> > function
> > >>>>>>>>>> [2]
> > >>>>>>>>>> that
> > >>>>>>>>>> does a substring match.  So, in order to remove possible
> > >> ambiguities
> > >>>>>>>>>> between two features, *contains()* will be renamed to
> > >>>>>>>>>> *string-contains()*
> > >>>>>>>>>> when I merge my index-only branch to the master if there is no
> > >>>>>>>>>> strong
> > >>>>>>>>>> opinion on this. Thank you. I will send another note as my
> merge
> > >>>>>>>>>> progresses. Thank you.
> > >>>>>>>>>>
> > >>>>>>>>>> [1] https://www.w3.org/TR/xpath-full-text-10/#doc-xquery10-
> > FTCon
> > >>>>>>>>>> tainsExpr
> > >>>>>>>>>>
> > >>>>>>>>>> [2]
> > >>>>>>>>>> https://asterix-jenkins.ics.uci.edu/job/asterix-test-full/si
> > >>>>>>>>>> te/asterix-doc/aql/functions.html#StringFunctions
> > >>>>>>>>>>
> > >>>>>>>>>> Best,
> > >>>>>>>>>> Taewoo
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>
> > >>>
> > >>>
> > >>
> >
> >
>

Reply via email to