Hi Gleb.

Agree that my example with `temperature average` is not the best, but main idea was to explain that function with reference should be defined according to some rules and cannot be defined without correct link to some element.

>>I have questions regarding notation, should we support only prefix notation for now? I think we have to support all kinds of common functions, but maybe start from some reasonable set. We have to support maximum wide functions list, but don't have some artificial goal to support all exotic cases just to be perfectionists

>> In general functions can be
>> -   without argument
>> -   with one argument
>> -   with many arguments
>> (We can start from first two variants - as first step.)

Want to get my words back.
Lets support functions with many arguments immediately, it is not big deal, and will require refactoring in another case

Want to repeat
Let's define functions list to support at first, according to their importance (prefix or postfix, with or without arguments etc)



NLPCRAFT-50 ticket created, please fix and extend its body if necessary


Regards,

Sergey


Base idea of function enricher is search of maximum count of various
functions, with references to another elements.
For example:
Some model has user element 'x:temp' with synonym 'temperature'
So, for sentence 'show me average temperature', token 'average' should
be detected as element 'nlpcraft:function' with type 'avg' and relation
to 'x:temp' element.

1.  In general functions can be

-   without argument
-   with one argument
-   with many arguments
     (We can start from first two variants - as first step.)


2.  Functions with arguments should have references to some another elements
     It can be user element or some predefined elements like 'nlp:geo' etc.
     Look at supported elements in the documentation.
     (I suggest to define table, which describes, which functions can be
     related with which elements)

3.  How to detect
     Example:


-   average temperature - ok (element 'x:temp' is after word 'average')
-   temperature average - skipped(such functions cannot be after their
     references)

-   average - skipped (it doesn't have sense without any references)
-   average <some free words> temperature - skipped (references should
     be after word without such gaps)

-   average the temperature - ok (gaps which contain only stopwords are
     possible)


4.  This enricher should create token with

-   name 'nlpcraft:function',
-   mandatory String property 'type' (function name),
-   optional java.util.List<Integer> property 'indexes', which
     -   omitted for function without arguments
     -   has one length list of indexes for function with one argument
         ("indexes" field name is hardcoded for internal enrichers and used in
         some related components)
         Maybe some additional optional parameters can be passed.


5.  Supported functions kinds can be

-   main math
-   sql
-   etc
     (We can start from math and sql functions, and extend supported kinds
     on next steps)


6.  Look at Limit, Relation and Sort enrichers as examples,

-   they have such references to another elements via 'indexes' fields.
-   also note please that this enricher also should be called in loop
     (like mentioned above) because can have references to nested elements.

-   look also please, how stop-words processed in these enrichers.
     (if functions with multiple word-names exist, stopwords are suitable
     inside these names, like 'X the Y' is ok as 'X Y' , where 'X Y' is
     multiple words valid function name)


7.  Functions names and all their synonyms (I guess mostly it can be
     shortcuts) can be hardcoded.
     Look at
     org.apache.nlpcraft.server.nlp.enrichers.coordinate.NCCoordinatesEnricher
     as example for numeric measures.

Reply via email to