This is an automated email from the ASF dual-hosted git repository.
aradzinski pushed a commit to branch NLPCRAFT-513
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft-website.git
The following commit(s) were added to refs/heads/NLPCRAFT-513 by this push:
new 9f73954 WIP
9f73954 is described below
commit 9f7395422317f4246b8e3f5c3dd051d22c36d2c7
Author: Aaron Radzinski <[email protected]>
AuthorDate: Thu Nov 24 13:49:56 2022 -0800
WIP
---
key-concepts.html => key-concepts-old.html | 2 +-
key-concepts.html | 458 +----------------------------
2 files changed, 17 insertions(+), 443 deletions(-)
diff --git a/key-concepts.html b/key-concepts-old.html
similarity index 99%
copy from key-concepts.html
copy to key-concepts-old.html
index ec3a5ae..3bd5ed8 100644
--- a/key-concepts.html
+++ b/key-concepts-old.html
@@ -37,7 +37,7 @@ id: key_concepts
specifics of the user input processing.
</li>
<li>
- {% scaladoc NCModelClient NCModelClient %} is responsible for
communication with the data model.
+ {% scaladoc NCModelClient NCModelClient %} is responsible for
interaction with the data model.
</li>
</ul>
diff --git a/key-concepts.html b/key-concepts.html
index ec3a5ae..b3b3b18 100644
--- a/key-concepts.html
+++ b/key-concepts.html
@@ -37,7 +37,7 @@ id: key_concepts
specifics of the user input processing.
</li>
<li>
- {% scaladoc NCModelClient NCModelClient %} is responsible for
communication with the data model.
+ {% scaladoc NCModelClient NCModelClient %} is responsible for
interaction with the data model.
</li>
</ul>
@@ -56,7 +56,7 @@ id: key_concepts
</section>
<section id="terminology">
- <h2 class="section-title">Terminology<a href="#"><i class="top-link
fas fa-fw fa-angle-double-up"></i></a></h2>
+ <h2 class="section-title">Main Types<a href="#"><i class="top-link fas
fa-fw fa-angle-double-up"></i></a></h2>
<p>
Let's start with the nomenclature of the main NLPCraft types:
</p>
@@ -98,8 +98,10 @@ id: key_concepts
<td>
<code>Entity</code> typically represents a real-world
object, such as a person, location, organization,
or product that can often be denoted with a proper name.
It can be abstract or have a physical existence.
- Each <code>entity</code> consists of zero or more
<code>tokens</code>. Combination of entities form one or more parsing
- <code>variants</code>.
+ Each <code>entity</code> consists of zero or more
<code>tokens</code> and therefore is represented by zero
+ or more substrings from the original input text. Note that
entities may have only a very loose mapping back
+ to the original text as entities represent a higher-level
abstractions compared to tokens. Combination of
+ entities form one or more parsing <code>variants</code>.
</td>
</tr>
<tr>
@@ -122,462 +124,34 @@ id: key_concepts
The output of the pipeline is further passed as an input
to <a href="intent-matching.html">intent matching</a>.
</td>
</tr>
- <tr>
- <td><b>{% scaladoc NCModelCofig NCModelConfig %}</b></td>
- <td>
- <code>Pipeline</code> is the main configuration property
of the model. Pipeline consists of an ordered sequence
- of <a href="/pipeline-components.html">pipeline
components</a>. User input starts at the first component of the
- pipeline as a simple text and exits the end of the
pipeline as a one or more parsing <code>variants</code>.
- The output of the pipeline is further passed as an input
to <a href="intent-matching.html">intent matching</a>.
- </td>
- </tr>
<tr>
<td><b><a target="scaladoc"
href="/apis/latest/">@NCIntent</a></b></td>
<td>
- <code>Variant</code> is a unique set of
<code>entities</code>. In many cases, a <code>token</code> or a group
- of <code>tokens</code> can be recognized as more than one
<code>entity</code> - resulting in multiple possible
- interpretations of the original sequence of tokens. Each
such interpretation is defined as a parsing <code>variant</code>.
- For example, user input <b>"Look at this crane."</b> can
be interpreted as two <code>variants</code>,
- one of them containing <code>entity</code>
<b>BIRD<sub>[crane]</sub></b> and another containing <code>entity</code>
<b>MACHINE<sub>[crane]</sub></b>.
+ <a target="scaladoc" href="/apis/latest/">@NCIntent</a>
annotation binds a declarative intent to its
+ callback method. The intent generally refers to the goal
that the end-user had in mind when speaking
+ or typing the input utterance. The intent has a
<em>declarative part or template</em> written in <a
href="/intent-matching.html#idl">IDL - Intent Definition Language</a>
+ that strictly defines a particular form the user input.
+ Intent is also bound to a callback method that will be
executed
+ when that intent, i.e. its template, is detected as the
best match for a given input.
</td>
</tr>
</tbody>
</table>
-
- <figure>
- <img alt="named entities" class="img-fluid"
src="/images/text-tokens-entities2.png">
- <figcaption><b>Fig 1.</b> Text -> Tokens -> Entities -> Parsing
Variants.</figcaption>
- </figure>
-
- <p>
- When <code>Variant</code> is prepared, the suitable
<code>Intent</code> is trying to matched with it.
- </p>
-
- <table class="gradient-table">
- <thead>
- <tr>
- <th>Term</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
-
- <tr>
- <td><code>Intent</code></td>
- <td>
- <code>Intent</code> is user defined callback method and
rule according to which this callback should be called.
- Most often rule is some template based on expected set of
<code>entities</code> in user input,
- but it can be defined more flexible.
- Parameters extracted from user text input are passed into
callback method.
- This method execution result is provided to user as answer
on his request.
- <code>Intent</code> callbacks are methods defined in
<code>Data Model</code> class annotated by
- <code>intent</code> rules via <a
href="intent-matching.html">IDL</a>.
- </td>
- </tr>
- <tr>
- <td><code>IDL</code></td>
- <td>
- IDL, Intent Definition Language, is a relatively
straightforward declarative language which
- defines a match between the parsed user input represented
as the collection of tokens,
- and the user-define callback method.
- IDL intents are bound to their callbacks via Java
annotation and can be located
- in the same Java annotations or placed in model YAML/JSON
file as well as in external *.idl files.
- </td>
- </tr>
- <tr>
- <td><code>Callback</code></td>
- <td>
- The user defined Scala method which mapped to the
<code>intent</code>.
- This method receives as its parameters normalized values
from user input text according to
- IDL matched terms.
- </td>
- </tr>
- </tbody>
- </table>
-
- <p>
- So, <code>Data Model</code> must be able to do tree following
things:
- </p>
-
- <ul>
- <li>
- Parse user input text as the <code>tokens</code>.
- They are input for searching <code>named entities</code>.
- <code>Tokens</code> parsing components should be included into
<a href="#model-pipeline">Model pipeline</a>.
- </li>
- <li>
- Find <code>named entities</code> based on these parsed
<code>tokens</code>.
- They are input for searching <code>intents</code>.
- <code>Entity</code> parsing components should be included into
<a href="#model-pipeline">Model pipeline</a>.
- </li>
- <li>
- Prepare <code>intents</code> with their callbacks methods
which contain business logic.
- These methods should be defined directly in the model class
definition or the model should have references on them.
- It will be described below. Callback can de defined in model
scala class directly or via references.
- Look at the chapter <a href="intent-matching.html">Intent
Matching</a> content for get more details.
- </li>
- </ul>
-
<p>
- As example, let's prepare the system which can call persons from
your contact list.
- Typical commands are: "<b>Please call to John Smith</b>" or
"<b>Connect me with Barbara Dillan</b>".
- For solving this task this model should be able to recognize in
user text following entities:
- <code>command</code> and <code>person</code> to apply this command.
+ Here's the illustration on how a user input text transforms into a
set of parsing variants:
</p>
-
- <p>
- So, when request "<b>Please call to John Smith</b>" received, our
model should be able to:
- </p>
-
- <ul>
- <li>
- Parse tokens splitting user text input:
- "<code>please</code>", "<code>call</code>", "<code>to</code>",
"<code>john</code>", "<code>smith</code>".
- </li>
- <li>
- Find two named entities:
- <ul>
- <li>
- <code>command</code> by token "<code>call</code>".
- </li>
- <li>
- <code>person</code> by tokens "<code>john</code>" and
"<code>smith</code>".
- </li>
- </ul>
- </li>
- <li>
- Have prepared intent:
- <pre class="brush: scala, highlight: [1, 2, 5, 6]">
- @NCIntent("intent=call term(command)={# == 'command'}
term(person)={# == 'person'}")
- def onCommand(
- ctx: NCContext,
- im: NCIntentMatch,
- @NCIntentTerm("command") command: NCEntity,
- @NCIntentTerm("person") person: NCEntity
- ): NCResult = ? // Implement business logic here.
- </pre>
-
- <ul>
- <li>
- <code>Line 1</code> defines intent <code>call</code>
with two conditions
- which expects two named entities in user input text.
- </li>
- <li>
- <code>Line 2</code> defines related callback method
<code>onCommand()</code>.
- </li>
- <li>
- <code>Lines 4 and 5</code> define two callback
method's arguments which are corresponded to
- <code>call</code> intent terms conditions. You can
extract normalized value
- <code>john smith</code> from the <code>person</code>
parameter and use it in the method body
- for getting his phone number etc.
- </li>
- </ul>
- </li>
- </ul>
- </section>
-
- <section id="model-configuration">
- <h2 class="section-title">Model Configuration<a href="#"><i
class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
-
- <p>
- <code>Data Model</code> configuration represented as
- {% scaladoc NCModelConfig NCModelConfig %}
- contains set of parameters which are described below.
- </p>
- <table class="gradient-table">
- <thead>
- <tr>
- <th>Name</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td><code>id</code>, <code>name</code> and
<code>version</code></td>
- <td>
- Mandatory model properties.
- </td>
- </tr>
- <tr>
- <td><code>description</code>, <code>origin</code></td>
- <td>
- Optional model properties.
- </td>
- </tr>
- <tr>
- <td><code>conversationTimeout</code></td>
- <td>
- Timeout of the user's conversation.
- If user doesn't communicate with the model this time
period STM is going to be cleared.
- Loot at <a href="short-term-memory.html">Conversation</a>
chapter to get more details.
- It is the mandatory parameter with default value.
- </td>
- </tr>
- <tr>
- <td><code>conversationDepth</code></td>
- <td>
- Maximum supported depth the user's conversation.
- Loot at <a href="short-term-memory.html">Conversation</a>
chapter to get more details.
- It is the mandatory parameter with default value.
- </td>
- </tr>
- </tbody>
- </table>
- </section>
-
- <section id="model-pipeline">
- <h2 class="section-title">Model Pipeline<a href="#"><i class="top-link
fas fa-fw fa-angle-double-up"></i></a></h2>
-
- <p>
- Model <code>Pipeline</code> is represented as {% scaladoc
NCPipeline NCPipeline %} and
- contains following components:
- </p>
-
- <table class="gradient-table">
- <thead>
- <tr>
- <th>Component</th>
- <th>Mandatory</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>{% scaladoc NCTokenParser NCTokenParser %}</td>
- <td>Mandatory single</td>
- <td>
- <code>Token parser</code> should be able to parse user
input plain text and split this text
- into <code>tokens</code> list.
- NLPCraft provides two default English language
implementations of token parser.
- Also, project contains examples for <a
href="examples/light_switch_fr.html">French</a> and
- <a href="examples/light_switch_ru.html">Russia</a>
languages token parser implementations.
- </td>
- </tr>
- <tr>
- <td> {% scaladoc NCTokenEnricher NCTokenEnricher %}</td>
- <td>Optional list</td>
- <td>
- <code>Tokens enricher</code> is a component which allow to
add additional properties for prepared tokens,
- like part of speech, quote, stop-words flags or any other.
- NLPCraft provides built-in English language set of token
enrichers implementations.
- Here is an <a
href="custom-components.html#token-enrichers">example</a>.
- </td>
- </tr>
- <tr>
- <td> {% scaladoc NCTokenValidator NCTokenValidator %}</td>
- <td>Optional list</td>
- <td>
- <code>Token validator</code> is a component which allow to
inspect prepared tokens and
- throw an exception to break user input processing.
- Here is an <a
href="custom-components.html#token-validators">example</a>.
- </td>
- </tr>
- <tr>
- <td> {% scaladoc NCEntityParser NCEntityParser %}</td>
- <td>Mandatory list</td>
- <td>
- <code>Entity parser</code> is a component which allow to
find user defined named entities
- based on prepared tokens as input.
- NLPCraft provides wrappers for named-entity recognition
components of
- <a href="https://opennlp.apache.org/">Apache OpenNLP</a>
and
- <a href="https://nlp.stanford.edu/">Stanford NLP</a> and
its own implementations.
- Note that at least one entity parser must be defined.
- Here is an <a
href="custom-components.html#entity-parsers">example</a>.
- </td>
- </tr>
- <tr>
- <td> {% scaladoc NCEntityEnricher NCEntityEnricher %}</td>
- <td>Optional list</td>
- <td>
- <code>Entity enricher</code> is component which allows to
add additional properties for prepared entities.
- Can be useful for extending existing entity enrichers
functionality.
- Here is an <a
href="custom-components.html#entity-enrichers">example</a>.
- </td>
- </tr>
- <tr>
- <td> {% scaladoc NCEntityMapper NCEntityMapper %}</td>
- <td>Optional list</td>
- <td>
- <code>Entity mappers</code> is component which allows to
map one set of entities to another after the entities
- were parsed and enriched. Can be useful for building
complex parsers based on existing.
- Here is an <a
href="custom-components.html#entity-mappers">example</a>.
- </td>
- </tr>
- <tr>
- <td> {% scaladoc NCEntityValidator NCEntityValidator %}</td>
- <td>Optional list</td>
- <td>
- <code>Entity validator</code> is a component which allow
to inspect prepared entities and
- throw an exception to break user input processing.
- Here is an <a
href="custom-components.html#entity-validators">example</a>.
- </td>
- </tr>
- <tr>
- <td> {% scaladoc NCVariantFilter NCVariantFilter %}</td>
- <td>Optional single</td>
- <td>
- <code>Variant filter</code> is a component which allows
filtering detected variants and
- rejecting undesirable.
- Here is an <a
href="custom-components.html#variant-filters">example</a>.
- </td>
- </tr>
- </tbody>
- </table>
-
<figure>
- <img alt="pipeline" class="img-fluid" src="/images/pipeline.png">
- <figcaption><b>Fig 2.</b> Pipeline</figcaption>
+ <img alt="named entities" class="img-fluid"
src="/images/text-tokens-entities2.png">
+ <figcaption><b>Fig 1.</b> Text -> Tokens -> Entities -> Parsing
Variants.</figcaption>
</figure>
-
- <p>
- Below {% scaladoc NCModel NCModel %} creation example.
- {% scaladoc NCPipeline NCPipeline %} is prepared using
- {% scaladoc NCPipelineBuilder NCPipelineBuilder %} class helper.
- </p>
-
- <pre class="brush: scala, highlight: []">
- val pipeline =
- new NCPipelineBuilder().
- withTokenParser(new NCFrTokenParser()).
- withTokenEnricher(new NCFrLemmaPosTokenEnricher()).
- withTokenEnricher(new NCFrStopWordsTokenEnricher()).
- withEntityParser(new
NCFrSemanticEntityParser("lightswitch_model_fr.yaml")).
- build
- val cfg = NCModelConfig("nlpcraft.lightswitch.fr.ex", "LightSwitch
Example Model FR", "1.0")
-
- val mdl = new NCModel(cfg, pipeline):
- // Add your callbacks definition or references on them here.
- </pre>
-
- <p>
- This flexible system allows to create any pipelines on any
language.
- You can collect NLPCraft predefined components, write your own and
easy reuse custom components.
- </p>
- </section>
-
- <section id="model-behavior">
- <h2 class="section-title">Model Behavior Overriding<a href="#"><i
class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
-
- <p>
- There are also several {% scaladoc NCModel NCModel %}
- callbacks that you can override to affect model behavior during
- <a href="/intent-matching.html#model_callbacks">intent matching</a>
- to perform logging, debugging, statistic or usage collection,
explicit update or initialization of
- conversation context, security audit or validation:
- </p>
- <table class="gradient-table">
- <thead>
- <tr>
- <th>Method</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>{% scaladoc NCModel#onContext-38d onContext() %}</td>
- <td>
- Overriding this method allows to prepare result before
intent matching.
- </td>
- </tr>
- <tr>
- <td>{% scaladoc NCModel#onMatchedIntent-946 onMatchedIntent()
%}</td>
- <td>
- Overriding this method allows to reject matched intent and
continue matching process.
- </td>
- </tr>
- <tr>
- <td>{% scaladoc NCModel#onResult-fffffaf3 onResult() %}</td>
- <td>
- Overriding this method allows to replace callback method
execution result.
- </td>
- </tr>
- <tr>
- <td>{% scaladoc NCModel#onRejection-4fa onRejection() %}</td>
- <td>
- Overriding this method allows to change operation result
when rejection occurs.
- </td>
- </tr>
- <tr>
- <td>{% scaladoc NCModel#onError-fffff759 onError() %}</td>
- <td>
- Overriding this method allows to change operation result
when any error occurs.
- </td>
- </tr>
- </tbody>
- </table>
- </section>
-
- <section id="client">
- <h2 class="section-title">Client Responsibility<a href="#"><i
class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
-
- <p>
- <code>Client</code> represented as {% scaladoc NCModelClient
NCModelClient %}
- is necessary for communication with the <code>Data Model</code>.
Base client methods are described below.
- </p>
-
- <table class="gradient-table">
- <thead>
- <tr>
- <th>Method</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>{% scaladoc NCModelClient#ask-fffff9ce ask() %}</td>
- <td>
- Passes user text input to the model and receives back
execution
- {% scaladoc NCResult NCResult %} or
- rejection exception if there isn't any triggered intents.
- {% scaladoc NCResult NCResult %} is wrapper on
- callback method execution result with additional
information.
- </td>
- </tr>
- <tr>
- <td>{% scaladoc NCModelClient#debugAsk-fffff96c debugAsk()
%}</td>
- <td>
- Passes user text input to the model and receives back
callback and its parameters or
- rejection exception if there isn't any triggered intents.
- Main difference from <code>ask</code> that triggered
intent callback method is not called.
- This method and this parameter can be useful in tests
scenarios.
- </td>
- </tr>
- <tr>
- <td>{% scaladoc NCModelClient#clearStm-571 clearStm() %}</td>
- <td>
- Clears STM state. Memory is cleared wholly or with some
predicate.
- Loot at <a href="short-term-memory.html">Conversation</a>
chapter to get more details.
- Second variant of given method with another parameters is
here - {% scaladoc NCModelClient#clearStm-1d8 clearStm() %}.
- </td>
- </tr>
- <tr>
- <td>{% scaladoc NCModelClient#clearDialog-571 clearDialog()
%}</td>
- <td>
- Clears dialog state. Dialog is cleared wholly or with some
predicate.
- Loot at <a href="short-term-memory.html">Conversation</a>
chapter to get more details.
- Second variant of given method with another parameters is
here - {% scaladoc NCModelClient#clearDialog-1d8 clearDialog() %}.
- </td>
- </tr>
- <tr>
- <td>{% scaladoc NCModelClient#close-94c close() %}</td>
- <td>
- Closes client. You can't call another client's methods
after this method was closed.
- </td>
- </tr>
- </tbody>
- </table>
</section>
</div>
<div class="col-md-2 third-column">
<ul class="side-nav">
<li class="side-nav-title">On This Page</li>
<li><a href="#overview">Key Concepts</a></li>
- <li><a href="#terminology">Terminology</a></li>
-<!-- <li><a href="#model-configuration">Model Configuration</a></li>
-->
-<!-- <li><a href="#model-pipeline">Model Pipeline</a></li> -->
-<!-- <li><a href="#model-behavior">Model Behavior Overriding</a></li>
-->
-<!-- <li><a href="#client">Client Responsibility</a></li> -->
+ <li><a href="#terminology">Main Types</a></li>
{% include quick-links.html %}
</ul>
</div>