[incubator-nlpcraft-website] branch master updated: WIP.

aradzinski Sun, 17 Jan 2021 19:35:58 -0800

This is an automated email from the ASF dual-hosted git repository.

aradzinski pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft-website.git



The following commit(s) were added to refs/heads/master by this push:
     new 4393572  WIP.
4393572 is described below

commit 4393572494f59267fb5fb6068cc35df66375a4e2
Author: Aaron Radzinski <[email protected]>
AuthorDate: Sun Jan 17 19:35:39 2021 -0800

    WIP.
---
 _data/blogs.yaml                             |   9 +
 blogs/how_to_find_something_in_the_text.html | 418 +++++++++++++++++++++++++++
 2 files changed, 427 insertions(+)

diff --git a/_data/blogs.yaml b/_data/blogs.yaml
index af375ad..84ce2d1 100644
--- a/_data/blogs.yaml
+++ b/_data/blogs.yaml
@@ -17,6 +17,15 @@
 
 - title: Преобразование текстовых запросов в SQL
   url: https://habr.com/ru/post/536720/
+  excerpt: Most of the NLP tasks start with the basic challenge - how to find 
or detect something in the text. Whether you are designing a search engine, 
conversational interface or some sort of classificator you will likely start 
with a problem of how to detect named entities in the input text. These named 
entities can be universal such as dates, countries, cities as well as domain 
specific for your model. It is also important to note that we are talking about 
a class of NLP tasks where [...]
+  author: Aaron Radzinski
+  publish_date: January 20, 2021
+  avatar_url: images/lion.jpg
+  twitter_id: aaron_radzinski
+  href_target: _self
+
+- title: Преобразование текстовых запросов в SQL
+  url: https://habr.com/ru/post/536720/
   excerpt: Большинство разработчиков, когда-либо сталкивавшихся с NLP 
задачами, рано или поздно задумывались над проблемой, обозначенной в заголовке 
статьи. Решений подобного рода создавалось достаточное количество, каждое со 
своими особенностями, плюсами и минусами. Первое, с которым мы с коллегами 
встретились лет 10 назад, и ссылку на которое я не смог сейчас даже найти, было 
оформлено в виде абсолютно нечитаемой диссертации. Мы честно, шаг за шагом 
пытались прорваться сквозь ее страни [...]
   author: Сергей Камов
   publish_date: January 11, 2021
diff --git a/blogs/how_to_find_something_in_the_text.html 
b/blogs/how_to_find_something_in_the_text.html
new file mode 100644
index 0000000..aeabe87
--- /dev/null
+++ b/blogs/how_to_find_something_in_the_text.html
@@ -0,0 +1,418 @@
+---
+active_crumb: How To Find Something In The Text
+layout: blog
+blog_title: How To Find Something In The Text
+author_name: Aaron Radzinski
+author_avatar: images/lion.jpg
+author_twitter_id: aaron_radzinski
+publish_date: January 20, 2021
+---
+
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<section>
+    <p>
+        In this blog, I'll try to give a high-level overview of STM - 
Short-Term Memory, a technique used to
+        maintain conversational context in NLPCraft. Maintaining the proper 
conversation context - remembering
+        what the current conversation is about - is essential for all human 
interaction and thus essential for
+        computer-based natural language understanding. To my knowledge, 
NLPCraft provides one of the most advanced
+        implementations of STM, especially considering how tightly it is 
integrated with NLPCraft's unique
+        intent-based matching (Google's <a target=google 
href="https://cloud.google.com/dialogflow/";>DialogFlow</a> is very similar yet).
+    </p>
+    <p>
+        Let's dive in.
+    </p>
+</section>
+<section>
+    <h2 class="section-title">Parsing User Input</h2>
+    <p>
+        One of the key objectives when parsing user input sentence for Natural 
Language Understanding (NLU) is to
+        detect all possible semantic entities, a.k.a <em>named entities</em>. 
Let's consider a few examples:
+    </p>
+    <ul>
+        <li>
+            <code>"What's the current weather in Tokyo?"</code><br/>
+            This sentence is fully sufficient for the processing
+            since it contains the topic <code>weather</code> as well as all 
necessary parameters
+            like time (<code>current</code>) and location (<code>Tokyo</code>).
+        </li>
+        <li>
+            <code>"What about Tokyo?"</code><br/>
+            This is an unclear sentence since it does not have the subject of 
the
+            question - what is it about Tokyo?
+        </li>
+        <li>
+            <code>"What's the weather?"</code><br/>
+            This is also unclear since we are missing important parameters
+            of location and time for our request.
+        </li>
+    </ul>
+    <p>
+        Sometimes we can use default values like the current user's location 
and the current time (if they are missing).
+        However, this can easily lead to the wrong interpretation if the 
conversation has an existing context.
+    </p>
+    <p>
+        In real life, as well as in NLP-based systems, we always try to start 
a conversation with a fully defined
+        sentence since without a context the missing information cannot be 
obtained and the sentenced cannot be interpreted.
+    </p>
+</section>
+<section>
+    <h2 class="section-title">Semantic Entities</h2>
+    <p>
+        Let's take a closer look at the named entities from the above examples:
+    </p>
+    <ul>
+        <li>
+            <code>weather</code> - this is an indicator of the subject of the 
conversation. Note that it indicates
+            the type of question rather than being an entity with multiple 
possible values.
+        </li>
+        <li>
+            <code>current</code> - this is an entity of type <code>Date</code> 
with the value of <code>now</code>.
+        </li>
+        <li>
+            <code>Tokyo</code> - this is an entity of type 
<code>Location</code> with two values:
+            <ul>
+                <li><code>city</code> - type of the location.</li>
+                <li><code>Tokyo, Japan</code> - normalized name of the 
location.</li>
+            </ul>
+        </li>
+    </ul>
+    <p>
+        We have two distinct classes of entities:
+    </p>
+    <ul>
+        <li>
+            Entities that have no values and only act as indicators or types. 
The entity <code>weather</code> is the
+            type indicator for the subject of the user input.
+        </li>
+        <li>
+            Entities that additionally have one or more specific values like 
<code>current</code> and <code>Tokyo</code> entities.
+        </li>
+    </ul>
+    <div class="bq success">
+        <div style="display: inline-block; margin-bottom: 20px">
+            <a style="margin-right: 10px" target="opennlp" 
href="https://opennlp.apache.org";><img src="/images/opennlp-logo.png" 
height="32px" alt=""></a>
+            <a style="margin-right: 10px" target="google" 
href="https://cloud.google.com/natural-language/";><img 
src="/images/google-cloud-logo-small.png" height="32px" alt=""></a>
+            <a style="margin-right: 10px" target="stanford" 
href="https://stanfordnlp.github.io/CoreNLP";><img 
src="/images/corenlp-logo.gif" height="48px" alt=""></a>
+            <a style="margin-right: 10px" target="spacy" 
href="https://spacy.io";><img src="/images/spacy-logo.png" height="32px" 
alt=""></a>
+        </div>
+        <p>
+            Note that NLPCraft provides <a 
href="/integrations.html">support</a> for wide variety of named entities (with 
all built-in ones being properly normalized)
+            including  <a href="/integrations.html">integrations</a> with
+            <a target="spacy" href="https://spacy.io/";>spaCy</a>,
+            <a target="stanford" 
href="https://stanfordnlp.github.io/CoreNLP";>Stanford CoreNLP</a>,
+            <a target="opennlp" href="https://opennlp.apache.org/";>OpenNLP</a> 
and
+            <a target="google" 
href="https://cloud.google.com/natural-language/";>Google Natural Language</a>.
+        </p>
+    </div>
+</section>
+<section>
+    <h2 class="section-title">Incomplete Sentences</h2>
+    <p>
+        Assuming previously asked questions about the weather in Tokyo (in the 
span of the ongoing conversation) one
+        could presumably ask the following questions using a <em>shorter, 
incomplete</em>, form:
+    </p>
+    <ul>
+        <li>
+            <code>"What about Kyoto?</code><br/>
+            This question is missing both the subject and the time. However, we
+            can safely assume we are still talking about current weather.
+        </li>
+        <li>
+            <code>"What about tomorrow?"</code><br/>
+            Just like above we automatically assume the weather subject but
+            use <code>Kyoto</code> as the location since it was mentioned the 
last.
+        </li>
+    </ul>
+    <p>
+        These are incomplete sentences. This type of short-hands cannot be 
interpreted without prior context (neither
+        by humans or by machines) since by themselves they are missing 
necessary information.
+        In the context of the conversation, however, these incomplete 
sentences work. We can simply provide one or two
+        entities and rely on the <em>"listener"</em> to recall the rest of 
missing information from a
+        <em>short-term memory</em>, a.k.a conversation context.
+    </p>
+    <p>
+        In NLPCraft, the intent-matching logic will automatically try to find 
missing information in the
+        conversation context (that is automatically maintained). Moreover, it 
will properly treat such recalled
+        information during weighted intent matching since it naturally has 
less "weight" than something that was
+        found explicitly in the user input.
+    </p>
+</section>
+<section>
+    <h2 class="section-title">Short-Term Memory</h2>
+    <p>
+        The short-term memory is exactly that... a memory that keeps only 
small amount of recently used information
+        and that evicts its contents after a short period of inactivity.
+    </p>
+    <p>
+        Let's look at the example from a real life. If you would call your 
friend in a couple of hours asking <code>"What about a day after?"</code>
+        (still talking about weather in Kyoto) - he'll likely be thoroughly 
confused. The conversation is timed out, and
+        your friend has lost (forgotten) its context. You will have to explain 
again to your confused friend what is that you are asking about...
+    </p>
+    <p>
+        NLPCraft has a simple rule that 5 minutes pause in conversation leads 
to the conversation context reset. However,
+        what happens if we switch the topic before this timeout elapsed?
+    </p>
+</section>
+<section>
+    <h2 class="section-title">Context Switch</h2>
+    <p>
+        Resetting the context by the timeout is, obviously, not a hard thing 
to do. What can be trickier is to detect
+        when conversation topic is switched and the previous context needs to 
be forgotten to avoid very
+        confusing interpretation errors. It is uncanny how humans can detect 
such switch with seemingly no effort, and yet
+        automating this task by the computer is anything but effortless...
+    </p>
+    <p>
+        Let's continue our weather-related conversation. All of a sudden, we 
ask about something completely different:
+    </p>
+    <ul>
+        <li>
+            <code>"How much mocha latter at Starbucks?"</code><br/>
+            At this point we should forget all about previous conversation 
about weather and assume going forward
+            that we are talking about coffee in Starbucks.
+        </li>
+        <li>
+            <code>"What about Peet's?"</code><br/>
+            We are talking about latter at Peet's.
+        </li>
+        <li>
+            <code>"...and croissant?"</code><br/>
+            Asking about Peet's crescent-shaped fresh rolls.
+        </li>
+    </ul>
+    <p>
+        Despite somewhat obvious logic the implementation of context switch is 
not an exact science. Sometimes, you
+        can have a "soft" context switch where you don't change the topic of 
the conversation 100% but yet sufficiently
+        enough to forget at least some parts of the previously collected 
context. NLPCraft has a built-in algorithm
+        to detect the hard switch in the conversation. It also exposes API to 
perform a selective reset on the
+        conversation in case of "soft" switch.
+    </p>
+</section>
+<section>
+    <h2 class="section-title">Overriding Entities</h2>
+    <p>
+        As we've seen above one named entity can replace or override an older 
entity in the STM, e.g. <code>"Peet's"</code>
+        replaced <code>"Starbucks"</code> in our previous questions. <b>The 
actual algorithm that governs this logic is one
+        of the most important part of STM implementation.</b> In human 
conversations we perform this logic seemingly
+        subconsciously — but the computer algorithm to do it is not that 
trivial. Let's see how it is done in NLPCraft.
+    </p>
+    <p>
+        One of the important supporting design decision is that an entity can 
belong to one or more groups. You can think of
+        groups as types, or classes of entities (to be mathematically precise 
these are the categories). The entity's
+        membership in such groups is what drives the rule of overriding.
+    </p>
+    <p>
+        Let's look at the specific example.
+    </p>
+    <p>
+        Consider a data model that defined 3 entities:
+    </p>
+    <ul>
+        <li>
+            <code>"sell"</code> (with synonym <code>"sales"</code>)
+        </li>
+        <li>
+            <code>"buy"</code> (with synonym <code>"purchase"</code>)
+        </li>
+        <li>
+            <code>"best_employee"</code> (with synonyms like 
<code>"best"</code>, <code>"best employee"</code>, <code>"best 
colleague"</code>)
+        </li>
+    </ul>
+    <p>
+        Our task is to support for following conversation:
+    </p>
+    <ul>
+        <li>
+            <code>"Give me the sales data"</code><br/>
+            We return sales information since we detected <code>"sell"</code> 
entity by its synonym <code>"sales"</code>.
+        </li>
+        <li>
+            <code>"Who was the best?"</code><br/>
+            We return the best salesmen since we detected 
<code>"best_employee"</code> and we should pick <code>"sell"</code> entity from 
the STM.
+        </li>
+        <li>
+            <code>"OK, give me the purchasing report now."</code><br/>
+            This is a bit trickier. We should return general purchasing data 
and not a best purchaser employee.
+            It feels counter-intuitive but we should NOT take 
<code>"best_employee"</code> entity from STM and, in fact, we should remove it 
from STM.
+        </li>
+        <li>
+            <code>"...and who's the best there?"</code><br/>
+            Now, we should return the best purchasing employee. We detected 
<code>"best_employee"</code> entity and we should pick <code>"buy"</code> 
entity from STM.
+        </li>
+        <li>
+            <code>"One more time - show me the general purchasing data 
again"</code><br/>
+            Once again, we should return a general purchasing report and 
ignore (and remove) <code>"best_employee"</code> from STM.
+        </li>
+    </ul>
+</section>
+<section>
+    <h2 class="section-title">Overriding Rule</h2>
+    <p>
+        Here's the rule we developed at NLPCraft and have been successfully 
using in various models:
+    </p>
+    <div class="bq success">
+        <b>Overriding Rule</b>
+        <p>
+            The entity will override other entity or entities in STM that 
belong to the same group set or its superset.
+        </p>
+    </div>
+    <p>
+        In other words, an entity with a smaller group set (more specific one) 
will override entity with the same
+        or larger group set (more generic one).
+        Let's consider an entity that belongs to the following groups: 
<code>{G1, G2, G3}</code>. This entity:
+    </p>
+    <ul>
+        <li>
+            WILL override existing entity belonging to <code>{G1, G2, 
G3}</code> groups (same set).
+        </li>
+        <li>
+            WILL override existing entity belonging to <code>{G1, G2, G3, 
G4}</code> groups (superset).
+        </li>
+        <li>
+            WILL NOT override existing entity belonging to <code>{G1, 
G2}</code> groups.
+        </li>
+        <li>
+            WIL NOT override existing entity belonging to <code>{G10, 
G20}</code> groups.
+        </li>
+    </ul>
+    <p>
+        Let's come back to our sell/buy/best example. To interpret the 
questions we've outlined above we need to
+        have the following 4 intents:
+    </p>
+    <ul>
+        <li><code>id=sale term={id=='sale'}</code></li>
+        <li><code>id=best_sale_person term={id=='sale'} 
term={id==best_employee}</code></li>
+        <li><code>id=buy term={id=='buy'}</code></li>
+        <li><code>id=buy_best_person term={id=='buy'} 
term={id==best_employee}</code></li>
+    </ul>
+    <p>
+        (this is actual <a href="/intent-matching.html#syntax">Intent DSL</a> 
used by NLPCraft -
+        <code>term</code> here is basically what's often referred to as a slot 
in other systems).
+    </p>
+    <p>
+        We also need to properly configure groups for our entities (names of 
the groups are arbitrary):
+    </p>
+    <ul>
+        <li>Entity <code>"sell"</code> belongs to group <b>A</b></li>
+        <li>Entity <code>"buy"</code> belongs to group <b>B</b></li>
+        <li>Entity <code>"best_employee"</code> belongs to groups <b>A</b> and 
<b>B</b></li>
+    </ul>
+    <p>
+        Let’s run through our example again with this configuration:
+    </p>
+    <ul>
+        <li>
+            <code>"Give me the sales data"</code>
+            <ul>
+                <li>We detected entity from group <b>A</b>.</li>
+                <li>STM is empty at this point.</li>
+                <li>Return general sales report.</li>
+                <li>Store <code>"sell"</code> entity with group <b>A</b> in 
STM.</li>
+            </ul>
+        </li>
+        <li>
+            <code>"Who was the best?"</code>
+            <ul>
+                <li>We detected entity belonging to groups <b>A</b> and 
<b>B</b>.</li>
+                <li>STM has entity belonging to group <b>A</b>.</li>
+                <li><b>{A, B}</b> does NOT override <b>{A}</b>.</li>
+                <li>Return best salesmen report.</li>
+                <li>Store detected <code>"best_employee"</code> entity.</li>
+                <li>STM now has two entities with <b>{A}</b> and <b>{A, B}</b> 
group sets.</li>
+            </ul>
+        </li>
+        <li>
+            <code>"OK, give me the purchasing report now."</code>
+            <ul>
+                <li>We detected <code>"buy"</code> entity with group 
<b>A</b>.</li>
+                <li>STM has two entities with <b>{A}</b> and <b>{A, B}</b> 
group sets.</li>
+                <li><b>{A}</b> overrides both <b>{A}</b> and <b>{A, 
B}</b>.</li>
+                <li>Return general purchasing report.</li>
+                <li>Store <code>"buy"</code> entity with group <b>A</b> in 
STM.</li>
+            </ul>
+        </li>
+    </ul>
+    <p>
+        And so on... easy, huh 😇 In fact, the logic is indeed relatively 
straightforward. It also follows
+        common sense where the logic produced by this rule matches the 
expected human behavior.
+    </p>
+</section>
+<section>
+    <h2 class="section-title">Explicit Context Switch</h2>
+    <p>
+        In some cases you may need to explicitly clear the conversation STM 
without relying on algorithmic behavior.
+        It happens when current and new topic of the conversation share some 
of the same entities. Although at first
+        it sounds counter-intuitive there are many examples of that in day to 
day life.
+    </p>
+    <p>
+        Let’s look at this sample conversation:
+    </p>
+    <ul>
+        <li>
+            <b>Q</b>: <code>"What the weather in Tokyo?"</code><br/>
+            <b>A</b>: Current weather in Tokyo...
+        </li>
+        <li>
+            <b>Q</b>: <code>"Let’s do New York after all then!"</code><br/>
+            <b>A</b>: Without an explicit conversation reset we would return 
current New York weather 🤔
+        </li>
+    </ul>
+    <p>
+        The second question was about going to New York (booking tickets, 
etc.). In real life - your
+        counterpart will likely ask what you mean by "doing New York after 
all" and you’ll have to explain
+        the abrupt change in the topic.
+        You can avoid this confusion by simply saying: "Enough about weather! 
Let’s talk about this weekend plans" - after
+        which the second question becomes clear. That sentence is an explicit 
context switch which you can also detect
+        in the NLPCraft model.
+    </p>
+    <p>
+        In NLPCraft you can also explicitly reset conversation context through 
API or by switching the model on the request.
+    </p>
+</section>
+<section>
+    <h2 class="section-title">Summary</h2>
+    <p>
+        Let’s collect all our thoughts on STM into a few bullet points:
+    </p>
+    <ul>
+        <li>
+            Missing entities in incomplete sentences can be auto-recalled from 
STM.
+        </li>
+        <li>
+            Newly detected type/category entity is likely indicating the 
change of topic.
+        </li>
+        <li>
+            The key property of STM is its short-time storage and overriding 
rule.
+        </li>
+        <li>
+            The explicit context switch is an important mechanism.
+        </li>
+    </ul>
+    <div class="bq info">
+        <b>
+            Short-Term Memory
+        </b>
+        <p>
+            It is uncanny how properly implemented STM can make conversational 
interface <b>feel like a normal human
+            conversation</b>. It allows to minimize the amount of parasitic 
dialogs and Q&A driven interfaces
+            without unnecessarily complicating the implementation of such 
systems.
+        </p>
+    </div>
+</section>
+
+

[incubator-nlpcraft-website] branch master updated: WIP.

Reply via email to