This is an automated email from the ASF dual-hosted git repository. sergeykamov pushed a commit to branch NLPCRAFT-513 in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft-website.git
The following commit(s) were added to refs/heads/NLPCRAFT-513 by this push: new 8c53a80 WIP. 8c53a80 is described below commit 8c53a8021d03baa89db623155f612a8fb4548daa Author: skhdl <skhdlem...@gmail.com> AuthorDate: Thu Nov 3 16:32:27 2022 +0400 WIP. --- built-in-builder.html | 27 ++++++++++++++++----------- built-in-entity-parser.html | 2 +- built-in-token-parser.html | 2 +- 3 files changed, 18 insertions(+), 13 deletions(-) diff --git a/built-in-builder.html b/built-in-builder.html index dd13e3e..887d662 100644 --- a/built-in-builder.html +++ b/built-in-builder.html @@ -28,19 +28,13 @@ id: built-in-builder <p> {% scaladoc NCPipelineBuilder NCPipelineBuilder %} class is designed for simplifying preparing {% scaladoc NCPipeline NCPipeline %} instance. - It allows to construct {% scaladoc NCPipeline NCPipeline %} instance - adding nested components via its methods. - It also contains a number of methods {% scaladoc NCPipelineBuilder#withSemantic-fffff4b0 withSemantic() %} + It allows to prepare {% scaladoc NCPipeline NCPipeline %} instance + adding pipeline chain components via its methods. + Also, it contains a number of {% scaladoc NCPipelineBuilder#withSemantic-fffff4b0 withSemantic() %} methods which allow to prepare pipeline instance based on {% scaladoc nlp/parsers/NCSemanticEntityParser NCSemanticEntityParser %} and configured language. - Currently only <b>English</b> language is supported with broad set of built-in components: - {% scaladoc nlp/parsers/NCOpenNLPTokenParser NCOpenNLPTokenParser %}, - {% scaladoc nlp/enrichers/NCOpenNLPLemmaPosTokenEnricher NCOpenNLPLemmaPosTokenEnricher %}, - {% scaladoc nlp/enrichers/NCEnStopWordsTokenEnricher NCEnStopWordsTokenEnricher %}, - {% scaladoc nlp/enrichers/NCEnSwearWordsTokenEnricher NCEnSwearWordsTokenEnricher %}, - {% scaladoc nlp/enrichers/NCEnQuotesTokenEnricher NCEnQuotesTokenEnricher %}, - {% scaladoc nlp/enrichers/NCEnDictionaryTokenEnricher NCEnDictionaryTokenEnricher %}, - {% scaladoc nlp/enrichers/NCEnBracketsTokenEnricher NCEnBracketsTokenEnricher %}. + Currently only <b>English</b> language is supported. + Pipeline for <b>English</b> language is created with useful set of built-in components. </p> </section> @@ -57,6 +51,17 @@ id: built-in-builder It defines pipeline with all built-in English language components and one semantic entity parser with model defined in <code>lightswitch_model.yaml</code>. </li> + <li> + It adds to the pipeline by default token parser implementation + {% scaladoc nlp/parsers/NCOpenNLPTokenParser NCOpenNLPTokenParser %} and + following token enrichers implementations: + {% scaladoc nlp/enrichers/NCOpenNLPLemmaPosTokenEnricher NCOpenNLPLemmaPosTokenEnricher %}, + {% scaladoc nlp/enrichers/NCEnStopWordsTokenEnricher NCEnStopWordsTokenEnricher %}, + {% scaladoc nlp/enrichers/NCEnSwearWordsTokenEnricher NCEnSwearWordsTokenEnricher %}, + {% scaladoc nlp/enrichers/NCEnQuotesTokenEnricher NCEnQuotesTokenEnricher %}, + {% scaladoc nlp/enrichers/NCEnDictionaryTokenEnricher NCEnDictionaryTokenEnricher %}, + {% scaladoc nlp/enrichers/NCEnBracketsTokenEnricher NCEnBracketsTokenEnricher %}. + </li> </ul> <p><b>Pipeline creation example constructed from built-in components:</b></p> diff --git a/built-in-entity-parser.html b/built-in-entity-parser.html index 24e1756..fb63ffd 100644 --- a/built-in-entity-parser.html +++ b/built-in-entity-parser.html @@ -26,7 +26,7 @@ id: built-in-entity-parser <h2 class="section-title">Overview<a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> <p> - {% scaladoc NCEntityParser NCEntityParser %} trait is part <a href="api-components.html#model-pipeline">Model Pipeline</a>. + {% scaladoc NCEntityParser NCEntityParser %} trait is part of <a href="api-components.html#model-pipeline">Model Pipeline</a>. Its implementation should allow to find user defined named entities based on prepared tokens as input. </p> diff --git a/built-in-token-parser.html b/built-in-token-parser.html index e9bac95..176fe7f 100644 --- a/built-in-token-parser.html +++ b/built-in-token-parser.html @@ -26,7 +26,7 @@ id: built-in-token-parser <h2 class="section-title">Overview<a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> <p> - {% scaladoc NCTokenParser NCTokenParser %} trait is part <a href="api-components.html#model-pipeline">Model Pipeline</a>. + {% scaladoc NCTokenParser NCTokenParser %} trait is part of <a href="api-components.html#model-pipeline">Model Pipeline</a>. Its implementation should parse user input plain text and split this text into <code>tokens</code> list. NLPCraft provides two English language token parser implementations: