Author: buildbot
Date: Thu Oct 17 10:50:17 2013
New Revision: 882978

Log:
Staging update by buildbot for stanbol

Added:
    
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/comention.html
Modified:
    websites/staging/stanbol/trunk/content/   (props changed)
    
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entitylinking.html
    
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/list.html

Propchange: websites/staging/stanbol/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Thu Oct 17 10:50:17 2013
@@ -1 +1 @@
-1532966
+1533039

Added: 
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/comention.html
==============================================================================
--- 
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/comention.html
 (added)
+++ 
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/comention.html
 Thu Oct 17 10:50:17 2013
@@ -0,0 +1,143 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 
"http://www.w3.org/TR/html4/loose.dtd";>
+<html>
+<head>
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE- 2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+  <link href="/css/stanbol.css" rel="stylesheet" type="text/css">
+  <title>Apache Stanbol - Co-Mention Engine</title>
+  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+  <link title="doap" rel="meta" type="application/rdf+xml" href="/doap.rdf"/>
+  <link rel="icon" type="image/png" 
href="/images/stanbol-logo/stanbol-favicon.png"/>
+  <script type="text/javascript">
+    // Google Analytics Tracking Code
+    var _gaq = _gaq || [];
+    _gaq.push(['_setAccount', 'UA-32086816-1']);
+    _gaq.push(['_trackPageview']);
+
+    (function() {
+      var ga = document.createElement('script'); ga.type = 'text/javascript'; 
ga.async = true;
+      ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 
'http://www') + '.google-analytics.com/ga.js';
+      var s = document.getElementsByTagName('script')[0]; 
s.parentNode.insertBefore(ga, s);
+    })();
+  </script>  
+</head>
+
+<body>
+  <div id="navigation"> <!-- but auto scroll the menue -->
+    <a href="/index.html"><img alt="Apache Stanbol" width="220" height="101" 
border="0" src="/images/stanbol-logo/stanbol-2010-12-14.png"/></a><br />
+      <ul>
+<li><a href="/docs/trunk/tutorial.html">Getting Started</a></li>
+<li><a href="/docs/trunk/">Documentation</a><ul>
+<li><a href="/docs/trunk/scenarios.html">Usage Scenarios</a></li>
+<li><a href="/docs/trunk/components/">Components</a></li>
+<li><a href="/docs/trunk/production-mode/">Production Mode</a></li>
+</ul>
+</li>
+<li><a href="/development/">Development</a><ul>
+<li><a href="/development/index.html#mailing_lists">Mailing Lists</a></li>
+<li><a href="/development/index.html#issue_tracker">Issue Tracker</a></li>
+<li><a href="/development/index.html#source_code">Source Code</a></li>
+<li><a href="/development/index.html#development_practices">Development 
Practices</a></li>
+</ul>
+</li>
+</ul>
+<h1 id="downloads">Downloads</h1>
+<ul>
+<li><a href="/downloads/">Overview</a><ul>
+<li><a href="/downloads/releases.html">Releases</a></li>
+<li><a href="/downloads/launchers.html">Launchers</a></li>
+</ul>
+</li>
+</ul>
+<h1 id="project">Project</h1>
+<ul>
+<li><a href="/pmc/">PMC</a></li>
+<li><a href="http://www.apache.org/licenses/LICENSE-2.0";>License</a></li>
+<li><a href="/privacy-policy.html">Privacy Policy</a></li>
+</ul>
+<h1 id="archived-docs">Archived Docs</h1>
+<ul>
+<li><a href="/docs/0.9.0-incubating/">0.9.0-incubating</a></li>
+</ul>
+<h1 id="the-asf">The ASF</h1>
+<ul>
+<li><a href="http://www.apache.org";>Apache Software Foundation</a></li>
+<li><a href="http://www.apache.org/foundation/thanks.html";>Thanks</a></li>
+<li><a href="http://www.apache.org/foundation/sponsorship.html";>Become a 
Sponsor</a></li>
+<li><a href="http://www.apache.org/security/";>Security</a></li>
+</ul>
+<p><br /><a href="/doap.rdf"><img style="margin-left: 1em;" border="0" 
alt="DOAP File" src="/images/doap.png"/></a></p>
+  </div>
+  <div id="content">
+    <div class="breadcrumbs">
+      <ul> <li><a href="/">Home</a></li> <li class="item"><a 
href="/docs/">Docs</a></li> <li class="item"><a 
href="/docs/trunk/">Trunk</a></li> <li class="item"><a 
href="/docs/trunk/components/">Components</a></li> <li class="item"><a 
href="/docs/trunk/components/enhancer/">Enhancer</a></li> <li class="item"><a 
href="/docs/trunk/components/enhancer/engines/">Engines</a></li> </ul>
+    </div>
+    <h1 class="title">Co-Mention Engine</h1>
+    <p>The Co-Mention engine aims to link initial mentions of Entities with 
later references in the Text.</p>
+<p>The typical example are persons only mentioned by their family name after 
an initial mention with the full name e.g.</p>
+<div class="codehilite"><pre><span class="p">...</span> <span 
class="n">Barack</span> <span class="n">Obama</span> <span 
class="n">gave</span> <span class="n">a</span> <span class="n">talk</span> 
<span class="n">to</span> <span class="n">members</span> <span 
class="n">of</span> <span class="n">the</span> <span class="n">Labor</span> 
<span class="n">Union</span> <span class="p">...</span> <span 
class="n">Obama</span> <span class="n">specially</span> <span 
class="n">mentioned</span> <span class="p">...</span>
+</pre></div>
+
+
+<p><strong>NOTE:</strong> This Engine does <em>NOT</em> provide/use NLP 
co-reference support (e.g. linking a Pronoun with the Entity it stands for). 
Its purpose it to (1) link follow up mentions of Entities with the original one 
and (2) add suggestion of the initial mention to follow up mentions.</p>
+<h2 id="configuration">Configuration</h2>
+<p>As this engine does use entity linking functionality of the <a 
href="entitylinking">EntityLinkingEngine</a> its configuration uses properties 
defined by the <a href="entitylinking#entity-linker-configuration">Entity 
Linker Configuration</a>.</p>
+<ul>
+<li><strong>Name</strong> <em>(stanbol.enhancer.engine.name)</em>: The name of 
the Enhancement Engine. This name is used to refer an <a 
href="index.html">EnhancementEngine</a> in <a 
href="../chains">EnhancementChain</a>s</li>
+<li><strong>ServiceRankging</strong> <em>(service.ranking)</em>: In case 
multiple enhancement engines do use the same name, than only the one with the 
higher ranking will get uses.</li>
+<li><strong>Case Sensitivity</strong> 
<em>(enhancer.engines.linking.caseSensitive)</em>: Boolean switch that allows 
to activate/deactivate case sensitive matching. It is important to understand 
that even with case sensitivity activated an Entity with the label such as 
"Anaconda" will be suggested for the mention of "anaconda" in the text. The 
main difference will be the confidence value of such a suggestion as with case 
sensitivity activated the starting letters "A" and "a" are NOT considered to be 
matching. See the second technical part for details about the matching process. 
Case Sensitivity is deactivated by default. It is recommended to be activated 
if controlled vocabularies contain abbreviations similar to commonly used words 
e.g. CAN for Canada.</li>
+<li><strong>Proper Noun Linking</strong> 
<em>(enhancer.engines.linking.properNounsState)</em>: Enables/Disables proper 
noun linking for searching co-mentions. By default this is disabled to also 
consider Commons Nouns when searching for co-mentions. However  for 
Vocabularies that only contain Proper Nouns (Persons, Organizations, ...) 
enabling this might be useful. For the full documentation of this feature see 
the <a href="entitylinking#text-processing-configuration">Text Processing 
Configuration</a> section of the EntityLinking engine.</li>
+<li><strong>Processed Languages</strong> 
<em>(enhancer.engines.linking.processedLanguages)</em>: Allows the detailed 
configuration on how NLP processing results should be consumed by the 
Co-Mention engine. For the full documentation of this feature see the <a 
href="entitylinking#text-processing-configuration">Text Processing 
Configuration</a></li>
+</ul>
+<p>Other supported properties that are not included in the Felix Webconsole 
configuration dialog. Those properties can only be set via OSGI configuration 
files. See the <a href="entitylinking">Entity Linking Engine</a> configuration 
for the full documentation of those properties</p>
+<ul>
+<li><strong>Min Search Token Length</strong> 
<em>(enhancer.engines.linking.minSearchTokenLength)</em></li>
+<li><strong>Minimum Token Match Score</strong> 
<em>(enhancer.engines.linking.minTokenScore)</em></li>
+<li><strong>Lemma based Matching</strong> 
<em>(enhancer.engines.linking.lemmaMatching)</em></li>
+<li><strong>Max Search Token Distance</strong> 
<em>(enhancer.engines.linking.maxSearchTokenDistance)</em></li>
+<li><strong>Max Search Tokens</strong> 
<em>(enhancer.engines.linking.maxSearchTokens)</em></li>
+</ul>
+<p>The following properties of the EntityLinking engine are ignored:</p>
+<ul>
+<li><strong>Type Mappings</strong> 
<em>(enhancer.engines.linking.typeMappings)</em>: The Co-Mention engine uses 
the dc:types of the initial mention. Therefore dc:Type mappings need not to be 
specified</li>
+<li><strong>Default Matching Language</strong> 
<em>(enhancer.engines.linking.defaultMatchingLanguage)</em>: The engine uses 
the language as detected for the parsed document for matching.</li>
+<li><strong>Redirect Field</strong> 
<em>(enhancer.engines.linking.redirectField)</em> and <strong>Redirect 
Mode</strong> <em>(enhancer.engines.linking.redirectMode)</em>: The engine uses 
suggestions of the initial mention. Redirects where already processed for those 
suggestions. Therefore the Co-Mention engine does not need to deal with 
redirects.</li>
+<li><strong>Label Field</strong> 
<em>(enhancer.engines.linking.labelField)</em>: The engine uses the initial 
mention as label to search for co-mentions. Because of theta no label field 
needs to be configured.</li>
+<li><strong>Type Field</strong> <em>(enhancer.engines.linking.typeField)</em>: 
The engine uses the types of the suggestions for the initial mentions.</li>
+<li><strong>Suggestions</strong> 
<em>(enhancer.engines.linking.suggestions)</em>: The Co-Mentions Engine adds 
all suggestions of the initial mention to co-mentions.</li>
+<li><strong>Min Matched Tokens</strong> 
<em>(enhancer.engines.linking.minFoundTokens)</em> is set to '1' meaning that 
at least a single token of the initial mention needs to match co-mentions.</li>
+<li><strong>Min Label Score</strong> 
<em>(enhancer.engines.linking.minLabelScore)</em> is set to '1/4' meaning that 
at least 1/4 of the tokens for the initial mention need to be present in 
co-mentions.
+** <strong>Min Match Score</strong> 
<em>(enhancer.engines.linking.minMatchScore)</em> is set to a value so that it 
does not filter any results.</li>
+</ul>
+  </div>
+  
+  <div id="footer">
+    <div class="copyright">
+      <p>
+        Copyright &copy; 2010 The Apache Software Foundation, Licensed under 
+        the <a href="http://www.apache.org/licenses/LICENSE-2.0";>Apache 
License, Version 2.0</a>.
+        <br />
+        Apache, Stanbol and the Apache feather and Stanbol logos are 
trademarks of The Apache Software Foundation.
+      </p>
+    </div>
+  </div>
+  
+</body>
+</html>
+

Modified: 
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entitylinking.html
==============================================================================
--- 
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entitylinking.html
 (original)
+++ 
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entitylinking.html
 Thu Oct 17 10:50:17 2013
@@ -273,7 +273,7 @@ Configuration wise this will pre-set the
 <p>If used in combination with an disambiguation Engine one might want to 
consider to suggest Entities where only a single token of multi-token labels do 
match. In such cases a configuration like <em>Min Matched Tokens</em>=1 and 
<em>Min Label Score</em> &lt;= 0.5 (e.g. 0.4) might be considered. With such 
scenarios users will also want to considerable increase the value for <em>Max 
Suggestions</em> (typically values &gt; 10).</p>
 </li>
 <li>
-<p><strong>Min Text Score</strong> 
<em>(enhancer.engines.linking.minTextScore)</em> [0..1]::double: The "Text 
Score" [0..1] represents how well the Label of an Entity matches to the 
selected Span in the Text. It compares the number of matched {@link Token} from 
the label with the number of Tokens enclosed by the Span in the Text an Entity 
is suggested for. Not exact matches for Tokens, or if the Tokens within the 
label do appear in an other order than in the text do also reduce this score. 
Entities are only considered if at least one of their labels cores higher than 
the minimum for all tree of <em>Min Labe Score</em>, <em>Min Text Match 
Score</em> and <em>Min Match Score</em>.</p>
+<p><strong>Min Text Score</strong> 
<em>(enhancer.engines.linking.minTextScore)</em> [0..1]::double: The "Text 
Score" [0..1] represents how well the Label of an Entity matches to the 
selected Span in the Text. It compares the number of matched {@link Token} from 
the label with the number of Tokens enclosed by the Span in the Text an Entity 
is suggested for. Not exact matches for Tokens, or if the Tokens within the 
label do appear in an other order than in the text do also reduce this score. 
Entities are only considered if at least one of their labels cores higher than 
the minimum for all three of <em>Min Label Score</em>, <em>Min Text Match 
Score</em> and <em>Min Match Score</em>.</p>
 </li>
 <li><strong>Min Match Score</strong> 
<em>(enhancer.engines.linking.minMatchScore)</em> [0..1]::double: Defined as 
the product of the "Text Score" with the "Label Score" - meaning that this 
value represents both how well the label matches the text and how much of the 
label is matched with the text. Entities are only considered if at least one of 
their labels cores higher than the minimum for all tree of <em>Min Labe 
Score</em>, <em>Min Text Match Score</em> and <em>Min Match Score</em>. </li>
 <li><strong>Use EntityRankings</strong> 
<em>(enhancer.engines.linking.useEntityRankings)</em> ::boolean (default=true): 
Entity Rankings can be used to define the ranking (popularity, importance, 
connectivity, ...) of an entity relative to other within the knowledge base. 
While fise:confidence values calculated by the EntityLinkingEngie do only 
represent how well a label of the entity do match with the given section in the 
processed text it does make sense for manny use cases to sort Entities with the 
same score based on their entity rankings (e.g. users would expect to get 
"Paris (France)" suggested before "Paris (Texas)" for Paris appearing in a 
text. Enabling this feature will slightly (&lt; 0.1) change the score of 
suggestions to ensure such a ordering.     </li>

Modified: 
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/list.html
==============================================================================
--- 
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/list.html
 (original)
+++ 
websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/list.html
 Thu Oct 17 10:50:17 2013
@@ -289,6 +289,13 @@
 </ul>
 </li>
 <li>
+<p><strong><a href="comention">Entity Co-Mention Engine</a>:</strong></p>
+<ul>
+<li>Uses initial mentions of an Entity (e.g. 'Barack Obama' in 'Barack Obama 
attended the UN security council ...')</li>
+<li>To detect co-mentions at a later position in the same document (e.g. 
'Obama' in '... Obama indicated consent …') </li>
+</ul>
+</li>
+<li>
 <p><strong>DBpedia Spotlight Annotation Engine:</strong> Integration of the 
DBpedia Spotlight with the Stanbol Enhancer (see <a 
href="https://issues.apache.org/jira/browse/STANBOL-706";>STANBOL-706</a>)</p>
 <ul>
 <li>includes NLP, Entity Linking and Disambiguation of Entities using <a 
href="http://dbpedia.org";>DBpedia</a> as knowledge base</li>


Reply via email to