Added: uima/site/trunk/uima-website/docs/d/ruta-current/ruta.html URL: http://svn.apache.org/viewvc/uima/site/trunk/uima-website/docs/d/ruta-current/ruta.html?rev=1915326&view=auto ============================================================================== --- uima/site/trunk/uima-website/docs/d/ruta-current/ruta.html (added) +++ uima/site/trunk/uima-website/docs/d/ruta-current/ruta.html Fri Jan 19 12:49:00 2024 @@ -0,0 +1,8543 @@ +<!DOCTYPE html> +<html lang="en"> +<head> +<meta charset="UTF-8"> +<meta http-equiv="X-UA-Compatible" content="IE=edge"> +<meta name="viewport" content="width=device-width, initial-scale=1.0"> +<meta name="generator" content="Asciidoctor 2.0.20"> +<meta name="author" content="Apache UIMA⢠Development Community"> +<title>Apache UIMA⢠- Ruta</title> +<style> +/*! Asciidoctor default stylesheet | MIT License | https://asciidoctor.org */ +/* Uncomment the following line when using as a custom stylesheet */ +/* @import "https://fonts.googleapis.com/css?family=Open+Sans:300,300italic,400,400italic,600,600italic%7CNoto+Serif:400,400italic,700,700italic%7CDroid+Sans+Mono:400,700"; */ +html{font-family:sans-serif;-webkit-text-size-adjust:100%} +a{background:none} +a:focus{outline:thin dotted} +a:active,a:hover{outline:0} +h1{font-size:2em;margin:.67em 0} +b,strong{font-weight:bold} +abbr{font-size:.9em} +abbr[title]{cursor:help;border-bottom:1px dotted #dddddf;text-decoration:none} +dfn{font-style:italic} +hr{height:0} +mark{background:#ff0;color:#000} +code,kbd,pre,samp{font-family:monospace;font-size:1em} +pre{white-space:pre-wrap} +q{quotes:"\201C" "\201D" "\2018" "\2019"} +small{font-size:80%} +sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:baseline} +sup{top:-.5em} +sub{bottom:-.25em} +img{border:0} +svg:not(:root){overflow:hidden} +figure{margin:0} +audio,video{display:inline-block} +audio:not([controls]){display:none;height:0} +fieldset{border:1px solid silver;margin:0 2px;padding:.35em .625em .75em} +legend{border:0;padding:0} +button,input,select,textarea{font-family:inherit;font-size:100%;margin:0} +button,input{line-height:normal} +button,select{text-transform:none} +button,html input[type=button],input[type=reset],input[type=submit]{-webkit-appearance:button;cursor:pointer} +button[disabled],html input[disabled]{cursor:default} +input[type=checkbox],input[type=radio]{padding:0} +button::-moz-focus-inner,input::-moz-focus-inner{border:0;padding:0} +textarea{overflow:auto;vertical-align:top} +table{border-collapse:collapse;border-spacing:0} +*,::before,::after{box-sizing:border-box} +html,body{font-size:100%} +body{background:#fff;color:rgba(0,0,0,.8);padding:0;margin:0;font-family:"Noto Serif","DejaVu Serif",serif;line-height:1;position:relative;cursor:auto;-moz-tab-size:4;-o-tab-size:4;tab-size:4;word-wrap:anywhere;-moz-osx-font-smoothing:grayscale;-webkit-font-smoothing:antialiased} +a:hover{cursor:pointer} +img,object,embed{max-width:100%;height:auto} +object,embed{height:100%} +img{-ms-interpolation-mode:bicubic} +.left{float:left!important} +.right{float:right!important} +.text-left{text-align:left!important} +.text-right{text-align:right!important} +.text-center{text-align:center!important} +.text-justify{text-align:justify!important} +.hide{display:none} +img,object,svg{display:inline-block;vertical-align:middle} +textarea{height:auto;min-height:50px} +select{width:100%} +.subheader,.admonitionblock td.content>.title,.audioblock>.title,.exampleblock>.title,.imageblock>.title,.listingblock>.title,.literalblock>.title,.stemblock>.title,.openblock>.title,.paragraph>.title,.quoteblock>.title,table.tableblock>.title,.verseblock>.title,.videoblock>.title,.dlist>.title,.olist>.title,.ulist>.title,.qlist>.title,.hdlist>.title{line-height:1.45;color:#7a2518;font-weight:400;margin-top:0;margin-bottom:.25em} +div,dl,dt,dd,ul,ol,li,h1,h2,h3,#toctitle,.sidebarblock>.content>.title,h4,h5,h6,pre,form,p,blockquote,th,td{margin:0;padding:0} +a{color:#2156a5;text-decoration:underline;line-height:inherit} +a:hover,a:focus{color:#1d4b8f} +a img{border:0} +p{line-height:1.6;margin-bottom:1.25em;text-rendering:optimizeLegibility} +p aside{font-size:.875em;line-height:1.35;font-style:italic} +h1,h2,h3,#toctitle,.sidebarblock>.content>.title,h4,h5,h6{font-family:"Open Sans","DejaVu Sans",sans-serif;font-weight:300;font-style:normal;color:#ba3925;text-rendering:optimizeLegibility;margin-top:1em;margin-bottom:.5em;line-height:1.0125em} +h1 small,h2 small,h3 small,#toctitle small,.sidebarblock>.content>.title small,h4 small,h5 small,h6 small{font-size:60%;color:#e99b8f;line-height:0} +h1{font-size:2.125em} +h2{font-size:1.6875em} +h3,#toctitle,.sidebarblock>.content>.title{font-size:1.375em} +h4,h5{font-size:1.125em} +h6{font-size:1em} +hr{border:solid #dddddf;border-width:1px 0 0;clear:both;margin:1.25em 0 1.1875em} +em,i{font-style:italic;line-height:inherit} +strong,b{font-weight:bold;line-height:inherit} +small{font-size:60%;line-height:inherit} +code{font-family:"Droid Sans Mono","DejaVu Sans Mono",monospace;font-weight:400;color:rgba(0,0,0,.9)} +ul,ol,dl{line-height:1.6;margin-bottom:1.25em;list-style-position:outside;font-family:inherit} +ul,ol{margin-left:1.5em} +ul li ul,ul li ol{margin-left:1.25em;margin-bottom:0} +ul.circle{list-style-type:circle} +ul.disc{list-style-type:disc} +ul.square{list-style-type:square} +ul.circle ul:not([class]),ul.disc ul:not([class]),ul.square ul:not([class]){list-style:inherit} +ol li ul,ol li ol{margin-left:1.25em;margin-bottom:0} +dl dt{margin-bottom:.3125em;font-weight:bold} +dl dd{margin-bottom:1.25em} +blockquote{margin:0 0 1.25em;padding:.5625em 1.25em 0 1.1875em;border-left:1px solid #ddd} +blockquote,blockquote p{line-height:1.6;color:rgba(0,0,0,.85)} +@media screen and (min-width:768px){h1,h2,h3,#toctitle,.sidebarblock>.content>.title,h4,h5,h6{line-height:1.2} +h1{font-size:2.75em} +h2{font-size:2.3125em} +h3,#toctitle,.sidebarblock>.content>.title{font-size:1.6875em} +h4{font-size:1.4375em}} +table{background:#fff;margin-bottom:1.25em;border:1px solid #dedede;word-wrap:normal} +table thead,table tfoot{background:#f7f8f7} +table thead tr th,table thead tr td,table tfoot tr th,table tfoot tr td{padding:.5em .625em .625em;font-size:inherit;color:rgba(0,0,0,.8);text-align:left} +table tr th,table tr td{padding:.5625em .625em;font-size:inherit;color:rgba(0,0,0,.8)} +table tr.even,table tr.alt{background:#f8f8f7} +table thead tr th,table tfoot tr th,table tbody tr td,table tr td,table tfoot tr td{line-height:1.6} +h1,h2,h3,#toctitle,.sidebarblock>.content>.title,h4,h5,h6{line-height:1.2;word-spacing:-.05em} +h1 strong,h2 strong,h3 strong,#toctitle strong,.sidebarblock>.content>.title strong,h4 strong,h5 strong,h6 strong{font-weight:400} +.center{margin-left:auto;margin-right:auto} +.stretch{width:100%} +.clearfix::before,.clearfix::after,.float-group::before,.float-group::after{content:" ";display:table} +.clearfix::after,.float-group::after{clear:both} +:not(pre).nobreak{word-wrap:normal} +:not(pre).nowrap{white-space:nowrap} +:not(pre).pre-wrap{white-space:pre-wrap} +:not(pre):not([class^=L])>code{font-size:.9375em;font-style:normal!important;letter-spacing:0;padding:.1em .5ex;word-spacing:-.15em;background:#f7f7f8;border-radius:4px;line-height:1.45;text-rendering:optimizeSpeed} +pre{color:rgba(0,0,0,.9);font-family:"Droid Sans Mono","DejaVu Sans Mono",monospace;line-height:1.45;text-rendering:optimizeSpeed} +pre code,pre pre{color:inherit;font-size:inherit;line-height:inherit} +pre>code{display:block} +pre.nowrap,pre.nowrap pre{white-space:pre;word-wrap:normal} +em em{font-style:normal} +strong strong{font-weight:400} +.keyseq{color:rgba(51,51,51,.8)} +kbd{font-family:"Droid Sans Mono","DejaVu Sans Mono",monospace;display:inline-block;color:rgba(0,0,0,.8);font-size:.65em;line-height:1.45;background:#f7f7f7;border:1px solid #ccc;border-radius:3px;box-shadow:0 1px 0 rgba(0,0,0,.2),inset 0 0 0 .1em #fff;margin:0 .15em;padding:.2em .5em;vertical-align:middle;position:relative;top:-.1em;white-space:nowrap} +.keyseq kbd:first-child{margin-left:0} +.keyseq kbd:last-child{margin-right:0} +.menuseq,.menuref{color:#000} +.menuseq b:not(.caret),.menuref{font-weight:inherit} +.menuseq{word-spacing:-.02em} +.menuseq b.caret{font-size:1.25em;line-height:.8} +.menuseq i.caret{font-weight:bold;text-align:center;width:.45em} +b.button::before,b.button::after{position:relative;top:-1px;font-weight:400} +b.button::before{content:"[";padding:0 3px 0 2px} +b.button::after{content:"]";padding:0 2px 0 3px} +p a>code:hover{color:rgba(0,0,0,.9)} +#header,#content,#footnotes,#footer{width:100%;margin:0 auto;max-width:62.5em;*zoom:1;position:relative;padding-left:.9375em;padding-right:.9375em} +#header::before,#header::after,#content::before,#content::after,#footnotes::before,#footnotes::after,#footer::before,#footer::after{content:" ";display:table} +#header::after,#content::after,#footnotes::after,#footer::after{clear:both} +#content{margin-top:1.25em} +#content::before{content:none} +#header>h1:first-child{color:rgba(0,0,0,.85);margin-top:2.25rem;margin-bottom:0} +#header>h1:first-child+#toc{margin-top:8px;border-top:1px solid #dddddf} +#header>h1:only-child,body.toc2 #header>h1:nth-last-child(2){border-bottom:1px solid #dddddf;padding-bottom:8px} +#header .details{border-bottom:1px solid #dddddf;line-height:1.45;padding-top:.25em;padding-bottom:.25em;padding-left:.25em;color:rgba(0,0,0,.6);display:flex;flex-flow:row wrap} +#header .details span:first-child{margin-left:-.125em} +#header .details span.email a{color:rgba(0,0,0,.85)} +#header .details br{display:none} +#header .details br+span::before{content:"\00a0\2013\00a0"} +#header .details br+span.author::before{content:"\00a0\22c5\00a0";color:rgba(0,0,0,.85)} +#header .details br+span#revremark::before{content:"\00a0|\00a0"} +#header #revnumber{text-transform:capitalize} +#header #revnumber::after{content:"\00a0"} +#content>h1:first-child:not([class]){color:rgba(0,0,0,.85);border-bottom:1px solid #dddddf;padding-bottom:8px;margin-top:0;padding-top:1rem;margin-bottom:1.25rem} +#toc{border-bottom:1px solid #e7e7e9;padding-bottom:.5em} +#toc>ul{margin-left:.125em} +#toc ul.sectlevel0>li>a{font-style:italic} +#toc ul.sectlevel0 ul.sectlevel1{margin:.5em 0} +#toc ul{font-family:"Open Sans","DejaVu Sans",sans-serif;list-style-type:none} +#toc li{line-height:1.3334;margin-top:.3334em} +#toc a{text-decoration:none} +#toc a:active{text-decoration:underline} +#toctitle{color:#7a2518;font-size:1.2em} +@media screen and (min-width:768px){#toctitle{font-size:1.375em} +body.toc2{padding-left:15em;padding-right:0} +#toc.toc2{margin-top:0!important;background:#f8f8f7;position:fixed;width:15em;left:0;top:0;border-right:1px solid #e7e7e9;border-top-width:0!important;border-bottom-width:0!important;z-index:1000;padding:1.25em 1em;height:100%;overflow:auto} +#toc.toc2 #toctitle{margin-top:0;margin-bottom:.8rem;font-size:1.2em} +#toc.toc2>ul{font-size:.9em;margin-bottom:0} +#toc.toc2 ul ul{margin-left:0;padding-left:1em} +#toc.toc2 ul.sectlevel0 ul.sectlevel1{padding-left:0;margin-top:.5em;margin-bottom:.5em} +body.toc2.toc-right{padding-left:0;padding-right:15em} +body.toc2.toc-right #toc.toc2{border-right-width:0;border-left:1px solid #e7e7e9;left:auto;right:0}} +@media screen and (min-width:1280px){body.toc2{padding-left:20em;padding-right:0} +#toc.toc2{width:20em} +#toc.toc2 #toctitle{font-size:1.375em} +#toc.toc2>ul{font-size:.95em} +#toc.toc2 ul ul{padding-left:1.25em} +body.toc2.toc-right{padding-left:0;padding-right:20em}} +#content #toc{border:1px solid #e0e0dc;margin-bottom:1.25em;padding:1.25em;background:#f8f8f7;border-radius:4px} +#content #toc>:first-child{margin-top:0} +#content #toc>:last-child{margin-bottom:0} +#footer{max-width:none;background:rgba(0,0,0,.8);padding:1.25em} +#footer-text{color:hsla(0,0%,100%,.8);line-height:1.44} +#content{margin-bottom:.625em} +.sect1{padding-bottom:.625em} +@media screen and (min-width:768px){#content{margin-bottom:1.25em} +.sect1{padding-bottom:1.25em}} +.sect1:last-child{padding-bottom:0} +.sect1+.sect1{border-top:1px solid #e7e7e9} +#content h1>a.anchor,h2>a.anchor,h3>a.anchor,#toctitle>a.anchor,.sidebarblock>.content>.title>a.anchor,h4>a.anchor,h5>a.anchor,h6>a.anchor{position:absolute;z-index:1001;width:1.5ex;margin-left:-1.5ex;display:block;text-decoration:none!important;visibility:hidden;text-align:center;font-weight:400} +#content h1>a.anchor::before,h2>a.anchor::before,h3>a.anchor::before,#toctitle>a.anchor::before,.sidebarblock>.content>.title>a.anchor::before,h4>a.anchor::before,h5>a.anchor::before,h6>a.anchor::before{content:"\00A7";font-size:.85em;display:block;padding-top:.1em} +#content h1:hover>a.anchor,#content h1>a.anchor:hover,h2:hover>a.anchor,h2>a.anchor:hover,h3:hover>a.anchor,#toctitle:hover>a.anchor,.sidebarblock>.content>.title:hover>a.anchor,h3>a.anchor:hover,#toctitle>a.anchor:hover,.sidebarblock>.content>.title>a.anchor:hover,h4:hover>a.anchor,h4>a.anchor:hover,h5:hover>a.anchor,h5>a.anchor:hover,h6:hover>a.anchor,h6>a.anchor:hover{visibility:visible} +#content h1>a.link,h2>a.link,h3>a.link,#toctitle>a.link,.sidebarblock>.content>.title>a.link,h4>a.link,h5>a.link,h6>a.link{color:#ba3925;text-decoration:none} +#content h1>a.link:hover,h2>a.link:hover,h3>a.link:hover,#toctitle>a.link:hover,.sidebarblock>.content>.title>a.link:hover,h4>a.link:hover,h5>a.link:hover,h6>a.link:hover{color:#a53221} +details,.audioblock,.imageblock,.literalblock,.listingblock,.stemblock,.videoblock{margin-bottom:1.25em} +details{margin-left:1.25rem} +details>summary{cursor:pointer;display:block;position:relative;line-height:1.6;margin-bottom:.625rem;outline:none;-webkit-tap-highlight-color:transparent} +details>summary::-webkit-details-marker{display:none} +details>summary::before{content:"";border:solid transparent;border-left:solid;border-width:.3em 0 .3em .5em;position:absolute;top:.5em;left:-1.25rem;transform:translateX(15%)} +details[open]>summary::before{border:solid transparent;border-top:solid;border-width:.5em .3em 0;transform:translateY(15%)} +details>summary::after{content:"";width:1.25rem;height:1em;position:absolute;top:.3em;left:-1.25rem} +.admonitionblock td.content>.title,.audioblock>.title,.exampleblock>.title,.imageblock>.title,.listingblock>.title,.literalblock>.title,.stemblock>.title,.openblock>.title,.paragraph>.title,.quoteblock>.title,table.tableblock>.title,.verseblock>.title,.videoblock>.title,.dlist>.title,.olist>.title,.ulist>.title,.qlist>.title,.hdlist>.title{text-rendering:optimizeLegibility;text-align:left;font-family:"Noto Serif","DejaVu Serif",serif;font-size:1rem;font-style:italic} +table.tableblock.fit-content>caption.title{white-space:nowrap;width:0} +.paragraph.lead>p,#preamble>.sectionbody>[class=paragraph]:first-of-type p{font-size:1.21875em;line-height:1.6;color:rgba(0,0,0,.85)} +.admonitionblock>table{border-collapse:separate;border:0;background:none;width:100%} +.admonitionblock>table td.icon{text-align:center;width:80px} +.admonitionblock>table td.icon img{max-width:none} +.admonitionblock>table td.icon .title{font-weight:bold;font-family:"Open Sans","DejaVu Sans",sans-serif;text-transform:uppercase} +.admonitionblock>table td.content{padding-left:1.125em;padding-right:1.25em;border-left:1px solid #dddddf;color:rgba(0,0,0,.6);word-wrap:anywhere} +.admonitionblock>table td.content>:last-child>:last-child{margin-bottom:0} +.exampleblock>.content{border:1px solid #e6e6e6;margin-bottom:1.25em;padding:1.25em;background:#fff;border-radius:4px} +.sidebarblock{border:1px solid #dbdbd6;margin-bottom:1.25em;padding:1.25em;background:#f3f3f2;border-radius:4px} +.sidebarblock>.content>.title{color:#7a2518;margin-top:0;text-align:center} +.exampleblock>.content>:first-child,.sidebarblock>.content>:first-child{margin-top:0} +.exampleblock>.content>:last-child,.exampleblock>.content>:last-child>:last-child,.exampleblock>.content .olist>ol>li:last-child>:last-child,.exampleblock>.content .ulist>ul>li:last-child>:last-child,.exampleblock>.content .qlist>ol>li:last-child>:last-child,.sidebarblock>.content>:last-child,.sidebarblock>.content>:last-child>:last-child,.sidebarblock>.content .olist>ol>li:last-child>:last-child,.sidebarblock>.content .ulist>ul>li:last-child>:last-child,.sidebarblock>.content .qlist>ol>li:last-child>:last-child{margin-bottom:0} +.literalblock pre,.listingblock>.content>pre{border-radius:4px;overflow-x:auto;padding:1em;font-size:.8125em} +@media screen and (min-width:768px){.literalblock pre,.listingblock>.content>pre{font-size:.90625em}} +@media screen and (min-width:1280px){.literalblock pre,.listingblock>.content>pre{font-size:1em}} +.literalblock pre,.listingblock>.content>pre:not(.highlight),.listingblock>.content>pre[class=highlight],.listingblock>.content>pre[class^="highlight "]{background:#f7f7f8} +.literalblock.output pre{color:#f7f7f8;background:rgba(0,0,0,.9)} +.listingblock>.content{position:relative} +.listingblock code[data-lang]::before{display:none;content:attr(data-lang);position:absolute;font-size:.75em;top:.425rem;right:.5rem;line-height:1;text-transform:uppercase;color:inherit;opacity:.5} +.listingblock:hover code[data-lang]::before{display:block} +.listingblock.terminal pre .command::before{content:attr(data-prompt);padding-right:.5em;color:inherit;opacity:.5} +.listingblock.terminal pre .command:not([data-prompt])::before{content:"$"} +.listingblock pre.highlightjs{padding:0} +.listingblock pre.highlightjs>code{padding:1em;border-radius:4px} +.listingblock pre.prettyprint{border-width:0} +.prettyprint{background:#f7f7f8} +pre.prettyprint .linenums{line-height:1.45;margin-left:2em} +pre.prettyprint li{background:none;list-style-type:inherit;padding-left:0} +pre.prettyprint li code[data-lang]::before{opacity:1} +pre.prettyprint li:not(:first-child) code[data-lang]::before{display:none} +table.linenotable{border-collapse:separate;border:0;margin-bottom:0;background:none} +table.linenotable td[class]{color:inherit;vertical-align:top;padding:0;line-height:inherit;white-space:normal} +table.linenotable td.code{padding-left:.75em} +table.linenotable td.linenos,pre.pygments .linenos{border-right:1px solid;opacity:.35;padding-right:.5em;-webkit-user-select:none;-moz-user-select:none;-ms-user-select:none;user-select:none} +pre.pygments span.linenos{display:inline-block;margin-right:.75em} +.quoteblock{margin:0 1em 1.25em 1.5em;display:table} +.quoteblock:not(.excerpt)>.title{margin-left:-1.5em;margin-bottom:.75em} +.quoteblock blockquote,.quoteblock p{color:rgba(0,0,0,.85);font-size:1.15rem;line-height:1.75;word-spacing:.1em;letter-spacing:0;font-style:italic;text-align:justify} +.quoteblock blockquote{margin:0;padding:0;border:0} +.quoteblock blockquote::before{content:"\201c";float:left;font-size:2.75em;font-weight:bold;line-height:.6em;margin-left:-.6em;color:#7a2518;text-shadow:0 1px 2px rgba(0,0,0,.1)} +.quoteblock blockquote>.paragraph:last-child p{margin-bottom:0} +.quoteblock .attribution{margin-top:.75em;margin-right:.5ex;text-align:right} +.verseblock{margin:0 1em 1.25em} +.verseblock pre{font-family:"Open Sans","DejaVu Sans",sans-serif;font-size:1.15rem;color:rgba(0,0,0,.85);font-weight:300;text-rendering:optimizeLegibility} +.verseblock pre strong{font-weight:400} +.verseblock .attribution{margin-top:1.25rem;margin-left:.5ex} +.quoteblock .attribution,.verseblock .attribution{font-size:.9375em;line-height:1.45;font-style:italic} +.quoteblock .attribution br,.verseblock .attribution br{display:none} +.quoteblock .attribution cite,.verseblock .attribution cite{display:block;letter-spacing:-.025em;color:rgba(0,0,0,.6)} +.quoteblock.abstract blockquote::before,.quoteblock.excerpt blockquote::before,.quoteblock .quoteblock blockquote::before{display:none} +.quoteblock.abstract blockquote,.quoteblock.abstract p,.quoteblock.excerpt blockquote,.quoteblock.excerpt p,.quoteblock .quoteblock blockquote,.quoteblock .quoteblock p{line-height:1.6;word-spacing:0} +.quoteblock.abstract{margin:0 1em 1.25em;display:block} +.quoteblock.abstract>.title{margin:0 0 .375em;font-size:1.15em;text-align:center} +.quoteblock.excerpt>blockquote,.quoteblock .quoteblock{padding:0 0 .25em 1em;border-left:.25em solid #dddddf} +.quoteblock.excerpt,.quoteblock .quoteblock{margin-left:0} +.quoteblock.excerpt blockquote,.quoteblock.excerpt p,.quoteblock .quoteblock blockquote,.quoteblock .quoteblock p{color:inherit;font-size:1.0625rem} +.quoteblock.excerpt .attribution,.quoteblock .quoteblock .attribution{color:inherit;font-size:.85rem;text-align:left;margin-right:0} +p.tableblock:last-child{margin-bottom:0} +td.tableblock>.content{margin-bottom:1.25em;word-wrap:anywhere} +td.tableblock>.content>:last-child{margin-bottom:-1.25em} +table.tableblock,th.tableblock,td.tableblock{border:0 solid #dedede} +table.grid-all>*>tr>*{border-width:1px} +table.grid-cols>*>tr>*{border-width:0 1px} +table.grid-rows>*>tr>*{border-width:1px 0} +table.frame-all{border-width:1px} +table.frame-ends{border-width:1px 0} +table.frame-sides{border-width:0 1px} +table.frame-none>colgroup+*>:first-child>*,table.frame-sides>colgroup+*>:first-child>*{border-top-width:0} +table.frame-none>:last-child>:last-child>*,table.frame-sides>:last-child>:last-child>*{border-bottom-width:0} +table.frame-none>*>tr>:first-child,table.frame-ends>*>tr>:first-child{border-left-width:0} +table.frame-none>*>tr>:last-child,table.frame-ends>*>tr>:last-child{border-right-width:0} +table.stripes-all>*>tr,table.stripes-odd>*>tr:nth-of-type(odd),table.stripes-even>*>tr:nth-of-type(even),table.stripes-hover>*>tr:hover{background:#f8f8f7} +th.halign-left,td.halign-left{text-align:left} +th.halign-right,td.halign-right{text-align:right} +th.halign-center,td.halign-center{text-align:center} +th.valign-top,td.valign-top{vertical-align:top} +th.valign-bottom,td.valign-bottom{vertical-align:bottom} +th.valign-middle,td.valign-middle{vertical-align:middle} +table thead th,table tfoot th{font-weight:bold} +tbody tr th{background:#f7f8f7} +tbody tr th,tbody tr th p,tfoot tr th,tfoot tr th p{color:rgba(0,0,0,.8);font-weight:bold} +p.tableblock>code:only-child{background:none;padding:0} +p.tableblock{font-size:1em} +ol{margin-left:1.75em} +ul li ol{margin-left:1.5em} +dl dd{margin-left:1.125em} +dl dd:last-child,dl dd:last-child>:last-child{margin-bottom:0} +li p,ul dd,ol dd,.olist .olist,.ulist .ulist,.ulist .olist,.olist .ulist{margin-bottom:.625em} +ul.checklist,ul.none,ol.none,ul.no-bullet,ol.no-bullet,ol.unnumbered,ul.unstyled,ol.unstyled{list-style-type:none} +ul.no-bullet,ol.no-bullet,ol.unnumbered{margin-left:.625em} +ul.unstyled,ol.unstyled{margin-left:0} +li>p:empty:only-child::before{content:"";display:inline-block} +ul.checklist>li>p:first-child{margin-left:-1em} +ul.checklist>li>p:first-child>.fa-square-o:first-child,ul.checklist>li>p:first-child>.fa-check-square-o:first-child{width:1.25em;font-size:.8em;position:relative;bottom:.125em} +ul.checklist>li>p:first-child>input[type=checkbox]:first-child{margin-right:.25em} +ul.inline{display:flex;flex-flow:row wrap;list-style:none;margin:0 0 .625em -1.25em} +ul.inline>li{margin-left:1.25em} +.unstyled dl dt{font-weight:400;font-style:normal} +ol.arabic{list-style-type:decimal} +ol.decimal{list-style-type:decimal-leading-zero} +ol.loweralpha{list-style-type:lower-alpha} +ol.upperalpha{list-style-type:upper-alpha} +ol.lowerroman{list-style-type:lower-roman} +ol.upperroman{list-style-type:upper-roman} +ol.lowergreek{list-style-type:lower-greek} +.hdlist>table,.colist>table{border:0;background:none} +.hdlist>table>tbody>tr,.colist>table>tbody>tr{background:none} +td.hdlist1,td.hdlist2{vertical-align:top;padding:0 .625em} +td.hdlist1{font-weight:bold;padding-bottom:1.25em} +td.hdlist2{word-wrap:anywhere} +.literalblock+.colist,.listingblock+.colist{margin-top:-.5em} +.colist td:not([class]):first-child{padding:.4em .75em 0;line-height:1;vertical-align:top} +.colist td:not([class]):first-child img{max-width:none} +.colist td:not([class]):last-child{padding:.25em 0} +.thumb,.th{line-height:0;display:inline-block;border:4px solid #fff;box-shadow:0 0 0 1px #ddd} +.imageblock.left{margin:.25em .625em 1.25em 0} +.imageblock.right{margin:.25em 0 1.25em .625em} +.imageblock>.title{margin-bottom:0} +.imageblock.thumb,.imageblock.th{border-width:6px} +.imageblock.thumb>.title,.imageblock.th>.title{padding:0 .125em} +.image.left,.image.right{margin-top:.25em;margin-bottom:.25em;display:inline-block;line-height:0} +.image.left{margin-right:.625em} +.image.right{margin-left:.625em} +a.image{text-decoration:none;display:inline-block} +a.image object{pointer-events:none} +sup.footnote,sup.footnoteref{font-size:.875em;position:static;vertical-align:super} +sup.footnote a,sup.footnoteref a{text-decoration:none} +sup.footnote a:active,sup.footnoteref a:active{text-decoration:underline} +#footnotes{padding-top:.75em;padding-bottom:.75em;margin-bottom:.625em} +#footnotes hr{width:20%;min-width:6.25em;margin:-.25em 0 .75em;border-width:1px 0 0} +#footnotes .footnote{padding:0 .375em 0 .225em;line-height:1.3334;font-size:.875em;margin-left:1.2em;margin-bottom:.2em} +#footnotes .footnote a:first-of-type{font-weight:bold;text-decoration:none;margin-left:-1.05em} +#footnotes .footnote:last-of-type{margin-bottom:0} +#content #footnotes{margin-top:-.625em;margin-bottom:0;padding:.75em 0} +div.unbreakable{page-break-inside:avoid} +.big{font-size:larger} +.small{font-size:smaller} +.underline{text-decoration:underline} +.overline{text-decoration:overline} +.line-through{text-decoration:line-through} +.aqua{color:#00bfbf} +.aqua-background{background:#00fafa} +.black{color:#000} +.black-background{background:#000} +.blue{color:#0000bf} +.blue-background{background:#0000fa} +.fuchsia{color:#bf00bf} +.fuchsia-background{background:#fa00fa} +.gray{color:#606060} +.gray-background{background:#7d7d7d} +.green{color:#006000} +.green-background{background:#007d00} +.lime{color:#00bf00} +.lime-background{background:#00fa00} +.maroon{color:#600000} +.maroon-background{background:#7d0000} +.navy{color:#000060} +.navy-background{background:#00007d} +.olive{color:#606000} +.olive-background{background:#7d7d00} +.purple{color:#600060} +.purple-background{background:#7d007d} +.red{color:#bf0000} +.red-background{background:#fa0000} +.silver{color:#909090} +.silver-background{background:#bcbcbc} +.teal{color:#006060} +.teal-background{background:#007d7d} +.white{color:#bfbfbf} +.white-background{background:#fafafa} +.yellow{color:#bfbf00} +.yellow-background{background:#fafa00} +span.icon>.fa{cursor:default} +a span.icon>.fa{cursor:inherit} +.admonitionblock td.icon [class^="fa icon-"]{font-size:2.5em;text-shadow:1px 1px 2px rgba(0,0,0,.5);cursor:default} +.admonitionblock td.icon .icon-note::before{content:"\f05a";color:#19407c} +.admonitionblock td.icon .icon-tip::before{content:"\f0eb";text-shadow:1px 1px 2px rgba(155,155,0,.8);color:#111} +.admonitionblock td.icon .icon-warning::before{content:"\f071";color:#bf6900} +.admonitionblock td.icon .icon-caution::before{content:"\f06d";color:#bf3400} +.admonitionblock td.icon .icon-important::before{content:"\f06a";color:#bf0000} +.conum[data-value]{display:inline-block;color:#fff!important;background:rgba(0,0,0,.8);border-radius:50%;text-align:center;font-size:.75em;width:1.67em;height:1.67em;line-height:1.67em;font-family:"Open Sans","DejaVu Sans",sans-serif;font-style:normal;font-weight:bold} +.conum[data-value] *{color:#fff!important} +.conum[data-value]+b{display:none} +.conum[data-value]::after{content:attr(data-value)} +pre .conum[data-value]{position:relative;top:-.125em} +b.conum *{color:inherit!important} +.conum:not([data-value]):empty{display:none} +dt,th.tableblock,td.content,div.footnote{text-rendering:optimizeLegibility} +h1,h2,p,td.content,span.alt,summary{letter-spacing:-.01em} +p strong,td.content strong,div.footnote strong{letter-spacing:-.005em} +p,blockquote,dt,td.content,td.hdlist1,span.alt,summary{font-size:1.0625rem} +p{margin-bottom:1.25rem} +.sidebarblock p,.sidebarblock dt,.sidebarblock td.content,p.tableblock{font-size:1em} +.exampleblock>.content{background:#fffef7;border-color:#e0e0dc;box-shadow:0 1px 4px #e0e0dc} +.print-only{display:none!important} +@page{margin:1.25cm .75cm} +@media print{*{box-shadow:none!important;text-shadow:none!important} +html{font-size:80%} +a{color:inherit!important;text-decoration:underline!important} +a.bare,a[href^="#"],a[href^="mailto:"]{text-decoration:none!important} +a[href^="http:"]:not(.bare)::after,a[href^="https:"]:not(.bare)::after{content:"(" attr(href) ")";display:inline-block;font-size:.875em;padding-left:.25em} +abbr[title]{border-bottom:1px dotted} +abbr[title]::after{content:" (" attr(title) ")"} +pre,blockquote,tr,img,object,svg{page-break-inside:avoid} +thead{display:table-header-group} +svg{max-width:100%} +p,blockquote,dt,td.content{font-size:1em;orphans:3;widows:3} +h2,h3,#toctitle,.sidebarblock>.content>.title{page-break-after:avoid} +#header,#content,#footnotes,#footer{max-width:none} +#toc,.sidebarblock,.exampleblock>.content{background:none!important} +#toc{border-bottom:1px solid #dddddf!important;padding-bottom:0!important} +body.book #header{text-align:center} +body.book #header>h1:first-child{border:0!important;margin:2.5em 0 1em} +body.book #header .details{border:0!important;display:block;padding:0!important} +body.book #header .details span:first-child{margin-left:0!important} +body.book #header .details br{display:block} +body.book #header .details br+span::before{content:none!important} +body.book #toc{border:0!important;text-align:left!important;padding:0!important;margin:0!important} +body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-break-before:always} +.listingblock code[data-lang]::before{display:block} +#footer{padding:0 .9375em} +.hide-on-print{display:none!important} +.print-only{display:block!important} +.hide-for-print{display:none!important} +.show-for-print{display:inherit!important}} +@media amzn-kf8,print{#header>h1:first-child{margin-top:1.25rem} +.sect1{padding:0!important} +.sect1+.sect1{border:0} +#footer{background:none} +#footer-text{color:rgba(0,0,0,.6);font-size:.9em}} +@media amzn-kf8{#header,#content,#footnotes,#footer{padding:0}} +</style> +</head> +<body class="book toc2 toc-left"> +<div id="header"> +<h1>Apache UIMA⢠- Ruta</h1> +<div class="details"> +<span id="author" class="author">Apache UIMA⢠Development Community</span><br> +<span id="revnumber">version 3.4.0</span> +</div> +<div id="toc" class="toc2"> +<div id="toctitle">Ruta Documentation</div> +<ul class="sectlevel1"> +<li><a href="#_ugr.tools.ruta.overview">1. Apache UIMA Ruta Overview</a> +<ul class="sectlevel2"> +<li><a href="#_ugr.tools.ruta.overview.intro">1.1. What is Apache UIMA Ruta?</a></li> +<li><a href="#_ugr.tools.ruta.overview.gettingstarted">1.2. Getting started</a></li> +<li><a href="#_ugr.tools.ruta.overview.coreconcepts">1.3. Core Concepts</a></li> +<li><a href="#_ugr.tools.ruta.overview.examples">1.4. Learning by Example</a></li> +<li><a href="#_ugr.tools.ruta.ae">1.5. UIMA Analysis Engines</a> +<ul class="sectlevel3"> +<li><a href="#_ugr.tools.ruta.ae.basic">1.5.1. Ruta Engine</a> +<ul class="sectlevel4"> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter">Configuration Parameters</a> +<ul class="sectlevel5"> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.mainscript">mainScript</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.rules">rules</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.rulesscriptname">rulesScriptName</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.scriptencoding">scriptEncoding</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.scriptpaths">scriptPaths</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.descriptorpaths">descriptorPaths</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.resourcepaths">resourcePaths</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.additionalscripts">additionalScripts</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.additionalengines">additionalEngines</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.additionaluimafitengines">additionalUimafitEngines</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.additionalextensions">additionalExtensions</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.reloadscript">reloadScript</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.seeders">seeders</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.defaultfilteredtypes">defaultFilteredTypes</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.removebasics">removeBasics</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.indexonly">indexOnly</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.indexskiptypes">indexSkipTypes</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.indexonlymentionedtypes">indexOnlyMentionedTypes</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.indexadditionally">indexAdditionally</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.reindexonly">reindexOnly</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.reindexskiptypes">reindexSkipTypes</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.reindexonlymentionedtypes">reindexOnlyMentionedTypes</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.reindexadditionally">reindexAdditionally</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.indexupdatemode">indexUpdateMode</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.validateinternalindexing">validateInternalIndexing</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.emptyisinvisible">emptyIsInvisible</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.modifydatapath">modifyDataPath</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.strictimports">strictImports</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.typeignorepattern">typeIgnorePattern</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.dynamicanchoring">dynamicAnchoring</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.lowmemoryprofile">lowMemoryProfile</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.simplegreedyforcomposed">simpleGreedyForComposed</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.debug">debug</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.debugwithmatches">debugWithMatches</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.debugaddtoindexes">debugAddToIndexes</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.debugonlyfor">debugOnlyFor</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.profile">profile</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.statistics">statistics</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.createdby">createdBy</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.varnames">varNames</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.varvalues">varValues</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.dictremovews">dictRemoveWS</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.csvseparator">csvSeparator</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.inferencevisitors">inferenceVisitors</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.maxrulematches">maxRuleMatches</a></li> +<li><a href="#_ugr.tools.ruta.ae.basic.parameter.maxruleelementmatches">maxRuleElementMatches</a></li> +</ul> +</li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.ae.annotationwriter">1.5.2. Annotation Writer</a> +<ul class="sectlevel4"> +<li><a href="#_ugr.tools.ruta.ae.annotationwriter.parameter">Configuration Parameters</a> +<ul class="sectlevel5"> +<li><a href="#_ugr.tools.ruta.ae.annotationwriter.parameter.output">Output</a></li> +<li><a href="#_ugr.tools.ruta.ae.annotationwriter.parameter.encoding">Encoding</a></li> +<li><a href="#_ugr.tools.ruta.ae.annotationwriter.parameter.type">Type</a></li> +</ul> +</li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.ae.plaintext">1.5.3. Plain Text Annotator</a></li> +<li><a href="#_ugr.tools.ruta.ae.modifier">1.5.4. Modifier</a> +<ul class="sectlevel4"> +<li><a href="#_ugr.tools.ruta.ae.modifier.parameter">Configuration Parameters</a> +<ul class="sectlevel5"> +<li><a href="#_ugr.tools.ruta.ae.modifier.parameter.stylemap">styleMap</a></li> +<li><a href="#_ugr.tools.ruta.ae.modifier.parameter.descriptorpaths">descriptorPaths</a></li> +<li><a href="#_ugr.tools.ruta.ae.modifier.parameter.outputlocation">outputLocation</a></li> +<li><a href="#_ugr.tools.ruta.ae.modifier.parameter.outputview">outputView</a></li> +</ul> +</li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.ae.html">1.5.5. HTML Annotator</a> +<ul class="sectlevel4"> +<li><a href="#_ugr.tools.ruta.ae.html.parameter">Configuration Parameters</a> +<ul class="sectlevel5"> +<li><a href="#_ugr.tools.ruta.ae.html.parameter.onlycontent">onlyContent</a></li> +</ul> +</li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter">1.5.6. HTML Converter</a> +<ul class="sectlevel4"> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter">Configuration Parameters</a> +<ul class="sectlevel5"> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.outputview">outputView</a></li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.inputview">inputView</a></li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.newlineinducingtags">newlineInducingTags</a></li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.replacelinebreaks">replaceLinebreaks</a></li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.linebreakreplacement">replaceLinebreaks</a></li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.conversionpolicy">conversionPolicy</a></li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.conversionpatterns">conversionPatterns</a></li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.conversionreplacements">conversionReplacements</a></li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.skipwhitespaces">skipWhitespaces</a></li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.processall">processAll</a></li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.newlineinducingtagregexp">newlineInducingTagRegExp</a></li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.gapinducingtags">gapInducingTags</a></li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.gaptext">gapText</a></li> +<li><a href="#_ugr.tools.ruta.ae.htmlconverter.parameter.usespacegap">useSpaceGap</a></li> +</ul> +</li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.ae.stylemap">1.5.7. Style Map Creator</a> +<ul class="sectlevel4"> +<li><a href="#_ugr.tools.ruta.ae.stylemap.parameter">Configuration Parameters</a> +<ul class="sectlevel5"> +<li><a href="#_ugr.tools.ruta.ae.stylemap.parameter.stylemap">styleMap</a></li> +<li><a href="#_ugr.tools.ruta.ae.stylemap.parameter.descriptorpaths">descriptorPaths</a></li> +</ul> +</li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.ae.cutter">1.5.8. Cutter</a> +<ul class="sectlevel4"> +<li><a href="#_ugr.tools.ruta.ae.cutter.parameter">Configuration Parameters</a> +<ul class="sectlevel5"> +<li><a href="#_ugr.tools.ruta.ae.cutter.parameter.keep">keep</a></li> +<li><a href="#_ugr.tools.ruta.ae.cutter.parameter.inputview">inputView</a></li> +<li><a href="#_ugr.tools.ruta.ae.cutter.parameter.outputview">outputView</a></li> +</ul> +</li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.ae.view">1.5.9. View Writer</a> +<ul class="sectlevel4"> +<li><a href="#_ugr.tools.ruta.ae.view.parameter">Configuration Parameters</a> +<ul class="sectlevel5"> +<li><a href="#_ugr.tools.ruta.ae.view.parameter.output">output</a></li> +<li><a href="#_ugr.tools.ruta.ae.view.parameter.inputview">inputView</a></li> +<li><a href="#_ugr.tools.ruta.ae.view.parameter.outputview">outputView</a></li> +</ul> +</li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.ae.xmi">1.5.10. XMI Writer</a> +<ul class="sectlevel4"> +<li><a href="#_ugr.tools.ruta.ae.xmi.parameter">Configuration Parameters</a> +<ul class="sectlevel5"> +<li><a href="#_ugr.tools.ruta.ae.xmi.parameter.output">Output</a></li> +</ul> +</li> +</ul> +</li> +</ul> +</li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.language.language">2. Apache UIMA Ruta Language</a> +<ul class="sectlevel2"> +<li><a href="#_ugr.tools.ruta.language.syntax">2.1. Syntax</a></li> +<li><a href="#_ugr.tools.ruta.language.anchoring">2.2. Rule elements and their matching order</a></li> +<li><a href="#_ugr.tools.ruta.language.seeding">2.3. Basic annotations and tokens</a></li> +<li><a href="#_ugr.tools.ruta.language.quantifier">2.4. Quantifiers</a> +<ul class="sectlevel3"> +<li><a href="#_ugr.tools.ruta.language.quantifier.sg">2.4.1. * Star Greedy</a></li> +<li><a href="#_ugr.tools.ruta.language.quantifier.sr">2.4.2. *? Star Reluctant</a></li> +<li><a href="#_ugr.tools.ruta.language.quantifier.pg">2.4.3. + Plus Greedy</a></li> +<li><a href="#_ugr.tools.ruta.language.quantifier.pr">2.4.4. +? Plus Reluctant</a></li> +<li><a href="#_ugr.tools.ruta.language.quantifier.qg">2.4.5. ? Question Greedy</a></li> +<li><a href="#_ugr.tools.ruta.language.quantifier.qr">2.4.6. ?? Question Reluctant</a></li> +<li><a href="#_ugr.tools.ruta.language.quantifier.mmg">2.4.7. [x,y] Min Max Greedy</a></li> +<li><a href="#_ugr.tools.ruta.language.quantifier.mmr">2.4.8. [x,y]? Min Max Reluctant</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.language.declarations">2.5. Declarations</a> +<ul class="sectlevel3"> +<li><a href="#_ugr.tools.ruta.language.declarations.type">2.5.1. Types</a></li> +<li><a href="#_ugr.tools.ruta.language.declarations.variable">2.5.2. Variables</a></li> +<li><a href="#_ugr.tools.ruta.language.declarations.ressource">2.5.3. Resources</a></li> +<li><a href="#_ugr.tools.ruta.language.declarations.scripts">2.5.4. Scripts</a></li> +<li><a href="#_ugr.tools.ruta.language.declarations.components">2.5.5. Components</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.language.expressions">2.6. Expressions</a> +<ul class="sectlevel3"> +<li><a href="#_ugr.tools.ruta.language.expressions.type">2.6.1. Type Expressions</a></li> +<li><a href="#_ugr.tools.ruta.language.expressions.annotation">2.6.2. Annotation Expressions</a></li> +<li><a href="#_ugr.tools.ruta.language.expressions.number">2.6.3. Number Expressions</a></li> +<li><a href="#_ugr.tools.ruta.language.expressions.string">2.6.4. String Expressions</a></li> +<li><a href="#_ugr.tools.ruta.language.expressions.boolean">2.6.5. Boolean Expressions</a></li> +<li><a href="#_ugr.tools.ruta.language.expressions.lists">2.6.6. List Expressions</a></li> +<li><a href="#_ugr.tools.ruta.language.expressions.features">2.6.7. Feature Expressions</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.language.conditions">2.7. Conditions</a> +<ul class="sectlevel3"> +<li><a href="#_ugr.tools.ruta.language.conditions.after">2.7.1. AFTER</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.and">2.7.2. AND</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.before">2.7.3. BEFORE</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.contains">2.7.4. CONTAINS</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.contextcount">2.7.5. CONTEXTCOUNT</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.count">2.7.6. COUNT</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.currentcount">2.7.7. CURRENTCOUNT</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.endswith">2.7.8. ENDSWITH</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.feature">2.7.9. FEATURE</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.if">2.7.10. IF</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.inlist">2.7.11. INLIST</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.is">2.7.12. IS</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.last">2.7.13. LAST</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.mofn">2.7.14. MOFN</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.near">2.7.15. NEAR</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.not">2.7.16. NOT</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.or">2.7.17. OR</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.parse">2.7.18. PARSE</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.partof">2.7.19. PARTOF</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.partofneq">2.7.20. PARTOFNEQ</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.position">2.7.21. POSITION</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.regexp">2.7.22. REGEXP</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.score">2.7.23. SCORE</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.size">2.7.24. SIZE</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.startswith">2.7.25. STARTSWITH</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.totalcount">2.7.26. TOTALCOUNT</a></li> +<li><a href="#_ugr.tools.ruta.language.conditions.vote">2.7.27. VOTE</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.language.actions">2.8. Actions</a> +<ul class="sectlevel3"> +<li><a href="#_ugr.tools.ruta.language.actions.add">2.8.1. ADD</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.addfiltertype">2.8.2. ADDFILTERTYPE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.addretaintype">2.8.3. ADDRETAINTYPE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.assign">2.8.4. ASSIGN</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.call">2.8.5. CALL</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.clear">2.8.6. CLEAR</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.color">2.8.7. COLOR</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.configure">2.8.8. CONFIGURE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.create">2.8.9. CREATE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.del">2.8.10. DEL</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.dynamicanchoring">2.8.11. DYNAMICANCHORING</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.exec">2.8.12. EXEC</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.fill">2.8.13. FILL</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.filtertype">2.8.14. FILTERTYPE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.gather">2.8.15. GATHER</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.get">2.8.16. GET</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.getfeature">2.8.17. GETFEATURE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.getlist">2.8.18. GETLIST</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.greedyanchoring">2.8.19. GREEDYANCHORING</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.log">2.8.20. LOG</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.mark">2.8.21. MARK</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.markfast">2.8.22. MARKFAST</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.markfirst">2.8.23. MARKFIRST</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.marklast">2.8.24. MARKLAST</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.markonce">2.8.25. MARKONCE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.markscore">2.8.26. MARKSCORE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.marktable">2.8.27. MARKTABLE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.matchedtext">2.8.28. MATCHEDTEXT</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.merge">2.8.29. MERGE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.remove">2.8.30. REMOVE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.removeduplicate">2.8.31. REMOVEDUPLICATE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.removefiltertype">2.8.32. REMOVEFILTERTYPE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.removeretaintype">2.8.33. REMOVERETAINTYPE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.replace">2.8.34. REPLACE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.retaintype">2.8.35. RETAINTYPE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.setfeature">2.8.36. SETFEATURE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.shift">2.8.37. SHIFT</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.split">2.8.38. SPLIT</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.transfer">2.8.39. TRANSFER</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.trie">2.8.40. TRIE</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.trim">2.8.41. TRIM</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.unmark">2.8.42. UNMARK</a></li> +<li><a href="#_ugr.tools.ruta.language.actions.unmarkall">2.8.43. UNMARKALL</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.language.filtering">2.9. Robust extraction using filtering</a></li> +<li><a href="#_ugr.tools.ruta.language.wildcard">2.10. Wildcard #</a></li> +<li><a href="#_ugr.tools.ruta.language.optional">2.11. Optional match _</a></li> +<li><a href="#_ugr.tools.ruta.language.labels">2.12. Label expressions</a></li> +<li><a href="#_ugr.tools.ruta.language.blocks">2.13. Blocks</a> +<ul class="sectlevel3"> +<li><a href="#_ugr.tools.ruta.language.blocks.block">2.13.1. BLOCK</a> +<ul class="sectlevel4"> +<li><a href="#_ugr.tools.ruta.language.blocks.block.condition">Conditioned statements</a></li> +<li><a href="#_ugr.tools.ruta.language.blocks.block.foreach">Loops with restriction of the matching window</a></li> +<li><a href="#_ugr.tools.ruta.language.blocks.block.procedure">Procedures</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.language.blocks.foreach">2.13.2. FOREACH</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.language.inlined">2.14. Inlined rules</a></li> +<li><a href="#_ugr.tools.ruta.language.macro">2.15. Macros for conditions and actions</a></li> +<li><a href="#_ugr.tools.ruta.language.score">2.16. Heuristic extraction using scoring rules</a></li> +<li><a href="#_ugr.tools.ruta.language.modification">2.17. Modification</a></li> +<li><a href="#_ugr.tools.ruta.language.external_resources">2.18. External resources</a> +<ul class="sectlevel3"> +<li><a href="#_wordlists">2.18.1. WORDLISTs</a></li> +<li><a href="#_wordtables">2.18.2. WORDTABLEs</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.language.regexprule">2.19. Simple Rules based on Regular Expressions</a></li> +<li><a href="#_ugr.tools.ruta.language.extensions">2.20. Language Extensions</a> +<ul class="sectlevel3"> +<li><a href="#_ugr.tools.ruta.language.extensions.core_ext">2.20.1. Provided Extensions</a> +<ul class="sectlevel4"> +<li><a href="#_ugr.tools.ruta.language.extensions.core_ext.documentblock">DOCUMENTBLOCK</a></li> +<li><a href="#_ugr.tools.ruta.language.extensions.core_ext.onlyfirst">ONLYFIRST</a></li> +<li><a href="#_ugr.tools.ruta.language.extensions.core_ext.onlyonce">ONLYONCE</a></li> +<li><a href="#_ugr.tools.ruta.language.extensions.core_ext.stringfunctions">Stringfunctions</a> +<ul class="sectlevel5"> +<li><a href="#_firstchartouppercaseistringexpression_expr">firstCharToUpperCase(IStringExpression expr)</a></li> +<li><a href="#_replacefirstistringexpression_expr_istringexpressionsearchtermistringexpressionreplacement">replaceFirst(IStringExpression expr, IStringExpressionsearchTerm,IStringExpressionreplacement)</a></li> +<li><a href="#_replaceallistringexpression_expr_istringexpressionsearchtermistringexpressionreplacement">replaceAll(IStringExpression expr, IStringExpressionsearchTerm,IStringExpressionreplacement)</a></li> +<li><a href="#_substringistringexpression_expr_inumberexpression_frominumberexpression_to">substring(IStringExpression expr, INumberExpression from,INumberExpression to)</a></li> +<li><a href="#_tolowercaseistringexpression_expr">toLowerCase(IStringExpression expr)</a></li> +<li><a href="#_touppercaseistringexpression_expr">toUpperCase(IStringExpression expr)</a></li> +<li><a href="#_containsistringexpression_expristringexpression_contains">contains(IStringExpression expr,IStringExpression contains)</a></li> +<li><a href="#_endswithistringexpression_expristringexpression_expr">endsWith(IStringExpression expr,IStringExpression expr)</a></li> +<li><a href="#_startswithistringexpression_expristringexpression_expr">startsWith(IStringExpression expr,IStringExpression expr)</a></li> +<li><a href="#_equalsistringexpression_expristringexpression_expr_and_equalsignorecaseexprexpr">equals(IStringExpression expr,IStringExpression expr) and equalsIgnoreCase(expr,expr)</a></li> +<li><a href="#_isemptyistringexpression_expr_and_equalsignorecaseexprexpr">isEmpty(IStringExpression expr) and equalsIgnoreCase(expr,expr)</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.language.extensions.core_ext.typefunctions">typeFromString</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.language.extensions.new">2.20.2. Adding new Language Elements</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.language.internal_indxexing">2.21. Internal indexing and reindexing</a> +<ul class="sectlevel3"> +<li><a href="#_ugr.tools.ruta.language.internal_indxexing.why">2.21.1. Why additional indexing?</a></li> +<li><a href="#_ugr.tools.ruta.language.internal_indxexing.how">2.21.2. How is it stored, created and updated?</a></li> +<li><a href="#_ugr.tools.ruta.language.internal_indxexing.optimize">2.21.3. How to optimize the performance?</a></li> +</ul> +</li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.workbench">3. Apache UIMA Ruta Workbench</a> +<ul class="sectlevel2"> +<li><a href="#_section.ugr.tools.ruta.workbench.install">3.1. Installation</a></li> +<li><a href="#_section.ugr.tools.ruta.workbench.overview">3.2. UIMA Ruta Workbench Overview</a></li> +<li><a href="#_section.ugr.tools.ruta.workbench.projects">3.3. UIMA Ruta Projects</a> +<ul class="sectlevel3"> +<li><a href="#_section.ugr.tools.ruta.workbench.projects.create_projects">3.3.1. UIMA Ruta create project wizard</a></li> +</ul> +</li> +<li><a href="#_section.ugr.tools.ruta.workbench.ruta_perspective">3.4. UIMA Ruta Perspective</a> +<ul class="sectlevel3"> +<li><a href="#_section.ugr.tools.ruta.workbench.ruta_perspective.annotation_browser">3.4.1. Annotation Browser</a></li> +<li><a href="#_section.ugr.tools.ruta.workbench.ruta_perspective.selection">3.4.2. Selection</a></li> +</ul> +</li> +<li><a href="#_section.ugr.tools.ruta.workbench.explain_perspective">3.5. UIMA Ruta Explain Perspective</a> +<ul class="sectlevel3"> +<li><a href="#_section.ugr.tools.ruta.workbench.explain_perspective.applied_rules">3.5.1. Applied Rules</a></li> +<li><a href="#_section.ugr.tools.ruta.workbench.explain_perspective.matched_and_failed_rules">3.5.2. Matched Rules and Failed Rules</a></li> +<li><a href="#_section.ugr.tools.ruta.workbench.explain_perspective.rule_elements">3.5.3. Rule Elements</a></li> +<li><a href="#_section.ugr.tools.ruta.workbench.explain_perspective.inlined_rules">3.5.4. Inlined Rules</a></li> +<li><a href="#_section.ugr.tools.ruta.workbench.explain_perspective.covering_rules">3.5.5. Covering Rules</a></li> +<li><a href="#_section.ugr.tools.ruta.workbench.explain_perspective.rule_list">3.5.6. Rule List</a></li> +<li><a href="#_section.ugr.tools.ruta.workbench.explain_perspective.created_by">3.5.7. Created By</a></li> +<li><a href="#_section.ugr.tools.ruta.workbench.explain_perspective.statistics">3.5.8. Statistics</a></li> +</ul> +</li> +<li><a href="#_section.tools.ruta.workbench.cde">3.6. UIMA Ruta CDE perspective</a> +<ul class="sectlevel3"> +<li><a href="#_section.tools.ruta.workbench.cde.documents">3.6.1. CDE Documents view</a></li> +<li><a href="#_section.tools.ruta.workbench.cde.constraints">3.6.2. CDE Constraints view</a></li> +<li><a href="#_section.tools.ruta.workbench.cde.result">3.6.3. CDE Result view</a></li> +</ul> +</li> +<li><a href="#_section.ugr.tools.ruta.workbench.ruta_query">3.7. Ruta Query View</a></li> +<li><a href="#_section.ugr.tools.ruta.workbench.testing">3.8. Testing</a> +<ul class="sectlevel3"> +<li><a href="#_section.ugr.tools.ruta.workbench.testing.usage">3.8.1. Usage</a></li> +<li><a href="#_section.ugr.tools.ruta.workbench.testing.evaluators">3.8.2. Evaluators</a></li> +</ul> +</li> +<li><a href="#_section.tools.ruta.workbench.textruler">3.9. TextRuler</a> +<ul class="sectlevel3"> +<li><a href="#_section.tools.ruta.workbench.textruler.learner">3.9.1. Included rule learning algorithms</a> +<ul class="sectlevel4"> +<li><a href="#_section.tools.ruta.workbench.textruler.lp2">LP2</a></li> +<li><a href="#_section.tools.ruta.workbench.textruler.whisk">WHISK</a></li> +<li><a href="#_section.tools.ruta.workbench.textruler.trabal">TraBaL</a></li> +<li><a href="#_section.tools.ruta.workbench.textruler.kep">KEP</a></li> +</ul> +</li> +<li><a href="#_section.tools.ruta.workbench.textruler.ui">3.9.2. The TextRuler view</a></li> +</ul> +</li> +<li><a href="#_section.tools.ruta.workbench.check">3.10. Check Annotations view</a></li> +<li><a href="#_section.ugr.tools.ruta.workbench.create_dictionaries">3.11. Creation of Tree Word Lists</a></li> +<li><a href="#_ugr.tools.ruta.workbench.apply">3.12. Apply a UIMA Ruta script to a folder</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.howtos">4. Apache UIMA Ruta HowTos</a> +<ul class="sectlevel2"> +<li><a href="#_ugr.tools.ruta.ae.basic.apply">4.1. Apply UIMA Ruta Analysis Engine in plain Java</a></li> +<li><a href="#_ugr.tools.ruta.integration">4.2. Integrating UIMA Ruta in an existing UIMA Annotator</a> +<ul class="sectlevel3"> +<li><a href="#_ugr.tools.ruta.ae.integration.mvn">4.2.1. Adding Ruta to our Annotator</a></li> +<li><a href="#_ugr.tools.ruta.ae.integration.loading">4.2.2. Developing Ruta rules and applying them from inside Java code</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.maven">4.3. UIMA Ruta Maven Plugin</a> +<ul class="sectlevel3"> +<li><a href="#_generate_goal">4.3.1. generate goal</a></li> +<li><a href="#_twl_goal">4.3.2. twl goal</a></li> +<li><a href="#_mtwl_goal">4.3.3. mtwl goal</a></li> +</ul> +</li> +<li><a href="#_ugr.tools.ruta.archetype">4.4. UIMA Ruta Maven Archetype</a></li> +<li><a href="#_section.tools.ruta.workbench.textruler.example">4.5. Induce rules with the TextRuler framework</a></li> +<li><a href="#_section.tools.ruta.howto.html">4.6. HTML annotations in plain text</a></li> +<li><a href="#_section.tools.ruta.howto.sorter">4.7. Sorting files with UIMA Ruta</a></li> +<li><a href="#_section.tools.ruta.howto.xml">4.8. Converting XML documents with UIMA Ruta</a></li> +</ul> +</li> +</ul> +</div> +</div> +<div id="content"> +<div id="preamble"> +<div class="sectionbody"> +<div class="paragraph"> +<p>Copyright © 2023 The Apache Software Foundation</p> +</div> +<h4 id="_license_and_disclaimer" class="discrete">License and Disclaimer</h4> +<div class="paragraph"> +<p>The ASF licenses this documentation to you under the Apache License, Version 2.0 (the "License"); +you may not use this documentation except in compliance with the License. You may obtain a copy of +the License at</p> +</div> +<div class="paragraph text-center"> +<p><a href="http://www.apache.org/licenses/LICENSE-2.0" class="bare">http://www.apache.org/licenses/LICENSE-2.0</a></p> +</div> +<div class="paragraph"> +<p>Unless required by applicable law or agreed to in writing, this documentation and its contents are +distributed under the License on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, +either express or implied. See the License for the specific language governing permissions and +limitations under the License.</p> +</div> +<h4 id="_trademarks" class="discrete">Trademarks</h4> +<div class="paragraph"> +<p>All terms mentioned in the text that are known to be trademarks or service marks have been +appropriately capitalized. Use of such terms in this book should not be regarded as affecting the +validity of the the trademark or service mark.</p> +</div> +</div> +</div> +<div class="sect1"> +<h2 id="_ugr.tools.ruta.overview"><a class="anchor" href="#_ugr.tools.ruta.overview"></a>1. Apache UIMA Ruta Overview</h2> +<div class="sectionbody"> +<div class="sect2"> +<h3 id="_ugr.tools.ruta.overview.intro"><a class="anchor" href="#_ugr.tools.ruta.overview.intro"></a>1.1. What is Apache UIMA Ruta?</h3> +<div class="paragraph"> +<p>Apache UIMA Ruta™ is a rule-based script language supported by Eclipse-based tooling. +The language is designed to enable rapid development of text processing applications within Apache UIMA™. A special focus lies on the intuitive and flexible domain specific language for defining patterns of annotations. +Writing rules for information extraction or other text processing applications is a tedious process. +The Eclipse-based tooling for UIMA Ruta, called the Apache UIMA Ruta Workbench, was created to support the user and to facilitate every step when writing UIMA Ruta rules. +Both the Ruta rule language and the UIMA Ruta Workbench integrate smoothly with Apache UIMA.</p> +</div> +</div> +<div class="sect2"> +<h3 id="_ugr.tools.ruta.overview.gettingstarted"><a class="anchor" href="#_ugr.tools.ruta.overview.gettingstarted"></a>1.2. Getting started</h3> +<div class="paragraph"> +<p>This section gives a short roadmap how to read the documentation and gives some recommendations how to start developing UIMA Ruta-based applications. +This documentation assumes that the reader knows about the core concepts of Apache UIMA. +Knowledge of the meaning and usage of the terms “CAS”, “Feature Structure”, “Annotation”, “Type”, “Type System” and “Analysis Engine” is required. +Please refer to the documentation of Apache UIMA for an introduction.</p> +</div> +<div class="paragraph"> +<p>Unexperienced users that want to learn about UIMA Ruta can start with the next two sections: <a href="#_ugr.tools.ruta.overview.coreconcepts">Section 1.3</a> gives a short overview of the core ideas and features of the UIMA Ruta language and Workbench. +This section introduces the main concepts of the UIMA Ruta language. +It explains how UIMA Ruta rules are composed and applied, and discusses the advantages of the UIMA Ruta system. +The following <a href="#_ugr.tools.ruta.overview.examples">Section 1.4</a> approaches the UIMA Ruta language using a different perspective. +Here, the language is introduced by examples. +The first example starts with explaining how a simple rule looks like, and each following example extends the syntax or semantics of the UIMA Ruta language. +After the consultation of these two sections, the reader is expected to have gained enough knowledge to start writing her first UIMA Ruta-based application.</p> +</div> +<div class="paragraph"> +<p>The UIMA Ruta Workbench was created to support the user and to facilitate the development process. +It is strongly recommended to use this Eclipse-based IDE since it, for example, automatically configures the component descriptors and provides editing support like syntax checking. <a href="#_section.ugr.tools.ruta.workbench.install">Section 3.1</a> describes how the UIMA Ruta Workbench is installed. +UIMA Ruta rules can also be applied on CAS without using the UIMA Ruta Workbench. <a href="#_ugr.tools.ruta.ae.basic.apply">Section 4.1</a> contains examples how to execute UIMA Ruta rules in plain java. +A good way to get started with UIMA Ruta is to play around with an exemplary UIMA Ruta project, e.g., “ExampleProject” in the example-projects of the UIMA Ruta source release. +This UIMA Ruta project contains some simple rules for processing citation metadata.</p> +</div> +<div class="paragraph"> +<p><a href="#_ugr.tools.ruta.language.language">Chapter 2</a> and <a href="#_ugr.tools.ruta.workbench">Chapter 3</a> provide more detailed descriptions and can be referred to in order to gain knowledge of specific parts of the UIMA Ruta language or the UIMA Ruta Workbench.</p> +</div> +</div> +<div class="sect2"> +<h3 id="_ugr.tools.ruta.overview.coreconcepts"><a class="anchor" href="#_ugr.tools.ruta.overview.coreconcepts"></a>1.3. Core Concepts</h3> +<div class="paragraph"> +<p>The UIMA Ruta language is an imperative rule language extended with scripting elements. +A UIMA Ruta rule defines a pattern of annotations with additional conditions. +If this pattern applies, then the actions of the rule are performed on the matched annotations. +A rule is composed of a sequence of rule elements and a rule element essentially consist of four parts: A matching condition, an optional quantifier, a list of conditions and a list of actions. +The matching condition is typically a type of an annotation by which the rule element matches on the covered text of one of those annotations. +The quantifier specifies, whether it is necessary that the rule element successfully matches and how often the rule element may match. +The list of conditions specifies additional constraints that the matched text or annotations need to fulfill. +The list of actions defines the consequences of the rule and often creates new annotations or modifies existing annotations. +They are only applied if all rule elements of the rule have successfully matched. +Examples for UIMA Ruta rules can be found in <a href="#_ugr.tools.ruta.overview.examples">Section 1.4</a>.</p> +</div> +<div class="paragraph"> +<p>When UIMA Ruta rules are applied on a document, respectively on a CAS, then they are always grouped in a script file. +However, a UIMA Ruta script file does not only contain rules, but also other statements. +First of all, each script file starts with a package declaration followed by a list of optional imports. +Then, common statements like rules, type declarations or blocks build the body and functionality of a script. <a href="#_ugr.tools.ruta.ae.basic.apply">Section 4.1</a> gives an example, how UIMA Ruta scripts can be applied in plain Java. +UIMA Ruta script files are naturally organized in UIMA Ruta projects, which is a concept of the UIMA Ruta Workbench. +The structure of a UIMA Ruta project is described in <a href="#_section.ugr.tools.ruta.workbench.projects">Section 3.3</a></p> +</div> +<div class="paragraph"> +<p>The inference of UIMA Ruta rules, that is the approach how the rules are applied, can be described as imperative depth-first matching. +In contrast to similar rule-based systems, UIMA Ruta rules are applied in the order they are defined in the script. +The imperative execution of the matching rules may have disadvantages, but also many advantages like an increased rate of development or an easier explanation. +The second main property of the UIMA Ruta inference is the depth-first matching. +When a rule matches on a pattern of annotations, then an alternative is always tracked until it has matched or failed before the next alternative is considered. +The behavior of a rule may change, if it has already matched on an early alternative and thus has performed an action, which influences some constraints of the rule. +Examples, how UIMA Ruta rules are applied, are given in <a href="#_ugr.tools.ruta.overview.examples">Section 1.4</a>.</p> +</div> +<div class="paragraph"> +<p>The UIMA Ruta language provides the possibility to approach an annotation problem in different ways. +Let us distinguish some approaches as an example. +It is common in the UIMA Ruta language to create many annotations of different types. +These annotations are probably not the targeted annotation of the domain, but can be helpful to incrementally approximate the annotation of interest. +This enables the user to work “bottom-up” and “top-down”. +In the former approach, the rules add incrementally more complex annotations using simple ones until the target annotation can be created. +In the latter approach, the rules get more specific while partitioning the document in smaller segments, which result in the targeted annotation, eventually. +By using many “helper”-annotations, the engineering task becomes easier and more comprehensive. +The UIMA Ruta language provides distinctive language elements for different tasks. +There are, for example, actions that are able to create new annotations, actions that are able to remove annotations and actions that are able to modify the offsets of annotations. +This enables, amongst other things, a transformation-based approach. +The user starts by creating general rules that are able to annotate most of the text fragments of interest. +Then, instead of making these rules more complex by adding more conditions for situations where they fail, additional rules are defined that correct the mistakes of the general rules, e.g., by deleting false positive annotations. <a href="#_ugr.tools.ruta.overview.examples">Section 1.4</a> provides some examples how UIMA Ruta rules can be engineered.</p> +</div> +<div class="paragraph"> +<p>To write rules manually is a tedious and error-prone process. +The <a href="#_ugr.tools.ruta.workbench">UIMA Ruta Workbench</a> was developed to facilitate writing rules by providing as much tooling support as possible. +This includes, for example, syntax checking and auto completion, which make the development less error-prone. +The user can annotate documents and use these documents as unit tests for test-driven development or quality maintenance. +Sometimes, it is necessary to debug the rules because they do not match as expected. +In this case, the explanation perspective provides views that explain every detail of the matching process. +Finally, the UIMA Ruta language can also be used by the tooling, for example, by the “Query” view. +Here, UIMA Ruta rules can be used as query statements in order to investigate annotated documents.</p> +</div> +<div class="paragraph"> +<p>UIMA Ruta smoothly integrates with Apache UIMA. +First of all, the UIMA Ruta rules are applied using a generic Analysis Engine and thus UIMA Ruta scripts can easily be added to Apache UIMA pipelines. +UIMA Ruta also provides the functionality to import and use other UIMA components like Analysis Engines and Type Systems. +UIMA Ruta rules can refer to every type defined in an imported type system, and the UIMA Ruta Workbench generates a type system descriptor file containing all types that were defined in a script file. +Any Analysis Engine can be executed by rules as long as their implementation is available in the classpath. +Therefore, functionality outsourced in an arbitrary Analysis Engine can be added and used within UIMA Ruta.</p> +</div> +</div> +<div class="sect2"> +<h3 id="_ugr.tools.ruta.overview.examples"><a class="anchor" href="#_ugr.tools.ruta.overview.examples"></a>1.4. Learning by Example</h3> +<div class="paragraph"> +<p>This section gives an introduction to the UIMA Ruta language by explaining the rule syntax and inference with some simplified examples. +It is recommended to use the UIMA Ruta Workbench to write UIMA Ruta rules in order to gain advantages like syntax checking. +A short description how to install the UIMA Ruta Workbench is given <a href="#_section.ugr.tools.ruta.workbench.install">here</a>. +The following examples make use of the annotations added by the default seeding of the UIMA Ruta Analysis Engine. +Their meaning is explained along with the examples.</p> +</div> +<div class="paragraph"> +<p>The first example consists of a declaration of a type followed by a simple rule. +Type declarations always start with the keyword “DECLARE” followed by the short name of the new type. +The namespace of the type is equal to the package declaration of the script file. +If there is no package declaration, then the types declared in the script file have no namespace. +There is also the possibility to create more complex types with features or specific parent types, but this will be neglected for now. +In the example, a simple annotation type with the short name “Animal” is defined. +After the declaration of the type, a rule with one rule element is given. +UIMA Ruta rules in general can consist of a sequence of rule elements. +Simple rule elements themselves consist of four parts: A matching condition, an optional quantifier, an optional list of conditions and an optional list of actions. +The rule element in the following example has a matching condition “W”, an annotation type standing for normal words. +Statements like declarations and rules always end with a semicolon.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>DECLARE Animal; +W{REGEXP("dog") -> MARK(Animal)};</code></pre> +</div> +</div> +<div class="paragraph"> +<p>The rule element also contains one condition and one action, both surrounded by curly parentheses. +In order to distinguish conditions from actions they are separated by “→”. +The condition “REGEXP("dog")” indicates that the matched word must match the regular expression “dog”. +If the matching condition and the additional regular expression are fulfilled, then the action is executed, which creates a new annotation of the type “Animal” with the same offsets as the matched token. +The default seeder does actually not add annotations of the type “W”, but annotations of the types “SW” and “CW” for small written words and capitalized words, which both have the parent type “W”.</p> +</div> +<div class="paragraph"> +<p>There is also the possibility to add implicit actions and conditions, which have no explicit name, but consist only of an expression. +In the part of the conditions, boolean expressions and feature match expression can be applied, and in the part of the actions, type expressions and feature assignment expression can be added. +The following example contains one implicit condition and one implicit action. +The additional condition is a boolean expression (boolean variable), which is set to “true”, and therefore is always fulfills the condition. +The “MARK” action was replaced by a type expression, which refer to the type “Animal”. +The following rule shows, therefore, the same behavior as the rule in the last example.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>DECLARE Animal; +BOOLEAN active = true; +W{REGEXP("dog"), active -> Animal};</code></pre> +</div> +</div> +<div class="paragraph"> +<p>There is also a special kind of rules, which follow a different syntax and semantic, and enables a simplified creation of annotations based on regular expression. +The following rule, for example, creates an “Animal” annotation for each occurrence of “dog” or “cat”.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>DECLARE Animal; +"dog|cat" -> Animal;</code></pre> +</div> +</div> +<div class="paragraph"> +<p>Since it is tedious to create Animal annotations by matching on different regular expression, we apply an external dictionary in the next example. +The first line defines a word list named “AnimalsList”, which is located in the resource folder (the file “Animals.txt” contains one animal name in each line). After the declaration of the type, a rule uses this word list to find all occurrences of animals in the complete document.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>WORDLIST AnimalsList = 'Animals.txt'; +DECLARE Animal; +Document{-> MARKFAST(Animal, AnimalsList)};</code></pre> +</div> +</div> +<div class="paragraph"> +<p>The matching condition of the rule element refers to the complete document, or more specific to the annotation of the type “DocumentAnnotation”, which covers the whole document. +The action “MARKFAST” of this rule element creates an annotation of the type “Animal” for each found entry of the dictionary “AnimalsList”.</p> +</div> +<div class="paragraph"> +<p>The next example introduces rules with more than one rule element, whereby one of them is a composed rule element. +The following rule tries to annotate occurrences of animals separated by commas, e.g., “dog, cat, bird”.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>DECLARE AnimalEnum; +(Animal COMMA)+{-> MARK(AnimalEnum,1,2)} Animal;</code></pre> +</div> +</div> +<div class="paragraph"> +<p>The rule consists of two rule elements, with “(Animal COMMA)+{→ MARK(AnimalEnum,1,2)}” being the first rule element and “Animal” the second one. +Let us take a closer look at the first rule element. +This rule element is actually composed of two normal rule elements, that are “Animal” and “COMMA”, and contains a greedy quantifier and one action. +This rule element, therefore, matches on one Animal annotation and a following comma. +This is repeated until one of the inner rule elements does not match anymore. +Then, there has to be another Animal annotation afterwards, specified by the second rule element of the rule. +In this case, the rule matches and its action is executed: The MARK action creates a new annotation of the type “AnimalEnum”. +However, in contrast to the previous examples, this action also contains two numbers. +These numbers refer to the rule elements that should be used to calculate the span of the created annotation. +The numbers “1, 2” state that the new annotation should start with the first rule element, the composed one, and should end with the second rule element.</p> +</div> +<div class="paragraph"> +<p>Let us make the composed rule element more complex. +The following rule also matches on lists of animals, which are separated by semicolon. +A disjunctive rule element is therefore added, indicated by the symbol “|”, which matches on annotations of the type “COMMA” or “SEMICOLON”.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>(Animal (COMMA | SEMICOLON))+{-> MARK(AnimalEnum,1,2)} Animal;</code></pre> +</div> +</div> +<div class="paragraph"> +<p>There two more special symbols that can be used to link rule elements. +If the symbol “|” is replaced by the symbol <code>&</code> in the last example, then the token after the animal need to be a comma and a semicolon, which is of course not possible. +Another symbol with a special meaning is “%”, which cannot only be used within a composed rule element (parentheses). This symbol can be interpreted as a global “and”: It links several rules, which only fire, if all rules have successfully matched. +In the following example, an annotation of the type “FoundIt” is created, if the document contains two periods in a row and two commas in a row:</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>PERIOD PERIOD % COMMA COMMA{-> FoundIt};</code></pre> +</div> +</div> +<div class="paragraph"> +<p>There is a “wild card” (“#”) rule element, which can be used to skip some text or annotations until the next rule element is able to match.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>DECLARE Sentence; +PERIOD #{-> MARK(Sentence)} PERIOD;</code></pre> +</div> +</div> +<div class="paragraph"> +<p>This rule annotates everything between two “PERIOD” annotations with the type “Sentence”. +Please note that the resulting annotations is automatically trimmed using the current filtering settings. +Conditions at wild card rule elements should by avoided and only be used by advanced users.</p> +</div> +<div class="paragraph"> +<p>Another special rule element is called “optional” (“_”). Sometimes, an annotation should be created on a text position if it is not followed by an annotation of a specific property. +In contrast to normal rule elements with optional quantifier, the optional rule element does not need to match at all.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>W ANY{-PARTOF(NUM)}; +W _{-PARTOF(NUM)};</code></pre> +</div> +</div> +<div class="paragraph"> +<p>The two rules in this example specify the same pattern: A word that is not followed by a number. +The difference between the rules shows itself at the border of the matching window, e.g., at the end of the document. +If the document contains only a single word, the first rule will not match successfully because the second rule element already fails at its matching condition. +The second rule, however, will successfully match due to the optional rule element.</p> +</div> +<div class="paragraph"> +<p>Rule elements can contain more then one condition. +The rule in the next example tries to identify headlines, which are bold, underlined and end with a colon.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>DECLARE Headline; +Paragraph{CONTAINS(Bold, 90, 100, true), + CONTAINS(Underlined, 90, 100, true), ENDSWITH(COLON) + -> MARK(Headline)};</code></pre> +</div> +</div> +<div class="paragraph"> +<p>The matching condition of this rule element is given with the type “Paragraph”, thus the rule takes a look at all Paragraph annotations. +The rule matches only if the three conditions, separated by commas, are fulfilled. +The first condition “CONTAINS(Bold, 90, 100, true)” states that 90%-100% of the matched paragraph annotation should also be annotated with annotations of the type “Bold”. +The boolean parameter “true” indicates that amount of Bold annotations should be calculated relatively to the matched annotation. +The two numbers “90,100” are, therefore, interpreted as percent amounts. +The exact calculation of the coverage is dependent on the tokenization of the document and is neglected for now. +The second condition “CONTAINS(Underlined, 90, 100, true)” consequently states that the paragraph should also contain at least 90% of annotations of the type “underlined”. +The third condition “ENDSWITH(COLON)” finally forces the Paragraph annotation to end with a colon. +It is only fulfilled, if there is an annotation of the type “COLON”, which has an end offset equal to the end offset of the matched Paragraph annotation.</p> +</div> +<div class="paragraph"> +<p>The readability and maintenance of rules does not increase, if more conditions are added. +One of the strengths of the UIMA Ruta language is that it provides different approaches to solve an annotation task. +The next two examples introduce actions for transformation-based rules.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>Headline{-CONTAINS(W) -> UNMARK(Headline)};</code></pre> +</div> +</div> +<div class="paragraph"> +<p>This rule consists of one condition and one action. +The condition “-CONTAINS(W)” is negated (indicated by the character “-”), and is therefore only fulfilled, if there are no annotations of the type “W” within the bound of the matched Headline annotation. +The action “UNMARK(Headline)” removes the matched Headline annotation. +Put into simple words, headlines that contain no words at all are not headlines.</p> +</div> +<div class="paragraph"> +<p>The next rule does not remove an annotation, but changes its offsets dependent on the context.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>Headline{-> SHIFT(Headline, 1, 2)} COLON;</code></pre> +</div> +</div> +<div class="paragraph"> +<p>Here, the action “SHIFT(Headline, 1, 2)” expands the matched Headline annotation to the next colon, if that Headline annotation is followed by a COLON annotation.</p> +</div> +<div class="paragraph"> +<p>UIMA Ruta rules can contain arbitrary conditions and actions, which is illustrated by the next example.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>DECLARE Month, Year, Date; +ANY{INLIST(MonthsList) -> MARK(Month), MARK(Date,1,3)} + PERIOD? NUM{REGEXP(".{2,4}") -> MARK(Year)};</code></pre> +</div> +</div> +<div class="paragraph"> +<p>This rule consists of three rule elements. +The first one matches on every token, which has a covered text that occurs in a word lists named “MonthsList”. +The second rule element is optional and does not need to be fulfilled, which is indicated by the quantifier “?”. +The last rule element matches on numbers that fulfill the regular expression “REGEXP(".{2,4}"” and are therefore at least two characters to a maximum of four characters long. +If this rule successfully matches on a text passage, then its three actions are executed: An annotation of the type “Month” is created for the first rule element, an annotation of the type “Year” is created for the last rule element and an annotation of the type “Date” is created for the span of all three rule elements. +If the word list contains the correct entries, then this rule matches on strings like “Dec. 2004”, “July 85” or “11.2008” and creates the corresponding annotations.</p> +</div> +<div class="paragraph"> +<p>After introducing the composition of rule elements, the default matching strategy is examined. +The two rules in the next example create an annotation for a sequence of arbitrary tokens with the only difference of one condition.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>DECLARE Text1, Text2; +ANY+{ -> MARK(Text1)}; +ANY+{-PARTOF(Text2) -> MARK(Text2)};</code></pre> +</div> +</div> +<div class="paragraph"> +<p>The first rule matches on each occurrence of an arbitrary token and continues this until the end of the document is reached. +This is caused by the greedy quantifier “+”. +Note that this rule considers each occurrence of a token and is therefore executed for each token resulting many overlapping annotations. +This behavior is illustrated with an example: When applied on the document “Peter works for Frank”, the rule creates four annotations with the covered texts “Peter works for Frank”, “works for Frank”, “for Frank” and “Frank”. +The rule first tries to match on the token “Peter” and continues its matching. +Then, it tries to match on the token “works” and continues its matching, and so on.</p> +</div> +<div class="paragraph"> +<p>In this example, the second rule only returns one annotation, which covers the complete document. +This is caused by the additional condition “-PARTOF(Text2)”. +The PARTOF condition is fulfilled, if the matched annotation is located within an annotation of the given type, or put in simple words, if the matched annotation is part of an annotation of the type “Text2”. +When applied on the document “Peter works for Frank”, the rule matches on the first token “Peter”, continues its match and creates an annotation of the type “Text2” for the complete document. +Then it tries to match on the second token “works”, but fails, because this token is already part of an Text2 annotation.</p> +</div> +<div class="paragraph"> +<p>UIMA Ruta rules can not only be used to create or modify annotations, but also to create features for annotations. +The next example defines and assigns a relation of employment, by storing the given annotations as feature values.</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>DECLARE Annotation EmplRelation + (Employee employeeRef, Employer employerRef); +Sentence{CONTAINS(EmploymentIndicator) -> CREATE(EmplRelation, + "employeeRef" = Employee, "employerRef" = Employer)};</code></pre> +</div> +</div> +<div class="paragraph"> +<p>The first statement of this example is a declaration that defines a new type of annotation named “EmplRelation”. +This annotation has two features: One feature with the name “employeeRef” of the type “Employee” and one feature with the name “employerRef” of the type “Employer”. +If the parent type is Annotation, then it can be omitted resulting in the following declaration:</p> +</div> +<div class="listingblock"> +<div class="content"> +<pre class="highlight"><code>DECLARE EmplRelation (Employee employeeRef, Employer employerRef);</code></pre> +</div> +</div> +<div class="paragraph"> +<p>The second statement of the example, which is a simple rule, creates one annotation of the type “EmplRelation” for each Sentence annotation that contains at least one annotation of the type “EmploymentIndicator”. +Additionally to creating an annotation, the CREATE action also assigns an annotation of the “Employee”, which needs to be located within the span of the matched sentence, to the feature “employeeRef” and an Employer annotation to the feature “employerRef”. +The annotations mentioned in this example need to be present in advance.</p> +</div> +<div class="paragraph"> +<p>In order to refer to annotations and, for example, assigning them to some features, special kinds of local and global variables can be utilized.
[... 7268 lines stripped ...]