Git commit b447f326e0cd58027893ac4eef16d7778026178e by T.C. Hollingsworth. Committed on 23/02/2014 at 02:14. Pushed by hollingsworth into branch 'frameworks'.
merge highlighting doc into development chapter M +0 -3 doc/kate/index.docbook M +964 -0 doc/katepart/development.docbook D +0 -968 doc/katepart/highlighting.docbook http://commits.kde.org/kate/b447f326e0cd58027893ac4eef16d7778026178e diff --git a/doc/kate/index.docbook b/doc/kate/index.docbook index 3af330b..a953f81 100644 --- a/doc/kate/index.docbook +++ b/doc/kate/index.docbook @@ -16,7 +16,6 @@ <!ENTITY plugins-chapter SYSTEM "plugins.docbook"> <!ENTITY development-chapter SYSTEM "development.docbook"> <!ENTITY vi-chapter SYSTEM "../katepart/vi.docbook"> - <!ENTITY regexp-appendix SYSTEM "../katepart/regular-expressions.docbook"> <!ENTITY % addindex "IGNORE"> <!ENTITY % English "INCLUDE"><!-- change language only here --> ]> @@ -260,8 +259,6 @@ documentation</para></listitem> </chapter> -&highlighting-appendix; - ®exp-appendix; <appendix id="installation"> diff --git a/doc/katepart/development.docbook b/doc/katepart/development.docbook index 69b0090..4381821 100644 --- a/doc/katepart/development.docbook +++ b/doc/katepart/development.docbook @@ -20,6 +20,970 @@ and share your enhancements with the world!</para> </sect1> +<sect1 id="highlight"> +<title>Working with Syntax Highlighting</title> + +<sect2 id="highlight-overview"> + +<title>Overview</title> + +<para>Syntax Highlighting is what makes the editor automatically +display text in different styles/colors, depending on the function of +the string in relation to the purpose of the file. In program source +code for example, control statements may be rendered bold, while data +types and comments get different colors from the rest of the +text. This greatly enhances the readability of the text, and thus +helps the author to be more efficient and productive.</para> + +<mediaobject> +<imageobject><imagedata format="PNG" fileref="highlighted.png"/></imageobject> +<textobject><phrase>A Perl function, rendered with syntax +highlighting.</phrase></textobject> +<caption><para>A Perl function, rendered with syntax highlighting.</para> +</caption> +</mediaobject> + +<mediaobject> +<imageobject><imagedata format="PNG" fileref="unhighlighted.png"/></imageobject> +<textobject><phrase>The same Perl function, without +highlighting.</phrase></textobject> +<caption><para>The same Perl function, without highlighting.</para></caption> +</mediaobject> + +<para>Of the two examples, which is easiest to read?</para> + +<para>&kappname; comes with a flexible, configurable and capable system +for doing syntax highlighting, and the standard distribution provides +definitions for a wide range of programming, scripting and markup +languages and other text file formats. In addition you can +provide your own definitions in simple &XML; files.</para> + +<para>&kappname; will automatically detect the right syntax rules when you +open a file, based on the &MIME; Type of the file, determined by its +extension, or, if it has none, the contents. Should you experience a +bad choice, you can manually set the syntax to use from the +<menuchoice><guimenu>Tools</guimenu><guisubmenu>Highlighting +</guisubmenu></menuchoice> menu.</para> + +<para>The styles and colors used by each syntax highlight definition +can be configured using the <link +linkend="prefcolors-highlighting-text-styles">Highlighting Text Styles</link> tab of the +<link linkend="config-dialog">Config Dialog</link>, while the &MIME; Types and +file extensions it should be used for are handled by the <link +linkend="pref-open-save-modes-filetypes">Modes & Filetypes</link> +tab.</para> + +<note> +<para>Syntax highlighting is there to enhance the readability of +correct text, but you cannot trust it to validate your text. Marking +text for syntax is difficult depending on the format you are using, +and in some cases the authors of the syntax rules will be proud if 98% +of text gets correctly rendered, though most often you need a rare +style to see the incorrect 2%.</para> +</note> + +<tip> +<para>You can download updated or additional syntax highlight +definitions from the &kappname; website by clicking the +<guibutton>Download Highlighting Files...</guibutton> button in the <link +linkend="pref-open-save-modes-filetypes">Modes & Filetypes</link> tab of the <link +linkend="config-dialog">Config Dialog</link>.</para> +</tip> + +</sect2> + +<sect2 id="katehighlight-system"> + +<title>The &kappname; Syntax Highlight System</title> + +<para>This section will discuss the &kappname; syntax highlighting +mechanism in more detail. It is for you if you want to know about +it, or if you want to change or create syntax definitions.</para> + +<sect3 id="katehighlight-howitworks"> + +<title>How it Works</title> + +<para>Whenever you open a file, one of the first things the &kappname; +editor does is detect which syntax definition to use for the +file. While reading the text of the file, and while you type away in +it, the syntax highlighting system will analyze the text using the +rules defined by the syntax definition and mark in it where different +contexts and styles begin and end.</para> + +<para>When you type in the document, the new text is analyzed and marked on the +fly, so that if you delete a character that is marked as the beginning or end +of a context, the style of surrounding text changes accordingly.</para> + +<para>The syntax definitions used by the &kappname; Syntax Highlighting System are +&XML; files, containing +<itemizedlist> +<listitem><para>Rules for detecting the role of text, organized into context blocks</para></listitem> +<listitem><para>Keyword lists</para></listitem> +<listitem><para>Style Item definitions</para></listitem> +</itemizedlist> +</para> + +<para>When analyzing the text, the detection rules are evaluated in +the order in which they are defined, and if the beginning of the +current string matches a rule, the related context is used. The start +point in the text is moved to the final point at which that rule +matched and a new loop of the rules begins, starting in the context +set by the matched rule.</para> + +</sect3> + +<sect3 id="highlight-system-rules"> +<title>Rules</title> + +<para>The detection rules are the heart of the highlighting detection +system. A rule is a string, character or <link +linkend="regular-expressions">regular expression</link> against which +to match the text being analyzed. It contains information about which +style to use for the matching part of the text. It may switch the +working context of the system either to an explicitly mentioned +context or to the previous context used by the text.</para> + +<para>Rules are organized in context groups. A context group is used +for main text concepts within the format, for example quoted text +strings or comment blocks in program source code. This ensures that +the highlighting system does not need to loop through all rules when +it is not necessary, and that some character sequences in the text can +be treated differently depending on the current context. +</para> + +<para>Contexts may be generated dynamically to allow the usage of instance +specific data in rules.</para> + +</sect3> + +<sect3 id="highlight-context-styles-keywords"> +<title>Context Styles and Keywords</title> + +<para>In some programming languages, integer numbers are treated +differently from floating point ones by the compiler (the program that +converts the source code to a binary executable), and there may be +characters having a special meaning within a quoted string. In such +cases, it makes sense to render them differently from the surroundings +so that they are easy to identify while reading the text. So even if +they do not represent special contexts, they may be seen as such by +the syntax highlighting system, so that they can be marked for +different rendering.</para> + +<para>A syntax definition may contain as many styles as required to +cover the concepts of the format it is used for.</para> + +<para>In many formats, there are lists of words that represent a +specific concept. For example, in programming languages, control +statements are one concept, data type names another, and built in +functions of the language a third. The &kappname; Syntax Highlighting +System can use such lists to detect and mark words in the text to +emphasize concepts of the text formats.</para> + +</sect3> + +<sect3 id="kate-highlight-system-default-styles"> +<title>Default Styles</title> + +<para>If you open a C++ source file, a &Java; source file and an +<acronym>HTML</acronym> document in &kappname;, you will see that even +though the formats are different, and thus different words are chosen +for special treatment, the colors used are the same. This is because +&kappname; has a predefined list of Default Styles which are employed by +the individual syntax definitions.</para> + +<para>This makes it easy to recognize similar concepts in different +text formats. For example, comments are present in almost any +programming, scripting or markup language, and when they are rendered +using the same style in all languages, you do not have to stop and +think to identify them within the text.</para> + +<tip> +<para>All styles in a syntax definition use one of the default +styles. A few syntax definitions use more styles than there are +defaults, so if you use a format often, it may be worth launching the +configuration dialog to see if some concepts use the same +style. For example, there is only one default style for strings, but as +the Perl programming language operates with two types of strings, you +can enhance the highlighting by configuring those to be slightly +different. All <link linkend="kate-highlight-default-styles">available default styles</link> +will be explained later.</para> +</tip> + +</sect3> + +</sect2> + +<sect2 id="katehighlight-xml-format"> +<title>The Highlight Definition &XML; Format</title> + +<sect3> +<title>Overview</title> + +<para>This section is an overview of the Highlight Definition &XML; +format. Based on a small example it will describe the main components +and their meaning and usage. The next section will go into detail with +the highlight detection rules.</para> + +<para>The formal definition, also known as the <acronym>DTD</acronym>, is stored +in the file <filename>language.dtd</filename> which should be +installed on your system in the folder +<filename>$<envar>KDEDIR</envar>/share/apps/katepart/syntax</filename>. +</para> + +<variablelist> +<title>Main sections of &kappname; Highlight Definition files</title> + +<varlistentry> +<term>A highlighting file contains a header that sets the XML version and the doctype:</term> +<listitem> +<programlisting> +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE language SYSTEM "language.dtd"> +</programlisting> +</listitem> +</varlistentry> + +<varlistentry> +<term>The root of the definition file is the element <userinput>language</userinput>. +Available attributes are:</term> + +<listitem> +<para>Required attributes:</para> +<para><userinput>name</userinput> sets the name of the language. It appears in the menus and dialogs afterwards.</para> +<para><userinput>section</userinput> specifies the category.</para> +<para><userinput>extensions</userinput> defines file extensions, such as "*.cpp;*.h"</para> + +<para>Optional attributes:</para> +<para><userinput>mimetype</userinput> associates files &MIME; type.</para> +<para><userinput>version</userinput> specifies the current version of the definition file.</para> +<para><userinput>kateversion</userinput> specifies the latest supported &kappname; version.</para> +<para><userinput>casesensitive</userinput> defines, whether the keywords are case sensitive or not.</para> +<para><userinput>priority</userinput> is necessary if another highlight definition file uses the same extensions. The higher priority will win.</para> +<para><userinput>author</userinput> contains the name of the author and his email-address.</para> +<para><userinput>license</userinput> contains the license, usually LGPL, Artistic, GPL or others.</para> +<para><userinput>hidden</userinput> defines whether the name should appear in &kappname;'s menus.</para> +<para>So the next line may look like this:</para> +<programlisting> +<language name="C++" version="1.00" kateversion="2.4" section="Sources" extensions="*.cpp;*.h" /> +</programlisting> +</listitem> +</varlistentry> + + +<varlistentry> +<term>Next comes the <userinput>highlighting</userinput> element, which +contains the optional element <userinput>list</userinput> and the required +elements <userinput>contexts</userinput> and <userinput>itemDatas</userinput>.</term> +<listitem> +<para><userinput>list</userinput> elements contain a list of keywords. In +this case the keywords are <emphasis>class</emphasis> and <emphasis>const</emphasis>. +You can add as many lists as you need.</para> +<para>The <userinput>contexts</userinput> element contains all contexts. +The first context is by default the start of the highlighting. There are +two rules in the context <emphasis>Normal Text</emphasis>, which match +the list of keywords with the name <emphasis>somename</emphasis> and a +rule that detects a quote and switches the context to <emphasis>string</emphasis>. +To learn more about rules read the next chapter.</para> +<para>The third part is the <userinput>itemDatas</userinput> element. It +contains all color and font styles needed by the contexts and rules. +In this example, the <userinput>itemData</userinput> <emphasis>Normal Text</emphasis>, +<emphasis>String</emphasis> and <emphasis>Keyword</emphasis> are used. +</para> +<programlisting> + <highlighting> + <list name="somename"> + <item> class </item> + <item> const </item> + </list> + <contexts> + <context attribute="Normal Text" lineEndContext="#pop" name="Normal Text" > + <keyword attribute="Keyword" context="#stay" String="somename" /> + <DetectChar attribute="String" context="string" char="&quot;" /> + </context> + <context attribute="String" lineEndContext="#stay" name="string" > + <DetectChar attribute="String" context="#pop" char="&quot;" /> + </context> + </contexts> + <itemDatas> + <itemData name="Normal Text" defStyleNum="dsNormal" /> + <itemData name="Keyword" defStyleNum="dsKeyword" /> + <itemData name="String" defStyleNum="dsString" /> + </itemDatas> + </highlighting> +</programlisting> +</listitem> +</varlistentry> + +<varlistentry> +<term>The last part of a highlight definition is the optional +<userinput>general</userinput> section. It may contain information +about keywords, code folding, comments and indentation.</term> + +<listitem> +<para>The <userinput>comment</userinput> section defines with what +string a single line comment is introduced. You also can define a +multiline comment using <emphasis>multiLine</emphasis> with the +additional attribute <emphasis>end</emphasis>. This is used if the +user presses the corresponding shortcut for <emphasis>comment/uncomment</emphasis>.</para> +<para>The <userinput>keywords</userinput> section defines whether +keyword lists are case sensitive or not. Other attributes will be +explained later.</para> +<programlisting> + <general> + <comments> + <comment name="singleLine" start="#"/> + </comments> + <keywords casesensitive="1"/> + </general> +</language> +</programlisting> +</listitem> +</varlistentry> + +</variablelist> + + +</sect3> + +<sect3 id="kate-highlight-sections"> +<title>The Sections in Detail</title> +<para>This part will describe all available attributes for contexts, +itemDatas, keywords, comments, code folding and indentation.</para> + +<variablelist> +<varlistentry> +<term>The element <userinput>context</userinput> belongs in the group +<userinput>contexts</userinput>. A context itself defines context specific +rules such as what should happen if the highlight system reaches the end of a +line. Available attributes are:</term> + + +<listitem> +<para><userinput>name</userinput> states the context name. Rules will use this name +to specify the context to switch to if the rule matches.</para> +<para><userinput>lineEndContext</userinput> defines the context the highlight +system switches to if it reaches the end of a line. This may either be a name +of another context, <userinput>#stay</userinput> to not switch the context +(⪚. do nothing) or <userinput>#pop</userinput> which will cause it to leave this +context. It is possible to use for example <userinput>#pop#pop#pop</userinput> +to pop three times, or even <userinput>#pop#pop!OtherContext</userinput> to pop +two times and switch to the context named <userinput>OtherContext</userinput>.</para> +<para><userinput>lineEmptyContext</userinput> defines the context if an empty +line is encountered. Default: #stay.</para> +<para><userinput>fallthrough</userinput> defines if the highlight system switches +to the context specified in fallthroughContext if no rule matches. +Default: <emphasis>false</emphasis>.</para> +<para><userinput>fallthroughContext</userinput> specifies the next context +if no rule matches.</para> +<para><userinput>dynamic</userinput> if <emphasis>true</emphasis>, the context +remembers strings/placeholders saved by dynamic rules. This is needed for HERE +documents for example. Default: <emphasis>false</emphasis>.</para> +</listitem> +</varlistentry> + + +<varlistentry> +<term>The element <userinput>itemData</userinput> is in the group +<userinput>itemDatas</userinput>. It defines the font style and colors. +So it is possible to define your own styles and colors. However, we +recommend you stick to the default styles if possible so that the user +will always see the same colors used in different languages. Though, +sometimes there is no other way and it is necessary to change color +and font attributes. The attributes name and defStyleNum are required, +the others are optional. Available attributes are:</term> + +<listitem> +<para><userinput>name</userinput> sets the name of the itemData. +Contexts and rules will use this name in their attribute +<emphasis>attribute</emphasis> to reference an itemData.</para> +<para><userinput>defStyleNum</userinput> defines which default style to use. +Available default styles are explained in detail later.</para> +<para><userinput>color</userinput> defines a color. Valid formats are +'#rrggbb' or '#rgb'.</para> +<para><userinput>selColor</userinput> defines the selection color.</para> +<para><userinput>italic</userinput> if <emphasis>true</emphasis>, the text will be italic.</para> +<para><userinput>bold</userinput> if <emphasis>true</emphasis>, the text will be bold.</para> +<para><userinput>underline</userinput> if <emphasis>true</emphasis>, the text will be underlined.</para> +<para><userinput>strikeout</userinput> if <emphasis>true</emphasis>, the text will be struck out.</para> +<para><userinput>spellChecking</userinput> if <emphasis>true</emphasis>, the text will be spellchecked.</para> +</listitem> +</varlistentry> + + +<varlistentry> +<term>The element <userinput>keywords</userinput> in the group +<userinput>general</userinput> defines keyword properties. Available attributes are:</term> + +<listitem> +<para><userinput>casesensitive</userinput> may be <emphasis>true</emphasis> +or <emphasis>false</emphasis>. If <emphasis>true</emphasis>, all keywords +are matched case sensitively.</para> +<para><userinput>weakDeliminator</userinput> is a list of characters that +do not act as word delimiters. For example, the dot <userinput>'.'</userinput> +is a word delimiter. Assume a keyword in a <userinput>list</userinput> contains +a dot, it will only match if you specify the dot as a weak delimiter.</para> +<para><userinput>additionalDeliminator</userinput> defines additional delimiters.</para> +<para><userinput>wordWrapDeliminator</userinput> defines characters after which a +line wrap may occur.</para> +<para>Default delimiters and word wrap delimiters are the characters +<userinput>.():!+,-<=>%&*/;?[]^{|}~\</userinput>, space (<userinput>' '</userinput>) +and tabulator (<userinput>'\t'</userinput>).</para> +</listitem> +</varlistentry> + + +<varlistentry> +<term>The element <userinput>comment</userinput> in the group +<userinput>comments</userinput> defines comment properties which are used +for <menuchoice><guimenu>Tools</guimenu><guimenuitem>Comment</guimenuitem></menuchoice> and +<menuchoice><guimenu>Tools</guimenu><guimenuitem>Uncomment</guimenuitem></menuchoice>. +Available attributes are:</term> + +<listitem> +<para><userinput>name</userinput> is either <emphasis>singleLine</emphasis> +or <emphasis>multiLine</emphasis>. If you choose <emphasis>multiLine</emphasis> +the attributes <emphasis>end</emphasis> and <emphasis>region</emphasis> are +required.</para> +<para><userinput>start</userinput> defines the string used to start a comment. +In C++ this would be "/*".</para> +<para><userinput>end</userinput> defines the string used to close a comment. +In C++ this would be "*/".</para> +<para><userinput>region</userinput> should be the name of the foldable +multiline comment. Assume you have <emphasis>beginRegion="Comment"</emphasis> +... <emphasis>endRegion="Comment"</emphasis> in your rules, you should use +<emphasis>region="Comment"</emphasis>. This way uncomment works even if you +do not select all the text of the multiline comment. The cursor only must be +in the multiline comment.</para> +</listitem> +</varlistentry> + + +<varlistentry> +<term>The element <userinput>folding</userinput> in the group +<userinput>general</userinput> defines code folding properties. +Available attributes are:</term> + +<listitem> +<para><userinput>indentationsensitive</userinput> if <emphasis>true</emphasis>, the code folding markers +will be added indentation based, as in the scripting language Python. Usually you +do not need to set it, as it defaults to <emphasis>false</emphasis>.</para> +</listitem> +</varlistentry> + + +<varlistentry> +<term>The element <userinput>indentation</userinput> in the group +<userinput>general</userinput> defines which indenter will be used. However, we strongly +recommend you omit this element, as the indenter usually will be set by either defining +a File Type or by adding a mode line to the text file. If you specify an indenter though, +you will force a specific indentation on the user, which he might not like at all. +Available attributes are:</term> + +<listitem> +<para><userinput>mode</userinput> is the name of the indenter. Available indenters +right now are: <emphasis>normal, cstyle, haskell, lilypond, lisp, python, ruby</emphasis> +and <emphasis>xml</emphasis>.</para> +</listitem> +</varlistentry> + + +</variablelist> + + +</sect3> + +<sect3 id="kate-highlight-default-styles"> +<title>Available Default Styles</title> +<para>Default Styles were <link linkend="kate-highlight-system-default-styles">already explained</link>, +as a short summary: Default styles are predefined font and color styles.</para> +<variablelist> +<varlistentry> +<term>So here are only the list of available default styles:</term> +<listitem> +<para><userinput>dsNormal</userinput>, used for normal text.</para> +<para><userinput>dsKeyword</userinput>, used for keywords.</para> +<para><userinput>dsDataType</userinput>, used for data types.</para> +<para><userinput>dsDecVal</userinput>, used for decimal values.</para> +<para><userinput>dsBaseN</userinput>, used for values with a base other than 10.</para> +<para><userinput>dsFloat</userinput>, used for float values.</para> +<para><userinput>dsChar</userinput>, used for a character.</para> +<para><userinput>dsString</userinput>, used for strings.</para> +<para><userinput>dsComment</userinput>, used for comments.</para> +<para><userinput>dsOthers</userinput>, used for 'other' things.</para> +<para><userinput>dsAlert</userinput>, used for warning messages.</para> +<para><userinput>dsFunction</userinput>, used for function calls.</para> +<para><userinput>dsRegionMarker</userinput>, used for region markers.</para> +<para><userinput>dsError</userinput>, used for error highlighting and wrong syntax.</para> +</listitem> +</varlistentry> +</variablelist> + +</sect3> + +</sect2> + +<sect2 id="kate-highlight-rules-detailled"> +<title>Highlight Detection Rules</title> + +<para>This section describes the syntax detection rules.</para> + +<para>Each rule can match zero or more characters at the beginning of +the string they are tested against. If the rule matches, the matching +characters are assigned the style or <emphasis>attribute</emphasis> +defined by the rule, and a rule may ask that the current context is +switched.</para> + +<para>A rule looks like this:</para> + +<programlisting><RuleName attribute="(identifier)" context="(identifier)" [rule specific attributes] /></programlisting> + +<para>The <emphasis>attribute</emphasis> identifies the style to use +for matched characters by name, and the <emphasis>context</emphasis> +identifies the context to use from here.</para> + +<para>The <emphasis>context</emphasis> can be identified by:</para> + +<itemizedlist> +<listitem> +<para>An <emphasis>identifier</emphasis>, which is the name of the other +context.</para> +</listitem> +<listitem> +<para>An <emphasis>order</emphasis> telling the engine to stay in the +current context (<userinput>#stay</userinput>), or to pop back to a +previous context used in the string (<userinput>#pop</userinput>).</para> +<para>To go back more steps, the #pop keyword can be repeated: +<userinput>#pop#pop#pop</userinput></para> +</listitem> +<listitem> +<para>An <emphasis>order</emphasis> followed by an exclamation mark +(<emphasis>!</emphasis>) and an <emphasis>identifier</emphasis>, which +will make the engine first follow the order and then switch to the +other context, e.g. <userinput>#pop#pop!OtherContext</userinput>.</para> +</listitem> +</itemizedlist> + +<para>Some rules can have <emphasis>child rules</emphasis> which are +then evaluated only if the parent rule matched. The entire matched +string will be given the attribute defined by the parent rule. A rule +with child rules looks like this:</para> + +<programlisting> +<RuleName (attributes)> + <ChildRuleName (attributes) /> + ... +</RuleName> +</programlisting> + + +<para>Rule specific attributes varies and are described in the +following sections.</para> + + +<itemizedlist> +<title>Common attributes</title> +<para>All rules have the following attributes in common and are +available whenever <userinput>(common attributes)</userinput> appears. +<emphasis>attribute</emphasis> and <emphasis>context</emphasis> +are required attributes, all others are optional. +</para> + +<listitem> +<para><emphasis>attribute</emphasis>: An attribute maps to a defined <emphasis>itemData</emphasis>.</para> +</listitem> +<listitem> +<para><emphasis>context</emphasis>: Specify the context to which the highlighting system switches if the rule matches.</para> +</listitem> +<listitem> +<para><emphasis>beginRegion</emphasis>: Start a code folding block. Default: unset.</para> +</listitem> +<listitem> +<para><emphasis>endRegion</emphasis>: Close a code folding block. Default: unset.</para> +</listitem> +<listitem> +<para><emphasis>lookAhead</emphasis>: If <emphasis>true</emphasis>, the +highlighting system will not process the matches length. +Default: <emphasis>false</emphasis>.</para> +</listitem> +<listitem> +<para><emphasis>firstNonSpace</emphasis>: Match only, if the string is +the first non-whitespace in the line. Default: <emphasis>false</emphasis>.</para> +</listitem> +<listitem> +<para><emphasis>column</emphasis>: Match only, if the column matches. Default: unset.</para> +</listitem> +</itemizedlist> + +<itemizedlist> +<title>Dynamic rules</title> +<para>Some rules allow the optional attribute <userinput>dynamic</userinput> +of type boolean that defaults to <emphasis>false</emphasis>. If dynamic is +<emphasis>true</emphasis>, a rule can use placeholders representing the text +matched by a <emphasis>regular expression</emphasis> rule that switched to the +current context in its <userinput>string</userinput> or +<userinput>char</userinput> attributes. In a <userinput>string</userinput>, +the placeholder <replaceable>%N</replaceable> (where N is a number) will be +replaced with the corresponding capture <replaceable>N</replaceable> +from the calling regular expression. In a +<userinput>char</userinput> the placeholder must be a number +<replaceable>N</replaceable> and it will be replaced with the first character of +the corresponding capture <replaceable>N</replaceable> from the calling regular +expression. Whenever a rule allows this attribute it will contain a +<emphasis>(dynamic)</emphasis>.</para> + +<listitem> +<para><emphasis>dynamic</emphasis>: may be <emphasis>(true|false)</emphasis>.</para> +</listitem> +</itemizedlist> + +<sect3 id="highlighting-rules-in-detail"> +<title>The Rules in Detail</title> + +<variablelist> +<varlistentry> +<term>DetectChar</term> +<listitem> +<para>Detect a single specific character. Commonly used for example to +find the ends of quoted strings.</para> +<programlisting><DetectChar char="(character)" (common attributes) (dynamic) /></programlisting> +<para>The <userinput>char</userinput> attribute defines the character +to match.</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>Detect2Chars</term> +<listitem> +<para>Detect two specific characters in a defined order.</para> +<programlisting><Detect2Chars char="(character)" char1="(character)" (common attributes) (dynamic) /></programlisting> +<para>The <userinput>char</userinput> attribute defines the first character to match, +<userinput>char1</userinput> the second.</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>AnyChar</term> +<listitem> +<para>Detect one character of a set of specified characters.</para> +<programlisting><AnyChar String="(string)" (common attributes) /></programlisting> +<para>The <userinput>String</userinput> attribute defines the set of +characters.</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>StringDetect</term> +<listitem> +<para>Detect an exact string.</para> +<programlisting><StringDetect String="(string)" [insensitive="true|false"] (common attributes) (dynamic) /></programlisting> +<para>The <userinput>String</userinput> attribute defines the string +to match. The <userinput>insensitive</userinput> attribute defaults to +<emphasis>false</emphasis> and is passed to the string comparison +function. If the value is <emphasis>true</emphasis> insensitive +comparing is used.</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>WordDetect</term> +<listitem> +<para>Detect an exact string but additionally require word boundaries +such as a dot <userinput>'.'</userinput> or a whitespace on the beginning +and the end of the word. Think of <userinput>\b<string>\b</userinput> +in terms of a regular expression, but it is faster than the rule <userinput>RegExpr</userinput>.</para> +<programlisting><WordDetect String="(string)" [insensitive="true|false"] (common attributes) (dynamic) /></programlisting> +<para>The <userinput>String</userinput> attribute defines the string +to match. The <userinput>insensitive</userinput> attribute defaults to +<emphasis>false</emphasis> and is passed to the string comparison +function. If the value is <emphasis>true</emphasis> insensitive +comparing is used.</para> +<para>Since: Kate 3.5 (KDE 4.5)</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>RegExpr</term> +<listitem> +<para>Matches against a regular expression.</para> +<programlisting><RegExpr String="(string)" [insensitive="true|false"] [minimal="true|false"] (common attributes) (dynamic) /></programlisting> +<para>The <userinput>String</userinput> attribute defines the regular +expression.</para> +<para><userinput>insensitive</userinput> defaults to +<emphasis>false</emphasis> and is passed to the regular expression +engine.</para> +<para><userinput>minimal</userinput> defaults to +<emphasis>false</emphasis> and is passed to the regular expression +engine.</para> +<para>Because the rules are always matched against the beginning of +the current string, a regular expression starting with a caret +(<literal>^</literal>) indicates that the rule should only be +matched against the start of a line.</para> +<para>See <link linkend="regular-expressions">Regular Expressions</link> +for more information on those.</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>keyword</term> +<listitem> +<para>Detect a keyword from a specified list.</para> +<programlisting><keyword String="(list name)" (common attributes) /></programlisting> +<para>The <userinput>String</userinput> attribute identifies the +keyword list by name. A list with that name must exist.</para> +<para>The highlighting system processes keyword rules in a very optimized way. +This makes it an absolute necessity that any keywords to be matched need to be +surrounded by defined delimiters, either implied (the default delimiters), +or explicitly specified within the <emphasis>additionalDeliminator</emphasis> +property of the <emphasis>keywords</emphasis> tag.</para> +<para>If a keyword to be matched shall contain a delimiter character, this +respective character must be added to the <emphasis>weakDeliminator</emphasis> +property of the <emphasis>keywords</emphasis> tag. This character will then +loose its delimiter property in all <emphasis>keyword</emphasis> rules.</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>Int</term> +<listitem> +<para>Detect an integer number.</para> +<para><programlisting><Int (common attributes) (dynamic) /></programlisting></para> +<para>This rule has no specific attributes. Child rules are typically +used to detect combinations of <userinput>L</userinput> and +<userinput>U</userinput> after the number, indicating the integer type +in program code. Actually all rules are allowed as child rules, though, +the <acronym>DTD</acronym> only allows the child rule <userinput>StringDetect</userinput>.</para> +<para>The following example matches integer numbers follows by the character 'L'. +<programlisting> +<Int attribute="Decimal" context="#stay" > + <StringDetect attribute="Decimal" context="#stay" String="L" insensitive="true"/> +</Int> +</programlisting></para> + +</listitem> +</varlistentry> + +<varlistentry> +<term>Float</term> +<listitem> +<para>Detect a floating point number.</para> +<para><programlisting><Float (common attributes) /></programlisting></para> +<para>This rule has no specific attributes. <userinput>AnyChar</userinput> is +allowed as a child rule and typically used to detect combinations, see rule +<userinput>Int</userinput> for reference.</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>HlCOct</term> +<listitem> +<para>Detect an octal point number representation.</para> +<para><programlisting><HlCOct (common attributes) /></programlisting></para> +<para>This rule has no specific attributes.</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>HlCHex</term> +<listitem> +<para>Detect a hexadecimal number representation.</para> +<para><programlisting><HlCHex (common attributes) /></programlisting></para> +<para>This rule has no specific attributes.</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>HlCStringChar</term> +<listitem> +<para>Detect an escaped character.</para> +<para><programlisting><HlCStringChar (common attributes) /></programlisting></para> +<para>This rule has no specific attributes.</para> + +<para>It matches literal representations of characters commonly used in +program code, for example <userinput>\n</userinput> +(newline) or <userinput>\t</userinput> (TAB).</para> + +<para>The following characters will match if they follow a backslash +(<literal>\</literal>): +<userinput>abefnrtv"'?\</userinput>. Additionally, escaped +hexadecimal numbers such as for example <userinput>\xff</userinput> and +escaped octal numbers, for example <userinput>\033</userinput> will +match.</para> + +</listitem> +</varlistentry> + +<varlistentry> +<term>HlCChar</term> +<listitem> +<para>Detect an C character.</para> +<para><programlisting><HlCChar (common attributes) /></programlisting></para> +<para>This rule has no specific attributes.</para> + +<para>It matches C characters enclosed in a tick (Example: <userinput>'c'</userinput>). +The ticks may be a simple character or an escaped character. +See HlCStringChar for matched escaped character sequences.</para> + +</listitem> +</varlistentry> + +<varlistentry> +<term>RangeDetect</term> +<listitem> +<para>Detect a string with defined start and end characters.</para> +<programlisting><RangeDetect char="(character)" char1="(character)" (common attributes) /></programlisting> +<para><userinput>char</userinput> defines the character starting the range, +<userinput>char1</userinput> the character ending the range.</para> +<para>Useful to detect for example small quoted strings and the like, but +note that since the highlighting engine works on one line at a time, this +will not find strings spanning over a line break.</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>LineContinue</term> +<listitem> +<para>Matches a specified char at the end of a line.</para> +<programlisting><LineContinue (common attributes) [char="\"] /></programlisting> +<para><userinput>char</userinput> optional character to match, default is +backslash (<userinput>'\'</userinput>). New since KDE 4.13.</para> +<para>This rule is useful for switching context at end of line. This is needed for + example in C/C++ to continue macros or strings.</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>IncludeRules</term> +<listitem> +<para>Include rules from another context or language/file.</para> +<programlisting><IncludeRules context="contextlink" [includeAttrib="true|false"] /></programlisting> + +<para>The <userinput>context</userinput> attribute defines which context to include.</para> +<para>If it is a simple string it includes all defined rules into the current context, example: +<programlisting><IncludeRules context="anotherContext" /></programlisting></para> + +<para> +If the string contains a <userinput>##</userinput> the highlight system +will look for a context from another language definition with the given name, +for example +<programlisting><IncludeRules context="String##C++" /></programlisting> +would include the context <emphasis>String</emphasis> from the <emphasis>C++</emphasis> +highlighting definition.</para> +<para>If <userinput>includeAttrib</userinput> attribute is +<emphasis>true</emphasis>, change the destination attribute to the one of +the source. This is required to make, for example, commenting work, if text +matched by the included context is a different highlight from the host +context. +</para> +</listitem> +</varlistentry> + + +<varlistentry> +<term>DetectSpaces</term> +<listitem> +<para>Detect whitespaces.</para> +<programlisting><DetectSpaces (common attributes) /></programlisting> + +<para>This rule has no specific attributes.</para> +<para>Use this rule if you know that there can be several whitespaces ahead, +for example in the beginning of indented lines. This rule will skip all +whitespace at once, instead of testing multiple rules and skipping one at a +time due to no match.</para> +</listitem> +</varlistentry> + + +<varlistentry> +<term>DetectIdentifier</term> +<listitem> +<para>Detect identifier strings (as a regular expression: [a-zA-Z_][a-zA-Z0-9_]*).</para> +<programlisting><DetectIdentifier (common attributes) /></programlisting> + +<para>This rule has no specific attributes.</para> +<para>Use this rule to skip a string of word characters at once, rather than +testing with multiple rules and skipping one at a time due to no match.</para> +</listitem> +</varlistentry> + +</variablelist> +</sect3> + +<sect3> +<title>Tips & Tricks</title> + +<itemizedlist> +<para>Once you have understood how the context switching works it will be +easy to write highlight definitions. Though you should carefully check what +rule you choose in what situation. Regular expressions are very mighty, but +they are slow compared to the other rules. So you may consider the following +tips. +</para> + +<listitem> +<para>If you only match two characters use <userinput>Detect2Chars</userinput> +instead of <userinput>StringDetect</userinput>. The same applies to +<userinput>DetectChar</userinput>.</para> +</listitem> +<listitem> +<para>Regular expressions are easy to use but often there is another much +faster way to achieve the same result. Consider you only want to match +the character <userinput>'#'</userinput> if it is the first character in the +line. A regular expression based solution would look like this: +<programlisting><RegExpr attribute="Macro" context="macro" String="^\s*#" /></programlisting> +You can achieve the same much faster in using: +<programlisting><DetectChar attribute="Macro" context="macro" char="#" firstNonSpace="true" /></programlisting> +If you want to match the regular expression <userinput>'^#'</userinput> you +can still use <userinput>DetectChar</userinput> with the attribute <userinput>column="0"</userinput>. +The attribute <userinput>column</userinput> counts characters, so a tabulator is only one character. +</para> +</listitem> +<listitem> +<para>You can switch contexts without processing characters. Assume that you +want to switch context when you meet the string <userinput>*/</userinput>, but +need to process that string in the next context. The below rule will match, and +the <userinput>lookAhead</userinput> attribute will cause the highlighter to +keep the matched string for the next context. +<programlisting><Detect2Chars attribute="Comment" context="#pop" char="*" char1="/" lookAhead="true" /></programlisting> +</para> +</listitem> +<listitem> +<para>Use <userinput>DetectSpaces</userinput> if you know that many whitespaces occur.</para> +</listitem> +<listitem> +<para>Use <userinput>DetectIdentifier</userinput> instead of the regular expression <userinput>'[a-zA-Z_]\w*'</userinput>.</para> +</listitem> +<listitem> +<para>Use default styles whenever you can. This way the user will find a familiar environment.</para> +</listitem> +<listitem> +<para>Look into other XML-files to see how other people implement tricky rules.</para> +</listitem> +<listitem> +<para>You can validate every XML file by using the command +<command>xmllint --dtdvalid language.dtd mySyntax.xml</command>.</para> +</listitem> +<listitem> +<para>If you repeat complex regular expression very often you can use +<emphasis>ENTITIES</emphasis>. Example:</para> +<programlisting> +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE language SYSTEM "language.dtd" +[ + <!ENTITY myref "[A-Za-z_:][\w.:_-]*"> +]> +</programlisting> +<para>Now you can use <emphasis>&myref;</emphasis> instead of the regular +expression.</para> +</listitem> +</itemizedlist> +</sect3> + +</sect2> + +</sect1> + + <sect1 id="dev-scripting"> <title>Scripting with JavaScript</title> diff --git a/doc/katepart/highlighting.docbook b/doc/katepart/highlighting.docbook deleted file mode 100644 index ee6f67b..0000000 --- a/doc/katepart/highlighting.docbook +++ /dev/null @@ -1,968 +0,0 @@ -<appendix id="highlight"> -<appendixinfo> -<authorgroup> -<author><personname><firstname></firstname></personname></author> -<!-- TRANS:ROLES_OF_TRANSLATORS --> -</authorgroup> -</appendixinfo> -<title>Working with Syntax Highlighting</title> - -<sect1 id="highlight-overview"> - -<title>Overview</title> - -<para>Syntax Highlighting is what makes the editor automatically -display text in different styles/colors, depending on the function of -the string in relation to the purpose of the file. In program source -code for example, control statements may be rendered bold, while data -types and comments get different colors from the rest of the -text. This greatly enhances the readability of the text, and thus -helps the author to be more efficient and productive.</para> - -<mediaobject> -<imageobject><imagedata format="PNG" fileref="highlighted.png"/></imageobject> -<textobject><phrase>A Perl function, rendered with syntax -highlighting.</phrase></textobject> -<caption><para>A Perl function, rendered with syntax highlighting.</para> -</caption> -</mediaobject> - -<mediaobject> -<imageobject><imagedata format="PNG" fileref="unhighlighted.png"/></imageobject> -<textobject><phrase>The same Perl function, without -highlighting.</phrase></textobject> -<caption><para>The same Perl function, without highlighting.</para></caption> -</mediaobject> - -<para>Of the two examples, which is easiest to read?</para> - -<para>&kappname; comes with a flexible, configurable and capable system -for doing syntax highlighting, and the standard distribution provides -definitions for a wide range of programming, scripting and markup -languages and other text file formats. In addition you can -provide your own definitions in simple &XML; files.</para> - -<para>&kappname; will automatically detect the right syntax rules when you -open a file, based on the &MIME; Type of the file, determined by its -extension, or, if it has none, the contents. Should you experience a -bad choice, you can manually set the syntax to use from the -<menuchoice><guimenu>Tools</guimenu><guisubmenu>Highlighting -</guisubmenu></menuchoice> menu.</para> - -<para>The styles and colors used by each syntax highlight definition -can be configured using the <link -linkend="prefcolors-highlighting-text-styles">Highlighting Text Styles</link> tab of the -<link linkend="config-dialog">Config Dialog</link>, while the &MIME; Types and -file extensions it should be used for are handled by the <link -linkend="pref-open-save-modes-filetypes">Modes & Filetypes</link> -tab.</para> - -<note> -<para>Syntax highlighting is there to enhance the readability of -correct text, but you cannot trust it to validate your text. Marking -text for syntax is difficult depending on the format you are using, -and in some cases the authors of the syntax rules will be proud if 98% -of text gets correctly rendered, though most often you need a rare -style to see the incorrect 2%.</para> -</note> - -<tip> -<para>You can download updated or additional syntax highlight -definitions from the &kappname; website by clicking the -<guibutton>Download Highlighting Files...</guibutton> button in the <link -linkend="pref-open-save-modes-filetypes">Modes & Filetypes</link> tab of the <link -linkend="config-dialog">Config Dialog</link>.</para> -</tip> - -</sect1> - -<sect1 id="katehighlight-system"> - -<title>The &kappname; Syntax Highlight System</title> - -<para>This section will discuss the &kappname; syntax highlighting -mechanism in more detail. It is for you if you want to know about -it, or if you want to change or create syntax definitions.</para> - -<sect2 id="katehighlight-howitworks"> - -<title>How it Works</title> - -<para>Whenever you open a file, one of the first things the &kappname; -editor does is detect which syntax definition to use for the -file. While reading the text of the file, and while you type away in -it, the syntax highlighting system will analyze the text using the -rules defined by the syntax definition and mark in it where different -contexts and styles begin and end.</para> - -<para>When you type in the document, the new text is analyzed and marked on the -fly, so that if you delete a character that is marked as the beginning or end -of a context, the style of surrounding text changes accordingly.</para> - -<para>The syntax definitions used by the &kappname; Syntax Highlighting System are -&XML; files, containing -<itemizedlist> -<listitem><para>Rules for detecting the role of text, organized into context blocks</para></listitem> -<listitem><para>Keyword lists</para></listitem> -<listitem><para>Style Item definitions</para></listitem> -</itemizedlist> -</para> - -<para>When analyzing the text, the detection rules are evaluated in -the order in which they are defined, and if the beginning of the -current string matches a rule, the related context is used. The start -point in the text is moved to the final point at which that rule -matched and a new loop of the rules begins, starting in the context -set by the matched rule.</para> - -</sect2> - -<sect2 id="highlight-system-rules"> -<title>Rules</title> - -<para>The detection rules are the heart of the highlighting detection -system. A rule is a string, character or <link -linkend="regular-expressions">regular expression</link> against which -to match the text being analyzed. It contains information about which -style to use for the matching part of the text. It may switch the -working context of the system either to an explicitly mentioned -context or to the previous context used by the text.</para> - -<para>Rules are organized in context groups. A context group is used -for main text concepts within the format, for example quoted text -strings or comment blocks in program source code. This ensures that -the highlighting system does not need to loop through all rules when -it is not necessary, and that some character sequences in the text can -be treated differently depending on the current context. -</para> - -<para>Contexts may be generated dynamically to allow the usage of instance -specific data in rules.</para> - -</sect2> - -<sect2 id="highlight-context-styles-keywords"> -<title>Context Styles and Keywords</title> - -<para>In some programming languages, integer numbers are treated -differently from floating point ones by the compiler (the program that -converts the source code to a binary executable), and there may be -characters having a special meaning within a quoted string. In such -cases, it makes sense to render them differently from the surroundings -so that they are easy to identify while reading the text. So even if -they do not represent special contexts, they may be seen as such by -the syntax highlighting system, so that they can be marked for -different rendering.</para> - -<para>A syntax definition may contain as many styles as required to -cover the concepts of the format it is used for.</para> - -<para>In many formats, there are lists of words that represent a -specific concept. For example, in programming languages, control -statements are one concept, data type names another, and built in -functions of the language a third. The &kappname; Syntax Highlighting -System can use such lists to detect and mark words in the text to -emphasize concepts of the text formats.</para> - -</sect2> - -<sect2 id="kate-highlight-system-default-styles"> -<title>Default Styles</title> - -<para>If you open a C++ source file, a &Java; source file and an -<acronym>HTML</acronym> document in &kappname;, you will see that even -though the formats are different, and thus different words are chosen -for special treatment, the colors used are the same. This is because -&kappname; has a predefined list of Default Styles which are employed by -the individual syntax definitions.</para> - -<para>This makes it easy to recognize similar concepts in different -text formats. For example, comments are present in almost any -programming, scripting or markup language, and when they are rendered -using the same style in all languages, you do not have to stop and -think to identify them within the text.</para> - -<tip> -<para>All styles in a syntax definition use one of the default -styles. A few syntax definitions use more styles than there are -defaults, so if you use a format often, it may be worth launching the -configuration dialog to see if some concepts use the same -style. For example, there is only one default style for strings, but as -the Perl programming language operates with two types of strings, you -can enhance the highlighting by configuring those to be slightly -different. All <link linkend="kate-highlight-default-styles">available default styles</link> -will be explained later.</para> -</tip> - -</sect2> - -</sect1> - -<sect1 id="katehighlight-xml-format"> -<title>The Highlight Definition &XML; Format</title> - -<sect2> -<title>Overview</title> - -<para>This section is an overview of the Highlight Definition &XML; -format. Based on a small example it will describe the main components -and their meaning and usage. The next section will go into detail with -the highlight detection rules.</para> - -<para>The formal definition, also known as the <acronym>DTD</acronym>, is stored -in the file <filename>language.dtd</filename> which should be -installed on your system in the folder -<filename>$<envar>KDEDIR</envar>/share/apps/katepart/syntax</filename>. -</para> - -<variablelist> -<title>Main sections of &kappname; Highlight Definition files</title> - -<varlistentry> -<term>A highlighting file contains a header that sets the XML version and the doctype:</term> -<listitem> -<programlisting> -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE language SYSTEM "language.dtd"> -</programlisting> -</listitem> -</varlistentry> - -<varlistentry> -<term>The root of the definition file is the element <userinput>language</userinput>. -Available attributes are:</term> - -<listitem> -<para>Required attributes:</para> -<para><userinput>name</userinput> sets the name of the language. It appears in the menus and dialogs afterwards.</para> -<para><userinput>section</userinput> specifies the category.</para> -<para><userinput>extensions</userinput> defines file extensions, such as "*.cpp;*.h"</para> - -<para>Optional attributes:</para> -<para><userinput>mimetype</userinput> associates files &MIME; type.</para> -<para><userinput>version</userinput> specifies the current version of the definition file.</para> -<para><userinput>kateversion</userinput> specifies the latest supported &kappname; version.</para> -<para><userinput>casesensitive</userinput> defines, whether the keywords are case sensitive or not.</para> -<para><userinput>priority</userinput> is necessary if another highlight definition file uses the same extensions. The higher priority will win.</para> -<para><userinput>author</userinput> contains the name of the author and his email-address.</para> -<para><userinput>license</userinput> contains the license, usually LGPL, Artistic, GPL or others.</para> -<para><userinput>hidden</userinput> defines whether the name should appear in &kappname;'s menus.</para> -<para>So the next line may look like this:</para> -<programlisting> -<language name="C++" version="1.00" kateversion="2.4" section="Sources" extensions="*.cpp;*.h" /> -</programlisting> -</listitem> -</varlistentry> - - -<varlistentry> -<term>Next comes the <userinput>highlighting</userinput> element, which -contains the optional element <userinput>list</userinput> and the required -elements <userinput>contexts</userinput> and <userinput>itemDatas</userinput>.</term> -<listitem> -<para><userinput>list</userinput> elements contain a list of keywords. In -this case the keywords are <emphasis>class</emphasis> and <emphasis>const</emphasis>. -You can add as many lists as you need.</para> -<para>The <userinput>contexts</userinput> element contains all contexts. -The first context is by default the start of the highlighting. There are -two rules in the context <emphasis>Normal Text</emphasis>, which match -the list of keywords with the name <emphasis>somename</emphasis> and a -rule that detects a quote and switches the context to <emphasis>string</emphasis>. -To learn more about rules read the next chapter.</para> -<para>The third part is the <userinput>itemDatas</userinput> element. It -contains all color and font styles needed by the contexts and rules. -In this example, the <userinput>itemData</userinput> <emphasis>Normal Text</emphasis>, -<emphasis>String</emphasis> and <emphasis>Keyword</emphasis> are used. -</para> -<programlisting> - <highlighting> - <list name="somename"> - <item> class </item> - <item> const </item> - </list> - <contexts> - <context attribute="Normal Text" lineEndContext="#pop" name="Normal Text" > - <keyword attribute="Keyword" context="#stay" String="somename" /> - <DetectChar attribute="String" context="string" char="&quot;" /> - </context> - <context attribute="String" lineEndContext="#stay" name="string" > - <DetectChar attribute="String" context="#pop" char="&quot;" /> - </context> - </contexts> - <itemDatas> - <itemData name="Normal Text" defStyleNum="dsNormal" /> - <itemData name="Keyword" defStyleNum="dsKeyword" /> - <itemData name="String" defStyleNum="dsString" /> - </itemDatas> - </highlighting> -</programlisting> -</listitem> -</varlistentry> - -<varlistentry> -<term>The last part of a highlight definition is the optional -<userinput>general</userinput> section. It may contain information -about keywords, code folding, comments and indentation.</term> - -<listitem> -<para>The <userinput>comment</userinput> section defines with what -string a single line comment is introduced. You also can define a -multiline comment using <emphasis>multiLine</emphasis> with the -additional attribute <emphasis>end</emphasis>. This is used if the -user presses the corresponding shortcut for <emphasis>comment/uncomment</emphasis>.</para> -<para>The <userinput>keywords</userinput> section defines whether -keyword lists are case sensitive or not. Other attributes will be -explained later.</para> -<programlisting> - <general> - <comments> - <comment name="singleLine" start="#"/> - </comments> - <keywords casesensitive="1"/> - </general> -</language> -</programlisting> -</listitem> -</varlistentry> - -</variablelist> - - -</sect2> - -<sect2 id="kate-highlight-sections"> -<title>The Sections in Detail</title> -<para>This part will describe all available attributes for contexts, -itemDatas, keywords, comments, code folding and indentation.</para> - -<variablelist> -<varlistentry> -<term>The element <userinput>context</userinput> belongs in the group -<userinput>contexts</userinput>. A context itself defines context specific -rules such as what should happen if the highlight system reaches the end of a -line. Available attributes are:</term> - - -<listitem> -<para><userinput>name</userinput> states the context name. Rules will use this name -to specify the context to switch to if the rule matches.</para> -<para><userinput>lineEndContext</userinput> defines the context the highlight -system switches to if it reaches the end of a line. This may either be a name -of another context, <userinput>#stay</userinput> to not switch the context -(⪚. do nothing) or <userinput>#pop</userinput> which will cause it to leave this -context. It is possible to use for example <userinput>#pop#pop#pop</userinput> -to pop three times, or even <userinput>#pop#pop!OtherContext</userinput> to pop -two times and switch to the context named <userinput>OtherContext</userinput>.</para> -<para><userinput>lineEmptyContext</userinput> defines the context if an empty -line is encountered. Default: #stay.</para> -<para><userinput>fallthrough</userinput> defines if the highlight system switches -to the context specified in fallthroughContext if no rule matches. -Default: <emphasis>false</emphasis>.</para> -<para><userinput>fallthroughContext</userinput> specifies the next context -if no rule matches.</para> -<para><userinput>dynamic</userinput> if <emphasis>true</emphasis>, the context -remembers strings/placeholders saved by dynamic rules. This is needed for HERE -documents for example. Default: <emphasis>false</emphasis>.</para> -</listitem> -</varlistentry> - - -<varlistentry> -<term>The element <userinput>itemData</userinput> is in the group -<userinput>itemDatas</userinput>. It defines the font style and colors. -So it is possible to define your own styles and colors. However, we -recommend you stick to the default styles if possible so that the user -will always see the same colors used in different languages. Though, -sometimes there is no other way and it is necessary to change color -and font attributes. The attributes name and defStyleNum are required, -the others are optional. Available attributes are:</term> - -<listitem> -<para><userinput>name</userinput> sets the name of the itemData. -Contexts and rules will use this name in their attribute -<emphasis>attribute</emphasis> to reference an itemData.</para> -<para><userinput>defStyleNum</userinput> defines which default style to use. -Available default styles are explained in detail later.</para> -<para><userinput>color</userinput> defines a color. Valid formats are -'#rrggbb' or '#rgb'.</para> -<para><userinput>selColor</userinput> defines the selection color.</para> -<para><userinput>italic</userinput> if <emphasis>true</emphasis>, the text will be italic.</para> -<para><userinput>bold</userinput> if <emphasis>true</emphasis>, the text will be bold.</para> -<para><userinput>underline</userinput> if <emphasis>true</emphasis>, the text will be underlined.</para> -<para><userinput>strikeout</userinput> if <emphasis>true</emphasis>, the text will be struck out.</para> -<para><userinput>spellChecking</userinput> if <emphasis>true</emphasis>, the text will be spellchecked.</para> -</listitem> -</varlistentry> - - -<varlistentry> -<term>The element <userinput>keywords</userinput> in the group -<userinput>general</userinput> defines keyword properties. Available attributes are:</term> - -<listitem> -<para><userinput>casesensitive</userinput> may be <emphasis>true</emphasis> -or <emphasis>false</emphasis>. If <emphasis>true</emphasis>, all keywords -are matched case sensitively.</para> -<para><userinput>weakDeliminator</userinput> is a list of characters that -do not act as word delimiters. For example, the dot <userinput>'.'</userinput> -is a word delimiter. Assume a keyword in a <userinput>list</userinput> contains -a dot, it will only match if you specify the dot as a weak delimiter.</para> -<para><userinput>additionalDeliminator</userinput> defines additional delimiters.</para> -<para><userinput>wordWrapDeliminator</userinput> defines characters after which a -line wrap may occur.</para> -<para>Default delimiters and word wrap delimiters are the characters -<userinput>.():!+,-<=>%&*/;?[]^{|}~\</userinput>, space (<userinput>' '</userinput>) -and tabulator (<userinput>'\t'</userinput>).</para> -</listitem> -</varlistentry> - - -<varlistentry> -<term>The element <userinput>comment</userinput> in the group -<userinput>comments</userinput> defines comment properties which are used -for <menuchoice><guimenu>Tools</guimenu><guimenuitem>Comment</guimenuitem></menuchoice> and -<menuchoice><guimenu>Tools</guimenu><guimenuitem>Uncomment</guimenuitem></menuchoice>. -Available attributes are:</term> - -<listitem> -<para><userinput>name</userinput> is either <emphasis>singleLine</emphasis> -or <emphasis>multiLine</emphasis>. If you choose <emphasis>multiLine</emphasis> -the attributes <emphasis>end</emphasis> and <emphasis>region</emphasis> are -required.</para> -<para><userinput>start</userinput> defines the string used to start a comment. -In C++ this would be "/*".</para> -<para><userinput>end</userinput> defines the string used to close a comment. -In C++ this would be "*/".</para> -<para><userinput>region</userinput> should be the name of the foldable -multiline comment. Assume you have <emphasis>beginRegion="Comment"</emphasis> -... <emphasis>endRegion="Comment"</emphasis> in your rules, you should use -<emphasis>region="Comment"</emphasis>. This way uncomment works even if you -do not select all the text of the multiline comment. The cursor only must be -in the multiline comment.</para> -</listitem> -</varlistentry> - - -<varlistentry> -<term>The element <userinput>folding</userinput> in the group -<userinput>general</userinput> defines code folding properties. -Available attributes are:</term> - -<listitem> -<para><userinput>indentationsensitive</userinput> if <emphasis>true</emphasis>, the code folding markers -will be added indentation based, as in the scripting language Python. Usually you -do not need to set it, as it defaults to <emphasis>false</emphasis>.</para> -</listitem> -</varlistentry> - - -<varlistentry> -<term>The element <userinput>indentation</userinput> in the group -<userinput>general</userinput> defines which indenter will be used. However, we strongly -recommend you omit this element, as the indenter usually will be set by either defining -a File Type or by adding a mode line to the text file. If you specify an indenter though, -you will force a specific indentation on the user, which he might not like at all. -Available attributes are:</term> - -<listitem> -<para><userinput>mode</userinput> is the name of the indenter. Available indenters -right now are: <emphasis>normal, cstyle, haskell, lilypond, lisp, python, ruby</emphasis> -and <emphasis>xml</emphasis>.</para> -</listitem> -</varlistentry> - - -</variablelist> - - -</sect2> - -<sect2 id="kate-highlight-default-styles"> -<title>Available Default Styles</title> -<para>Default Styles were <link linkend="kate-highlight-system-default-styles">already explained</link>, -as a short summary: Default styles are predefined font and color styles.</para> -<variablelist> -<varlistentry> -<term>So here are only the list of available default styles:</term> -<listitem> -<para><userinput>dsNormal</userinput>, used for normal text.</para> -<para><userinput>dsKeyword</userinput>, used for keywords.</para> -<para><userinput>dsDataType</userinput>, used for data types.</para> -<para><userinput>dsDecVal</userinput>, used for decimal values.</para> -<para><userinput>dsBaseN</userinput>, used for values with a base other than 10.</para> -<para><userinput>dsFloat</userinput>, used for float values.</para> -<para><userinput>dsChar</userinput>, used for a character.</para> -<para><userinput>dsString</userinput>, used for strings.</para> -<para><userinput>dsComment</userinput>, used for comments.</para> -<para><userinput>dsOthers</userinput>, used for 'other' things.</para> -<para><userinput>dsAlert</userinput>, used for warning messages.</para> -<para><userinput>dsFunction</userinput>, used for function calls.</para> -<para><userinput>dsRegionMarker</userinput>, used for region markers.</para> -<para><userinput>dsError</userinput>, used for error highlighting and wrong syntax.</para> -</listitem> -</varlistentry> -</variablelist> - -</sect2> - -</sect1> - -<sect1 id="kate-highlight-rules-detailled"> -<title>Highlight Detection Rules</title> - -<para>This section describes the syntax detection rules.</para> - -<para>Each rule can match zero or more characters at the beginning of -the string they are tested against. If the rule matches, the matching -characters are assigned the style or <emphasis>attribute</emphasis> -defined by the rule, and a rule may ask that the current context is -switched.</para> - -<para>A rule looks like this:</para> - -<programlisting><RuleName attribute="(identifier)" context="(identifier)" [rule specific attributes] /></programlisting> - -<para>The <emphasis>attribute</emphasis> identifies the style to use -for matched characters by name, and the <emphasis>context</emphasis> -identifies the context to use from here.</para> - -<para>The <emphasis>context</emphasis> can be identified by:</para> - -<itemizedlist> -<listitem> -<para>An <emphasis>identifier</emphasis>, which is the name of the other -context.</para> -</listitem> -<listitem> -<para>An <emphasis>order</emphasis> telling the engine to stay in the -current context (<userinput>#stay</userinput>), or to pop back to a -previous context used in the string (<userinput>#pop</userinput>).</para> -<para>To go back more steps, the #pop keyword can be repeated: -<userinput>#pop#pop#pop</userinput></para> -</listitem> -<listitem> -<para>An <emphasis>order</emphasis> followed by an exclamation mark -(<emphasis>!</emphasis>) and an <emphasis>identifier</emphasis>, which -will make the engine first follow the order and then switch to the -other context, e.g. <userinput>#pop#pop!OtherContext</userinput>.</para> -</listitem> -</itemizedlist> - -<para>Some rules can have <emphasis>child rules</emphasis> which are -then evaluated only if the parent rule matched. The entire matched -string will be given the attribute defined by the parent rule. A rule -with child rules looks like this:</para> - -<programlisting> -<RuleName (attributes)> - <ChildRuleName (attributes) /> - ... -</RuleName> -</programlisting> - - -<para>Rule specific attributes varies and are described in the -following sections.</para> - - -<itemizedlist> -<title>Common attributes</title> -<para>All rules have the following attributes in common and are -available whenever <userinput>(common attributes)</userinput> appears. -<emphasis>attribute</emphasis> and <emphasis>context</emphasis> -are required attributes, all others are optional. -</para> - -<listitem> -<para><emphasis>attribute</emphasis>: An attribute maps to a defined <emphasis>itemData</emphasis>.</para> -</listitem> -<listitem> -<para><emphasis>context</emphasis>: Specify the context to which the highlighting system switches if the rule matches.</para> -</listitem> -<listitem> -<para><emphasis>beginRegion</emphasis>: Start a code folding block. Default: unset.</para> -</listitem> -<listitem> -<para><emphasis>endRegion</emphasis>: Close a code folding block. Default: unset.</para> -</listitem> -<listitem> -<para><emphasis>lookAhead</emphasis>: If <emphasis>true</emphasis>, the -highlighting system will not process the matches length. -Default: <emphasis>false</emphasis>.</para> -</listitem> -<listitem> -<para><emphasis>firstNonSpace</emphasis>: Match only, if the string is -the first non-whitespace in the line. Default: <emphasis>false</emphasis>.</para> -</listitem> -<listitem> -<para><emphasis>column</emphasis>: Match only, if the column matches. Default: unset.</para> -</listitem> -</itemizedlist> - -<itemizedlist> -<title>Dynamic rules</title> -<para>Some rules allow the optional attribute <userinput>dynamic</userinput> -of type boolean that defaults to <emphasis>false</emphasis>. If dynamic is -<emphasis>true</emphasis>, a rule can use placeholders representing the text -matched by a <emphasis>regular expression</emphasis> rule that switched to the -current context in its <userinput>string</userinput> or -<userinput>char</userinput> attributes. In a <userinput>string</userinput>, -the placeholder <replaceable>%N</replaceable> (where N is a number) will be -replaced with the corresponding capture <replaceable>N</replaceable> -from the calling regular expression. In a -<userinput>char</userinput> the placeholder must be a number -<replaceable>N</replaceable> and it will be replaced with the first character of -the corresponding capture <replaceable>N</replaceable> from the calling regular -expression. Whenever a rule allows this attribute it will contain a -<emphasis>(dynamic)</emphasis>.</para> - -<listitem> -<para><emphasis>dynamic</emphasis>: may be <emphasis>(true|false)</emphasis>.</para> -</listitem> -</itemizedlist> - -<sect2 id="highlighting-rules-in-detail"> -<title>The Rules in Detail</title> - -<variablelist> -<varlistentry> -<term>DetectChar</term> -<listitem> -<para>Detect a single specific character. Commonly used for example to -find the ends of quoted strings.</para> -<programlisting><DetectChar char="(character)" (common attributes) (dynamic) /></programlisting> -<para>The <userinput>char</userinput> attribute defines the character -to match.</para> -</listitem> -</varlistentry> - -<varlistentry> -<term>Detect2Chars</term> -<listitem> -<para>Detect two specific characters in a defined order.</para> -<programlisting><Detect2Chars char="(character)" char1="(character)" (common attributes) (dynamic) /></programlisting> -<para>The <userinput>char</userinput> attribute defines the first character to match, -<userinput>char1</userinput> the second.</para> -</listitem> -</varlistentry> - -<varlistentry> -<term>AnyChar</term> -<listitem> -<para>Detect one character of a set of specified characters.</para> -<programlisting><AnyChar String="(string)" (common attributes) /></programlisting> -<para>The <userinput>String</userinput> attribute defines the set of -characters.</para> -</listitem> -</varlistentry> - -<varlistentry> -<term>StringDetect</term> -<listitem> -<para>Detect an exact string.</para> -<programlisting><StringDetect String="(string)" [insensitive="true|false"] (common attributes) (dynamic) /></programlisting> -<para>The <userinput>String</userinput> attribute defines the string -to match. The <userinput>insensitive</userinput> attribute defaults to -<emphasis>false</emphasis> and is passed to the string comparison -function. If the value is <emphasis>true</emphasis> insensitive -comparing is used.</para> -</listitem> -</varlistentry> - -<varlistentry> -<term>WordDetect</term> -<listitem> -<para>Detect an exact string but additionally require word boundaries -such as a dot <userinput>'.'</userinput> or a whitespace on the beginning -and the end of the word. Think of <userinput>\b<string>\b</userinput> -in terms of a regular expression, but it is faster than the rule <userinput>RegExpr</userinput>.</para> -<programlisting><WordDetect String="(string)" [insensitive="true|false"] (common attributes) (dynamic) /></programlisting> -<para>The <userinput>String</userinput> attribute defines the string -to match. The <userinput>insensitive</userinput> attribute defaults to -<emphasis>false</emphasis> and is passed to the string comparison -function. If the value is <emphasis>true</emphasis> insensitive -comparing is used.</para> -<para>Since: Kate 3.5 (KDE 4.5)</para> -</listitem> -</varlistentry> - -<varlistentry> -<term>RegExpr</term> -<listitem> -<para>Matches against a regular expression.</para> -<programlisting><RegExpr String="(string)" [insensitive="true|false"] [minimal="true|false"] (common attributes) (dynamic) /></programlisting> -<para>The <userinput>String</userinput> attribute defines the regular -expression.</para> -<para><userinput>insensitive</userinput> defaults to -<emphasis>false</emphasis> and is passed to the regular expression -engine.</para> -<para><userinput>minimal</userinput> defaults to -<emphasis>false</emphasis> and is passed to the regular expression -engine.</para> -<para>Because the rules are always matched against the beginning of -the current string, a regular expression starting with a caret -(<literal>^</literal>) indicates that the rule should only be -matched against the start of a line.</para> -<para>See <link linkend="regular-expressions">Regular Expressions</link> -for more information on those.</para> -</listitem> -</varlistentry> - -<varlistentry> -<term>keyword</term> -<listitem> -<para>Detect a keyword from a specified list.</para> -<programlisting><keyword String="(list name)" (common attributes) /></programlisting> -<para>The <userinput>String</userinput> attribute identifies the -keyword list by name. A list with that name must exist.</para> -<para>The highlighting system processes keyword rules in a very optimized way. -This makes it an absolute necessity that any keywords to be matched need to be -surrounded by defined delimiters, either implied (the default delimiters), -or explicitly specified within the <emphasis>additionalDeliminator</emphasis> -property of the <emphasis>keywords</emphasis> tag.</para> -<para>If a keyword to be matched shall contain a delimiter character, this -respective character must be added to the <emphasis>weakDeliminator</emphasis> -property of the <emphasis>keywords</emphasis> tag. This character will then -loose its delimiter property in all <emphasis>keyword</emphasis> rules.</para> -</listitem> -</varlistentry> - -<varlistentry> -<term>Int</term> -<listitem> -<para>Detect an integer number.</para> -<para><programlisting><Int (common attributes) (dynamic) /></programlisting></para> -<para>This rule has no specific attributes. Child rules are typically -used to detect combinations of <userinput>L</userinput> and -<userinput>U</userinput> after the number, indicating the integer type -in program code. Actually all rules are allowed as child rules, though, -the <acronym>DTD</acronym> only allows the child rule <userinput>StringDetect</userinput>.</para> -<para>The following example matches integer numbers follows by the character 'L'. -<programlisting> -<Int attribute="Decimal" context="#stay" > - <StringDetect attribute="Decimal" context="#stay" String="L" insensitive="true"/> -</Int> -</programlisting></para> - -</listitem> -</varlistentry> - -<varlistentry> -<term>Float</term> -<listitem> -<para>Detect a floating point number.</para> -<para><programlisting><Float (common attributes) /></programlisting></para> -<para>This rule has no specific attributes. <userinput>AnyChar</userinput> is -allowed as a child rule and typically used to detect combinations, see rule -<userinput>Int</userinput> for reference.</para> -</listitem> -</varlistentry> - -<varlistentry> -<term>HlCOct</term> -<listitem> -<para>Detect an octal point number representation.</para> -<para><programlisting><HlCOct (common attributes) /></programlisting></para> -<para>This rule has no specific attributes.</para> -</listitem> -</varlistentry> - -<varlistentry> -<term>HlCHex</term> -<listitem> -<para>Detect a hexadecimal number representation.</para> -<para><programlisting><HlCHex (common attributes) /></programlisting></para> -<para>This rule has no specific attributes.</para> -</listitem> -</varlistentry> - -<varlistentry> -<term>HlCStringChar</term> -<listitem> -<para>Detect an escaped character.</para> -<para><programlisting><HlCStringChar (common attributes) /></programlisting></para> -<para>This rule has no specific attributes.</para> - -<para>It matches literal representations of characters commonly used in -program code, for example <userinput>\n</userinput> -(newline) or <userinput>\t</userinput> (TAB).</para> - -<para>The following characters will match if they follow a backslash -(<literal>\</literal>): -<userinput>abefnrtv"'?\</userinput>. Additionally, escaped -hexadecimal numbers such as for example <userinput>\xff</userinput> and -escaped octal numbers, for example <userinput>\033</userinput> will -match.</para> - -</listitem> -</varlistentry> - -<varlistentry> -<term>HlCChar</term> -<listitem> -<para>Detect an C character.</para> -<para><programlisting><HlCChar (common attributes) /></programlisting></para> -<para>This rule has no specific attributes.</para> - -<para>It matches C characters enclosed in a tick (Example: <userinput>'c'</userinput>). -The ticks may be a simple character or an escaped character. -See HlCStringChar for matched escaped character sequences.</para> - -</listitem> -</varlistentry> - -<varlistentry> -<term>RangeDetect</term> -<listitem> -<para>Detect a string with defined start and end characters.</para> -<programlisting><RangeDetect char="(character)" char1="(character)" (common attributes) /></programlisting> -<para><userinput>char</userinput> defines the character starting the range, -<userinput>char1</userinput> the character ending the range.</para> -<para>Useful to detect for example small quoted strings and the like, but -note that since the highlighting engine works on one line at a time, this -will not find strings spanning over a line break.</para> -</listitem> -</varlistentry> - -<varlistentry> -<term>LineContinue</term> -<listitem> -<para>Matches a specified char at the end of a line.</para> -<programlisting><LineContinue (common attributes) [char="\"] /></programlisting> -<para><userinput>char</userinput> optional character to match, default is -backslash (<userinput>'\'</userinput>). New since KDE 4.13.</para> -<para>This rule is useful for switching context at end of line. This is needed for - example in C/C++ to continue macros or strings.</para> -</listitem> -</varlistentry> - -<varlistentry> -<term>IncludeRules</term> -<listitem> -<para>Include rules from another context or language/file.</para> -<programlisting><IncludeRules context="contextlink" [includeAttrib="true|false"] /></programlisting> - -<para>The <userinput>context</userinput> attribute defines which context to include.</para> -<para>If it is a simple string it includes all defined rules into the current context, example: -<programlisting><IncludeRules context="anotherContext" /></programlisting></para> - -<para> -If the string contains a <userinput>##</userinput> the highlight system -will look for a context from another language definition with the given name, -for example -<programlisting><IncludeRules context="String##C++" /></programlisting> -would include the context <emphasis>String</emphasis> from the <emphasis>C++</emphasis> -highlighting definition.</para> -<para>If <userinput>includeAttrib</userinput> attribute is -<emphasis>true</emphasis>, change the destination attribute to the one of -the source. This is required to make, for example, commenting work, if text -matched by the included context is a different highlight from the host -context. -</para> -</listitem> -</varlistentry> - - -<varlistentry> -<term>DetectSpaces</term> -<listitem> -<para>Detect whitespaces.</para> -<programlisting><DetectSpaces (common attributes) /></programlisting> - -<para>This rule has no specific attributes.</para> -<para>Use this rule if you know that there can be several whitespaces ahead, -for example in the beginning of indented lines. This rule will skip all -whitespace at once, instead of testing multiple rules and skipping one at a -time due to no match.</para> -</listitem> -</varlistentry> - - -<varlistentry> -<term>DetectIdentifier</term> -<listitem> -<para>Detect identifier strings (as a regular expression: [a-zA-Z_][a-zA-Z0-9_]*).</para> -<programlisting><DetectIdentifier (common attributes) /></programlisting> - -<para>This rule has no specific attributes.</para> -<para>Use this rule to skip a string of word characters at once, rather than -testing with multiple rules and skipping one at a time due to no match.</para> -</listitem> -</varlistentry> - -</variablelist> -</sect2> - -<sect2> -<title>Tips & Tricks</title> - -<itemizedlist> -<para>Once you have understood how the context switching works it will be -easy to write highlight definitions. Though you should carefully check what -rule you choose in what situation. Regular expressions are very mighty, but -they are slow compared to the other rules. So you may consider the following -tips. -</para> - -<listitem> -<para>If you only match two characters use <userinput>Detect2Chars</userinput> -instead of <userinput>StringDetect</userinput>. The same applies to -<userinput>DetectChar</userinput>.</para> -</listitem> -<listitem> -<para>Regular expressions are easy to use but often there is another much -faster way to achieve the same result. Consider you only want to match -the character <userinput>'#'</userinput> if it is the first character in the -line. A regular expression based solution would look like this: -<programlisting><RegExpr attribute="Macro" context="macro" String="^\s*#" /></programlisting> -You can achieve the same much faster in using: -<programlisting><DetectChar attribute="Macro" context="macro" char="#" firstNonSpace="true" /></programlisting> -If you want to match the regular expression <userinput>'^#'</userinput> you -can still use <userinput>DetectChar</userinput> with the attribute <userinput>column="0"</userinput>. -The attribute <userinput>column</userinput> counts characters, so a tabulator is only one character. -</para> -</listitem> -<listitem> -<para>You can switch contexts without processing characters. Assume that you -want to switch context when you meet the string <userinput>*/</userinput>, but -need to process that string in the next context. The below rule will match, and -the <userinput>lookAhead</userinput> attribute will cause the highlighter to -keep the matched string for the next context. -<programlisting><Detect2Chars attribute="Comment" context="#pop" char="*" char1="/" lookAhead="true" /></programlisting> -</para> -</listitem> -<listitem> -<para>Use <userinput>DetectSpaces</userinput> if you know that many whitespaces occur.</para> -</listitem> -<listitem> -<para>Use <userinput>DetectIdentifier</userinput> instead of the regular expression <userinput>'[a-zA-Z_]\w*'</userinput>.</para> -</listitem> -<listitem> -<para>Use default styles whenever you can. This way the user will find a familiar environment.</para> -</listitem> -<listitem> -<para>Look into other XML-files to see how other people implement tricky rules.</para> -</listitem> -<listitem> -<para>You can validate every XML file by using the command -<command>xmllint --dtdvalid language.dtd mySyntax.xml</command>.</para> -</listitem> -<listitem> -<para>If you repeat complex regular expression very often you can use -<emphasis>ENTITIES</emphasis>. Example:</para> -<programlisting> -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE language SYSTEM "language.dtd" -[ - <!ENTITY myref "[A-Za-z_:][\w.:_-]*"> -]> -</programlisting> -<para>Now you can use <emphasis>&myref;</emphasis> instead of the regular -expression.</para> -</listitem> -</itemizedlist> -</sect2> - -</sect1> - -</appendix>
