Hi,

W dniu 2012-12-31 14:01, Mauro Condarelli pisze:
> Hi All,
>
> On 31/12/2012 12:48, Mike Unwalla wrote:
>> Hello,
>>
>> Readability is more important than decreasing the size of a file. In my
>> opinion, Step 1 and Step 3 decrease readability. '<marker'> is clearer than
>> '<m>'.
> I completely agree with the above.
>
> The point I was trying to make is xml doesn't look suited to describe a
> set of production rules for text transformation (disambiguator) or
> syntax check (grammar).
>
> In such a case it's common to devise a DSL (Domain Specific Language)
> precisely describing the problem and thus enhancing manifold readability
> and maintainability.
> The downside of this approach is the need to build a complete toolchain
> for the new language, including a suitable editor and a compiler.
>
> I was pointing out eclipse includes all tools to easily do all necessary
> framework with very little effort (actually little more than writing the
> BNF grammar for the DSL itself).

Well, I'm not sure if this will be so easy, as conversion of XML 
languages into BNF is not a completely trivial business. There are no 
standard converters between XML Schema and BNF, for example, and I'm not 
sure if XSD is context-free just like BNF. It might be higher in 
Chomsky's hierarchy because it allows for some context-sensitivity in 
element names and regular expressions on the right-hand side of 
productions... I'm not sure how much of this is actually used in our 
.xsd files.

 > This can be deployed into eclipse
> itself (as a plugin) or wrapped in a stand-alone "RCP" application
> acting as a (very fat) editor (complete with syntax-highlighting,
> on-the-fly error detection and auto-completion) for the language files

We already have XML editors that do that, and more.

> that, as a "side effect" produces also some suitable representation of
> the semantic. This "suitable representation" could be in the form of
> compilable java classes (for speed) or even the current xml syntax (for
> compatibility).

Well, it would be nice to compile our rules for speed, but for the user, 
I still think that a database-like front-end would be much better. The 
DSL seems to replace a hard language to learn with another hard language 
to learn.

Regards,
Marcin

>
> Regards
> Mauro
>> In a related reply, Dominique wrote:
>>           It will only marginally reduce size. But shorter add less noise
>>           so it's clearer in my opinion. <m> and <s> may look less readable
>>           than <marker> and <suggestion> but since rule developers
>>           use them all the time, they would be well familiar with them.
>>
>> I do not create rules each day. Typically, I work with LT each day for 2 or
>> 3 weeks. Then, I work on other projects for weeks or months.
>>
>> Regards,
>>
>> Mike Unwalla
>> Contact: www.techscribe.co.uk/techw/contact.htm
>>
>>
>> -----Original Message-----
>> From: Daniel Naber [mailto:list2...@danielnaber.de]
>> Sent: 30 December 2012 20:56
>> To: development discussion for LanguageTool
>> Subject: making XML rules more compact?
>>
>> Hi,
>>
>> we have three languages with grammar files that are more than 1 MB large
>> (German, French, Catalan). The German grammar.xml has more than 24,000
>> lines. This size makes editing the files difficult. I have some ideas on how
>>
>> to improve the situation and I'm looking for other ideas and comments:
>>
>> Step 1 - the easy one
>>
>> We can make the syntax a bit more compact and readable by changing some
>> elements:
>>
>> <marker> => <m>
>> <suggestion> => <s>
>> <example type="correct"> => <right>
>> <example type="incorrect"> => <wrong>
>>
>>
>> Step 2 - less repetition (also easy to implement)
>>
>> The contents of <message>, <url>, and <short> should be inherited from a
>> <rulegroup> element to its <rule> elements. This way those elements do not
>> need to be repeated if the are the same for all rules of a rulegroup.
>>
>>
>> Step 3 - an XML-free pattern
>>
>> Add a compact way to describe simple patterns. This is best explained by
>> example. What is now this:
>>
>> <pattern>
>>     <token regexp="yes">foo|bar</token>
>>     <marker>
>>       <token>myerror</token>
>>     </marker>
>> </pattern>
>>
>> ...could be written like this:
>>
>> <p>re:foo|bar _myerror_</p>
>>
>> Thus you don't need "<token>" at all as a whitespace implies a token
>> boundary. The prefix "re:" turns on regular expression matching (the same
>> for "pos:" -> POS tag, "pos:re:" -> POS tag regex). "<marker>" is replaced
>> by underscores. This does not support exceptions and other advanced
>> features, but it turns a 6-line rule into a 1-line rule. This new syntax is
>> optional, i.e. the old one can still be used.
>>
>> What do you think about that? Other suggestions for making rule syntax more
>> compact?
>>
>> Regards
>>    Daniel
>>
>>
>> ------------------------------------------------------------------------------
>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>> MVPs and experts. SALE $99.99 this month only -- learn more at:
>> http://p.sf.net/sfu/learnmore_122412
>> _______________________________________________
>> Languagetool-devel mailing list
>> Languagetool-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>
>
> ------------------------------------------------------------------------------
> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> MVPs and experts. SALE $99.99 this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122412
> _______________________________________________
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>
>


------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122412
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to