W dniu 2014-03-23 13:13, Dave Pawson pisze:
>
> So if I specify
> java -jar ${langtools}/languagetool-commandline.jar --language EN-GB
> --disable $disRules $*
> there are two grammar files in use?
>   IMHO it would help the user (or at least annoy him/her less) if I was told
> which file / rule is being used.

Well, you get 8 rules or something more. In general, it doesn't make 
much difference if you specify a country variant; this is, I think, the 
only combination where it does matter (we don't have too many special 
country-variant rules). In verbose mode, we already display lots of 
info, but we can add this.

>
> Yes, they are used to generate primary, secondary and tertiary terms
> in the index.
>
> I have asked on the docbook list, I'll provide a stylesheet for docbook
> expanding includes, removing 'extras' such as indexterms.

Right. Remember, however, that integrating corrections will not be 
trivial then. What I mean is that LT displays the position of the 
mistake (also in its XML output) which can be used to highlight the 
error. If you remove any content with a stylesheet, then the initial 
position may be skewed, and highlights will show in random places 
because LT won't see the markup. This is why a stylesheet is not really 
the way to write an AnnotatedText parser for us. We rather need to parse 
docbook with some special Java code, which might be simple anyway.

>
>>
>>>
>>> ========================
>>> Unpaired_brackets error
>>>
>>> In my XML I'm using "'"  single quote as both apostrophe
>>> and single quote (rightly or wrongly).
>>> --disable EN_UNPAIRED_BRACKETS
>>> as a command line option would (presumably) disable match
>>> checking for a number of characters?
>>
>> You could but LT should handle apostrophes and single quotes without any
>> problems. If it doesn't, please file an issue on github for me:
>>
>> https://github.com/languagetool-org/languagetool/issues?state=open
>>
>> But you can paste the example here, if it's not anything confidential,
>> of course.
>
>
>
> 185.) Line 489, column 15, Rule ID: EN_UNPAIRED_BRACKETS
> Message: Unpaired bracket or similar symbol
> ... key for the front door. <link
> xlink:href="http://www.randrsecurity.com/";>R and R securi...
>
> Clearly there isn't an unpaired " character.  Not sure what else it
> might be reporting?
> Not very clear though.

Right. This is just because the tag is split with an end-of-line marker. 
You're apparently using -b parameter which breaks at a single 
end-of-line marker, but this is wrong for your files.

>
>>
>>>     Is it possible to be more selective?
>>
>> No. We don't have that option.
>
> In which case could a rule be repeated with less content in the set?

Not really, as this is a Java rule.

Anyway, the false positive is here just because of the end-of-line markers.

Regards,
Marcin

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to