R.J. Baars wrote: > A long time ago, I chose to have the - as a word char, not separating word > parts that really belong together. > > That is now in the way for the date rules, since a normal date in Dutch > can also be 15-1-1958. > > Is there a solution for this issue? Like tokenizing when the dash is > within a number? Or get the date values from the date string using regexp > catching ? > > Ruud
Hi Ruud It should not be a problem. Have a look at rule DATE_JOUR[4] in French. $ echo "Vendredi, 28-08-2014." | \ java -jar .../languagetool-commandline.jar -c utf-8 -l fr -v Expected text language: French Working on STDIN... 2434 rules activated for language French <S> Vendredi[vendredi/A,],[,/M nonfin,] 28-08-2014[28-08-2014/null,].[./M fin,</S>,]<P/> Disambiguator log: RB-ADVERBES:1 Vendredi[vendredi/N m s*,vendredi/A*] -> Vendredi[vendredi/A*] 1.) Line 1, column 1, Rule ID: DATE_JOUR[4] Message: La date « Vendredi, 28-08-2014 » n’est pas un vendredi mais un jeudi. Vendredi, 28-08-2014. ^^^^^^^^^^^^^^^^^^^^ Time: 970ms for 1 sentences (1.0 sentences/sec) ------------------------------------------------------------------------------ Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel