Hi

I measured LT speed using command line version of LanguageTool.
Recorded numbers are user time reported by Linux time command.
Measurements were made on my laptop:
- xubuntu-14.04.3
- i5-3317U CPU, 1.7Ghz, 4 cores, SSD
- java version "1.8.0_60"

I measured:
- several versions of LT (from 1.8 to latest 3.2 snapshot)
- for 3 languages (fr, de, eo)
- with an empty document (to see startup time)
- and with a 500 lines document (first 500 lines from Tatoeba in each
language):
  * 24871 bytes for French
  * 24773 bytes for German
  * 15878 bytes for Esperanto

Each configuration was measured 4 times and I recorded
the smallest and biggest times (user time in seconds). I used
this kind of command to get 4 measurements:

  $ (for t in 1 2 3 4; do time -p java -jar \
     lt-2.6/LanguageTool-2.6/languagetool-commandline.jar \
     -l de < 500-de.csv > /dev/null ; done ) 2>&1 | grep -w real
  real 5.72
  real 3.88
  real 5.04
  real 5.01

I used the LanguageTool-*.zip files downloaded from
https://languagetool.org/download/.  For LT-3.2 snapshot,
I used git change 4f2f48f647074a60543c2e7204bfe2152d753f4a (2015/10/23).

Here are the results:

============================================================================================
                     language fr               language de
 language eo
------------  --------------------------- ------------------------
 ------------------------
LanguageTool  empty         500           empty         500         empty
     500
version       document      lines         document      lines
document     lines
------------  -----------   -----------   ----------   -----------
 -----------  -----------
1.8           0.55s 0.59s   3.62s 3.83s   0.99s 1.01s  3.44s 3.72s  0.58s
0.62s  1.87s 1.91s
1.9           0.55s 0.59s   3.34s 3.75s   0.96s 1.06s  3.63s 3.69s  0.56s
0.62s  1.66s 2.05s
2.0           0.54s 0.57s   3.56s 4.01s   0.98s 1.03s  3.77s 4.10s  0.56s
0.62s  1.72s 2.06s
2.1           0.81s 0.88s   4.05s 5.08s   1.08s 1.12s  4.43s 4.80s  0.66s
0.67s  1.85s 2.15s
2.2           0.82s 0.89s   3.83s 5.06s   1.06s 1.15s  4.42s 4.91s  0.65s
0.67s  1.68s 1.94s
2.3           0.92s 0.99s   4.50s 4.83s   1.06s 1.19s  4.59s 4.77s  0.69s
0.71s  1.67s 1.83s
2.4.1         0.98s 1.05s   3.92s 4.19s   1.15s 1.17s  3.84s 4.11s  0.64s
0.73s  1.72s 1.87s
2.5           0.94s 1.04s   4.72s 4.92s   1.13s 1.21s  3.62s 4.80s  0.65s
0.72s  1.66s 1.80s
2.6           1.01s 1.06s   4.90s 5.35s   1.15s 1.21s  3.88s 5.72s  0.68s
0.75s  1.69s 1.85s
2.7           1.00s 1.07s   5,23s 5.98s   1.10s 1.18s  4.05s 5.15s  0.72s
0.74s  2.81s 2.91s
2.8           0.99s 1.05s   5.25s 6.21s   1.16s 1.20s  4.69s 5.23s  0.73s
0.78s  2.81s 2.87s
2.9           1.06s 1.10s   5.03s 5.84s   1.25s 1.39s  5.16s 5.66s  0.76s
0.80s  2.41s 2.84s
3.0           1.08s 1.11s   5.50s 5.80s   1.31s 1.35s  5.44s 6.08s  0.76s
0.79s  2.70s 2.87s
3.1           1.20s 1.24s   6.26s 6.77s   1.32s 1.35s  5.52s 6.22s  0.79s
0.82s  2.59s 2.74s
3-2-SNAPSHOT  1.26s 1.34s   5.94s 7.26s   1.29s 1.41s  5.80s 6.23s  0.82s
0.85s  2.48s 2.62s
============================================================================================

There is a tendency for LT to become slower releases after releases but
as more rules are added, it's not really unexpected. Startup time for French
has more than doubled which is not nice.

Multi-threading was introduced in LT-2.7 but above numbers don't show
improvements. Maybe I needed to use a bigger document than 500 lines.

I'm not sure whether we can draw useful conclusions from this, but I'm
sharing in
case it is interesting to you.

I don't have the time, but If someone has, it could be good to automate such
measurements with a script, so it can be measured more systematically for
more
languages.

Regards
Dominique
------------------------------------------------------------------------------
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to