Tjones has uploaded a new change for review. (
https://gerrit.wikimedia.org/r/333683 )
Change subject: Update PHP TextCat Models to 10K n-grams
......................................................................
Update PHP TextCat Models to 10K n-grams
Update all LM/ and LM-query/ models to 10K n-grams. The number of spaces
('_') counted in the LM models has gone down by 2 for every model, but
doesn't change the rank statistics for any model.
Update lm2php.php to handle slightly changed Perl model format (a stray
space was removed).
Add a couple of test cases that differ by model size above 5K (previous
max).
Bug: T155672
Change-Id: If35912574e833a677459531f994ae95f314b042d
---
M LM-query/ar.lm
M LM-query/bg.lm
M LM-query/bn.lm
M LM-query/cs.lm
M LM-query/de.lm
M LM-query/el.lm
M LM-query/en.lm
M LM-query/es.lm
M LM-query/fa.lm
M LM-query/fr.lm
M LM-query/he.lm
M LM-query/hi.lm
M LM-query/hy.lm
M LM-query/id.lm
M LM-query/it.lm
M LM-query/ja.lm
M LM-query/ka.lm
M LM-query/ko.lm
M LM-query/nl.lm
M LM-query/pl.lm
M LM-query/pt.lm
M LM-query/ru.lm
M LM-query/sv.lm
M LM-query/ta.lm
M LM-query/te.lm
M LM-query/th.lm
M LM-query/tr.lm
M LM-query/uk.lm
M LM-query/vi.lm
M LM-query/zh.lm
M LM/af.lm
M LM/ar.lm
M LM/be.lm
M LM/bg.lm
M LM/bn.lm
M LM/br.lm
M LM/bs.lm
M LM/ca.lm
M LM/cs.lm
M LM/cy.lm
M LM/da.lm
M LM/de.lm
M LM/el.lm
M LM/en.lm
M LM/eo.lm
M LM/es.lm
M LM/et.lm
M LM/eu.lm
M LM/fa.lm
M LM/fi.lm
M LM/fr.lm
M LM/ga.lm
M LM/gu.lm
M LM/he.lm
M LM/hi.lm
M LM/hr.lm
M LM/hu.lm
M LM/hy.lm
M LM/id.lm
M LM/is.lm
M LM/it.lm
M LM/ja.lm
M LM/jv.lm
M LM/ka.lm
M LM/kn.lm
M LM/ko.lm
M LM/la.lm
M LM/lt.lm
M LM/lv.lm
M LM/ml.lm
M LM/mr.lm
M LM/ms.lm
M LM/my.lm
M LM/nl.lm
M LM/no.lm
M LM/or.lm
M LM/pl.lm
M LM/pnb.lm
M LM/pt.lm
M LM/ro.lm
M LM/ru.lm
M LM/sco.lm
M LM/sh.lm
M LM/sk.lm
M LM/sl.lm
M LM/sq.lm
M LM/sr.lm
M LM/su.lm
M LM/sv.lm
M LM/ta.lm
M LM/te.lm
M LM/th.lm
M LM/tk.lm
M LM/tl.lm
M LM/tr.lm
M LM/uk.lm
M LM/ur.lm
M LM/vi.lm
M LM/zh-yue.lm
M LM/zh.lm
M README.md
M lm2php.php
M tests/TextCatTest.php
103 files changed, 1,140,086 insertions(+), 79 deletions(-)
git pull ssh://gerrit.wikimedia.org:29418/wikimedia/textcat
refs/changes/83/333683/1
--
To view, visit https://gerrit.wikimedia.org/r/333683
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: If35912574e833a677459531f994ae95f314b042d
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/textcat
Gerrit-Branch: master
Gerrit-Owner: Tjones <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits