Text indexing Wikidata

Neubert, Joachim Fri, 18 Feb 2022 00:59:43 -0800

Text indexing the truthy Wikidata dump took 13:10 h for 1.5b labels (in parts 
using text:LowerCaseKeywordAnalyzer) on the massive parallel machine.


I observed a CPU usage of 100-250 %, and wonder if I could do something to 
speed up. My command line simply was

java -cp /opt/fuseki/fuseki-server.jar jena.textindexer --debug 
--desc=/tmp/temp.ttl

(apache-jena-fuseki-4.5.0-SNAPSHOT)

Cheers, Joachim

--
Joachim Neubert

ZBW - Leibniz Information Centre for Economics
Neuer Jungfernstieg 21
20354 Hamburg
Phone +49-40-42834-462

Text indexing Wikidata

Reply via email to