[Wikidata-bugs] [Maniphest] [Commented On] T238002: WDQS Munger should be multi threaded
dcausse added a comment. Separation of - parsing - munging - writing in multiple thread doubled the speed of the munger old: real1371m34.618s user1854m48.672s sys 24m44.480s new: real731m20.495s user1798m42.176s sys 30m7.888s I should have linked https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/553758 to this task. Since the rdf parser is the limiting factor I think we will have to do the entity delimitation without a rdf parser if we want to further improve the speed of this step. We could also consider switching to the `nt` format which I'm sure will be a lot faster to parse if the size overhead is acceptable. TASK DETAIL https://phabricator.wikimedia.org/T238002 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Smalyshev, Gehel, Aklapper, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T238002: WDQS Munger should be multi threaded
Smalyshev added a comment. Per-item data are mostly independent, so different items can be easily processable in parallel, however that would require splitting the incoming data per item (note that item data not necessarily have item URI as subject - there are statements, references, values, sitelinks, etc.) TASK DETAIL https://phabricator.wikimedia.org/T238002 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Smalyshev Cc: Smalyshev, Gehel, Aklapper, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs