Hello; I'm processing Wikipedia dumps. For now, I'm copying some dumps into the tool path (/data/project/tool/dumps) to preserve them for my study, because only the last 2 dumps are in /public/dumps. And when I launch the jsub, the script read them from there.
But I have a question, is /public/dumps faster than /data/project ? I mean in r.pm. or any technical feature. Or all are the same? By the way, when processing dumps, I have found that reading from a 7z dump is faster than from a bz2, so I think that the hard disks are playing here a important role, more than CPU. Thanks
_______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
