[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-10-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 --- Comment #17 from Adam Wight s...@ludd.net 2011-10-06 18:06:02 UTC --- What about saving several indexes of data each in their own file? For illustration, tlwiki-20110926-pages-meta-history.xml.bz2.index-on-revision.sqlite3

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-08-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 --- Comment #12 from Ariel T. Glenn ar...@wikimedia.org 2011-08-29 18:07:24 UTC --- (In response to comment 11) No they aren't but I have a C library that could be used to build such an index without a ton of work, for bzip2 files;

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-08-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 Andrew Dunbar hippytr...@gmail.com changed: What|Removed |Added CC|

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-08-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 --- Comment #14 from Ariel T. Glenn ar...@wikimedia.org 2011-08-29 19:39:55 UTC --- Yeah, I'm familiar with seek-bzip2, but it didn't do what I needed for my use case. I wanted to be able to easily locate a given XML page in a dump file

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-08-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 Ángel González keis...@gmail.com changed: What|Removed |Added CC||keis...@gmail.com

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-08-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 --- Comment #16 from Ariel T. Glenn ar...@wikimedia.org 2011-08-29 22:19:33 UTC --- See Adminstrators'_noticeboard/Incidents, a total of 561938 revs last time I looked (which was over a month ago, surely even worse now). -- Configure

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-06-04 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 --- Comment #11 from Adam Wight s...@ludd.net 2011-06-04 11:07:57 UTC --- Make it a requirement that the compression library is able to report compressed block boundaries as it is working, so an index can be generated. This will open many

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-06-03 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 Diederik van Liere dvanli...@gmail.com changed: What|Removed |Added Keywords||analytics --

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-06-03 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 --- Comment #9 from Platonides platoni...@gmail.com 2011-06-03 22:00:31 UTC --- Diederik, they are not created uncompressed in memory. I think we should just move to xz (mainly for the space benefits), which would provide the uncompressed

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-06-03 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 --- Comment #10 from Diederik van Liere dvanli...@gmail.com 2011-06-03 22:04:31 UTC --- xz compression sounds good to me! -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-06-02 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 Platonides platoni...@gmail.com changed: What|Removed |Added CC||platoni...@gmail.com

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-06-02 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 --- Comment #6 from Brion Vibber br...@wikimedia.org 2011-06-02 21:54:24 UTC --- (In reply to comment #5) Dump files are generated directly to their compressed form, so these exact things aren't really possible to put in. You can just

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-06-02 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 --- Comment #7 from Platonides platoni...@gmail.com 2011-06-02 22:35:03 UTC --- Sorry, I didn't pay enough attention to the first post, I was thinking in giving that metadata separatedly. -- Configure bugmail:

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-06-02 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 --- Comment #8 from Diederik van Liere dvanli...@gmail.com 2011-06-02 22:40:04 UTC --- Or alternatively, first create the page XML elements and once that's done and you have collected meta data like number of articles, uncompressed size, etc.

[Bug 26499] Include uncompressed size and other metadata in each dump file

2011-02-24 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=26499 Adam Wight s...@ludd.net changed: What|Removed |Added Summary|Include size of the dump|Include uncompressed size