[Bug 28956] Dumps should be incremental

2013-10-22 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #30 from Quim Gil q...@wikimedia.org ---
If you have open tasks or bugs left, one possibility is to list them at
https://www.mediawiki.org/wiki/Google_Code-In and volunteer yourself as mentor.

We have heard from Google and free software projects participating in Code-in
that students participating in this programs have done a great work finishing
and polishing GSoC projects, many times mentores by the former GSoC student.
The key is to be able to split the pending work in little tasks.

More information in the wiki page. If you have questions you can ask there or
you can contact me directly.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-10-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #28 from Svick gsv...@gmail.com ---
GSoC is over and the code is mostly done (repo operations/dumps/incremental,
branch gsoc). There are some remaining bugs (bug 64633) and TODOs
(https://www.mediawiki.org/wiki/User:Svick/Incremental_dumps/TODO). After that
is done, the code should be ready for production.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-10-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #29 from Svick gsv...@gmail.com ---
Typo in last comment: the correct link for remaining bugs is bug 54633.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-10-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

Quim Gil q...@wikimedia.org changed:

   What|Removed |Added

 Blocks||54633

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-07-01 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #27 from Svick gsv...@gmail.com ---
I have now started working on this, for more information and updates, see
[[mw:User:Svick/Incremental dumps]].

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-05-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #24 from Dmitriy Sintsov ques...@rambler.ru ---
No, I didn't knew one can perform such dumps. Which options of
maintenance/dumpBackup.php are used?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-05-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #25 from Ariel T. Glenn ar...@wikimedia.org ---
pass 1: php -q dumpBackup.php --wiki=somewikiorother --stub --quiet
--force-normal --output=gzip:somestubname.xml.gz --revrange --revstart
somerevnum --revend otherrevnum

pass 2: php -q dumpTextPass.php --wiki=somewikiorother
--stub=gzip:somestubname.xml.gz --force-normal --quiet --spawn=php
--output=bzip2:somerevisioncontentname.xml.bz2

The trick is to have the starting and ending revision ids for your range. Do
note again that this does not address deleted/hidden revisions etc.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-05-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #26 from Dmitriy Sintsov ques...@rambler.ru ---
It's not fully automated. I propose a solution where revrange is taken
automatically by script itself and dumps are automatically split into daily /
weekly files and also importDump.php could import such files.

If I'll had more time and not so much extreme shortage of money I'd make such
patch, but unfortunately I can't. I am not well enough to become Wikimedia
developer, while MediaWiki freelancing here in Russia turned out to be
financial disaster (not much of MediaWiki related jobs here). So, currently I
turned to coding for another frameworks and do not know whether to return back.
But I still read Wikitech.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-05-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #23 from Ariel T. Glenn ar...@wikimedia.org ---
(In reply to comment #22)

Are you aware of the adds/changes dumps which are basically a daily dump of new
revsions (without however notification about deletions etc)?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-05-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #21 from Ariel T. Glenn ar...@wikimedia.org ---
(In reply to comment #20)

 The biggest problem is slowness of xml dumps, so SQL dumps also should be
 created in such way.

If I inderstand you correctly, you're suggesting that the text revisions be
dumped using e.g. mysqldump in order to make them faster.  While the production
of the XML dumps for WMF projects is very slow for large projects, using
mysqldump isn't feasible, for a few reasons:

* Text revisions live in external storage clusters in separate databases and
tables.  Older revisions might live in a different cluser than newer ones.  For
any given revision the way to find out where the text content is stored is to
check the pointer in the wikis's text table.
* Some text revisions are hidden from public view (deleted or oversighted) and
should not be included in the dumps.
* We have all of the metadata that should accompany the text of each page, for
bot users, researchers and importers alike.  This is a convenience measure more
than anything else but a vary popular one. Of course if there were some other
proposal for packaging the metadata in the glorious new dump format to come,
this issue could be addressed.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-05-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #22 from Dmitriy Sintsov ques...@rambler.ru ---
Sure, Wikimedia installations are very special so SQL dumps are out of question
for them.

But actually my idea is very different. I suggest to split XML dumps into daily
files automatically during backup / import process:
wpen_2013-05-04.xml
wpen_2013-05-05.xml
wpen_2013-05-06.xml

Of course when full day passed by and it's dump file already exists, then such
daily dumps should not be re-created, just quickly skipped out.

In case there will be too many XML files, one may either use nested directory
tree:
wpen/2013/05/04.xml
wpen/2013/05/05.xml
wpen/2013/05/06.xml

or to perform weekly dumps (number of week in the year):
wpen_2013-01.xml
wpen_2013-02.xml
...
wpen_2013-52.xml

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-05-03 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

Jeremy Coffman jcoffma...@yahoo.com changed:

   What|Removed |Added

 CC||jcoffma...@yahoo.com

--- Comment #16 from Jeremy Coffman jcoffma...@yahoo.com ---
Hello,

Here is my proposal:
http://www.mediawiki.org/wiki/User:J.a.coffman/GSoc_2013_Proposal

Thanks,
Jeremy Coffman

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-05-03 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #17 from Quim Gil q...@wikimedia.org ---
(In reply to comment #10)
 So I had (not knowing about Wywin's) also started working on a proposal for
 this project too... [[mw:User:Legoktm/a]] (still in drafting).

Just a note to confirm that Legoktm has submitted a GSoC proposal related to
this report: https://www.mediawiki.org/wiki/User:Legoktm/GSoC_2013

And yes, we have received Jeremy's proposal as well. 

Good luck to all candidates!

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-05-03 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

Dmitriy Sintsov ques...@rambler.ru changed:

   What|Removed |Added

 CC||ques...@rambler.ru

--- Comment #18 from Dmitriy Sintsov ques...@rambler.ru ---
Why not just select revision.rev_timestamp range to dump only revisions created
during some time interval (let's say day or week)?
It should not be too hard to implement cli options to select timestamp ranges
for maintenance/dumpBackup.php. Then such dumps can be made by Wikimedia.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-05-03 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

Quim Gil q...@wikimedia.org changed:

   What|Removed |Added

 CC||pavlovic.sanja...@gmail.com

--- Comment #19 from Quim Gil q...@wikimedia.org ---
Just a note to say that Sanja Pavlovic has submitted a GSoC proposal related to
this report:
https://www.mediawiki.org/wiki/User:Sanja_pavlovic/GSOC/OPW_application

Good luck!

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-05-03 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #20 from Dmitriy Sintsov ques...@rambler.ru ---
Even full dumps could be performed into separate files per day or per week, so
the dump operation will be full and incremental at the same time. It will
produce multi-file dump, however maintenance/importDump.php also could be
modified to import from such multi-file dumps.

The biggest problem is slowness of xml dumps, so SQL dumps also should be
created in such way.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-05-02 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

Jamie Thingelstad ja...@thingelstad.com changed:

   What|Removed |Added

 CC||ja...@thingelstad.com

--- Comment #14 from Jamie Thingelstad ja...@thingelstad.com ---
I found Svick's proposal via wikitech-l and highlighted on the talk page a hope
that this would support remote incremental via API. See comment at

https://www.mediawiki.org/wiki/User_talk:Svick/Incremental_dumps#Consider_remote_backups_as_well_26965

I'm sharing it here so others can see and comment as well. Allowing remote
differentials would allow for services (like WikiApiary, Archiveteam and
others) to provide a robust backup service to large numbers of wikis.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-05-02 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #15 from Quim Gil q...@wikimedia.org ---
Just a note to say that Shao Hong has submitted a GSoC proposal related with
this report: https://www.mediawiki.org/wiki/User:Shaohong

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-04-27 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #13 from Quim Gil q...@wikimedia.org ---
Just a note to say that Svick has submitted his proposal officially.

Legoktm, you are encouraged to apply to the GSoC tool asap.

Ariel, please sign up as possible mentor for all these proposals in the GSoC
tool. We are recommending two co-mentors per project, based on previous
experiences. All the better if a second co-mentor joins.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-04-26 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #12 from Svick gsv...@gmail.com ---
This project is starting to get crowded: I have also added my own proposal:
[[mw:User:Svick/Incremental dumps]].

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-04-24 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

Legoktm legoktm.wikipe...@gmail.com changed:

   What|Removed |Added

 CC||legoktm.wikipe...@gmail.com

--- Comment #10 from Legoktm legoktm.wikipe...@gmail.com ---
So I had (not knowing about Wywin's) also started working on a proposal for
this project too... [[mw:User:Legoktm/a]] (still in drafting).

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-04-24 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #11 from Quim Gil q...@wikimedia.org ---
Competition is good! Having more students after a single project idea might
look like having less chances to be acepted. However, it also shows a high
interest in the idea itself and therefore helps us promoting it among other
ideas proposed.

All this to say: keep working on a great proposal and good luck!

PS: this is one of the reasons why we encourage candidates to share their plans
in the community channels as soon as possible:
https://www.mediawiki.org/wiki/Mentorship_programs/Application_template

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-04-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #9 from Quim Gil q...@wikimedia.org ---
Just a note to say that user Wywin has applied to GSoC with a proposal related
to this report. Good luck!

https://www.mediawiki.org/wiki/User:Wywin

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-03-26 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #8 from Ariel T. Glenn ar...@wikimedia.org ---
brain dump of thoughts here:
http://www.mediawiki.org/wiki/User:ArielGlenn/Dumps_new_format_%28deltas,_changesets%29
 not meant to be binding in any way, some of it likely to be pure crapola too.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2013-03-25 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

Quim Gil q...@wikimedia.org changed:

   What|Removed |Added

 CC||q...@wikimedia.org

--- Comment #7 from Quim Gil q...@wikimedia.org ---
Ariel has proposed this feature request as a Google Summer of Code Project at
http://www.mediawiki.org/wiki/Mentorship_programs/Possible_projects#XML_dumps

We have accepted it and a shorter version is now listed at
https://www.mediawiki.org/wiki/Summer_of_Code_2013#Incremental_data_dumps

Pasting here the recommendation for a implementation from Ariel, just in case:

This could be achieved by designing the right output format for the XML files
containing text for all revisions. It would need: a smart choice for
compression of multiple items together, an index into the compressed blocks, a
way to remove content quickly, possibly leaving zeroed blocks bhind, a way to
re-use empty blocks. To use the new archive format, we would need tools to
convert to bz2 or 7z (so users can keep all their existing scripts for the
dumps), a format for storing isolated sets of changes (so dump users can
download just these sets), a script to apply those changes to the above format
(so users can run the script against the change set and their full dump to
update their copy). It would likely need to take as input an XML file of new
pages and new revisions for old pages, as well as a list of pages and/or
revisions that have been deleted in the meantime; this would entail no changes
to MediaWiki core, all of the work would be done by a separate set of tools.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2012-06-18 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

Hydra hy...@alphacorp.tk changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||hy...@alphacorp.tk
 Resolution||FIXED

--- Comment #5 from Hydra hy...@alphacorp.tk 2012-06-18 13:08:45 UTC ---
Since November 2011, incremental dumps (also known as add/change dumps) are
available at http://dumps.wikimedia.org/other/incr/.

However, this feature is still being marked as experimental, so there is no
guarantee that it works fully.

Marking this bug as resolved.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2012-06-18 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

Ariel T. Glenn ar...@wikimedia.org changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |

--- Comment #6 from Ariel T. Glenn ar...@wikimedia.org 2012-06-18 13:21:05 
UTC ---
Actually the incrementals are adds/changes dumps, as documented at
http://wikitech.wikimedia.org/view/Dumps/Adds-changes_dumps

A bit more needs to be done before I would consider them equivalent to
incrementals, so repoening.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2011-05-20 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #3 from Mark A. Hershberger m...@everybody.org 2011-05-20 
22:18:45 UTC ---
(In reply to comment #2)
 On the one hand it's sort of like security through obscurity; we're relying on
 good will and inconvenience more than anything else to make the system work. 
 OTOH it's maybe better than ignoring the issue.  Thoughts? 

I see your concerns.  Where should this be discussed?

 We would still need to generate fulls of course on a regular basis,

My thought, probably naive, is that the diffs between these fulls may be
sufficient for Ted's original request.  Probably not strictly source code diffs
(though, now I am curious), but maybe a log of changes could be created that
would be more compact than the full dump but still allow a person to take the
last image they had up to the latest.

Hrmm...  I smell a project if no one else has tried this yet.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2011-05-20 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

Mark A. Hershberger m...@everybody.org changed:

   What|Removed |Added

   Priority|Unprioritized   |Normal

--- Comment #4 from Mark A. Hershberger m...@everybody.org 2011-05-20 
22:19:12 UTC ---
(In reply to comment #2)
 On the one hand it's sort of like security through obscurity; we're relying on
 good will and inconvenience more than anything else to make the system work. 
 OTOH it's maybe better than ignoring the issue.  Thoughts? 

I see your concerns.  Where should this be discussed?

 We would still need to generate fulls of course on a regular basis,

My thought, probably naive, is that the diffs between these fulls may be
sufficient for Ted's original request.  Probably not strictly source code diffs
(though, now I am curious), but maybe a log of changes could be created that
would be more compact than the full dump but still allow a person to take the
last image they had up to the latest.

Hrmm...  I smell a project if no one else has tried this yet.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2011-05-14 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

--- Comment #2 from Ariel T. Glenn ar...@wikimedia.org 2011-05-14 06:03:05 
UTC ---
Yes, actually I have a number of thoughts on the issue, but first we have to
deal with the political side of the issue, which is that deleted revisions
get deleted or oversighted for a reason, and if we only produce incremental
dumps on a regular basis, those revisions don't get removed from what's
produced.  At least, they wouldn't with the existing system.  We should talk
about the consequences of that.  We might be dealing with copyrighted material
which has since been removed, or information that identifies a user; those are
the two big cases in my mind.  The way the dumps are supposed to work is that
eventually we don't make the old copies public any more; space is reused, and
so downloaders pick up the new files.  Of course if someone wanted to keep a
copy of the old files they could, but in practice that doesn't happen for the
en dumps, as we saw several months ago when we had the server outage. 

On the one hand it's sort of like security through obscurity; we're relying on
good will and inconvenience more than anything else to make the system work. 
OTOH it's maybe better than ignoring the issue.  Thoughts? 

We would still need to generate fulls of course on a regular basis, and I am
guessing that we would need to provide a script that would merge incrementals
with fulls, since page moves and deletion information would need to be included
in the incrementals.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 28956] Dumps should be incremental

2011-05-13 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=28956

Mark A. Hershberger m...@everybody.org changed:

   What|Removed |Added

 CC||m...@everybody.org
 AssignedTo|tf...@wikimedia.org |ar...@wikimedia.org

--- Comment #1 from Mark A. Hershberger m...@everybody.org 2011-05-13 
21:59:38 UTC ---
re-assigning to Ariel as she is the one responsible for backups.  Ariel,
thoughts?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l