So... this seems to have snuck back in a month ago:
https://www.mediawiki.org/wiki/Special:Code/MediaWiki/101021
https://bugzilla.wikimedia.org/show_bug.cgi?id=21860
Have we resolved the deployment questions on how to actually do the change?
Just want to make sure ops has plenty of warning
On 28/11/11 08:29, Brion Vibber wrote:
So... this seems to have snuck back in a month ago:
https://www.mediawiki.org/wiki/Special:Code/MediaWiki/101021
https://bugzilla.wikimedia.org/show_bug.cgi?id=21860
I don't think it really snuck, Rob has been talking about it for a
while, see e.g.
I have no idea about the schema changes, but to choose a digest for
detection of identity reverts is pretty simple. The really difficult
part is to choose a locally sensitive hash or fingerprint that works
for very similar revisions with a lot of content.
I would propose that the digest is stored
Στις 17-09-2011, ημέρα Σαβ, και ώρα 22:55 -0700, ο/η Robert Rohde
έγραψε:
On Sat, Sep 17, 2011 at 4:56 PM, Anthony wikim...@inbox.org wrote:
snip
For offline analyses, there's no need to change the online database tables.
Need? That's debatable, but one of the major motivators is the
developers wikitech-l@lists.wikimedia.org
Sent: Sun, Sep 18, 2011 05:56:15 GMT+00:00
Subject: Re: [Wikitech-l] Adding MD5 / SHA1 column to revision table
(discussing r94289)
On Sat, Sep 17, 2011 at 4:56 PM, Anthony wikim...@inbox.org wrote:
On Sat, Sep 17, 2011 at 6:46 PM, Robert Rohde raro...@gmail.com
On 09/18/2011 08:55 AM, Robert Rohde wrote:
people find ways to improve the attacks on SHA-1. (The existing
attacks usually require the ability to feed arbitrary binary strings
into the hash function. Given that both browsers and Mediawiki will
tend to reject binary data placed in an edit
What is the threat?
Sent from my Verizon Wireless Phone
-Original message-
From: Ilmari Karonen nos...@vyznev.net
To: Wikimedia developers wikitech-l@lists.wikimedia.org
Sent: Sun, Sep 18, 2011 20:20:34 GMT+00:00
Subject: Re: [Wikitech-l] Adding MD5 / SHA1 column to revision table
On Sun, Sep 18, 2011 at 7:24 AM, Russell N. Nelson - rnnelson
rnnel...@clarkson.edu wrote:
It is meaningless to talk about cryptography without a threat model, just as
Robert says. Is
anybody actually attacking us?
You mean, like Grawp?
___
On Sun, Sep 18, 2011 at 2:33 AM, Ariel T. Glenn ar...@wikimedia.org wrote:
Στις 17-09-2011, ημέρα Σαβ, και ώρα 22:55 -0700, ο/η Robert Rohde
έγραψε:
On Sat, Sep 17, 2011 at 4:56 PM, Anthony wikim...@inbox.org wrote:
snip
For offline analyses, there's no need to change the online database
On Sun, Sep 18, 2011 at 7:24 AM, Russell N. Nelson - rnnelson
rnnel...@clarkson.edu wrote:
It is meaningless to talk about cryptography without a threat model, just as
Robert says. Is anybody actually attacking us? Or are we worried about
accidental collisions?
I believe it began as
On Sun, Sep 18, 2011 at 5:30 PM, Chad innocentkil...@gmail.com wrote:
On Sun, Sep 18, 2011 at 7:24 AM, Russell N. Nelson - rnnelson
rnnel...@clarkson.edu wrote:
It is meaningless to talk about cryptography without a threat model, just as
Robert says. Is anybody actually attacking us? Or are
On Sun, Sep 18, 2011 at 5:47 PM, Anthony wikim...@inbox.org wrote:
On Sun, Sep 18, 2011 at 5:30 PM, Chad innocentkil...@gmail.com wrote:
On Sun, Sep 18, 2011 at 7:24 AM, Russell N. Nelson - rnnelson
rnnel...@clarkson.edu wrote:
It is meaningless to talk about cryptography without a threat
Chad wrote:
For those of us who do not know...what the heck is a Grawp attack?
Does it involve generating hash collisions?
-Chad
It's the name of a wikipedia vandal.
http://en.wikipedia.org/wiki/User:Grawp
___
Wikitech-l mailing list
On Sun, Sep 18, 2011 at 5:50 PM, Chad innocentkil...@gmail.com wrote:
On Sun, Sep 18, 2011 at 5:47 PM, Anthony wikim...@inbox.org wrote:
On Sun, Sep 18, 2011 at 5:30 PM, Chad innocentkil...@gmail.com wrote:
On Sun, Sep 18, 2011 at 7:24 AM, Russell N. Nelson - rnnelson
rnnel...@clarkson.edu
On Sun, Sep 18, 2011 at 6:01 PM, Anthony wikim...@inbox.org wrote:
There's also a
description at http://en.wikipedia.org/wiki/User:Grawp , which does
not do justice to the mad hacker skillz of this individual and his
intent on finding bugs in mediawiki and exploiting them.
(and/or the Grawp
Anthony wrote:
It does not involve generating hash collisions, but it involves
finding various bugs in mediawiki and using them to vandalise, often
by injecting javascript. The best description I could find was at
Encyclopedia Dramatica, which seems to be taken down (there's a cache
if you
On Sun, Sep 18, 2011 at 7:20 PM, Anthony wikim...@inbox.org wrote:
On Sun, Sep 18, 2011 at 7:07 PM, bawolff bawolff...@gmail.com wrote:
Anthony wrote:
The pages you link to seem to indicate he's nothing more than a
willy-on-wheels type vandal, who at worst tricked an admin into doing
a delete
On Fri, Sep 16, 2011 at 6:48 PM, Thomas Gries m...@tgries.de wrote:
Was there a certain reason to chose base 36 ?
Why not recoding to base 62 and saving 3 bytes per checksum ?
I don't know, this was way, way before my time. But then, why use base
62 if you can use base 64? Encoders/decoders for
Roan Kattouw wrote:
On Fri, Sep 16, 2011 at 6:48 PM, Thomas Griesm...@tgries.de wrote:
Was there a certain reason to chose base 36 ?
Why not recoding to base 62 and saving 3 bytes per checksum ?
I don't know, this was way, way before my time. But then, why use base
62 if you can use base
On Sat, Sep 17, 2011 at 8:26 AM, Roan Kattouw roan.katt...@gmail.com wrote:
Minor detail: I think it's more likely we'll use SHA-1 hashes rather
than MD5 hashes.
Is there a good reason to prefer SHA-1?
Both have weaknesses allowing one to construct a collision (with
considerable effort), but I
On Sat, Sep 17, 2011 at 6:46 PM, Robert Rohde raro...@gmail.com wrote:
Is there a good reason to prefer SHA-1?
Both have weaknesses allowing one to construct a collision (with
considerable effort)
Considerable effort? I can create an MD5 collision in a few minutes
on my home computer. Is
On Sat, Sep 17, 2011 at 4:56 PM, Anthony wikim...@inbox.org wrote:
On Sat, Sep 17, 2011 at 6:46 PM, Robert Rohde raro...@gmail.com wrote:
Is there a good reason to prefer SHA-1?
Both have weaknesses allowing one to construct a collision (with
considerable effort)
Considerable effort? I can
RE:
http://www.mediawiki.org/wiki/Requests_for_comment/Database_field_for_checksum_of_page_text#Field_type
Recently, Adding MD5 / SHA1 column to revision table (discussing r94289)
was discussed.
For some applications, I use the technique of representing the 128 bit
of md5 or other checksums
On Fri, Sep 16, 2011 at 8:15 AM, Thomas Gries m...@tgries.de wrote:
For some applications, I use the technique of representing the 128 bit
of md5 or other checksums
as base-62 character strings
instead of hexadecimal (base-16) strings.
MediaWiki already uses a similar
Am 16.09.2011 11:24, schrieb Roan Kattouw:
For some applications, I use the technique of representing the 128 bit
of md5 or other checksums
as base-62 character strings
instead of hexadecimal (base-16) strings.
MediaWiki already uses a similar technique, storing SHA-1 hashes
On 9/16/11 9:48 AM, Thomas Gries wrote:
Am 16.09.2011 11:24, schrieb Roan Kattouw:
For some applications, I use the technique of representing the 128 bit
of md5 or other checksums
as base-62 character strings
instead of hexadecimal (base-16) strings.
MediaWiki already uses
On 11-09-16 09:48 AM, Thomas Gries wrote:
Am 16.09.2011 11:24, schrieb Roan Kattouw:
For some applications, I use the technique of representing the 128 bit
of md5 or other checksums
as base-62 character strings
instead of hexadecimal (base-16) strings.
MediaWiki already uses a
On Fri, Sep 16, 2011 at 9:48 AM, Thomas Gries m...@tgries.de wrote:
Am 16.09.2011 11:24, schrieb Roan Kattouw:
For some applications, I use the technique of representing the 128 bit
of md5 or other checksums
as base-62 character strings
instead of hexadecimal (base-16)
2011/9/4 MZMcBride z...@mzmcbride.com
Diederik van Liere wrote:
I've suggested to generate bulk checksums as well but both Brion and
Ariel see
the primary purpose of this field to check the validity of the dump
generating
process and so they want to generate the checksums straight from
On Sat, Sep 3, 2011 at 12:33 AM, Rob Lanphier ro...@wikimedia.org wrote:
I generally suspect that a standard index is going to be a waste for
the most urgent uses of this. It will rarely be interesting to search
for common hashes between articles. The far more common case will be
to search
Thanks for moving the page.
Diederik
On 2011-09-04, at 3:29 PM, Krinkle wrote:
2011/9/4 MZMcBride z...@mzmcbride.com
Diederik van Liere wrote:
I've suggested to generate bulk checksums as well but both Brion and
Ariel see
the primary purpose of this field to check the validity of the dump
Hi,
I've suggested to generate bulk checksums as well but both Brion and Ariel see
the primary purpose of this field to check the validity of the dump generating
process and so they want to generate the checksums straight from the external
storage.
In a general sense, there are two use cases
On Sat, Sep 3, 2011 at 2:20 AM, Asher Feldman afeld...@wikimedia.org wrote:
Is code written to populate rev_sha1 on each new edit?
I believe that was part of Aaron's code that got reverted, yes.
Offline generation of hashes is definitely possible, but the only
reason you'd do it is to minimize
On Thu, Aug 18, 2011 at 7:40 AM, Diederik van Liere dvanli...@gmail.com wrote:
Hi!
I am starting this thread because Brion's revision r94289 reverted
r94289 [0] stating core schema change with no discussion [1].
Bumping this: What are the remaining open questions regarding this
schema change?
Would it be possible to generate offline hashes for the bulk of our revision
corpus via dumps and load that into prod to minimize the time and impact of
the backfill?
When using for analysis, will we wish the new columns had partial indexes
(first 6 characters?)
Is code written to populate
Bug 2939 is one relevant bug to this, it could probably use an index.
[1] https://bugzilla.wikimedia.org/show_bug.cgi?id=2939
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
On 11-09-02 05:20 PM, Asher Feldman wrote:
Would it be possible to generate offline hashes for
On Fri, Sep 2, 2011 at 5:47 PM, Daniel Friesen
li...@nadir-seen-fire.com wrote:
On 11-09-02 05:20 PM, Asher Feldman wrote:
When using for analysis, will we wish the new columns had partial indexes
(first 6 characters?)
Bug 2939 is one relevant bug to this, it could probably use an index.
[1]
On 11-09-02 09:33 PM, Rob Lanphier wrote:
On Fri, Sep 2, 2011 at 5:47 PM, Daniel Friesen
li...@nadir-seen-fire.com wrote:
On 11-09-02 05:20 PM, Asher Feldman wrote:
When using for analysis, will we wish the new columns had partial indexes
(first 6 characters?)
Bug 2939 is one relevant bug to
Hi!
I am starting this thread because Brion's revision r94289 reverted
r94289 [0] stating core schema change with no discussion [1].
Bugs 21860 [2] and 25312 [3] advocate for the inclusion of a hash
column (either md5 or sha1) in the revision table. The primary use
case of this column will be to
39 matches
Mail list logo