https://bugzilla.wikimedia.org/show_bug.cgi?id=54736

       Web browser: ---
            Bug ID: 54736
           Summary: Inconsistency bug in file re-upload (possible race
                    condition) potentially causing data loss
           Product: Wikimedia
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: Unprioritized
         Component: Media storage
          Assignee: wikibugs-l@lists.wikimedia.org
          Reporter: bawolff...@gmail.com
                CC: aschulz4...@gmail.com, bawolff...@gmail.com,
                    bda...@wikimedia.org, fflo...@wikimedia.org,
                    mtrac...@member.fsf.org
    Classification: Unclassified
   Mobile Platform: ---

Denniss pointed me to https://commons.wikimedia.org/wiki/?curid=28611474

Basically, the original version of the file is missing (oi_archive_name empty
string). On top of that, the two old versions have the exact same upload
timestamp down to the second, which seems unlikely to be correct (unless both
uploaded by uploadwizard at same time, using chunked upload or upload stash
maybe?).

The img_metadata looks similar, except orientation field is different, the
files have different sha1's and different sizes, which would support the
hypothesis that user saw file looked wrong, rotated in an image manipulation
program, and re-uploaded (however that doesn't follow with the timestamp).

So weird things happening here:
*Original file is gone
*The second version has the exact same img_timestamp as the first version
**Furthermore log_timestamps are the same too. The log_ids are such that it
appears the timestamps are correct
*page_id in the log entry for the second version, is incorrect (It is 0, which
could indicate something wrong, perhaps with slave lag. Normally the logging
code does a db query by itself to determine what log_page to use, and I believe
it hits a slave for that)
*There is no dummy edit to the file description page for the first re-upload
*Hard to be sure, but it appears the original file was visible for a long time
(At least 23 hours after the second version was uploaded, as this was when it
was tagged as needing rotation). Then rotate bot comes along, rotates things,
but it rotates the wrong 2nd version, which is already rotated
**Possibly caused by a race condition with varnishes - the re-upload sends
cache purges but they happen before the first image is rendered, and the
thumbnail gets stuck in varnish cache. (I think the issue where the original
file is gone, is more important than the thumbnails stuck in cache, assuming
that is what is happening here)

Possible theory: User uses upload wizard to upload all the files in some
folder. This person accidentally selects the non-rotated version, and then
later selects the rotated version, in the same batch, and puts both as they
should have the same filename. Upload wizard uploads both, and tries to publish
both to the same file name. Since the long part of the upload already has taken
place (since its stashed), only the quick publish step has to happen, which can
happen at the same time. Race conditions in filebackend causes data loss of
original file.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to