bawolff,

i don’t think the issue with the records has to do with the characters in the 
xml file. i think the issue has to do with the timeout of downloading large 
mediafiles on commons from a web page and possibly that the mediafile is served 
in a way that the UploadBase::verifyFile() method doesn’t like.

if you, or anyone on the list, has some time to review the latest patch of 
https://gerrit.wikimedia.org/r/#/c/127839/; please do so, and +2 it if you’re 
okay with it, or let me know what you think needs to be changed, so that it can 
be deployed to the beta cluster and production servers. once the patch has been 
deployed to the beta cluster we can test fae’s records using this new preview 
method.


with kind regards,
dan


On May 2, 2014, at 09:00 , Brian Wolff <[email protected]> wrote:

> Could you include what the binary code for your ae was if possible (on unix 
> computers, possibly also mac, the hd or hexdump command can tell you this) or 
> just attach the xml in question to a bug (since there is a possibility that 
> your email client might change the character)?
> 
> The encoding should be utf8 with NFC, but even if its not quite correct gw/mw 
> should convert it.
> 
> --bawolff
> On May 2, 2014 3:42 AM, "Fæ" <[email protected]> wrote:
> >
> > Thanks for the detailed investigation Dan. There must be some oddity in the 
> > way I'm creating my xml (Python generated, then edited in JEdit for any 
> > tweaks, should be 'utf-8') so I'll continue plugging at it.
> >
> > I keep missing your emails and finding them under my spam folder, no idea 
> > why.
> >
> > Fae
> >
> >
> > On 1 May 2014 20:04, dan entous <[email protected]> wrote:
> >>
> >> characters
> >> ----------
> >> i have a test xml i use to test titles and added the characters you 
> >> mentioned. i had no problem uploading the test xml file. here are 2 
> >> results that seem to indicate that there should not be an issue with the 
> >> characters:
> >>
> >> http://commons.wikimedia.beta.wmflabs.org/wiki/File:The_%22King%E2%80%99s_of_Hungary_-_%C3%86%22_holding_c%C3%B6uncil_%26_in_his_tent_%26_on_the_battl%C3%A9field_-_Froissart%27s_Chronicles_(Volume_IV,_part_2)_(1470-1475),_f.84_-_BL_Harley_MS_4380.jpg
> >>
> >> http://commons.wikimedia.beta.wmflabs.org/wiki/File:Dice_players_-_Lo_L%C3%ADbro_de_Multi_B%C8%A9lli_Mirac%C3%BCli_(14th_C),_f.9v_-_BL_%C3%85dd_MS_22557.jpg
> >>
> >>
> >> example record
> >> --------------
> >> i tested the example record locally and after about 2 minutes i got the 
> >> message:
> >>
> >> The file you submitted was too large. original URL: 
> >> http://link.nypl.org/2Qqj_oLvSbWRwPxtB1rq_wZ evaluated URL: 
> >> http://link.nypl.org/2Qqj_oLvSbWRwPxtB1rq_wZ
> >>
> >> my wiki was set to a limit of 100mb, so i up’d it to 1000mb.
> >>
> >> i also switched to the new preview branch i have in gerrit, 
> >> https://gerrit.wikimedia.org/r/#/c/127839/, for bug 
> >> https://bugzilla.wikimedia.org/show_bug.cgi?id=63864, which no longer 
> >> downloads an image to the wiki during the preview step. instead it 
> >> downloads all mediafiles in a background job.
> >>
> >> the job successfully completed after 3 minutes and the image was viewable 
> >> in my local wiki.
> >>
> >> i also took a look at our wikitech instance and saw that you had uploaded 
> >> the image there without issue. i also repeated the uploaded but got the 
> >> message:
> >>
> >> “This file did not pass file verification.”
> >>
> >> this seems to have been thrown by UploadBase.php, so i'd have to look 
> >> further into that issue. but i also suspect that commons may have just 
> >> timed out on the download of the image in the preview step. this type of 
> >> error seems similar to bug 63864. i just need someone to +2 the patch i 
> >> made so that we can test the new preview step on the beta cluster.
> >>
> >>
> >> with kind regards,
> >> dan
> >>
> >>
> >> On May 1, 2014, at 18:35 , Federico Leva (Nemo) <[email protected]> wrote:
> >>
> >> > Fæ, 01/05/2014 14:59:
> >> >> Instead, a good example of characters giving a problem is the file at
> >> >> [1]. This caused the GWT run to halt but was successfully loaded once
> >> >> I changed the "Æ" (ae ligature) character in Ægean to a simple "A".
> >> >> The only cause of this failure must have been the character, which is
> >> >> allowed in the mediawiki software.
> >> >>
> >> >> Links
> >> >> 1.https://commons.wikimedia.org/wiki/File:A_new_map_of_the_islands_of_the_Agean_Sea,_together_with_the_island_of_Crete,_and_the_adjoining_isles._NYPL1630716.tiff
> >> >
> >> > Thanks, this gives you clear steps to reproduce and makes a valuable bug 
> >> > report. Please file. :)
> >> >
> >> > Nemo
> >> >
> >> >
> >> > _______________________________________________
> >> > Glamtools mailing list
> >> > [email protected]
> >> > https://lists.wikimedia.org/mailman/listinfo/glamtools
> >>
> >>
> >> _______________________________________________
> >> Glamtools mailing list
> >> [email protected]
> >> https://lists.wikimedia.org/mailman/listinfo/glamtools
> >
> >
> >
> >
> > -- 
> > [email protected] https://commons.wikimedia.org/wiki/User:Fae
> > Personal and confidential, please do not circulate or re-quote.
> >
> > _______________________________________________
> > Glamtools mailing list
> > [email protected]
> > https://lists.wikimedia.org/mailman/listinfo/glamtools
> >
> _______________________________________________
> Glamtools mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/glamtools


_______________________________________________
Glamtools mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/glamtools

Reply via email to