The files have "<!-- <metadata_mapped_json>" and then the json-data, in their file description, there the field "gwtoolset-url-to-the-media-file" seems to be perfect. Thus if you can get the full text-page corresponding to the file it must be possible to easily parse the text to get this part and store it as json? How to get the wikitext's I do not know exactly (but that seems quite trivial). Mvg, Bas
Date: Thu, 26 Nov 2015 11:36:59 -0500 From: [email protected] To: [email protected] Subject: Re: [Glamtools] Getting metadata back from Wikimedia Commons We dont have great support for this. The best i know of is the "credit" field in https://commons.wikimedia.org/w/api.php?generator=categorymembers&gcmtitle=%20category:%20Media%20from%20Open%20Beelden%20%20&prop=imageinfo&gcmtype=file&iiprop=extmetadata|sha1&action=query&formatversion=2&gcmlimit=max&format=json but might need to parse some html (unless the data is included in exif/xmp in which case there is a different api query you can use). Also keep in mind you have to use the continue parameter to get the next page of results. Hope that helps, --bawolff On Thursday, November 26, 2015, Jesse de Vos <[email protected]> wrote: > Hi everyone, > > We’re trying to get a clearer picture of what material we have on Wikimedia > Commons so that our next batch upload doesn’t duplicate with material that is > already on Commons. The category: Media from Open Beelden contains all the > files and we would like to have all the metadata on Commons for that category > (specifically the 'source' URL) to match against our new content upload. > > Does anyone know how to gather this using the Commons API? Basically, a call > to the API with the “File:title” field that would return a JSON object with > all the metadata is exactly what we need. Help would be much appreciated! > > Best, > > Jesse > > -- > > Met vriendelijke groet, > > Jesse de Vos > Researcher Interactive and New Media > > T 035 - 677 39 37 > Aanwezig: ma t/m do > > <https://ci4.googleusercontent.com/proxy/-SRFha-hgXruUwURqU-2UHoKrRDxJA1bV-lcprm-obQlgDS90aaD-bHYTz3InAlVSVVv9hESQ9Z7v6tOZbVPXLe637s9tUuFOAOV14Gr7e7kWIqUDpsfbX4=s0-d-e1-ft#http://files.beeldengeluid.nl/handtekening/Beeld-en-Geluid_logo.jpg> > > Nederlands Instituut voor Beeld en Geluid > Media Parkboulevard 1, 1217 WE Hilversum | Postbus 1060, 1200 BB Hilversum > | beeldengeluid.nl > _______________________________________________ Glamtools mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/glamtools
_______________________________________________ Glamtools mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/glamtools
