Re: [Wikidata] Update frequency on the Wikidata Query API
Just ask WDQ for a list of item ids, then pass them to the live API https://www.wikidata.org/w/api.php?action=helpmodules=wbgetentities. You may miss some recently edited items, but at least you wouldn't base your edits upon outdated revisions (and wbeditentity https://www.wikidata.org/w/api.php?action=helpmodules=wbeditentity's baserevid argument completely eliminates the risk). Il 19/06/2015 00:10, Andra Waagmeester ha scritto: Indeed the ultimate truth source is on the wikidata site it self. However, I am not aware of a way to query the Wikidata site for a list of items fitting a certain condition (e.g. all Wikidata items containing a claim with the NCBI Entrez Gene (P351) property.) It is here that I need to rely on WDQ (and WDQS) and potentially risk missing existing items due to delays in which WDQ (and WDAS) gets updated. I would like to know if I could rely on a given time frame - being it seconds, hours, days, or one week). I currently assume a delay of a week, but I don't know how accurate this assumption is. Regards, On Thu, Jun 18, 2015 at 10:23 PM, Stas Malyshev smalys...@wikimedia.org mailto:smalys...@wikimedia.org wrote: Hi! The way that updates work *in all systems* (polling small lists of recent changes at intervals and hoping that this leads to a complete change history), it seems quite possible that such systems will sometimes miss an update, at least in the long run and under varying conditions (high server load, network troubles, update script down for a while, whatever). Insufficient update frequency is maybe not the biggest problem here (it should be in the range of one to a few minutes for all of the services). Very important point with which I agree - it is completely possible that update polling misses an update, WDQS is no exception and it usually does not treat it as a problem, as the next update can fill up the missed one. However the ultimate truth source is on the wikidata site only. Beware of the caches though - if you ask for the same data on the same URL twice, I think you can get the same result even if the underlying data changed in the meantime. -- Stas Malyshev smalys...@wikimedia.org mailto:smalys...@wikimedia.org ___ Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Re: [Wikidata] Update frequency on the Wikidata Query API
Wikidata is a wiki like any other, so you can just click What links here on Property:P351 and pick up all items that way On Fri, Jun 19, 2015 at 9:14 AM, Ricordisamoa ricordisa...@openmailbox.org wrote: Just ask WDQ for a list of item ids, then pass them to the live API https://www.wikidata.org/w/api.php?action=helpmodules=wbgetentities. You may miss some recently edited items, but at least you wouldn't base your edits upon outdated revisions (and wbeditentity https://www.wikidata.org/w/api.php?action=helpmodules=wbeditentity's baserevid argument completely eliminates the risk). Il 19/06/2015 00:10, Andra Waagmeester ha scritto: Indeed the ultimate truth source is on the wikidata site it self. However, I am not aware of a way to query the Wikidata site for a list of items fitting a certain condition (e.g. all Wikidata items containing a claim with the NCBI Entrez Gene (P351) property.) It is here that I need to rely on WDQ (and WDQS) and potentially risk missing existing items due to delays in which WDQ (and WDAS) gets updated. I would like to know if I could rely on a given time frame - being it seconds, hours, days, or one week). I currently assume a delay of a week, but I don't know how accurate this assumption is. Regards, On Thu, Jun 18, 2015 at 10:23 PM, Stas Malyshev smalys...@wikimedia.org wrote: Hi! The way that updates work *in all systems* (polling small lists of recent changes at intervals and hoping that this leads to a complete change history), it seems quite possible that such systems will sometimes miss an update, at least in the long run and under varying conditions (high server load, network troubles, update script down for a while, whatever). Insufficient update frequency is maybe not the biggest problem here (it should be in the range of one to a few minutes for all of the services). Very important point with which I agree - it is completely possible that update polling misses an update, WDQS is no exception and it usually does not treat it as a problem, as the next update can fill up the missed one. However the ultimate truth source is on the wikidata site only. Beware of the caches though - if you ask for the same data on the same URL twice, I think you can get the same result even if the underlying data changed in the meantime. -- Stas Malyshev smalys...@wikimedia.org ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata ___ Wikidata mailing listWikidata@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Re: [Wikidata] Update frequency on the Wikidata Query API
Hi there, Are you aware of the revision URL parameter? Last paragraph of https://www.wikidata.org/wiki/Wikidata:Data_access#Linked_Data_interface. This hopefully should help. Cheers, Tom -- Dr. Thomas Steiner, Employee, Google Inc. http://blog.tomayac.com, http://twitter.com/tomayac -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck0fjumBl3DCharaCTersAttH3b0ttom.hTtP5://xKcd.c0m/1181/ -END PGP SIGNATURE- ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Re: [Wikidata] Update frequency on the Wikidata Query API
Sorry, realizing only now that this is for the Query API, not the Linked Data interface. My bad, please ignore my previous reply. -- Dr. Thomas Steiner, Employee, Google Inc. http://blog.tomayac.com, http://twitter.com/tomayac -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck0fjumBl3DCharaCTersAttH3b0ttom.hTtP5://xKcd.c0m/1181/ -END PGP SIGNATURE- ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Re: [Wikidata] Update frequency on the Wikidata Query API
Hi! I have seen that SPARQL query service and it indeed is an interesting alternative. In terms of stability and update frequency how different is the SPARQL query service from the Wikidata Query API? In terms of stability: it's beta, so while we try to keep it up and running smoothly, it is not out of the question that it can be taken down at any moment, either because we found a bug or because we need to update something, and the data model can change too. We do not expect substantial changes in data model anymore, and we try to keep it up and running (doesn't help that we are in the middle of large labs outage right now: https://wikitech.wikimedia.org/wiki/Incident_documentation/20150617-LabsNFSOutage ) and synched continuously (i.e. no more than minutes behind wikidata edits), but as long as it's beta we can give no guarantees on anything. We're working hard to make it production-quality, but that will take a bit more time. The differences between WDQ and WDQS/SPARQL is that SPARQL is a full-features language for querying triple-based (RDF) data sets, and allows very complex queries. It is also a standard in linked data world. You can use the translator (http://tools.wmflabs.org/wdq2sparql/w2s.php) - once the labs outage ends of course - to convert between WDQ syntax and SPARQL. Also check out other links on the WDQS beta page for short intros about how things are done with SPARQL and examples of which queries you can run. -- Stas Malyshev smalys...@wikimedia.org ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Re: [Wikidata] Update frequency on the Wikidata Query API
Hi! How often is the WDQ api really being updated? Is it possible to query wikidata live, with WDQ and if not, are there alternatives that would allow this? We currently have SPARQL query service in beta[1], which is updated constantly from Wikidata. Note that since it's beta it is not stable yet both operationally and data-model-wise, so please be aware of this, also it has timeout limits that won't allow you for now to run queries that are too complex. But if you want to check it out and see if that fits your use case you are most welcome. [1] http://wdqs-beta.wmflabs.org/ -- Stas Malyshev smalys...@wikimedia.org ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Re: [Wikidata] Update frequency on the Wikidata Query API
Hi Stas, I have seen that SPARQL query service and it indeed is an interesting alternative. In terms of stability and update frequency how different is the SPARQL query service from the Wikidata Query API? Cheers, Andra Andra On Thu, Jun 18, 2015 at 9:20 PM, Stas Malyshev smalys...@wikimedia.org wrote: Hi! How often is the WDQ api really being updated? Is it possible to query wikidata live, with WDQ and if not, are there alternatives that would allow this? We currently have SPARQL query service in beta[1], which is updated constantly from Wikidata. Note that since it's beta it is not stable yet both operationally and data-model-wise, so please be aware of this, also it has timeout limits that won't allow you for now to run queries that are too complex. But if you want to check it out and see if that fits your use case you are most welcome. [1] http://wdqs-beta.wmflabs.org/ -- Stas Malyshev smalys...@wikimedia.org ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Re: [Wikidata] Update frequency on the Wikidata Query API
On 18.06.2015 21:40, Thomas Steiner wrote: Sorry, realizing only now that this is for the Query API, not the Linked Data interface. My bad, please ignore my previous reply. Maybe it would still be a good idea for bots to do a check against the live json data right before the edit. Checking live json is an additional step but should hardly slow down bots (which are throttled anyway). The way that updates work *in all systems* (polling small lists of recent changes at intervals and hoping that this leads to a complete change history), it seems quite possible that such systems will sometimes miss an update, at least in the long run and under varying conditions (high server load, network troubles, update script down for a while, whatever). Insufficient update frequency is maybe not the biggest problem here (it should be in the range of one to a few minutes for all of the services). Regards, Markus ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Re: [Wikidata] Update frequency on the Wikidata Query API
Indeed the ultimate truth source is on the wikidata site it self. However, I am not aware of a way to query the Wikidata site for a list of items fitting a certain condition (e.g. all Wikidata items containing a claim with the NCBI Entrez Gene (P351) property.) It is here that I need to rely on WDQ (and WDQS) and potentially risk missing existing items due to delays in which WDQ (and WDAS) gets updated. I would like to know if I could rely on a given time frame - being it seconds, hours, days, or one week). I currently assume a delay of a week, but I don't know how accurate this assumption is. Regards, On Thu, Jun 18, 2015 at 10:23 PM, Stas Malyshev smalys...@wikimedia.org wrote: Hi! The way that updates work *in all systems* (polling small lists of recent changes at intervals and hoping that this leads to a complete change history), it seems quite possible that such systems will sometimes miss an update, at least in the long run and under varying conditions (high server load, network troubles, update script down for a while, whatever). Insufficient update frequency is maybe not the biggest problem here (it should be in the range of one to a few minutes for all of the services). Very important point with which I agree - it is completely possible that update polling misses an update, WDQS is no exception and it usually does not treat it as a problem, as the next update can fill up the missed one. However the ultimate truth source is on the wikidata site only. Beware of the caches though - if you ask for the same data on the same URL twice, I think you can get the same result even if the underlying data changed in the meantime. -- Stas Malyshev smalys...@wikimedia.org ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata