Re: [Wikidata] Update frequency on the Wikidata Query API

2015-06-19 Thread Ricordisamoa
Just ask WDQ for a list of item ids, then pass them to the live API 
https://www.wikidata.org/w/api.php?action=helpmodules=wbgetentities.
You may miss some recently edited items, but at least you wouldn't base 
your edits upon outdated revisions (and wbeditentity 
https://www.wikidata.org/w/api.php?action=helpmodules=wbeditentity's 
baserevid argument completely eliminates the risk).


Il 19/06/2015 00:10, Andra Waagmeester ha scritto:
Indeed the ultimate truth source is on the wikidata site it self. 
However, I am not aware of a way to query the Wikidata site for a list 
of items fitting a certain condition (e.g. all Wikidata items 
containing a claim with the  NCBI Entrez Gene (P351) property.)


It is here that I need to rely on WDQ (and WDQS) and potentially risk 
missing existing items due to delays in which WDQ (and WDAS) gets updated.


I would like to know if I could rely on a given time frame - being it 
seconds, hours,  days, or one week).


I currently assume a delay of a week, but I don't know how accurate 
this assumption is.


Regards,



On Thu, Jun 18, 2015 at 10:23 PM, Stas Malyshev 
smalys...@wikimedia.org mailto:smalys...@wikimedia.org wrote:


Hi!

 The way that updates work *in all systems* (polling small lists of
 recent changes at intervals and hoping that this leads to a complete
 change history), it seems quite possible that such systems will
 sometimes miss an update, at least in the long run and under varying
 conditions (high server load, network troubles, update script
down for a
 while, whatever). Insufficient update frequency is maybe not the
biggest
 problem here (it should be in the range of one to a few minutes
for all
 of the services).

Very important point with which I agree - it is completely
possible that
update polling misses an update, WDQS is no exception and it usually
does not treat it as a problem, as the next update can fill up the
missed one. However the ultimate truth source is on the wikidata site
only. Beware of the caches though - if you ask for the same data
on the
same URL twice, I think you can get the same result even if the
underlying data changed in the meantime.

--
Stas Malyshev
smalys...@wikimedia.org mailto:smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Update frequency on the Wikidata Query API

2015-06-19 Thread Jane Darnell
Wikidata is a wiki like any other, so you can just click What links here
on Property:P351 and pick up all items that way

On Fri, Jun 19, 2015 at 9:14 AM, Ricordisamoa ricordisa...@openmailbox.org
wrote:

  Just ask WDQ for a list of item ids, then pass them to the live API
 https://www.wikidata.org/w/api.php?action=helpmodules=wbgetentities.
 You may miss some recently edited items, but at least you wouldn't base
 your edits upon outdated revisions (and wbeditentity
 https://www.wikidata.org/w/api.php?action=helpmodules=wbeditentity's
 baserevid argument completely eliminates the risk).

 Il 19/06/2015 00:10, Andra Waagmeester ha scritto:

 Indeed the ultimate truth source is on the wikidata site it self. However,
 I am not aware of a way to query the Wikidata site for a list of items
 fitting a certain condition (e.g. all Wikidata items containing a claim
 with the  NCBI Entrez Gene (P351) property.)

  It is here that I need to rely on WDQ (and WDQS) and potentially risk
 missing existing items due to delays in which WDQ (and WDAS) gets updated.

 I would like to know if I could rely on a given time frame - being it
 seconds, hours,  days, or one week).

 I currently assume a delay of a week, but I don't know how accurate this
 assumption is.

 Regards,



 On Thu, Jun 18, 2015 at 10:23 PM, Stas Malyshev smalys...@wikimedia.org
 wrote:

 Hi!

  The way that updates work *in all systems* (polling small lists of
  recent changes at intervals and hoping that this leads to a complete
  change history), it seems quite possible that such systems will
  sometimes miss an update, at least in the long run and under varying
  conditions (high server load, network troubles, update script down for a
  while, whatever). Insufficient update frequency is maybe not the biggest
  problem here (it should be in the range of one to a few minutes for all
  of the services).

 Very important point with which I agree - it is completely possible that
 update polling misses an update, WDQS is no exception and it usually
 does not treat it as a problem, as the next update can fill up the
 missed one. However the ultimate truth source is on the wikidata site
 only. Beware of the caches though - if you ask for the same data on the
 same URL twice, I think you can get the same result even if the
 underlying data changed in the meantime.

 --
 Stas Malyshev
 smalys...@wikimedia.org

 ___
 Wikidata mailing list
 Wikidata@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata




 ___
 Wikidata mailing 
 listWikidata@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata



 ___
 Wikidata mailing list
 Wikidata@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Update frequency on the Wikidata Query API

2015-06-18 Thread Thomas Steiner
Hi there,

Are you aware of the revision URL parameter? Last paragraph of
https://www.wikidata.org/wiki/Wikidata:Data_access#Linked_Data_interface.
This hopefully should help.

Cheers,
Tom


-- 
Dr. Thomas Steiner, Employee, Google Inc.
http://blog.tomayac.com, http://twitter.com/tomayac

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck0fjumBl3DCharaCTersAttH3b0ttom.hTtP5://xKcd.c0m/1181/
-END PGP SIGNATURE-
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Update frequency on the Wikidata Query API

2015-06-18 Thread Thomas Steiner
Sorry, realizing only now that this is for the Query API, not the
Linked Data interface. My bad, please ignore my previous reply.

-- 
Dr. Thomas Steiner, Employee, Google Inc.
http://blog.tomayac.com, http://twitter.com/tomayac

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck0fjumBl3DCharaCTersAttH3b0ttom.hTtP5://xKcd.c0m/1181/
-END PGP SIGNATURE-

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Update frequency on the Wikidata Query API

2015-06-18 Thread Stas Malyshev
Hi!

 I have seen that SPARQL query service and it indeed is an interesting
 alternative. In terms of stability and update frequency how different is
 the SPARQL query service from the Wikidata Query API?

In terms of stability: it's beta, so while we try to keep it up and
running smoothly, it is not out of the question that it can be taken
down at any moment, either because we found a bug or because we need to
update something, and the data model can change too. We do not expect
substantial changes in data model anymore, and we try to keep it up and
running (doesn't help that we are in the middle of large labs outage
right now:
https://wikitech.wikimedia.org/wiki/Incident_documentation/20150617-LabsNFSOutage
) and synched continuously (i.e. no more than minutes behind wikidata
edits), but as long as it's beta we can give no guarantees on anything.
We're working hard to make it production-quality, but that will take a
bit more time.

The differences between WDQ and WDQS/SPARQL is that SPARQL is a
full-features language for querying triple-based (RDF) data sets, and
allows very complex queries. It is also a standard in linked data world.
You can use the translator (http://tools.wmflabs.org/wdq2sparql/w2s.php)
- once the labs outage ends of course - to convert between WDQ syntax
and SPARQL. Also check out other links on the WDQS beta page for short
intros about how things are done with SPARQL and examples of which
queries you can run.

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Update frequency on the Wikidata Query API

2015-06-18 Thread Stas Malyshev
Hi!

 How often is the WDQ api really being updated? Is it possible to query
 wikidata live, with WDQ and if not, are there alternatives that would
 allow this?

We currently have SPARQL query service in beta[1], which is updated
constantly from Wikidata. Note that since it's beta it is not stable yet
both operationally and data-model-wise, so please be aware of this, also
it has timeout limits that won't allow you for now to run queries that
are too complex. But if you want to check it out and see if that fits
your use case you are most welcome.

[1] http://wdqs-beta.wmflabs.org/
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Update frequency on the Wikidata Query API

2015-06-18 Thread Andra Waagmeester
Hi Stas,

I have seen that SPARQL query service and it indeed is an interesting
alternative. In terms of stability and update frequency how different is
the SPARQL query service from the Wikidata Query API?

Cheers, Andra

Andra

On Thu, Jun 18, 2015 at 9:20 PM, Stas Malyshev smalys...@wikimedia.org
wrote:

 Hi!

  How often is the WDQ api really being updated? Is it possible to query
  wikidata live, with WDQ and if not, are there alternatives that would
  allow this?

 We currently have SPARQL query service in beta[1], which is updated
 constantly from Wikidata. Note that since it's beta it is not stable yet
 both operationally and data-model-wise, so please be aware of this, also
 it has timeout limits that won't allow you for now to run queries that
 are too complex. But if you want to check it out and see if that fits
 your use case you are most welcome.

 [1] http://wdqs-beta.wmflabs.org/
 --
 Stas Malyshev
 smalys...@wikimedia.org

 ___
 Wikidata mailing list
 Wikidata@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Update frequency on the Wikidata Query API

2015-06-18 Thread Markus Krötzsch

On 18.06.2015 21:40, Thomas Steiner wrote:

Sorry, realizing only now that this is for the Query API, not the
Linked Data interface. My bad, please ignore my previous reply.



Maybe it would still be a good idea for bots to do a check against the 
live json data right before the edit. Checking live json is an 
additional step but should hardly slow down bots (which are throttled 
anyway).


The way that updates work *in all systems* (polling small lists of 
recent changes at intervals and hoping that this leads to a complete 
change history), it seems quite possible that such systems will 
sometimes miss an update, at least in the long run and under varying 
conditions (high server load, network troubles, update script down for a 
while, whatever). Insufficient update frequency is maybe not the biggest 
problem here (it should be in the range of one to a few minutes for all 
of the services).


Regards,

Markus

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Update frequency on the Wikidata Query API

2015-06-18 Thread Andra Waagmeester
Indeed the ultimate truth source is on the wikidata site it self. However,
I am not aware of a way to query the Wikidata site for a list of items
fitting a certain condition (e.g. all Wikidata items containing a claim
with the  NCBI Entrez Gene (P351) property.)

It is here that I need to rely on WDQ (and WDQS) and potentially risk
missing existing items due to delays in which WDQ (and WDAS) gets updated.

I would like to know if I could rely on a given time frame - being it
seconds, hours,  days, or one week).

I currently assume a delay of a week, but I don't know how accurate this
assumption is.

Regards,



On Thu, Jun 18, 2015 at 10:23 PM, Stas Malyshev smalys...@wikimedia.org
wrote:

 Hi!

  The way that updates work *in all systems* (polling small lists of
  recent changes at intervals and hoping that this leads to a complete
  change history), it seems quite possible that such systems will
  sometimes miss an update, at least in the long run and under varying
  conditions (high server load, network troubles, update script down for a
  while, whatever). Insufficient update frequency is maybe not the biggest
  problem here (it should be in the range of one to a few minutes for all
  of the services).

 Very important point with which I agree - it is completely possible that
 update polling misses an update, WDQS is no exception and it usually
 does not treat it as a problem, as the next update can fill up the
 missed one. However the ultimate truth source is on the wikidata site
 only. Beware of the caches though - if you ask for the same data on the
 same URL twice, I think you can get the same result even if the
 underlying data changed in the meantime.

 --
 Stas Malyshev
 smalys...@wikimedia.org

 ___
 Wikidata mailing list
 Wikidata@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata