Adding analytics@ a public  e-mail list where you can post questions such
as this one.

>that doesn’t tell us how often entities are accessed through
Special:EntityData or wbgetclaims
>Does this data already exist, even in the form of raw access logs?
Is this data always requested via http from an api endpoint that will hit a
varnish cache? (Daniel can probably answer this)
>From what I see on our data we have requests like the following:

www.wikidata.org /w/api.php ?action=wbgetclaims&format=json&entity=Q633155
www.wikidata.org /w/api.php
?callback=jQuery11130020702992017004984_1465195743367&format=json&action=wbgetclaims&property=P373&entity=Q5296&_=1465195743368
www.wikidata.org /w/api.php ?action=wbgetclaims&format=json&entity=Q573612
www.wikidata.org /w/api.php ?action=wbgetclaims&format=json&entity=Q472729
www.wikidata.org /w/api.php ?action=wbgetclaims&format=json&entity=Q349797
www.wikidata.org /w/api.php
?action=compare&torev=344163911&fromrev=344163907&format=json
www.wikidata.org /w/api.php ?action=wbgetentities&format=xml&ids=Q2356135
www.wikidata.org /w/api.php ?action=wbgetentities&format=xml&ids=Q2355988
www.wikidata.org /w/api.php
?action=compare&torev=344164023&fromrev=344163948&format=json

If the data you are interested in can be inferred from these requests there
is no additional data gathering needed.


>If not, what effort would be required to gather this data? For the
purposes of my proposal to the U.S. Census Bureau I am estimating around
>six weeks of effort for this for one person working full-time. If it will
take more time I will need to know.
I think I have mentioned this before on an e-mail thread but without
knowing the details of what you want to do we cannot give you a time
estimate. What are the exact metrics you are interested on?   Is the
project described anywhere in meta?

Thanks,

Nuria

On Thu, Jun 30, 2016 at 11:45 AM, James Hare <[email protected]> wrote:

> Copying Lydia Pintscher and Daniel Kinzler (with whom I’ve discussed this
> very topic).
>
> I am interested in metrics that describe how Wikidata is used. While we do
> have views on individual pages, that doesn’t tell us how often entities are
> accessed through Special:EntityData or wbgetclaims. Nor does it tell us how
> often statements/RDF triples show up in the Wikidata Query Service. Does
> this data already exist, even in the form of raw access logs? If not, what
> effort would be required to gather this data? For the purposes of my
> proposal to the U.S. Census Bureau I am estimating around six weeks of
> effort for this for one person working full-time. If it will take more time
> I will need to know.
>
>
> Thank you,
> James Hare
>
> On Thursday, June 2, 2016 at 2:18 PM, Nuria Ruiz wrote:
>
> James:
>
> >My current operating assumption is that it would take one person,
> working on a full time basis, around six weeks to go from raw access logs
> >to a functioning API that would provide information on how many times a
> Wikidata entity was accessed through the various APIs and the >query
> service. Do you believe this to be an accurate level of effort estimation
> based on your experience with past projects of this nature?
> You are starting from the assumption that we do have the data you are
> interested in in the logs which I am not sure it is the case, have you done
> you checks on this regard with wikidata developers?
>
> Analytics 'automagically' collects data from logs about *page* requests,
> any other requests collections (and it seems that yours fit on this
> scenario) need to be instrumented. I would send an e-mail to analytics@
> public list and wikidata folks to ask about how to harvest the data you are
> interested in, it doesn't sound like it is being collected at this time so
> your project scope might be quite a bit bigger than you think.
>
> Thanks,
>
> Nuria
>
>
>
>
> On Thu, Jun 2, 2016 at 5:06 AM, James Hare <[email protected]> wrote:
>
> Hello Nuria,
>
> I am currently developing a proposal for the U.S. Census Bureau to
> integrate their datasets with Wikidata. As part of this, I am interested in
> getting Wikidata usage metrics beyond the page view data currently
> available. My concern is that the page views API gives you information only
> on how many times a *page* is accessed – but Wikidata is not really used
> in this way. More often is it the case that Wikidata’s information is
> accessed through the API endpoints (wbgetclaims etc.), through
> Special:EntityData, and the Wikidata Query Service. If we have information
> on usage through those mechanisms, that would give me much better
> information on Wikidata’s usage.
>
> To the extent these metrics are important to my prospective client, I am
> willing to provide in-kind support to the analytics team to make this
> information available, including expenses associated with the NDA process
> (I understand that such a person may need to deal with raw access logs that
> include PII.) My current operating assumption is that it would take one
> person, working on a full time basis, around six weeks to go from raw
> access logs to a functioning API that would provide information on how many
> times a Wikidata entity was accessed through the various APIs and the query
> service. Do you believe this to be an accurate level of effort estimation
> based on your experience with past projects of this nature?
>
> Please let me know if you have any questions. I am happy to discuss my
> idea with you further.
>
>
> Regards,
> James Hare
>
>
>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to