Re: [Wikidata-l] Bot request: 250+ thousands person data

2014-04-29 Thread Gerard Meijssen
Hoi,
When Wikipedia has an approach to specific articles that are not compatible
with Wikidata, we can create items that fit our need and keep the original
item for what it is .. for instance a list of people (in the case of the
Wright brothers).

The notion that Wikidata defers to Wikipedia is not one can keep because
there are bound to be Wikipedias who differ in their approach and have an
article for both Wilbur and Orville Wright..

Yes, it is good to have a hope for algorithms in the future, in the mean
time consider what percentage is wrong and that quite often not having data
is more damaging than having data that can be manipulated with queries,
tools. No data is no grip at all. We do have queries in WDQ/Autolist and we
have tools in ToolScript and pywikipedia.

IMHO the most important thing we should do to get better quality is report
on differences. This helps all projects involved in an import / export /
comparison.
Thanks,
  Gerard


On 29 April 2014 09:15, John Mark Vandenberg jay...@gmail.com wrote:

 On Sun, Apr 27, 2014 at 8:28 PM, Amir Ladsgroup ladsgr...@gmail.com
 wrote:
  there are some problems in using bio template for example they used it
 for a
  group of people
 
  https://it.wikipedia.org/wiki/Fratelli_Wright

 This is quite a difficult problem.  Also look for infoboxes not at the
 top of a page, because the Wikipedia page contains two concepts.  Here
 is an example with {{Bio}}:

 https://it.wikipedia.org/wiki/Slashdot

 In the journals area, I faced this many times with the article about a
 society not having an infobox for the society, but including an
 infobox in a section for their primary journal .

 My bot has some very hacky code to detect the infobox type in a few
 languages

 https://www.wikidata.org/wiki/User:JVbot/periodicalbot.py
 (the first function)

 It would be good if we can create an algorithm that detects all these
 anomalies, or a special hidden parameter added to the invocation, to
 exclude those templates from automated parsing, but also lists all
 pages like this so that those pages can be split on the Wikipedias
 (unless notability rules prevent the split).

 --
 John Vandenberg

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Bot request: 250+ thousands person data

2014-04-29 Thread Luca Martinelli
2014-04-28 17:08 GMT+02:00 David Cuenca dacu...@gmail.com:
 On Mon, Apr 28, 2014 at 3:10 PM, Luca Martinelli martinellil...@gmail.com
 wrote:
 I recalled the fact quite correctly:
 https://it.wikipedia.org/wiki/Modulo:Bio takes dates of birth and
 death from Wikidata. I think we can talk to extend the possibility to
 gender, and later to other fields.
 That's perfect, because that means that the bot can just delete the text on
 import.

I would say -1 for the moment. We first need to talk about it and
create hidden categories in order to control the retrievals. There's
time to delete. :)

-- 
Luca Sannita Martinelli
http://it.wikipedia.org/wiki/Utente:Sannita

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Autocomplete API for a search field in an external app

2014-04-29 Thread Christian Dullweber
Hi,

the autocomplete box is using the wbsearchentities api. You could try to
copy the jquery.wikibase.entityselector.js from
wikibase/lib/resources/jquery.wikibase it looks like it works independently
from wikibase. It needs the jqueryui suggester.

best regards,

Christian



2014-04-28 18:40 GMT+02:00 Maxime Lathuilière gro...@maxlath.eu:

  Hi,

 I'm looking for a way to reproduce the autocomplete on wikidata search
 field in an external app (to tag resources with wikidata entities ids), but
 so far I couldn't find what could allow me to do this in the API doc.
 In a little experiment https://github.com/maxlath/wikidata-autocomplete,
 I tried to use the API action *wbsearchentities*, but it appear to be
 inefficient until I reach the full name... Is there already a way to query
 such an autocomplete API or any plans to implement one? Or even better,
 is there a way to easily re-use the wikidata search field widget? in a
 non-PHP project? :D

 Any clue welcome :)

 Best regards,

 Max

 --


 Maxime Lathuilière
 maxlath.eu
 @maxlath
 Zorglub27 https://www.wikidata.org/wiki/User:Zorglub27
 Contributionshttps://www.wikidata.org/wiki/Special:Contributions/Zorglub27

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Autocomplete API for a search field in an external app

2014-04-29 Thread Thomas Steiner
Hi Maxime,

After some quick reverse engineering of the site with the Chrome
Developer Tools, here's the API it is using:

http://www.wikidata.org/w/api.php
?callback=[YOUR CALLBACK NAME]
action=wbsearchentities
format=json
language=en
type=item
continue=0
_=[TIMESTAMP (as a cache buster)]
search=[YOUR QUERY]

If you don't need the callback, then the API is as follows (note the
missing cache buster):

http://www.wikidata.org/w/api.php
action=wbsearchentities
format=json
language=en
type=item
continue=0
search=[YOUR QUERY]

Hope this helps.

Best,
Tom

-- 
Thomas Steiner, Employee, Google Inc.
http://blog.tomayac.com, http://twitter.com/tomayac

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck0fjumBl3DCharaCTersAttH3b0ttom.hTtP5://xKcd.c0m/1181/
-END PGP SIGNATURE-

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Autocomplete API for a search field in an external app

2014-04-29 Thread Thomas Steiner
Small correction:

 If you don't need the callback, then the API is as follows (note the
 missing cache buster):

 http://www.wikidata.org/w/api.php
 action=wbsearchentities
?action=wbsearchentities (replace '' with '?', copy and paste
oversight, sorry).

-- 
Thomas Steiner, Employee, Google Inc.
http://blog.tomayac.com, http://twitter.com/tomayac

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck0fjumBl3DCharaCTersAttH3b0ttom.hTtP5://xKcd.c0m/1181/
-END PGP SIGNATURE-

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Bot request: 250+ thousands person data

2014-04-29 Thread Daniel Kinzler
Am 29.04.2014 03:53, schrieb Amir Ladsgroup:
 It's not a big deal, parsing it would be no problem, I can use it in
 parsing data from Bio template in Italian Wikipedia but I have to use
 precision argument in snak. Am I right?

Yes, exactly.

 what value have to set for precision if I have just year (and no month
 and day)?

If you just have they yea, the precision value would be 9. This is arbitrary and
obscure, sorry. I have filed a bug to fix this:
https://bugzilla.wikimedia.org/show_bug.cgi?id=64593

For reference, here is the table of precisions to be used for time values, as
defined in the TimeValue class:

const PRECISION_Ga = 0; // Gigayear
const PRECISION_100Ma = 1; // 100 Megayears
const PRECISION_10Ma = 2; // 10 Megayears
const PRECISION_Ma = 3; // Megayear
const PRECISION_100ka = 4; // 100 Kiloyears
const PRECISION_10ka = 5; // 10 Kiloyears
const PRECISION_ka = 6; // Kiloyear
const PRECISION_100a = 7; // 100 years
const PRECISION_10a = 8; // 10 years
const PRECISION_YEAR = 9;
const PRECISION_MONTH = 10;
const PRECISION_DAY = 11;
const PRECISION_HOUR = 12;
const PRECISION_MINUTE = 13;
const PRECISION_SECOND = 14;


If you have something like between 1846 and 1855, you can use the before and
after fields of the time value:

  time: +0001850-00-00T00:00:00Z,
  precision: 9,
  before: 4,
  after: 5

This means the main value is 1850, given as a year, with a lower bound four
years before and an upper bound 5 years after the main value (before and after
are given in the unit specified by the precision value). The main value is
what is going to be displayed per default; it will also be used for sorting
query results (once we have queries).

This is a bit complicated, but should allow you to actually represent uncertain
dates. We made it so you can be precise about the uncertainty :)

HTH
Daniel



-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Bot request: 250+ thousands person data

2014-04-29 Thread Luca Martinelli
Il 29/apr/2014 09:31 Gerard Meijssen gerard.meijs...@gmail.com ha
scritto:

 Hoi,
 When Wikipedia has an approach to specific articles that are not
compatible with Wikidata, we can create items that fit our need and keep
the original item for what it is .. for instance a list of people (in the
case of the Wright brothers).

 The notion that Wikidata defers to Wikipedia is not one can keep because
there are bound to be Wikipedias who differ in their approach and have an
article for both Wilbur and Orville Wright..

Exactly, I kinda had the same problem with Sacco and Vanzetti when I was
uploading Italian authority codes. They have two different codes in the
Italian national library system, but have a joint article on Wikipedia.

L.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Bot request: 250+ thousands person data

2014-04-29 Thread David Cuenca
On Tue, Apr 29, 2014 at 12:48 PM, Daniel Kinzler 
daniel.kinz...@wikimedia.de wrote:

 If you have something like between 1846 and 1855, you can use the
 before and
 after fields of the time value:

   time: +0001850-00-00T00:00:00Z,
   precision: 9,
   before: 4,
   after: 5

 This means the main value is 1850, given as a year, with a lower bound
 four
 years before and an upper bound 5 years after the main value (before and
 after
 are given in the unit specified by the precision value). The main value
 is
 what is going to be displayed per default; it will also be used for sorting
 query results (once we have queries).


Is it possible to have just an lower bond, leaving the upper one open? I am
thinking of uses like
https://www.wikidata.org/wiki/Wikidata:Property_proposal/Generic#earliest_date

For things like circa I don't see any clear solution other than
inventing some ranges...

Micru
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l