Re: [Wikidata] Help with SPARQL or API or something to get subcategories

2016-10-08 Thread Kingsley Idehen
On 10/6/16 2:57 PM, Thad Guidry wrote:
> Hello team :)
>
> So while I'm helping with the Wikidata - Schema.org mappings, a
> request came in to expose subcategories of an existing Wikipedia category.
>
> For example, say I start with this
> topic: https://www.wikidata.org/wiki/Q27119725  Parking facilities
>
> The topic's main category is shown as "Category:Parking facilities"
> and that has links to Wikipedia, specifically a Wikipedia category
> link, and where the WP category page has subcategories that I would
> like to expose somehow in whichever way is *easiest* currently with
> our tools, apis, etc.
>
> Can it all be done in SPARQL against some services that already expose
> WP subcategories given a specific category ?  Or is there an API that
> does this already ?  other tools that might expose WP categories ?
>
> The IDEAL GOAL is to query 'equivalent class' =
> schema.org/ParkingFacility  and get
> back the WP categories *in one shot or query or api call.*
>
> http://schema.org/ParkingFacility
>
>  *
>
> Parking facilities in India
> ‎
>  *
> Parking facilities in the United States
> 
> ‎
>
>  *
> Aircraft hangars
> ‎
>
>  *
> Garages (parking)
> ‎
>  *
> Railway depots
> ‎
>
>
> Any gurus ?

Hi Thad,

If there are owl:equivalentClass mappings in some Linked Data Space, and
the SPARQL service associated with said Data Space supports
owl:equivalentClass reasoning, then the answer to your question is yes.

What unknown right now is the class mappings between Wikidata and
Schema.org. If a dump of those exist, the rest is trivial :)

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Info WorldUniversity
Lydia and Wikidatans,

If we consider atoms and nano-related entities, and other similar different
scale entities, will Wikidata string data types wish for more space in char
limits, I wonder (and for anticipating data for a realistic virtual
earth/universe at a) street view, b) neuronal/cellular c) molecular, d)
nano/atomic levels, etc.)?

Scott
http://twitter.com/WorldUnivAndSch

On Oct 8, 2016 5:32 AM, "Egon Willighagen" 
wrote:

> Dear Thomas,
>
> On Sat, Oct 8, 2016 at 12:07 PM, Thomas Douillard <
> thomas.douill...@gmail.com> wrote:
>
>> Probably a silly question but ... did you all consider creating a
>> datatype for molecue representation ? This seem to be a very similar
>> usecase than mathematica formula. Essentially we're not dealing with a raw
>> string but a representation of molecule formulas, with its own encoding ...
>>
>
> The InChI is actually not a structural representation, but a derived
> unique identifier.
>
> What you propose would, however, apply to the SMILES. That one is
> generally of about the same size as the InChI, and there your solution
> sounds like a great idea!
>
> Egon
>
>
>> Changing the limit seem to be a poor workaround to a dedicated datatype -
>> nobody seems to have found a relevant usecase and it seem to me that we're
>> essentially abusing strings for storing blobs ...
>>
>> 2016-10-08 11:33 GMT+02:00 Egon Willighagen :
>>
>>>
>>>
>>> On Sat, Oct 8, 2016 at 11:28 AM, Lydia Pintscher <
>>> lydia.pintsc...@wikimedia.de> wrote:
>>>
 On Sat, Oct 8, 2016 at 11:23 AM, Egon Willighagen
  wrote:
 > Ah, those numbers are for https://www.wikidata.org/wiki/Property:P234
 ...

 External identifier then. Cool. And for string like in
 https://www.wikidata.org/wiki/Property:P233? Sebastian's initial email
>>>
>>> says 1500 to 2000. Is this still a good number after this discussion?

>>>
>>> Yes, that would cover more than 99.9% of all InChIs in PubChem. (See
>>> Sebastian's reply earlier in this thread.)
>>>
>>> Egon
>>>
>>> --
>>> E.L. Willighagen
>>> Department of Bioinformatics - BiGCaT
>>> Maastricht University (http://www.bigcat.unimaas.nl/)
>>> Homepage: http://egonw.github.com/
>>> LinkedIn: http://se.linkedin.com/in/egonw
>>> Blog: http://chem-bla-ics.blogspot.com/
>>> PubList: http://www.citeulike.org/user/egonw/tag/papers
>>> ORCID: -0001-7542-0286
>>> ImpactStory: https://impactstory.org/u/egonwillighagen
>>>
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/u/egonwillighagen
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Egon Willighagen
Dear Thomas,

On Sat, Oct 8, 2016 at 12:07 PM, Thomas Douillard <
thomas.douill...@gmail.com> wrote:

> Probably a silly question but ... did you all consider creating a datatype
> for molecue representation ? This seem to be a very similar usecase than
> mathematica formula. Essentially we're not dealing with a raw string but a
> representation of molecule formulas, with its own encoding ...
>

The InChI is actually not a structural representation, but a derived unique
identifier.

What you propose would, however, apply to the SMILES. That one is generally
of about the same size as the InChI, and there your solution sounds like a
great idea!

Egon


> Changing the limit seem to be a poor workaround to a dedicated datatype -
> nobody seems to have found a relevant usecase and it seem to me that we're
> essentially abusing strings for storing blobs ...
>
> 2016-10-08 11:33 GMT+02:00 Egon Willighagen :
>
>>
>>
>> On Sat, Oct 8, 2016 at 11:28 AM, Lydia Pintscher <
>> lydia.pintsc...@wikimedia.de> wrote:
>>
>>> On Sat, Oct 8, 2016 at 11:23 AM, Egon Willighagen
>>>  wrote:
>>> > Ah, those numbers are for https://www.wikidata.org/wiki/Property:P234
>>> ...
>>>
>>> External identifier then. Cool. And for string like in
>>> https://www.wikidata.org/wiki/Property:P233? Sebastian's initial email
>>
>> says 1500 to 2000. Is this still a good number after this discussion?
>>>
>>
>> Yes, that would cover more than 99.9% of all InChIs in PubChem. (See
>> Sebastian's reply earlier in this thread.)
>>
>> Egon
>>
>> --
>> E.L. Willighagen
>> Department of Bioinformatics - BiGCaT
>> Maastricht University (http://www.bigcat.unimaas.nl/)
>> Homepage: http://egonw.github.com/
>> LinkedIn: http://se.linkedin.com/in/egonw
>> Blog: http://chem-bla-ics.blogspot.com/
>> PubList: http://www.citeulike.org/user/egonw/tag/papers
>> ORCID: -0001-7542-0286
>> ImpactStory: https://impactstory.org/u/egonwillighagen
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>


-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Daniel Kinzler
That was discussed and declined a while ago, see
. Though I think the proposed
realization was presentational rather than functional. I'll have to re-read the
discussion, though.

Am 08.10.2016 um 12:07 schrieb Thomas Douillard:
> Probably a silly question but ... did you all consider creating a datatype for
> molecue representation ? This seem to be a very similar usecase than 
> mathematica
> formula. Essentially we're not dealing with a raw string but a representation 
> of
> molecule formulas, with its own encoding ...
> 
> Changing the limit seem to be a poor workaround to a dedicated datatype - 
> nobody
> seems to have found a relevant usecase and it seem to me that we're 
> essentially
> abusing strings for storing blobs ...
> 
> 2016-10-08 11:33 GMT+02:00 Egon Willighagen  >:
> 
> 
> 
> On Sat, Oct 8, 2016 at 11:28 AM, Lydia Pintscher
> mailto:lydia.pintsc...@wikimedia.de>> 
> wrote:
> 
> On Sat, Oct 8, 2016 at 11:23 AM, Egon Willighagen
> mailto:egon.willigha...@gmail.com>> 
> wrote:
> > Ah, those numbers are for 
> https://www.wikidata.org/wiki/Property:P234
>  ...
> 
> External identifier then. Cool. And for string like in
> https://www.wikidata.org/wiki/Property:P233
> ? Sebastian's initial 
> email 
> 
> says 1500 to 2000. Is this still a good number after this discussion?
> 
> 
> Yes, that would cover more than 99.9% of all InChIs in PubChem. (See
> Sebastian's reply earlier in this thread.)
> 
> Egon
> 
> -- 
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw 
> 
> Blog: http://chem-bla-ics.blogspot.com/ 
> 
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> 
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/u/egonwillighagen
> 
> 
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org 
> https://lists.wikimedia.org/mailman/listinfo/wikidata
> 
> 
> 
> 
> 
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
> 


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Thomas Douillard
Probably a silly question but ... did you all consider creating a datatype
for molecue representation ? This seem to be a very similar usecase than
mathematica formula. Essentially we're not dealing with a raw string but a
representation of molecule formulas, with its own encoding ...

Changing the limit seem to be a poor workaround to a dedicated datatype -
nobody seems to have found a relevant usecase and it seem to me that we're
essentially abusing strings for storing blobs ...

2016-10-08 11:33 GMT+02:00 Egon Willighagen :

>
>
> On Sat, Oct 8, 2016 at 11:28 AM, Lydia Pintscher <
> lydia.pintsc...@wikimedia.de> wrote:
>
>> On Sat, Oct 8, 2016 at 11:23 AM, Egon Willighagen
>>  wrote:
>> > Ah, those numbers are for https://www.wikidata.org/wiki/Property:P234
>> ...
>>
>> External identifier then. Cool. And for string like in
>> https://www.wikidata.org/wiki/Property:P233? Sebastian's initial email
>
> says 1500 to 2000. Is this still a good number after this discussion?
>>
>
> Yes, that would cover more than 99.9% of all InChIs in PubChem. (See
> Sebastian's reply earlier in this thread.)
>
> Egon
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/u/egonwillighagen
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Egon Willighagen
On Sat, Oct 8, 2016 at 11:28 AM, Lydia Pintscher <
lydia.pintsc...@wikimedia.de> wrote:

> On Sat, Oct 8, 2016 at 11:23 AM, Egon Willighagen
>  wrote:
> > Ah, those numbers are for https://www.wikidata.org/wiki/Property:P234
> ...
>
> External identifier then. Cool. And for string like in
> https://www.wikidata.org/wiki/Property:P233? Sebastian's initial email

says 1500 to 2000. Is this still a good number after this discussion?
>

Yes, that would cover more than 99.9% of all InChIs in PubChem. (See
Sebastian's reply earlier in this thread.)

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Lydia Pintscher
On Sat, Oct 8, 2016 at 11:23 AM, Egon Willighagen
 wrote:
> Ah, those numbers are for https://www.wikidata.org/wiki/Property:P234 ...

External identifier then. Cool. And for string like in
https://www.wikidata.org/wiki/Property:P233? Sebastian's initial email
says 1500 to 2000. Is this still a good number after this discussion?


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Egon Willighagen
On Sat, Oct 8, 2016 at 11:19 AM, Lydia Pintscher <
lydia.pintsc...@wikimedia.de> wrote:

> On Sat, Oct 8, 2016 at 11:14 AM, Egon Willighagen
>  wrote:
> > For small compounds this is answered by Sebastian's analysis... 5K would
> > cover all currently known small molecules. 1K would cover 99.9%.
>
> Ok. That is for strings, correct? Input for other use cases?


Ah, those numbers are for https://www.wikidata.org/wiki/Property:P234 ...

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Lydia Pintscher
On Sat, Oct 8, 2016 at 11:14 AM, Egon Willighagen
 wrote:
> For small compounds this is answered by Sebastian's analysis... 5K would
> cover all currently known small molecules. 1K would cover 99.9%.

Ok. That is for strings, correct? Input for other use cases?

> Lydia, do I understand that a formal request needs to be filed? Who will do
> that?

I'll handle that part for string. It was mostly about me not wanting
to increase the limit on external identifiers without a request. I
have not seen one but I might have overlooked it.


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Egon Willighagen
On Sat, Oct 8, 2016 at 11:07 AM, Lydia Pintscher <
lydia.pintsc...@wikimedia.de> wrote:
>
> Based on this my proposal is to increase string and URL and
> potentially external identifier if you request it. One open question
> is still what the new limit should be.
>

For small compounds this is answered by Sebastian's analysis... 5K would
cover all currently known small molecules. 1K would cover 99.9%.

Lydia, do I understand that a formal request needs to be filed? Who will do
that?

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Lydia Pintscher
Hi everyone,

I've been thinking more about this and we also discussed this within
the development team. Here's my thinking at this point:

* We do have data that you all want to see in Wikidata that is
currently prevented by the limit. That is not good.
* I agree that the general understanding of all of us is very good
when it comes to Wikidata not being the place to store long free
texts. However I still fear that especially new people initially do
not understand this. We could mitigate this by for example giving the
user a hint when their input is getting too long even if it is still
within the limit. Twitter does this in a nice way when you are getting
close to the 140 character limit. However that is not implemented
right now.
* I do worry about licensing and copyright issues with especially the
following properties: https://www.wikidata.org/wiki/Property:P2795
https://www.wikidata.org/wiki/Property:P1683
https://www.wikidata.org/wiki/Property:P1684
https://www.wikidata.org/wiki/Property:P2315 I took a rough survey of
for me potentially troublesome properties and it seems they are all
monolingual text. I am not worried about increasing external
identifier and URL. It looks like string is also okish at this point
in time.

Based on this my proposal is to increase string and URL and
potentially external identifier if you request it. One open question
is still what the new limit should be.


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata