Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-28 Thread hellmann
Hi Gerard,

I was not trying to judge here. I was just saying that it wasn't much data in 
the end.
For me Freebase was basically cherry-picked. 

Meanwhile, the data we extract is more pertinent to the goal of having Wikidata 
cover the info boxes. We still have ~ 500 million statements left. But none of 
it is used yet. Hopefully we can change that. 

Meanwhile, Google crawls all the references and extracts facts from there. We 
don't have that available, but there is Linked Open Data. 

--
Sebastian 

On September 27, 2019 5:26:43 PM GMT+02:00, Gerard Meijssen 
 wrote:
>Hoi,
>I totally reject the assertion was so bad. I have always had the
>opinion
>that the main issue was an atrocious user interface. Add to this the
>people
>that have Wikipedia notions about quality. They have and had a
>detrimental
>effect on both the quantity and quality of Wikidata.
>
>When you add the functionality that is being build by the datawranglers
>at
>DBpedia, it becomes easy/easier to compare the data from Wikipedias
>with
>Wikidata (and why not Freebase) add what has consensus and curate the
>differences. This will enable a true datasense of quality and allows us
>to
>provide a much improved service.
>Thanks,
>  GerardM
>
>On Fri, 27 Sep 2019 at 15:54, Marco Fossati 
>wrote:
>
>> Hey Sebastian,
>>
>> On 9/20/19 10:22 AM, Sebastian Hellmann wrote:
>> > Not much of Freebase did end up in Wikidata.
>>
>> Dropping here some pointers to shed light on the migration of
>Freebase
>> to Wikidata, since I was partially involved in the process:
>> 1. WikiProject [1];
>> 2. the paper behind [2];
>> 3. datasets to be migrated [3].
>>
>> I can confirm that the migration has stalled: as of today, *528
>> thousands* Freebase statements were curated by the community, out of
>*10
>> million* ones. By 'curated', I mean approved or rejected.
>> These numbers come from two queries against the primary sources tool
>> database.
>>
>> The stall is due to several causes: in my opinion, the most important
>> one was the bad quality of sources [4,5] coming from the Knowledge
>Vault
>> project [6].
>>
>> Cheers,
>>
>> Marco
>>
>> [1] https://www.wikidata.org/wiki/Wikidata:WikiProject_Freebase
>> [2]
>>
>>
>http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44818.pdf
>> [3]
>>
>https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool/Version_1#Data
>> [4]
>>
>>
>https://www.wikidata.org/wiki/Wikidata_talk:Primary_sources_tool/Archive/2017#Quality_of_sources
>> [5]
>>
>>
>https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements#A_whitelist_for_sources
>> [6] https://www.cs.ubc.ca/~murphyk/Papers/kv-kdd14.pdf
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-27 Thread Gerard Meijssen
Hoi,
I totally reject the assertion was so bad. I have always had the opinion
that the main issue was an atrocious user interface. Add to this the people
that have Wikipedia notions about quality. They have and had a detrimental
effect on both the quantity and quality of Wikidata.

When you add the functionality that is being build by the datawranglers at
DBpedia, it becomes easy/easier to compare the data from Wikipedias with
Wikidata (and why not Freebase) add what has consensus and curate the
differences. This will enable a true datasense of quality and allows us to
provide a much improved service.
Thanks,
  GerardM

On Fri, 27 Sep 2019 at 15:54, Marco Fossati  wrote:

> Hey Sebastian,
>
> On 9/20/19 10:22 AM, Sebastian Hellmann wrote:
> > Not much of Freebase did end up in Wikidata.
>
> Dropping here some pointers to shed light on the migration of Freebase
> to Wikidata, since I was partially involved in the process:
> 1. WikiProject [1];
> 2. the paper behind [2];
> 3. datasets to be migrated [3].
>
> I can confirm that the migration has stalled: as of today, *528
> thousands* Freebase statements were curated by the community, out of *10
> million* ones. By 'curated', I mean approved or rejected.
> These numbers come from two queries against the primary sources tool
> database.
>
> The stall is due to several causes: in my opinion, the most important
> one was the bad quality of sources [4,5] coming from the Knowledge Vault
> project [6].
>
> Cheers,
>
> Marco
>
> [1] https://www.wikidata.org/wiki/Wikidata:WikiProject_Freebase
> [2]
>
> http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44818.pdf
> [3]
> https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool/Version_1#Data
> [4]
>
> https://www.wikidata.org/wiki/Wikidata_talk:Primary_sources_tool/Archive/2017#Quality_of_sources
> [5]
>
> https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements#A_whitelist_for_sources
> [6] https://www.cs.ubc.ca/~murphyk/Papers/kv-kdd14.pdf
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-27 Thread Sebastian Hellmann

Hi Marco,

I think, I looked at it some years ago and it still sounds like less 
than 5% made it, which is what I remember.


-- Sebastian

On 27.09.19 15:53, Marco Fossati wrote:

Hey Sebastian,

On 9/20/19 10:22 AM, Sebastian Hellmann wrote:

Not much of Freebase did end up in Wikidata.


Dropping here some pointers to shed light on the migration of Freebase 
to Wikidata, since I was partially involved in the process:

1. WikiProject [1];
2. the paper behind [2];
3. datasets to be migrated [3].

I can confirm that the migration has stalled: as of today, *528 
thousands* Freebase statements were curated by the community, out of 
*10 million* ones. By 'curated', I mean approved or rejected.
These numbers come from two queries against the primary sources tool 
database.


The stall is due to several causes: in my opinion, the most important 
one was the bad quality of sources [4,5] coming from the Knowledge 
Vault project [6].


Cheers,

Marco

[1] https://www.wikidata.org/wiki/Wikidata:WikiProject_Freebase
[2] 
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44818.pdf
[3] 
https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool/Version_1#Data
[4] 
https://www.wikidata.org/wiki/Wikidata_talk:Primary_sources_tool/Archive/2017#Quality_of_sources
[5] 
https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements#A_whitelist_for_sources

[6] https://www.cs.ubc.ca/~murphyk/Papers/kv-kdd14.pdf


--
All the best,
Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT) 
Competence Center

at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association
Projects: http://dbpedia.org, http://nlp2rdf.org, 
http://linguistics.okfn.org, https://www.w3.org/community/ld4lt 


Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-27 Thread Marco Fossati

Hey Sebastian,

On 9/20/19 10:22 AM, Sebastian Hellmann wrote:

Not much of Freebase did end up in Wikidata.


Dropping here some pointers to shed light on the migration of Freebase 
to Wikidata, since I was partially involved in the process:

1. WikiProject [1];
2. the paper behind [2];
3. datasets to be migrated [3].

I can confirm that the migration has stalled: as of today, *528 
thousands* Freebase statements were curated by the community, out of *10 
million* ones. By 'curated', I mean approved or rejected.
These numbers come from two queries against the primary sources tool 
database.


The stall is due to several causes: in my opinion, the most important 
one was the bad quality of sources [4,5] coming from the Knowledge Vault 
project [6].


Cheers,

Marco

[1] https://www.wikidata.org/wiki/Wikidata:WikiProject_Freebase
[2] 
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44818.pdf
[3] 
https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool/Version_1#Data
[4] 
https://www.wikidata.org/wiki/Wikidata_talk:Primary_sources_tool/Archive/2017#Quality_of_sources
[5] 
https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements#A_whitelist_for_sources

[6] https://www.cs.ubc.ca/~murphyk/Papers/kv-kdd14.pdf

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-22 Thread Kingsley Idehen
On 9/22/19 11:55 AM, Kingsley Idehen wrote:
> On 9/21/19 6:30 PM, Kingsley Idehen wrote:
>> On 9/20/19 1:31 PM, Denny Vrandečić wrote:
>>> Yes, you're touching exactly on the problems I had during the
>>> evaluation - I couldn't even figure out what DBpedia is.
>> Hi Denny and Sebastian,
>>
>> To reiterate and/or clarify.
>>
>> DBpedia is a community project comprising RDF datasets constructed from
>> Wikipedia content that's deployed using Linked Data principles.
>
>
> A little clearer, as the definition above was a little too concise:
>
> DBpedia is a community project comprising a variety of data curation tools, 
> services (Linked Data lookup and SPARQL), and RDF datasets constructed from 
> Wikipedia that's deployed using Linked Data principles and cross-referenced 
> with other data sources as illustrated in the Linked Open Data Cloud (the 
> world's largest Knowledge Graph)[1][2].
>
> This project has recently spawned a Databus effort which addresses historic 
> challenges associated with dataset curation, publication, discovery, and 
> monetization [3].
>
> [1] https://lod-cloud.ne2
>
> [2] 
> https://medium.com/virtuoso-blog/what-is-the-linked-open-data-cloud-and-why-is-it-important-1901a7cb7b1f
>  -- what is the LOD Cloud and why is it important? 
>
> [3] https://databus.dbpedia.org/ -- Databus 
>

TypoFix:


[1] https://lod-cloud.net

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-22 Thread Kingsley Idehen
On 9/21/19 6:30 PM, Kingsley Idehen wrote:
> On 9/20/19 1:31 PM, Denny Vrandečić wrote:
>> Yes, you're touching exactly on the problems I had during the
>> evaluation - I couldn't even figure out what DBpedia is.
> Hi Denny and Sebastian,
>
> To reiterate and/or clarify.
>
> DBpedia is a community project comprising RDF datasets constructed from
> Wikipedia content that's deployed using Linked Data principles.


A little clearer, as the definition above was a little too concise:

DBpedia is a community project comprising a variety of data curation tools, 
services (Linked Data lookup and SPARQL), and RDF datasets constructed from 
Wikipedia that's deployed using Linked Data principles and cross-referenced 
with other data sources as illustrated in the Linked Open Data Cloud (the 
world's largest Knowledge Graph)[1][2].

This project has recently spawned a Databus effort which addresses historic 
challenges associated with dataset curation, publication, discovery, and 
monetization [3].

[1] https://lod-cloud.ne2

[2] 
https://medium.com/virtuoso-blog/what-is-the-linked-open-data-cloud-and-why-is-it-important-1901a7cb7b1f
 -- what is the LOD Cloud and why is it important? 

[3] https://databus.dbpedia.org/ -- Databus 

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-22 Thread Kingsley Idehen
On 9/22/19 1:34 AM, hellm...@informatik.uni-leipzig.de wrote:
> Hi Kingsley,
>
> that describes the core of the glue that DBpedia is. The definition
> leads to people downloading the EN DBpedia dataset and running
> statistics that will only discover what data is wrong or missing in
> the smallest parts of DBpedia.


The question was "What is DBpedia?" . What is misleading about it being
about Wikipedia content transformed into RDF and deployed using Linked
Data principles?

>
> What happened to "LOD is the largest knowledge graph on earth" ? 


The question wasn't "What is the LOD Cloud?" or am I missing something
here.


> Querying more Freebase data from DBpedia via Linked Data is a use case
> since over 10 years now using ontologies as a GPS.


Freebase is yet another derivative of Wikipedia content, isn't it?


>
> Also the definition you give limits the community to people who have
> edited 10 Scala Classes in the extraction framework, which is probably
> 10 people altogether.


Look, can't you simply make a clear statement of what is missing from my
definition of DBpedia? I sense you are talking about all the other
utilities that have been developed by the project beyond dataset
production e.g., services like DBpedia Spotlight etc?


>
> So this is the most exclusionist view I can think of.
>
> What you wrote here is adequate:
> https://medium.com/openlink-software-blog/what-is-dbpedia-and-why-is-it-important-d306b5324f90
>
> What you wrote in your email as a summary is very narrow and
> misleading, see Markus Kroetzsch's email. People will continue to
> measure DBpedia by exactly the part of the data that is loaded in the
> Virtuoso SPARQL endpoint unless we make the derivatives downloadable
> outside of HTTP LD requests.


You really have to try using a slightly better tone when communicating.

You could simply say:

Kingsley, here are some thing that could be overlooked based on the
description your presented:

Item 1..N.

I'll just fix it, or worst case agree to disagree.


Kingsley

>
> -- Sebastian
>
>
> On September 22, 2019 12:30:24 AM GMT+02:00, Kingsley Idehen
>  wrote:
>
> On 9/20/19 1:31 PM, Denny Vrandečić wrote:
>
> Yes, you're touching exactly on the problems I had during the
> evaluation - I couldn't even figure out what DBpedia is. 
>
>
> Hi Denny and Sebastian,
>
> To reiterate and/or clarify.
>
> DBpedia is a community project comprising RDF datasets constructed from
> Wikipedia content that's deployed using Linked Data principles.
>
> The description above implies the following re focus breakdown:
>
> [1] Dataset creation -- this cannot be created in line with Linked Data
> principles without the items that follow
>
> [2] Linked Data Deployment -- without this there is nothing to look-up
> re follow-your-nose exploration
>
> [3] SPARQL Query Services  -- without this there is nothing to query
>
> Over the years I've written a number of posts addressing the key
> question "what is DBpedia?"
>
> [1]
> 
> https://medium.com/openlink-software-blog/what-is-dbpedia-and-why-is-it-important-d306b5324f90
> -- What is DBpedia, and why is it important?
>
> [2]
> 
> https://medium.com/virtuoso-blog/on-the-mutually-beneficial-nature-of-dbpedia-and-wikidata-5fb2b9f22ada
> -- Mutually beneficial nature of Wikidata and DBpedia
>
>
> -- 
> Sent from my Android device with K-9 Mail. Please excuse my brevity. 


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-22 Thread Kingsley Idehen
On 9/21/19 7:35 PM, Andra Waagmeester wrote:
> Agree, I am also interested in seeing this. I recently did a small
> comparison on science awards on coverage of laureates in both DBpedia
> and wikidata and came to the same conclusion. The difference sometimes
> was quite substantial in favour of Wikidata. 


Are you not able to share SPARQL Query Results page links for this?

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-22 Thread Gerard Meijssen
Hoi,
>From my perspective the point of a data set is for it to be used. The
extend in which it is used defines how useful an individual data set is. I
even blogged about it .. [1]
Thanks,
 GerardM

[1]
https://ultimategerardm.blogspot.com/2019/09/comparing-datasets-bigger-or-better-or.html

On Sun, 22 Sep 2019 at 11:29,  wrote:

> DBpedia actually has no data, we provide tools to more effectively use
> OTHER PEOPLE'S DATA, e.g. Wikipedia.
>
> Here is an image of the maximum size of the new scalable and actually bulk
> downloadable DBpedia via Databus in let's say one or two years:
>
> https://lod-cloud.net/
>
> With Download As Wikidata Q's and P's Option.
>
> It's there, just hard to download in bulk.
>
> LG,
> Sebastian
>
>
> On September 22, 2019 10:41:10 AM GMT+02:00, Markus Kroetzsch <
> markus.kroetz...@tu-dresden.de> wrote:
>>
>> On 22/09/2019 08:48, Sebastian Hellmann wrote:
>> ...
>>
>>>
>>> The formula here is quite easy: If you look at DBpedia's data in detail
>>> or a part of it, it will not shine so much since it is extracted,
>>>
>>
>> Sure, but I think that this is not clear to many people who are
>> currently using DBpedia as a dataset (even if only for testing/research
>> purposes). Also, there would surely be value in analysing the
>> differences more closely. I agree with you that quantitatively, Wikidata
>> might be orders of magnitudes ahead. Yet, there can still be individual
>> bits of information that are in DBpedia but missing from Wikidata so far.
>>
>> For example, DBpedia EN has 32 people educated at the University of
>> Leipzig, whereas Wikidata has 1217. Nevertheless, there is, for example,
>> John Henry Wright (Q6238997), who is known to DBpedia but not to
>> Wikidata (yet). Such cases might be worth systematic weeding out so that
>> we can really come to the point where Wikidata is a strict superset of
>> all (correct) data in DBpedia.
>>
>> Cheers,
>>
>> Markus
>>
>>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-22 Thread hellmann
DBpedia actually has no data, we provide tools to more effectively use OTHER 
PEOPLE'S DATA, e.g. Wikipedia. 

Here is an image of the maximum size of the new scalable and actually bulk 
downloadable DBpedia via Databus in let's say one or two years:

https://lod-cloud.net/

With Download As Wikidata Q's and P's Option. 

It's there, just hard to download in bulk. 

LG, 
Sebastian 


On September 22, 2019 10:41:10 AM GMT+02:00, Markus Kroetzsch 
 wrote:
>On 22/09/2019 08:48, Sebastian Hellmann wrote:
>...
>> 
>> The formula here is quite easy: If you look at DBpedia's data in
>detail 
>> or a part of it, it will not shine so much since it is extracted, 
>
>Sure, but I think that this is not clear to many people who are 
>currently using DBpedia as a dataset (even if only for testing/research
>
>purposes). Also, there would surely be value in analysing the 
>differences more closely. I agree with you that quantitatively,
>Wikidata 
>might be orders of magnitudes ahead. Yet, there can still be individual
>
>bits of information that are in DBpedia but missing from Wikidata so
>far.
>
>For example, DBpedia EN has 32 people educated at the University of 
>Leipzig, whereas Wikidata has 1217. Nevertheless, there is, for
>example, 
>John Henry Wright (Q6238997), who is known to DBpedia but not to 
>Wikidata (yet). Such cases might be worth systematic weeding out so
>that 
>we can really come to the point where Wikidata is a strict superset of 
>all (correct) data in DBpedia.
>
>Cheers,
>
>Markus

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-22 Thread Markus Kroetzsch

On 22/09/2019 08:48, Sebastian Hellmann wrote:
...


The formula here is quite easy: If you look at DBpedia's data in detail 
or a part of it, it will not shine so much since it is extracted, 


Sure, but I think that this is not clear to many people who are 
currently using DBpedia as a dataset (even if only for testing/research 
purposes). Also, there would surely be value in analysing the 
differences more closely. I agree with you that quantitatively, Wikidata 
might be orders of magnitudes ahead. Yet, there can still be individual 
bits of information that are in DBpedia but missing from Wikidata so far.


For example, DBpedia EN has 32 people educated at the University of 
Leipzig, whereas Wikidata has 1217. Nevertheless, there is, for example, 
John Henry Wright (Q6238997), who is known to DBpedia but not to 
Wikidata (yet). Such cases might be worth systematic weeding out so that 
we can really come to the point where Wikidata is a strict superset of 
all (correct) data in DBpedia.


Cheers,

Markus



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-21 Thread hellmann
Hi Kingsley,

that describes the core of the glue that DBpedia is. The definition leads to 
people downloading the EN DBpedia dataset and running statistics that will only 
discover what data is wrong or missing in the smallest parts of DBpedia.

What happened to "LOD is the largest knowledge graph on earth" ? Querying more 
Freebase data from DBpedia via Linked Data is a use case since over 10 years 
now using ontologies as a GPS.

Also the definition you give limits the community to people who have edited 10 
Scala Classes in the extraction framework, which is probably 10 people 
altogether.

So this is the most exclusionist view I can think of. 

What you wrote here is adequate:
https://medium.com/openlink-software-blog/what-is-dbpedia-and-why-is-it-important-d306b5324f90

What you wrote in your email as a summary is very narrow and misleading, see 
Markus Kroetzsch's email. People will continue to measure DBpedia by exactly 
the part of the data that is loaded in the Virtuoso SPARQL endpoint  unless we 
make the derivatives downloadable outside of HTTP LD requests. 

-- Sebastian


On September 22, 2019 12:30:24 AM GMT+02:00, Kingsley Idehen 
 wrote:
>On 9/20/19 1:31 PM, Denny Vrandečić wrote:
>> Yes, you're touching exactly on the problems I had during the
>> evaluation - I couldn't even figure out what DBpedia is.
>
>Hi Denny and Sebastian,
>
>To reiterate and/or clarify.
>
>DBpedia is a community project comprising RDF datasets constructed from
>Wikipedia content that's deployed using Linked Data principles.
>
>The description above implies the following re focus breakdown:
>
>[1] Dataset creation -- this cannot be created in line with Linked Data
>principles without the items that follow
>
>[2] Linked Data Deployment -- without this there is nothing to look-up
>re follow-your-nose exploration
>
>[3] SPARQL Query Services  -- without this there is nothing to query
>
>Over the years I've written a number of posts addressing the key
>question "what is DBpedia?"
>
>[1]
>https://medium.com/openlink-software-blog/what-is-dbpedia-and-why-is-it-important-d306b5324f90
>-- What is DBpedia, and why is it important?
>
>[2]
>https://medium.com/virtuoso-blog/on-the-mutually-beneficial-nature-of-dbpedia-and-wikidata-5fb2b9f22ada
>-- Mutually beneficial nature of Wikidata and DBpedia
>
>
>-- 
>Regards,
>
>Kingsley Idehen  
>Founder & CEO 
>OpenLink Software   
>Home Page: http://www.openlinksw.com
>Community Support: https://community.openlinksw.com
>Weblogs (Blogs):
>Company Blog: https://medium.com/openlink-software-blog
>Virtuoso Blog: https://medium.com/virtuoso-blog
>Data Access Drivers Blog:
>https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>
>Personal Weblogs (Blogs):
>Medium Blog: https://medium.com/@kidehen
>Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
>  http://kidehen.blogspot.com
>
>Profile Pages:
>Pinterest: https://www.pinterest.com/kidehen/
>Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
>Twitter: https://twitter.com/kidehen
>Google+: https://plus.google.com/+KingsleyIdehen/about
>LinkedIn: http://www.linkedin.com/in/kidehen
>
>Web Identities (WebID):
>Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
>:
>http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-21 Thread Kingsley Idehen
On 9/20/19 1:31 PM, Denny Vrandečić wrote:
> Yes, you're touching exactly on the problems I had during the
> evaluation - I couldn't even figure out what DBpedia is.

Hi Denny and Sebastian,

To reiterate and/or clarify.

DBpedia is a community project comprising RDF datasets constructed from
Wikipedia content that's deployed using Linked Data principles.

The description above implies the following re focus breakdown:

[1] Dataset creation -- this cannot be created in line with Linked Data
principles without the items that follow

[2] Linked Data Deployment -- without this there is nothing to look-up
re follow-your-nose exploration

[3] SPARQL Query Services  -- without this there is nothing to query

Over the years I've written a number of posts addressing the key
question "what is DBpedia?"

[1]
https://medium.com/openlink-software-blog/what-is-dbpedia-and-why-is-it-important-d306b5324f90
-- What is DBpedia, and why is it important?

[2]
https://medium.com/virtuoso-blog/on-the-mutually-beneficial-nature-of-dbpedia-and-wikidata-5fb2b9f22ada
-- Mutually beneficial nature of Wikidata and DBpedia


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-21 Thread Markus Kroetzsch

On 20/09/2019 17:53, Denny Vrandečić wrote:
...


I have been working on a comparison of DBpedia, Wikidata, and Freebase 
(and since you've read my thesis, you know that's a thing I know a bit 
about). Simple evaluation, coverage, correctness, nothing dramatically 
fancy. But I am torn about publishing it, because, d'oh, people may 
(with good reasons) dismiss it as being biased. And truth be told - the 
simple fact that I don't know DBpedia as well as I know Wikidata and 
Freebase might indeed have lead to errors, mistakes, and stuff I missed 
in the evaluation. But you know what would help?


I would also be very interested in seeing this. I had a closer look at 
DBpedia recently for a tutorial and was surprised by how different the 
data is in comparison to Wikidata. A methodological comparison would 
surely be helpful.


Of course, it has to be fair, taking into account that DBpedia editions 
are based on a Wikipedia in one language (hence is always missing 
entities that Wikidata has). For example, I recently computed the 
difference between the following two:


(1) The set of all pairs of ancestors that one can find by following 
(paths of) parent relations on EN DBPedia.
(2) The set of all pairs of ancestors that one can find by following 
(paths of) mother/father relations on Wikidata, but visiting only items 
that are present in English Wikipedia.


I am not sure if this is fair or not, but I found it an interesting 
setup (non-local effects of incompleteness) -- and (2) is a nice 
illustration of something you cannot achieve in SPARQL on principled 
grounds ;-).


Cheers,

Markus



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-21 Thread hellmann
One more thing, I would be interested in. I don't think comparing wikidata and 
freebase to DBpedia will make sense as these are sources for us. However we 
could compare DBpedia including the Wikidata and Freebase part to the Google 
Knowledge Graph and repeat this every three months to guide our community in 
integrating more sources. Can we do that?

-- Sebastian 

On September 20, 2019 8:07:28 PM GMT+02:00, "Denny Vrandečić" 
 wrote:
>I would love your input! I will send the link here, and any
>contribution
>will be welcome :)
>
>Thank you!
>
>On Fri, Sep 20, 2019 at 11:05 AM Samuel Klein 
>wrote:
>
>> I'm also interested in this comparison and intersection, and glad to
>share
>> perspective + help.  Warmly, SJ
>>
>> On Fri, Sep 20, 2019 at 1:32 PM Denny Vrandečić 
>> wrote:
>>
>>> Yes, you're touching exactly on the problems I had during the
>evaluation
>>> - I couldn't even figure out what DBpedia is. Thanks, your help will
>be
>>> very much appreciated.
>>>
>>> OK, I will send a link the week after the next, and then we can
>start
>>> working on it :) I am very much looking forward to it.
>>>
>>> On Fri, Sep 20, 2019 at 10:11 AM Sebastian Hellmann <
>>> hellm...@informatik.uni-leipzig.de> wrote:
>>>
 Na, I am quite open, albeit impulsive. The information given was
>quite
 good and some of my concerns regarding the involvement of Google
>were also
 lifted or relativized. Mainly due to the fact that there seems to
>be a
 sense of awareness.

 I am just studying  economic principles, which are very powerful. I
>also
 have the feeling that free and open stuff just got a lot more
>commercial
 and I am still struggling with myself whether this is good or not.
>Also
 whether DBpedia should become frenemies with BigTech. Or funny
>things like
 many funding agencies try to push for national sustainability
>options, but
 most of the time, they suggest to use the GitHub Platform. Wikibase
>could
 be an option here.

 I have to apologize for the Knowledge Graph Talk thing. I was a bit
 grumpy, because I thought I wasted a lot of time on the Talk page
>that
 could have been invested in making the article better (WP:BE_BOLD
>style),
 but now I think, it might have been my own mistake. So apologies
>for
 lashing out there.

 (see comments below)
 On 20.09.19 17:53, Denny Vrandečić wrote:

 Sebastian,

 "I don't want to facilitate conspiracy theories, but ..."
 "[I am] interested in what is the truth behind the truth"

 I am sorry, I truly am, but this *is* the language I know from
 conspiracy theorists. And given that, I cannot imagine that there
>is
 anything I can say that could convince you otherwise. Therefore
>there is no
 real point for me in engaging with this conversation on these
>terms, I
 cannot see how it would turn constructive.

 The answers to many of your questions are public and on the record.
 Others tried to point you to them (thanks), but you dismiss them as
>not
 fitting your narrative.

 So here's a suggestion, which I think might be much more
>constructive
 and forward-looking:

 I have been working on a comparison of DBpedia, Wikidata, and
>Freebase
 (and since you've read my thesis, you know that's a thing I know a
>bit
 about). Simple evaluation, coverage, correctness, nothing
>dramatically
 fancy. But I am torn about publishing it, because, d'oh, people may
>(with
 good reasons) dismiss it as being biased. And truth be told - the
>simple
 fact that I don't know DBpedia as well as I know Wikidata and
>Freebase
 might indeed have lead to errors, mistakes, and stuff I missed in
>the
 evaluation. But you know what would help?

 You.

 My suggestion is that I publish my current draft, and then you and
>me
 work together on it, publically, in the open, until we reach a
>state we
 both consider correct enough for publication.

 What do you think?

 Sure, we are doing statistics at the moment as well. It is a bit
>hard to
 define what DBpedia is nowadays as we are rebranding the remixed
>datasets,
 now that we can pick up links and other data from the Databus. It
>might not
 even be a real dataset anymore, but glue between datasets focusing
>on the
 speed of integration and ease of quality improvement. Also still
>working on
 the concrete Sync Targets for GlobalFactSync (

>https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/GlobalFactSyncRE)
 as well.

 One question I have is whether Wikidata is effective/efficient or
>where
 it is effective and where it could use improvement as a chance for
 collaboration.

 So yes any time.

 -- Sebastian


 Cheers,
 Denny

 P.S.: I am travelling the next week, so I may ask for patience


 On Fri, Sep 20, 2019 at 8:11 AM Thad Guidry 

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Denny Vrandečić
I would love your input! I will send the link here, and any contribution
will be welcome :)

Thank you!

On Fri, Sep 20, 2019 at 11:05 AM Samuel Klein  wrote:

> I'm also interested in this comparison and intersection, and glad to share
> perspective + help.  Warmly, SJ
>
> On Fri, Sep 20, 2019 at 1:32 PM Denny Vrandečić 
> wrote:
>
>> Yes, you're touching exactly on the problems I had during the evaluation
>> - I couldn't even figure out what DBpedia is. Thanks, your help will be
>> very much appreciated.
>>
>> OK, I will send a link the week after the next, and then we can start
>> working on it :) I am very much looking forward to it.
>>
>> On Fri, Sep 20, 2019 at 10:11 AM Sebastian Hellmann <
>> hellm...@informatik.uni-leipzig.de> wrote:
>>
>>> Na, I am quite open, albeit impulsive. The information given was quite
>>> good and some of my concerns regarding the involvement of Google were also
>>> lifted or relativized. Mainly due to the fact that there seems to be a
>>> sense of awareness.
>>>
>>> I am just studying  economic principles, which are very powerful. I also
>>> have the feeling that free and open stuff just got a lot more commercial
>>> and I am still struggling with myself whether this is good or not. Also
>>> whether DBpedia should become frenemies with BigTech. Or funny things like
>>> many funding agencies try to push for national sustainability options, but
>>> most of the time, they suggest to use the GitHub Platform. Wikibase could
>>> be an option here.
>>>
>>> I have to apologize for the Knowledge Graph Talk thing. I was a bit
>>> grumpy, because I thought I wasted a lot of time on the Talk page that
>>> could have been invested in making the article better (WP:BE_BOLD style),
>>> but now I think, it might have been my own mistake. So apologies for
>>> lashing out there.
>>>
>>> (see comments below)
>>> On 20.09.19 17:53, Denny Vrandečić wrote:
>>>
>>> Sebastian,
>>>
>>> "I don't want to facilitate conspiracy theories, but ..."
>>> "[I am] interested in what is the truth behind the truth"
>>>
>>> I am sorry, I truly am, but this *is* the language I know from
>>> conspiracy theorists. And given that, I cannot imagine that there is
>>> anything I can say that could convince you otherwise. Therefore there is no
>>> real point for me in engaging with this conversation on these terms, I
>>> cannot see how it would turn constructive.
>>>
>>> The answers to many of your questions are public and on the record.
>>> Others tried to point you to them (thanks), but you dismiss them as not
>>> fitting your narrative.
>>>
>>> So here's a suggestion, which I think might be much more constructive
>>> and forward-looking:
>>>
>>> I have been working on a comparison of DBpedia, Wikidata, and Freebase
>>> (and since you've read my thesis, you know that's a thing I know a bit
>>> about). Simple evaluation, coverage, correctness, nothing dramatically
>>> fancy. But I am torn about publishing it, because, d'oh, people may (with
>>> good reasons) dismiss it as being biased. And truth be told - the simple
>>> fact that I don't know DBpedia as well as I know Wikidata and Freebase
>>> might indeed have lead to errors, mistakes, and stuff I missed in the
>>> evaluation. But you know what would help?
>>>
>>> You.
>>>
>>> My suggestion is that I publish my current draft, and then you and me
>>> work together on it, publically, in the open, until we reach a state we
>>> both consider correct enough for publication.
>>>
>>> What do you think?
>>>
>>> Sure, we are doing statistics at the moment as well. It is a bit hard to
>>> define what DBpedia is nowadays as we are rebranding the remixed datasets,
>>> now that we can pick up links and other data from the Databus. It might not
>>> even be a real dataset anymore, but glue between datasets focusing on the
>>> speed of integration and ease of quality improvement. Also still working on
>>> the concrete Sync Targets for GlobalFactSync (
>>> https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/GlobalFactSyncRE)
>>> as well.
>>>
>>> One question I have is whether Wikidata is effective/efficient or where
>>> it is effective and where it could use improvement as a chance for
>>> collaboration.
>>>
>>> So yes any time.
>>>
>>> -- Sebastian
>>>
>>>
>>> Cheers,
>>> Denny
>>>
>>> P.S.: I am travelling the next week, so I may ask for patience
>>>
>>>
>>> On Fri, Sep 20, 2019 at 8:11 AM Thad Guidry 
>>> wrote:
>>>
 Thank you for sharing your opinions, Sebastian.

 Cheers,
 Thad
 https://www.linkedin.com/in/thadguidry/


 On Fri, Sep 20, 2019 at 9:43 AM Sebastian Hellmann <
 hellm...@informatik.uni-leipzig.de> wrote:

> Hi Thad,
> On 20.09.19 15:28, Thad Guidry wrote:
>
> With my tech evangelist hat on...
>
> Google's philanthropy is nearly boundless when it comes to the
> promotion of knowledge.  Why? Because indeed it's in their best interest
> otherwise no one can prosper without knowledge.  They 

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Samuel Klein
I'm also interested in this comparison and intersection, and glad to share
perspective + help.  Warmly, SJ

On Fri, Sep 20, 2019 at 1:32 PM Denny Vrandečić  wrote:

> Yes, you're touching exactly on the problems I had during the evaluation -
> I couldn't even figure out what DBpedia is. Thanks, your help will be
> very much appreciated.
>
> OK, I will send a link the week after the next, and then we can start
> working on it :) I am very much looking forward to it.
>
> On Fri, Sep 20, 2019 at 10:11 AM Sebastian Hellmann <
> hellm...@informatik.uni-leipzig.de> wrote:
>
>> Na, I am quite open, albeit impulsive. The information given was quite
>> good and some of my concerns regarding the involvement of Google were also
>> lifted or relativized. Mainly due to the fact that there seems to be a
>> sense of awareness.
>>
>> I am just studying  economic principles, which are very powerful. I also
>> have the feeling that free and open stuff just got a lot more commercial
>> and I am still struggling with myself whether this is good or not. Also
>> whether DBpedia should become frenemies with BigTech. Or funny things like
>> many funding agencies try to push for national sustainability options, but
>> most of the time, they suggest to use the GitHub Platform. Wikibase could
>> be an option here.
>>
>> I have to apologize for the Knowledge Graph Talk thing. I was a bit
>> grumpy, because I thought I wasted a lot of time on the Talk page that
>> could have been invested in making the article better (WP:BE_BOLD style),
>> but now I think, it might have been my own mistake. So apologies for
>> lashing out there.
>>
>> (see comments below)
>> On 20.09.19 17:53, Denny Vrandečić wrote:
>>
>> Sebastian,
>>
>> "I don't want to facilitate conspiracy theories, but ..."
>> "[I am] interested in what is the truth behind the truth"
>>
>> I am sorry, I truly am, but this *is* the language I know from conspiracy
>> theorists. And given that, I cannot imagine that there is anything I can
>> say that could convince you otherwise. Therefore there is no real point for
>> me in engaging with this conversation on these terms, I cannot see how it
>> would turn constructive.
>>
>> The answers to many of your questions are public and on the record.
>> Others tried to point you to them (thanks), but you dismiss them as not
>> fitting your narrative.
>>
>> So here's a suggestion, which I think might be much more constructive and
>> forward-looking:
>>
>> I have been working on a comparison of DBpedia, Wikidata, and Freebase
>> (and since you've read my thesis, you know that's a thing I know a bit
>> about). Simple evaluation, coverage, correctness, nothing dramatically
>> fancy. But I am torn about publishing it, because, d'oh, people may (with
>> good reasons) dismiss it as being biased. And truth be told - the simple
>> fact that I don't know DBpedia as well as I know Wikidata and Freebase
>> might indeed have lead to errors, mistakes, and stuff I missed in the
>> evaluation. But you know what would help?
>>
>> You.
>>
>> My suggestion is that I publish my current draft, and then you and me
>> work together on it, publically, in the open, until we reach a state we
>> both consider correct enough for publication.
>>
>> What do you think?
>>
>> Sure, we are doing statistics at the moment as well. It is a bit hard to
>> define what DBpedia is nowadays as we are rebranding the remixed datasets,
>> now that we can pick up links and other data from the Databus. It might not
>> even be a real dataset anymore, but glue between datasets focusing on the
>> speed of integration and ease of quality improvement. Also still working on
>> the concrete Sync Targets for GlobalFactSync (
>> https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/GlobalFactSyncRE)
>> as well.
>>
>> One question I have is whether Wikidata is effective/efficient or where
>> it is effective and where it could use improvement as a chance for
>> collaboration.
>>
>> So yes any time.
>>
>> -- Sebastian
>>
>>
>> Cheers,
>> Denny
>>
>> P.S.: I am travelling the next week, so I may ask for patience
>>
>>
>> On Fri, Sep 20, 2019 at 8:11 AM Thad Guidry  wrote:
>>
>>> Thank you for sharing your opinions, Sebastian.
>>>
>>> Cheers,
>>> Thad
>>> https://www.linkedin.com/in/thadguidry/
>>>
>>>
>>> On Fri, Sep 20, 2019 at 9:43 AM Sebastian Hellmann <
>>> hellm...@informatik.uni-leipzig.de> wrote:
>>>
 Hi Thad,
 On 20.09.19 15:28, Thad Guidry wrote:

 With my tech evangelist hat on...

 Google's philanthropy is nearly boundless when it comes to the
 promotion of knowledge.  Why? Because indeed it's in their best interest
 otherwise no one can prosper without knowledge.  They aggregate knowledge
 for the benefit of mankind, and then make a profit through advertising ...
 all while making that knowledge extremely easy to be found for the world.


 I am neither pro-Google or anti-Google per se. Maybe skeptical and
 interested in what is 

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread hellmann
Just an ominous note here. It has to do with th property of the semantic web of 
only having one schema and several id's for same things and then it is just a 
matter of how to partition it again and distribute it to where people need the 
information and establishing feedback in the opposite direction. Basically an 
implemented variation of what Kingsley has been saying for years. 

Waiting for your message. 

LG, 
Sebastian 



On September 20, 2019 7:31:36 PM GMT+02:00, "Denny Vrandečić" 
 wrote:
>Yes, you're touching exactly on the problems I had during the
>evaluation -
>I couldn't even figure out what DBpedia is. Thanks, your help will be
>very much appreciated.
>
>OK, I will send a link the week after the next, and then we can start
>working on it :) I am very much looking forward to it.
>
>On Fri, Sep 20, 2019 at 10:11 AM Sebastian Hellmann <
>hellm...@informatik.uni-leipzig.de> wrote:
>
>> Na, I am quite open, albeit impulsive. The information given was
>quite
>> good and some of my concerns regarding the involvement of Google were
>also
>> lifted or relativized. Mainly due to the fact that there seems to be
>a
>> sense of awareness.
>>
>> I am just studying  economic principles, which are very powerful. I
>also
>> have the feeling that free and open stuff just got a lot more
>commercial
>> and I am still struggling with myself whether this is good or not.
>Also
>> whether DBpedia should become frenemies with BigTech. Or funny things
>like
>> many funding agencies try to push for national sustainability
>options, but
>> most of the time, they suggest to use the GitHub Platform. Wikibase
>could
>> be an option here.
>>
>> I have to apologize for the Knowledge Graph Talk thing. I was a bit
>> grumpy, because I thought I wasted a lot of time on the Talk page
>that
>> could have been invested in making the article better (WP:BE_BOLD
>style),
>> but now I think, it might have been my own mistake. So apologies for
>> lashing out there.
>>
>> (see comments below)
>> On 20.09.19 17:53, Denny Vrandečić wrote:
>>
>> Sebastian,
>>
>> "I don't want to facilitate conspiracy theories, but ..."
>> "[I am] interested in what is the truth behind the truth"
>>
>> I am sorry, I truly am, but this *is* the language I know from
>conspiracy
>> theorists. And given that, I cannot imagine that there is anything I
>can
>> say that could convince you otherwise. Therefore there is no real
>point for
>> me in engaging with this conversation on these terms, I cannot see
>how it
>> would turn constructive.
>>
>> The answers to many of your questions are public and on the record.
>Others
>> tried to point you to them (thanks), but you dismiss them as not
>fitting
>> your narrative.
>>
>> So here's a suggestion, which I think might be much more constructive
>and
>> forward-looking:
>>
>> I have been working on a comparison of DBpedia, Wikidata, and
>Freebase
>> (and since you've read my thesis, you know that's a thing I know a
>bit
>> about). Simple evaluation, coverage, correctness, nothing
>dramatically
>> fancy. But I am torn about publishing it, because, d'oh, people may
>(with
>> good reasons) dismiss it as being biased. And truth be told - the
>simple
>> fact that I don't know DBpedia as well as I know Wikidata and
>Freebase
>> might indeed have lead to errors, mistakes, and stuff I missed in the
>> evaluation. But you know what would help?
>>
>> You.
>>
>> My suggestion is that I publish my current draft, and then you and me
>work
>> together on it, publically, in the open, until we reach a state we
>both
>> consider correct enough for publication.
>>
>> What do you think?
>>
>> Sure, we are doing statistics at the moment as well. It is a bit hard
>to
>> define what DBpedia is nowadays as we are rebranding the remixed
>datasets,
>> now that we can pick up links and other data from the Databus. It
>might not
>> even be a real dataset anymore, but glue between datasets focusing on
>the
>> speed of integration and ease of quality improvement. Also still
>working on
>> the concrete Sync Targets for GlobalFactSync (
>>
>https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/GlobalFactSyncRE)
>> as well.
>>
>> One question I have is whether Wikidata is effective/efficient or
>where it
>> is effective and where it could use improvement as a chance for
>> collaboration.
>>
>> So yes any time.
>>
>> -- Sebastian
>>
>>
>> Cheers,
>> Denny
>>
>> P.S.: I am travelling the next week, so I may ask for patience
>>
>>
>> On Fri, Sep 20, 2019 at 8:11 AM Thad Guidry 
>wrote:
>>
>>> Thank you for sharing your opinions, Sebastian.
>>>
>>> Cheers,
>>> Thad
>>> https://www.linkedin.com/in/thadguidry/
>>>
>>>
>>> On Fri, Sep 20, 2019 at 9:43 AM Sebastian Hellmann <
>>> hellm...@informatik.uni-leipzig.de> wrote:
>>>
 Hi Thad,
 On 20.09.19 15:28, Thad Guidry wrote:

 With my tech evangelist hat on...

 Google's philanthropy is nearly boundless when it comes to the
>promotion
 of knowledge.  Why? Because indeed it's in 

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Denny Vrandečić
Yes, you're touching exactly on the problems I had during the evaluation -
I couldn't even figure out what DBpedia is. Thanks, your help will be
very much appreciated.

OK, I will send a link the week after the next, and then we can start
working on it :) I am very much looking forward to it.

On Fri, Sep 20, 2019 at 10:11 AM Sebastian Hellmann <
hellm...@informatik.uni-leipzig.de> wrote:

> Na, I am quite open, albeit impulsive. The information given was quite
> good and some of my concerns regarding the involvement of Google were also
> lifted or relativized. Mainly due to the fact that there seems to be a
> sense of awareness.
>
> I am just studying  economic principles, which are very powerful. I also
> have the feeling that free and open stuff just got a lot more commercial
> and I am still struggling with myself whether this is good or not. Also
> whether DBpedia should become frenemies with BigTech. Or funny things like
> many funding agencies try to push for national sustainability options, but
> most of the time, they suggest to use the GitHub Platform. Wikibase could
> be an option here.
>
> I have to apologize for the Knowledge Graph Talk thing. I was a bit
> grumpy, because I thought I wasted a lot of time on the Talk page that
> could have been invested in making the article better (WP:BE_BOLD style),
> but now I think, it might have been my own mistake. So apologies for
> lashing out there.
>
> (see comments below)
> On 20.09.19 17:53, Denny Vrandečić wrote:
>
> Sebastian,
>
> "I don't want to facilitate conspiracy theories, but ..."
> "[I am] interested in what is the truth behind the truth"
>
> I am sorry, I truly am, but this *is* the language I know from conspiracy
> theorists. And given that, I cannot imagine that there is anything I can
> say that could convince you otherwise. Therefore there is no real point for
> me in engaging with this conversation on these terms, I cannot see how it
> would turn constructive.
>
> The answers to many of your questions are public and on the record. Others
> tried to point you to them (thanks), but you dismiss them as not fitting
> your narrative.
>
> So here's a suggestion, which I think might be much more constructive and
> forward-looking:
>
> I have been working on a comparison of DBpedia, Wikidata, and Freebase
> (and since you've read my thesis, you know that's a thing I know a bit
> about). Simple evaluation, coverage, correctness, nothing dramatically
> fancy. But I am torn about publishing it, because, d'oh, people may (with
> good reasons) dismiss it as being biased. And truth be told - the simple
> fact that I don't know DBpedia as well as I know Wikidata and Freebase
> might indeed have lead to errors, mistakes, and stuff I missed in the
> evaluation. But you know what would help?
>
> You.
>
> My suggestion is that I publish my current draft, and then you and me work
> together on it, publically, in the open, until we reach a state we both
> consider correct enough for publication.
>
> What do you think?
>
> Sure, we are doing statistics at the moment as well. It is a bit hard to
> define what DBpedia is nowadays as we are rebranding the remixed datasets,
> now that we can pick up links and other data from the Databus. It might not
> even be a real dataset anymore, but glue between datasets focusing on the
> speed of integration and ease of quality improvement. Also still working on
> the concrete Sync Targets for GlobalFactSync (
> https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/GlobalFactSyncRE)
> as well.
>
> One question I have is whether Wikidata is effective/efficient or where it
> is effective and where it could use improvement as a chance for
> collaboration.
>
> So yes any time.
>
> -- Sebastian
>
>
> Cheers,
> Denny
>
> P.S.: I am travelling the next week, so I may ask for patience
>
>
> On Fri, Sep 20, 2019 at 8:11 AM Thad Guidry  wrote:
>
>> Thank you for sharing your opinions, Sebastian.
>>
>> Cheers,
>> Thad
>> https://www.linkedin.com/in/thadguidry/
>>
>>
>> On Fri, Sep 20, 2019 at 9:43 AM Sebastian Hellmann <
>> hellm...@informatik.uni-leipzig.de> wrote:
>>
>>> Hi Thad,
>>> On 20.09.19 15:28, Thad Guidry wrote:
>>>
>>> With my tech evangelist hat on...
>>>
>>> Google's philanthropy is nearly boundless when it comes to the promotion
>>> of knowledge.  Why? Because indeed it's in their best interest otherwise no
>>> one can prosper without knowledge.  They aggregate knowledge for the
>>> benefit of mankind, and then make a profit through advertising ... all
>>> while making that knowledge extremely easy to be found for the world.
>>>
>>>
>>> I am neither pro-Google or anti-Google per se. Maybe skeptical and
>>> interested in what is the truth behind the truth. Google is not synonym to
>>> philanthropy. Wikimedia is or at least I think they are doing many things
>>> right. Google is a platform, so primarily they "aggregate knowledge for
>>> their benefit" while creating enough incentives in form of accessibility
>>> 

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Sebastian Hellmann
Na, I am quite open, albeit impulsive. The information given was quite 
good and some of my concerns regarding the involvement of Google were 
also lifted or relativized. Mainly due to the fact that there seems to 
be a sense of awareness.


I am just studying  economic principles, which are very powerful. I also 
have the feeling that free and open stuff just got a lot more commercial 
and I am still struggling with myself whether this is good or not. Also 
whether DBpedia should become frenemies with BigTech. Or funny things 
like many funding agencies try to push for national sustainability 
options, but most of the time, they suggest to use the GitHub Platform. 
Wikibase could be an option here.


I have to apologize for the Knowledge Graph Talk thing. I was a bit 
grumpy, because I thought I wasted a lot of time on the Talk page that 
could have been invested in making the article better (WP:BE_BOLD 
style), but now I think, it might have been my own mistake. So apologies 
for lashing out there.


(see comments below)

On 20.09.19 17:53, Denny Vrandečić wrote:

Sebastian,

"I don't want to facilitate conspiracy theories, but ..."
"[I am] interested in what is the truth behind the truth"

I am sorry, I truly am, but this *is* the language I know from 
conspiracy theorists. And given that, I cannot imagine that there is 
anything I can say that could convince you otherwise. Therefore there 
is no real point for me in engaging with this conversation on these 
terms, I cannot see how it would turn constructive.


The answers to many of your questions are public and on the record. 
Others tried to point you to them (thanks), but you dismiss them as 
not fitting your narrative.


So here's a suggestion, which I think might be much more constructive 
and forward-looking:


I have been working on a comparison of DBpedia, Wikidata, and Freebase 
(and since you've read my thesis, you know that's a thing I know a bit 
about). Simple evaluation, coverage, correctness, nothing dramatically 
fancy. But I am torn about publishing it, because, d'oh, people may 
(with good reasons) dismiss it as being biased. And truth be told - 
the simple fact that I don't know DBpedia as well as I know Wikidata 
and Freebase might indeed have lead to errors, mistakes, and stuff I 
missed in the evaluation. But you know what would help?


You.

My suggestion is that I publish my current draft, and then you and me 
work together on it, publically, in the open, until we reach a state 
we both consider correct enough for publication.


What do you think?


Sure, we are doing statistics at the moment as well. It is a bit hard to 
define what DBpedia is nowadays as we are rebranding the remixed 
datasets, now that we can pick up links and other data from the Databus. 
It might not even be a real dataset anymore, but glue between datasets 
focusing on the speed of integration and ease of quality improvement. 
Also still working on the concrete Sync Targets for GlobalFactSync 
(https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/GlobalFactSyncRE) 
as well.


One question I have is whether Wikidata is effective/efficient or where 
it is effective and where it could use improvement as a chance for 
collaboration.


So yes any time.

-- Sebastian



Cheers,
Denny

P.S.: I am travelling the next week, so I may ask for patience


On Fri, Sep 20, 2019 at 8:11 AM Thad Guidry > wrote:


Thank you for sharing your opinions, Sebastian.

Cheers,
Thad
https://www.linkedin.com/in/thadguidry/


On Fri, Sep 20, 2019 at 9:43 AM Sebastian Hellmann
mailto:hellm...@informatik.uni-leipzig.de>> wrote:

Hi Thad,

On 20.09.19 15:28, Thad Guidry wrote:

With my tech evangelist hat on...

Google's philanthropy is nearly boundless when it comes to
the promotion of knowledge.  Why? Because indeed it's in
their best interest otherwise no one can prosper without
knowledge.  They aggregate knowledge for the benefit of
mankind, and then make a profit through advertising ... all
while making that knowledge extremely easy to be found for
the world.


I am neither pro-Google or anti-Google per se. Maybe skeptical
and interested in what is the truth behind the truth. Google
is not synonym to philanthropy. Wikimedia is or at least I
think they are doing many things right. Google is a platform,
so primarily they "aggregate knowledge for their benefit"
while creating enough incentives in form of accessibility for
users to add the user's knowledge to theirs. It is not about
what Google offers, but what it takes in return. 20% of
employees time is also an investment in the skill of the
employee, a Google asset called Human Capital and also leads
to me and Denny from Google discussing whether
https://en.wikipedia.org/wiki/Talk:Knowledge_Graph is 

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Denny Vrandečić
Sebastian,

"I don't want to facilitate conspiracy theories, but ..."
"[I am] interested in what is the truth behind the truth"

I am sorry, I truly am, but this *is* the language I know from conspiracy
theorists. And given that, I cannot imagine that there is anything I can
say that could convince you otherwise. Therefore there is no real point for
me in engaging with this conversation on these terms, I cannot see how it
would turn constructive.

The answers to many of your questions are public and on the record. Others
tried to point you to them (thanks), but you dismiss them as not fitting
your narrative.

So here's a suggestion, which I think might be much more constructive and
forward-looking:

I have been working on a comparison of DBpedia, Wikidata, and Freebase (and
since you've read my thesis, you know that's a thing I know a bit about).
Simple evaluation, coverage, correctness, nothing dramatically fancy. But I
am torn about publishing it, because, d'oh, people may (with good reasons)
dismiss it as being biased. And truth be told - the simple fact that I
don't know DBpedia as well as I know Wikidata and Freebase might indeed
have lead to errors, mistakes, and stuff I missed in the evaluation. But
you know what would help?

You.

My suggestion is that I publish my current draft, and then you and me work
together on it, publically, in the open, until we reach a state we both
consider correct enough for publication.

What do you think?

Cheers,
Denny

P.S.: I am travelling the next week, so I may ask for patience


On Fri, Sep 20, 2019 at 8:11 AM Thad Guidry  wrote:

> Thank you for sharing your opinions, Sebastian.
>
> Cheers,
> Thad
> https://www.linkedin.com/in/thadguidry/
>
>
> On Fri, Sep 20, 2019 at 9:43 AM Sebastian Hellmann <
> hellm...@informatik.uni-leipzig.de> wrote:
>
>> Hi Thad,
>> On 20.09.19 15:28, Thad Guidry wrote:
>>
>> With my tech evangelist hat on...
>>
>> Google's philanthropy is nearly boundless when it comes to the promotion
>> of knowledge.  Why? Because indeed it's in their best interest otherwise no
>> one can prosper without knowledge.  They aggregate knowledge for the
>> benefit of mankind, and then make a profit through advertising ... all
>> while making that knowledge extremely easy to be found for the world.
>>
>>
>> I am neither pro-Google or anti-Google per se. Maybe skeptical and
>> interested in what is the truth behind the truth. Google is not synonym to
>> philanthropy. Wikimedia is or at least I think they are doing many things
>> right. Google is a platform, so primarily they "aggregate knowledge for
>> their benefit" while creating enough incentives in form of accessibility
>> for users to add the user's knowledge to theirs. It is not about what
>> Google offers, but what it takes in return. 20% of employees time is also
>> an investment in the skill of the employee, a Google asset called Human
>> Capital and also leads to me and Denny from Google discussing whether
>> https://en.wikipedia.org/wiki/Talk:Knowledge_Graph is content marketing
>> or knowledge (@Denny: no offense, legit arguments, but no agenda to resolve
>> the stalled discussion there). Except I don't have 20% time to straighten
>> the view into what I believe would be neutral, so pushing it becomes a
>> resource issue.
>>
>> I found the other replies much more realistic and the perspective is yet
>> unclear. Maybe Mozilla wasn't so much frenemy with Google and got removed
>> from the browser market for it. I am also thinking about Linked Open Data.
>> Decentralisation is quite weak, individually. I guess spreading all the
>> Wikibases around to super-nodes is helpful unless it prevents the formation
>> of a stronger lobby of philanthropists or competition to BigTech. Wikidata
>> created some pressure on DBpedia as well (also opportunities), but we are
>> fine since we can simply innovate. Others might not withstand. Microsoft
>> seems to favor OpenStreetMaps so I am just asking to which degree Open
>> Source and Open Data is being instrumentalised by BigTech.
>>
>> Hence my question, whether it is compromise or be removed. (Note that
>> states are also platforms, which measure value in GDP and make laws and
>> roads and take VAT on transactions. Sometimes, they even don't remove
>> opposition.)
>>
>> --
>> All the best,
>> Sebastian Hellmann
>>
>> Director of Knowledge Integration and Linked Data Technologies (KILT)
>> Competence Center
>> at the Institute for Applied Informatics (InfAI) at Leipzig University
>> Executive Director of the DBpedia Association
>> Projects: http://dbpedia.org, http://nlp2rdf.org,
>> http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
>> 
>> Homepage: http://aksw.org/SebastianHellmann
>> Research Group: http://aksw.org
>>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Thad Guidry
Thank you for sharing your opinions, Sebastian.

Cheers,
Thad
https://www.linkedin.com/in/thadguidry/


On Fri, Sep 20, 2019 at 9:43 AM Sebastian Hellmann <
hellm...@informatik.uni-leipzig.de> wrote:

> Hi Thad,
> On 20.09.19 15:28, Thad Guidry wrote:
>
> With my tech evangelist hat on...
>
> Google's philanthropy is nearly boundless when it comes to the promotion
> of knowledge.  Why? Because indeed it's in their best interest otherwise no
> one can prosper without knowledge.  They aggregate knowledge for the
> benefit of mankind, and then make a profit through advertising ... all
> while making that knowledge extremely easy to be found for the world.
>
>
> I am neither pro-Google or anti-Google per se. Maybe skeptical and
> interested in what is the truth behind the truth. Google is not synonym to
> philanthropy. Wikimedia is or at least I think they are doing many things
> right. Google is a platform, so primarily they "aggregate knowledge for
> their benefit" while creating enough incentives in form of accessibility
> for users to add the user's knowledge to theirs. It is not about what
> Google offers, but what it takes in return. 20% of employees time is also
> an investment in the skill of the employee, a Google asset called Human
> Capital and also leads to me and Denny from Google discussing whether
> https://en.wikipedia.org/wiki/Talk:Knowledge_Graph is content marketing
> or knowledge (@Denny: no offense, legit arguments, but no agenda to resolve
> the stalled discussion there). Except I don't have 20% time to straighten
> the view into what I believe would be neutral, so pushing it becomes a
> resource issue.
>
> I found the other replies much more realistic and the perspective is yet
> unclear. Maybe Mozilla wasn't so much frenemy with Google and got removed
> from the browser market for it. I am also thinking about Linked Open Data.
> Decentralisation is quite weak, individually. I guess spreading all the
> Wikibases around to super-nodes is helpful unless it prevents the formation
> of a stronger lobby of philanthropists or competition to BigTech. Wikidata
> created some pressure on DBpedia as well (also opportunities), but we are
> fine since we can simply innovate. Others might not withstand. Microsoft
> seems to favor OpenStreetMaps so I am just asking to which degree Open
> Source and Open Data is being instrumentalised by BigTech.
>
> Hence my question, whether it is compromise or be removed. (Note that
> states are also platforms, which measure value in GDP and make laws and
> roads and take VAT on transactions. Sometimes, they even don't remove
> opposition.)
>
> --
> All the best,
> Sebastian Hellmann
>
> Director of Knowledge Integration and Linked Data Technologies (KILT)
> Competence Center
> at the Institute for Applied Informatics (InfAI) at Leipzig University
> Executive Director of the DBpedia Association
> Projects: http://dbpedia.org, http://nlp2rdf.org,
> http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
> 
> Homepage: http://aksw.org/SebastianHellmann
> Research Group: http://aksw.org
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Sebastian Hellmann

Hi Thad,

On 20.09.19 15:28, Thad Guidry wrote:

With my tech evangelist hat on...

Google's philanthropy is nearly boundless when it comes to the 
promotion of knowledge.  Why? Because indeed it's in their best 
interest otherwise no one can prosper without knowledge.  They 
aggregate knowledge for the benefit of mankind, and then make a profit 
through advertising ... all while making that knowledge extremely easy 
to be found for the world.


I am neither pro-Google or anti-Google per se. Maybe skeptical and 
interested in what is the truth behind the truth. Google is not synonym 
to philanthropy. Wikimedia is or at least I think they are doing many 
things right. Google is a platform, so primarily they "aggregate 
knowledge for their benefit" while creating enough incentives in form of 
accessibility for users to add the user's knowledge to theirs. It is not 
about what Google offers, but what it takes in return. 20% of employees 
time is also an investment in the skill of the employee, a Google asset 
called Human Capital and also leads to me and Denny from Google 
discussing whether https://en.wikipedia.org/wiki/Talk:Knowledge_Graph is 
content marketing or knowledge (@Denny: no offense, legit arguments, but 
no agenda to resolve the stalled discussion there). Except I don't have 
20% time to straighten the view into what I believe would be neutral, so 
pushing it becomes a resource issue.


I found the other replies much more realistic and the perspective is yet 
unclear. Maybe Mozilla wasn't so much frenemy with Google and got 
removed from the browser market for it. I am also thinking about Linked 
Open Data. Decentralisation is quite weak, individually. I guess 
spreading all the Wikibases around to super-nodes is helpful unless it 
prevents the formation of a stronger lobby of philanthropists or 
competition to BigTech. Wikidata created some pressure on DBpedia as 
well (also opportunities), but we are fine since we can simply innovate. 
Others might not withstand. Microsoft seems to favor OpenStreetMaps so I 
am just asking to which degree Open Source and Open Data is being 
instrumentalised by BigTech.


Hence my question, whether it is compromise or be removed. (Note that 
states are also platforms, which measure value in GDP and make laws and 
roads and take VAT on transactions. Sometimes, they even don't remove 
opposition.)


--
All the best,
Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT) 
Competence Center

at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association
Projects: http://dbpedia.org, http://nlp2rdf.org, 
http://linguistics.okfn.org, https://www.w3.org/community/ld4lt 


Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Thad Guidry
With my tech evangelist hat on...

Google's philanthropy is nearly boundless when it comes to the promotion of
knowledge.  Why? Because indeed it's in their best interest otherwise no
one can prosper without knowledge.  They aggregate knowledge for the
benefit of mankind, and then make a profit through advertising ... all
while making that knowledge extremely easy to be found for the world.

Nothing in this world is entirely free (servers must spin, cooling must be
provided, bugs squashed...).
To that end, Google and others understand this and help defray
substantial costs of providing free knowledge in multiple domains
especially in those domains that contribute to tech and human goodwill
(science & medicine).
Sometimes with direct cash donations to WMF, even just this year with $2
million being decided by Google employees to give to WMF!!!


Other times it's with talent from interns they pay for during the summer,
or tech knowledge exchanges to help tackle problems we have.
Still other times it's just their 20% employee time helping the world keep
Open Source libraries up to date or giving the world Open Source tools that
we ourselves use across WMF every minute of the day.
Then there are all the trickle down benefits (increasing privacy, REALLY?,
yes Really!
,
reducing security risks, better performance, etc.) from those Open Source
libraries & tools with things like ClusterFuzz
,
TensorFlow, Go, Kubernetes, and 1000's of others.
https://opensource.google.com/
https://github.com/google

Thad
https://www.linkedin.com/in/thadguidry/


On Fri, Sep 20, 2019 at 5:25 AM Luca Martinelli 
wrote:

> Hi Sebastian,
>
> I'll try to take on some of your doubts, hopefully helping you to
> solve them, or at least to give you some starting points.
>
> Il giorno ven 20 set 2019 alle ore 10:48 Sebastian Hellmann
>  ha scritto:
> > 1. there was a Knowledge Engine Project which failed, but in principle
> had the right idea:
> https://en.wikipedia.org/wiki/Knowledge_Engine_(Wikimedia_Foundation)
> >
> > This was aimed to "democratize the discovery of media, news and
> information", in particular counter-moving the traffic sink by Google
> providing Wikipedia's information in Google Search.
>
> I don't remember/know much about the Knowledge Engine (KE), but to
> quote Liam Wyatt/User:Wittylama, "the crime wasn't thinking about it,
> it was the cover-up".
>
> In other words, and based on what I remember and know, the Wikipedia
> internal search engine always sucked, and KE was an hypothesis of
> solving this problem. The main problems were:
> 1) an overall sensation - I repeat: SENSATION - that WMF was ready to
> compete with Google on the "search engine market", something that was
> never discussed within and/or with the community;
> 2) that this project was pushed in a very "secretive" way, i.e. it was
> discovered by chance with an announcement of WMF winning a grant from
> [I don't remember which institution, sorry], and the more questions
> were raised about it, the less answers the then-Executive Director
> seemed to be willing to give.
>
> IMHO, having an internal engine that helps people getting what they're
> looking for is a great idea, and the way it was conducted was indeed a
> crime, because (again IMHO) we lost a good opportunity to start our
> work several years in advance. What makes me still angry about it was
> the way the whole thing was conducted: we still lack most pieces of
> the whole thing, and this may fuel non-NPOV reconstructions as well as
> unnecessary spin-off discussions that bring us further away from the
> solution we were trying to achieve.
>
> > Now that there is Wikidata, this is much better for Google because they
> can take the CC-0 data as they wish.
>
> KE and Wikidata are two separate issues. I'm sure Wikidata would have
> played a role in KE, given its important role in linking concepts and
> items, but they're still two separate things.
>
> As for Google picking data from Wikidata, they do the same from
> countless databases (disregarding of their license), so all I can say
> is that, if I were Google, I'd do the very same thing. The difference
> between Google and Wikidata, and the reason why I still think Wikidata
> is better, is that the latter releases its data to *everybody*, while
> the former keeps it only to itself.
>
> And I want to stress that "everybody" part: when we do synchronisation
> with a GLAM database, we give them back an extremely valuable
> feedback, in terms of link to other databases they can freely access,
> as well as in terms of hints for data clean-up - which, again, is
> something that Google doesn't provide at all.
>
> > 3. I was under the impression that Google bought Freebase and then
> started Wikidata 

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Luca Martinelli
Hi Sebastian,

I'll try to take on some of your doubts, hopefully helping you to
solve them, or at least to give you some starting points.

Il giorno ven 20 set 2019 alle ore 10:48 Sebastian Hellmann
 ha scritto:
> 1. there was a Knowledge Engine Project which failed, but in principle had 
> the right idea: 
> https://en.wikipedia.org/wiki/Knowledge_Engine_(Wikimedia_Foundation)
>
> This was aimed to "democratize the discovery of media, news and information", 
> in particular counter-moving the traffic sink by Google providing Wikipedia's 
> information in Google Search.

I don't remember/know much about the Knowledge Engine (KE), but to
quote Liam Wyatt/User:Wittylama, "the crime wasn't thinking about it,
it was the cover-up".

In other words, and based on what I remember and know, the Wikipedia
internal search engine always sucked, and KE was an hypothesis of
solving this problem. The main problems were:
1) an overall sensation - I repeat: SENSATION - that WMF was ready to
compete with Google on the "search engine market", something that was
never discussed within and/or with the community;
2) that this project was pushed in a very "secretive" way, i.e. it was
discovered by chance with an announcement of WMF winning a grant from
[I don't remember which institution, sorry], and the more questions
were raised about it, the less answers the then-Executive Director
seemed to be willing to give.

IMHO, having an internal engine that helps people getting what they're
looking for is a great idea, and the way it was conducted was indeed a
crime, because (again IMHO) we lost a good opportunity to start our
work several years in advance. What makes me still angry about it was
the way the whole thing was conducted: we still lack most pieces of
the whole thing, and this may fuel non-NPOV reconstructions as well as
unnecessary spin-off discussions that bring us further away from the
solution we were trying to achieve.

> Now that there is Wikidata, this is much better for Google because they can 
> take the CC-0 data as they wish.

KE and Wikidata are two separate issues. I'm sure Wikidata would have
played a role in KE, given its important role in linking concepts and
items, but they're still two separate things.

As for Google picking data from Wikidata, they do the same from
countless databases (disregarding of their license), so all I can say
is that, if I were Google, I'd do the very same thing. The difference
between Google and Wikidata, and the reason why I still think Wikidata
is better, is that the latter releases its data to *everybody*, while
the former keeps it only to itself.

And I want to stress that "everybody" part: when we do synchronisation
with a GLAM database, we give them back an extremely valuable
feedback, in terms of link to other databases they can freely access,
as well as in terms of hints for data clean-up - which, again, is
something that Google doesn't provide at all.

> 3. I was under the impression that Google bought Freebase and then started 
> Wikidata as a non-threatening model to the data they have in their Knowledge 
> Graph
>Could someone give me some pointers about the financial connections of Google 
>and Wikimedia (this should be transparent, right?) and also who pushed the 
>Wikidata movement into life in 2012?

Wikidata started as an independent project by some of the people who
worked on Semantic MediaWiki (there are so many of them I fear I might
miss some of them, and that would be embarrassing for me), not as a
Google project.

It was originally financed *also* by Google, yes, but it was a small
part compared to the aid from other institutions, such as the Allen
Institute for Artificial Intelligence, the Gordon and Betty Moore
Foundation, the Wikimedia Foundation itself, and others.

> Google was also mentioned in 
> https://blog.wikimedia.org/2017/10/30/wikidata-fifth-birthday/ but while it 
> reads "Freebase, was discontinued because of the superiority of Wikidata’s 
> approach and active community." I know the story as: Google didn't want its 
> competitors to have the data and the service. Not much of Freebase did end up 
> in Wikidata.

I remember the story as "Google couldn't make anymore money out of
Freebase, that was being also superseded by other internal systems
*and* Wikidata, so Denny pushed Google to donate Freebase's triples to
Wikidata".

This is basically the same (well, with due proportions) that happened
with OpenRefine, which originally was called Google Refine and that
was discontinued because Google couldn't do any profit with it, and
now is one of the most valuable tools that we can use to clean up and
re-conciliate data with Wikidata.

As for the integration of the data, I don't have any precise data
about it, but I'm sure that a fair part of Freebase did end up in
Wikidata, just as much as many other big databases did.

> As I said, I don't want to push any opinions in any directions. I am more 
> asking for more information about the 

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Federico Leva (Nemo)

Sebastian Hellmann, 20/09/19 11:22:
Maybe somebody could enlighten me about the overall strategy and 
connections here.


You can add more links to grants and other Wikimedia pages on 
.


Google and the Wikimedia movement are on opposite sides for most things, 
but occasionally some of their employees (or algorithms!) happen to be 
interested in the same things as us, so we end up doing things together 
and a few breadcrumbs travel towards WMF. What matters to me is that 
they don't abuse our brands.


Sadly WMF is not always careful about communication, for instance 
 still has an appalling 
sentence "Working with partners like Google" right under the heading 
"Partner for change".


Federico

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Nicolas VIGNERON
Hi,

You can already found some information here:
https://en.wikipedia.org/wiki/Wikidata#Development_history (including
finance details is you follow the sources).

For the "How intertwined is Google", it's a long and complex story, it goes
back at least to 2005 (Wikipedia probably wouldn't exist today - or in a
drastic different way - if the search engine didn't favour Wikipedia since
then).
As a non-answer, I would say that Wikidata is as intertwined with Google as
any major website is intertwined with Google.

Cdlt, ~nicolas

Le ven. 20 sept. 2019 à 10:48, Sebastian Hellmann <
hellm...@informatik.uni-leipzig.de> a écrit :

> Dear all,
>
> personally I am quite happy that Denny can contribute more to Wikidata and
> Wikipedia. No personal criticism there, I read his thesis and I am
> impressed by his work and contributions.
>
> I don't want to facilitate any conspiracy theories here, but I am
> wondering about where Wikidata is going, especially with respect to Google.
>
> Note that Chrome/Chromium being Open Source with a twist has already
> pushed Firefox from the market, but now there is this controversy about
> what is being tracked server side by Google Analytics and Client side by
> cookies and also the current discussion about Ad Blocker removal from
> Chrome:
> https://www.wired.com/story/google-chrome-ad-blockers-extensions-api/
>
> Maybe somebody could enlighten me about the overall strategy and
> connections here.
>
> 1. there was a Knowledge Engine Project which failed, but in principle had
> the right idea:
> https://en.wikipedia.org/wiki/Knowledge_Engine_(Wikimedia_Foundation)
>
> This was aimed to "democratize the discovery of media, news and
> information", in particular counter-moving the traffic sink by Google
> providing Wikipedia's information in Google Search. Now that there is
> Wikidata, this is much better for Google because they can take the CC-0
> data as they wish.
>
> 2. there are some very widely used terms like  "Knowledge Graph" , which
> seems to be blocked by Google: https://www.wikidata.org/wiki/Q648625 and
> https://en.wikipedia.org/wiki/Knowledge_Graph without a neutral point of
> view like the German WP adopted:
> https://de.wikipedia.org/wiki/Google#Knowledge_Graph
>
> 3. I was under the impression that Google bought Freebase and then started
> Wikidata as a non-threatening model to the data they have in their
> Knowledge Graph
>
> Could someone give me some pointers about the financial connections of
> Google and Wikimedia (this should be transparent, right?) and also who
> pushed the Wikidata movement into life in 2012?
>
> Google was also mentioned in
> https://blog.wikimedia.org/2017/10/30/wikidata-fifth-birthday/ but while
> it reads "Freebase , was
> discontinued because of the superiority of Wikidata’s approach and active
> community." I know the story as: Google didn't want its competitors to have
> the data and the service. Not much of Freebase did end up in Wikidata.
>
> As I said, I don't want to push any opinions in any directions. I am more
> asking for more information about the connection of Google to Wikidata
> (financially), then Google to WMF and also I am asking about any strategic
> advantages for Google in relation to their competition.
>
> Please don't answer with "How great Wikidata is", I already know that and
> this is also not in the scope of my "How intertwined is Google with
> Wikidata / WMF?" question. Can't mention this enough: also not against
> Denny.
> It is a request for better information as I can't seem to find clear
> answers here.
>
> --
> All the best,
> Sebastian Hellmann
>
> Director of Knowledge Integration and Linked Data Technologies (KILT)
> Competence Center
> at the Institute for Applied Informatics (InfAI) at Leipzig University
> Executive Director of the DBpedia Association
> Projects: http://dbpedia.org, http://nlp2rdf.org,
> http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
> 
> Homepage: http://aksw.org/SebastianHellmann
> Research Group: http://aksw.org
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-20 Thread Sebastian Hellmann

Dear all,

personally I am quite happy that Denny can contribute more to Wikidata 
and Wikipedia. No personal criticism there, I read his thesis and I am 
impressed by his work and contributions.


I don't want to facilitate any conspiracy theories here, but I am 
wondering about where Wikidata is going, especially with respect to Google.


Note that Chrome/Chromium being Open Source with a twist has already 
pushed Firefox from the market, but now there is this controversy about 
what is being tracked server side by Google Analytics and Client side by 
cookies and also the current discussion about Ad Blocker removal from 
Chrome: 
https://www.wired.com/story/google-chrome-ad-blockers-extensions-api/


Maybe somebody could enlighten me about the overall strategy and 
connections here.


1. there was a Knowledge Engine Project which failed, but in principle 
had the right idea: 
https://en.wikipedia.org/wiki/Knowledge_Engine_(Wikimedia_Foundation)


This was aimed to "democratize the discovery of media, news and 
information", in particular counter-moving the traffic sink by Google 
providing Wikipedia's information in Google Search. Now that there is 
Wikidata, this is much better for Google because they can take the CC-0 
data as they wish.


2. there are some very widely used terms like  "Knowledge Graph" , which 
seems to be blocked by Google: https://www.wikidata.org/wiki/Q648625 and 
https://en.wikipedia.org/wiki/Knowledge_Graph without a neutral point of 
view like the German WP adopted: 
https://de.wikipedia.org/wiki/Google#Knowledge_Graph


3. I was under the impression that Google bought Freebase and then 
started Wikidata as a non-threatening model to the data they have in 
their Knowledge Graph


Could someone give me some pointers about the financial connections of 
Google and Wikimedia (this should be transparent, right?) and also who 
pushed the Wikidata movement into life in 2012?


Google was also mentioned in 
https://blog.wikimedia.org/2017/10/30/wikidata-fifth-birthday/ but while 
it reads "Freebase , was 
discontinued because of the superiority of Wikidata’s approach and 
active community." I know the story as: Google didn't want its 
competitors to have the data and the service. Not much of Freebase did 
end up in Wikidata.


As I said, I don't want to push any opinions in any directions. I am 
more asking for more information about the connection of Google to 
Wikidata (financially), then Google to WMF and also I am asking about 
any strategic advantages for Google in relation to their competition.


Please don't answer with "How great Wikidata is", I already know that 
and this is also not in the scope of my "How intertwined is Google with 
Wikidata / WMF?" question. Can't mention this enough: also not against 
Denny.


It is a request for better information as I can't seem to find clear 
answers here.


--
All the best,
Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT) 
Competence Center

at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association
Projects: http://dbpedia.org, http://nlp2rdf.org, 
http://linguistics.okfn.org, https://www.w3.org/community/ld4lt 


Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata