from:"Kingsley Idehen"

[Wikidata] New Wikidata 2024 Snapshot deployment using Virtuoso

2024-05-10 Thread Kingsley Idehen via Wikidata


All,

We are pleased to announce a new edition of the Wikidata Knowledge Graph 
that we host. This latest release includes entity relationships 
represented as RDF triples, covering Wikidata updates up to April 2024. 
It is deployed via an instance of Virtuoso Open Source (VOS) Edition.


Live Instance Configuration.


ItemValue
CPU |2x Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz|
Cores   |24|
Memory  |378 GB|
SSD |4x Crucial M4 SSD 500 GB|


Graph Stats.

Graph   Description Triple Count
|http://www.wikidata.org/|  the Wikidata dataset|19,524,435,178|
|http://www.wikidata.org/lexemes/| 	the Wikidata Lexemes dataset 
|169,677,210|

|urn:wikidata:labels|   calculated Wikidata labels  |1,454,421|


Links:

1. 
https://community.openlinksw.com/t/loading-the-wikidata-2024-04-dataset-into-virtuoso-open-source/4413 
-- Announcement Page


2. https://wikidata.demo.openlinksw.com/sparql/ -- SPARQL Endpoint

3. https://wikidata.demo.openlinksw.com/fct/ -- Faceted Search & 
Browsing Service Endpoint



--
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Home Page:http://www.openlinksw.com
Community Support:https://community.openlinksw.com
Weblogs (Blogs):
Company Blog:https://medium.com/openlink-software-blog
Virtuoso Blog:https://medium.com/virtuoso-blog
Data Access Drivers 
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog:https://medium.com/@kidehen
Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest:https://www.pinterest.com/kidehen/
Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter:https://twitter.com/kidehen
Google+:https://plus.google.com/+KingsleyIdehen/about
LinkedIn:http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i

:http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/GDH5GDL7NSC5KPZOSH6XD2CW7TLF243C/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Confusing Blog Post

2023-04-14 Thread Kingsley Idehen via Wikidata


All,

I stumbled upon a Wikidata Scalability blog post by Magnus Manske (cc'd 
in on this message) [1] that contains the following excerpt:


"One of theproposed alternatives 
<https://commons.wikimedia.org/wiki/File:WDQS_Backend_Alternatives_working_paper.pdf>to 
blazegraph is Virtuoso. It models the graph structure of Wikidata in 
MySQL. Even before I read about that, I had a similar thought. But 
unlike Virtuoso, which is a general solution, Wikidata is a special 
case, with optimization potential. So I went to do a bit of tinkering. 
The following was done in the last few days, so it’s not exactly pretty, 
but initial results are promising."


That's incorrect. Virtuoso is a multi-model DBMS with its own SQL and 
RDF engines that have nothing to do with MySQL.


Magnus:

Uniprot hosts 100 Billion+ triples, accessed 24/7 globally, handling 
unpredictable query complexity and solution size unpredictability using 
a single-server instance of the open source edition of Virtuoso. The 
same applies to a vast majority of nodes in the  massive LOD Cloud.


Scalability has never been a Virtuoso issue since the very inception of 
the LOD Cloud bootstrapped by DBpedia (yet another Virtuoso instance).


Links:

[1] http://magnusmanske.de/wordpress/?p=691

--
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Home Page:http://www.openlinksw.com
Community Support:https://community.openlinksw.com
Weblogs (Blogs):
Company Blog:https://medium.com/openlink-software-blog
Virtuoso Blog:https://medium.com/virtuoso-blog
Data Access Drivers 
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog:https://medium.com/@kidehen
Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest:https://www.pinterest.com/kidehen/
Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter:https://twitter.com/kidehen
Google+:https://plus.google.com/+KingsleyIdehen/about
LinkedIn:http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i

:http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this


smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/L3NI6FI6JILKYESJXIBDJ6WIZ3LLY4G4/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] New Virtuoso-hosted Wikidata Instance from the AWS Cloud

2023-03-09 Thread Kingsley Idehen via Wikidata


All,

Pleased to announce immediate availability of a new Wikidata Snapshot 
available from the Amazon Web Services (AWS) cloud [1][2].


This new release comprises a Virtuoso 08.03.3327 instance preloaded with 
the Wikidata Knowledge Graph constructed from the December 2022 dataset 
dump, and mirroring what's also served to the public via our Wikidata 
Query Services Endpoint [3].


Who is this for?

1. Anyone seeking a preloaded and pre-configured Wikidata instance for 
personal or service-specific use


2. Anyone seeking SQL and GraphQL access to Wikidata

3. Anyone seeking ODBC or JDBC access to Wikidata via client tools and 
development environments that support those data access protocols


Links:

[1] 
https://community.openlinksw.com/t/wikidata-snapshot-virtuoso-pago-ebs-backed-ec2-ami/3243 
-- detailed release information
[2] https://aws.amazon.com/marketplace/pp/prodview-fi6y6lnzvs6vc -- AWS 
Marketplace Page
[3] https://wikidata.demo.openlinksw.com/sparql -- SPARQL Endpoint 
deployed using Virtuoso Open Source Edition


--
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/ROAGHBYMKMKXR375TQK6RIWKBVH2XKBD/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Re: Inconsistencies on WDQS data - data reload on WDQS

2023-02-27 Thread Kingsley Idehen via Wikidata



On 2/27/23 10:15 AM, Guillaume Lederrey wrote:
On Fri, 24 Feb 2023 at 19:31, Kingsley Idehen via Wikidata 
 wrote:



On 2/24/23 5:59 AM, Guillaume Lederrey wrote:

On Thu, 23 Feb 2023 at 22:56, Kingsley Idehen
 wrote:


On 2/23/23 3:09 PM, Guillaume Lederrey wrote:

On Thu, 23 Feb 2023 at 16:39, Kingsley Idehen
 wrote:


On 2/22/23 3:28 AM, Guillaume Lederrey wrote:

On Wed, 22 Feb 2023 at 00:03, Kingsley Idehen via
Wikidata  wrote:


On 2/21/23 4:05 PM, Guillaume Lederrey wrote:
> Hello all!
>
> TL;DR: We expect to successfully complete the
recent data reload on
> Wikidata Query Service soon, but we've
encountered multiple failures
> related to the size of the graph, and anticipate
that this issue may
> worsen in the future. Although we succeeded this
time, we cannot
> guarantee that future reload attempts will be
successful given the
> current trend of the data reload process. Thank
you for your
> understanding and patience..
>
> Longer version:
>
> WDQS is updated from a stream of recent changes
on Wikidata, with a
> maximum delay of ~2 minutes. This process was
improved as part of the
> WDQS Streaming Updater project to ensure data
coherence[1] . However,
> the update process is still imperfect and can
lead to data
> inconsistencies in some cases[2][3]. To address
this, we reload the
> data from dumps a few times per year to
reinitialize the system from a
> known good state.
>
> The recent reload of data from dumps started in
mid-December and was
> initially met with some issues related to
download and instabilities
> in Blazegraph, the database used by WDQS[4].
Loading the data into
> Blazegraph takes a couple of weeks due to the
size of the graph, and
> we had multiple attempts where the reload failed
after >90% of the
> data had been loaded. Our understanding of the
issue is that a "race
> condition" in Blazegraph[5], where subtle timing
changes lead to
> corruption of the journal in some rare cases, is
to blame.[6]
>
> We want to reassure you that the last reload job
was successful on one
> of our servers. The data still needs to be copied
over to all of the
> WDQS servers, which will take a couple of weeks,
but should not bring
> any additional issues. However, reloading the
full data from dumps is
> becoming more complex as the data size grows, and
we wanted to let you
> know why the process took longer than expected.
We understand that
> data inconsistencies can be problematic, and we
appreciate your
> patience and understanding while we work to
ensure the quality and
> consistency of the data on WDQS.
>
> Thank you for your continued support and
understanding!
>
>
>     Guillaume
>
>
> [1] https://phabricator.wikimedia.org/T244590
> [2] https://phabricator.wikimedia.org/T323239
> [3] https://phabricator.wikimedia.org/T322869
> [4] https://phabricator.wikimedia.org/T323096
> [5]
https://en.wikipedia.org/wiki/Race_condition#In_software
> [6] https://phabricator.wikimedia.org/T263110
>
Hi Guillaume,

Are there plans to decouple WDQS from the back-end
database? Doing that
provides more resilient architecture for Wikidata
as a whole since you
will be able to swap and interchange
SPARQL-compliant backends.


It depends what you mean by decoupling. The coupling
points as I see them are:

* update process

[Wikidata] Re: Inconsistencies on WDQS data - data reload on WDQS

2023-02-24 Thread Kingsley Idehen via Wikidata



On 2/24/23 2:25 PM, Samuel Klein wrote:

This is an important topic.  Let's migrate off of Blazegraph.

No, really: what's the status of WDQS backend updates 
<https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_update>, 
like risk projections and timelines for migration? [1 
<https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/thread/B4GTI6TDEKS7Q2OMJR26XWLFYMUXSR6F/#YORJB4AYYSUFSYM7H3VTZSOZBC4GTEOZ>]



Guillaume Lederrey  wrote:

2. Near real-time data streams usable by 3rd Party Wikidata hosts

 Could you please open a Phabricator task to document what you
would like to see exposed and why it would be useful?


I started a ticket: https://phabricator.wikimedia.org/T330521
Anyone interested, please edit as needed.



Hi Samuel,


Thanks for opening that up ticket!


Kingsley



___
Wikidata mailing list --wikidata@lists.wikimedia.org
Public archives 
athttps://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/ACUCRGVTCK6GMLU4GEL7UFCOLRAXFJIJ/
To unsubscribe send an email towikidata-le...@lists.wikimedia.org



--
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Home Page:http://www.openlinksw.com
Community Support:https://community.openlinksw.com
Weblogs (Blogs):
Company Blog:https://medium.com/openlink-software-blog
Virtuoso Blog:https://medium.com/virtuoso-blog
Data Access Drivers 
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog:https://medium.com/@kidehen
Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest:https://www.pinterest.com/kidehen/
Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter:https://twitter.com/kidehen
Google+:https://plus.google.com/+KingsleyIdehen/about
LinkedIn:http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i

:http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this


smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/GLCG7XNOTKPHWG2PCJ4HDKZGWOK3JKMV/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Re: Inconsistencies on WDQS data - data reload on WDQS

2023-02-24 Thread Kingsley Idehen via Wikidata



On 2/24/23 5:59 AM, Guillaume Lederrey wrote:
On Thu, 23 Feb 2023 at 22:56, Kingsley Idehen  
wrote:



On 2/23/23 3:09 PM, Guillaume Lederrey wrote:

On Thu, 23 Feb 2023 at 16:39, Kingsley Idehen
 wrote:


On 2/22/23 3:28 AM, Guillaume Lederrey wrote:

On Wed, 22 Feb 2023 at 00:03, Kingsley Idehen via Wikidata
 wrote:


On 2/21/23 4:05 PM, Guillaume Lederrey wrote:
> Hello all!
>
> TL;DR: We expect to successfully complete the recent
data reload on
> Wikidata Query Service soon, but we've encountered
multiple failures
> related to the size of the graph, and anticipate that
this issue may
> worsen in the future. Although we succeeded this time,
we cannot
> guarantee that future reload attempts will be
successful given the
> current trend of the data reload process. Thank you
for your
> understanding and patience..
>
> Longer version:
>
> WDQS is updated from a stream of recent changes on
Wikidata, with a
> maximum delay of ~2 minutes. This process was improved
as part of the
> WDQS Streaming Updater project to ensure data
coherence[1] . However,
> the update process is still imperfect and can lead to
data
> inconsistencies in some cases[2][3]. To address this,
we reload the
> data from dumps a few times per year to reinitialize
the system from a
> known good state.
>
> The recent reload of data from dumps started in
mid-December and was
> initially met with some issues related to download and
instabilities
> in Blazegraph, the database used by WDQS[4]. Loading
the data into
> Blazegraph takes a couple of weeks due to the size of
the graph, and
> we had multiple attempts where the reload failed after
>90% of the
> data had been loaded. Our understanding of the issue
is that a "race
> condition" in Blazegraph[5], where subtle timing
changes lead to
> corruption of the journal in some rare cases, is to
blame.[6]
>
> We want to reassure you that the last reload job was
successful on one
> of our servers. The data still needs to be copied over
to all of the
> WDQS servers, which will take a couple of weeks, but
should not bring
> any additional issues. However, reloading the full
data from dumps is
> becoming more complex as the data size grows, and we
wanted to let you
> know why the process took longer than expected. We
understand that
> data inconsistencies can be problematic, and we
appreciate your
> patience and understanding while we work to ensure the
quality and
> consistency of the data on WDQS.
>
> Thank you for your continued support and understanding!
>
>
>     Guillaume
>
>
> [1] https://phabricator.wikimedia.org/T244590
> [2] https://phabricator.wikimedia.org/T323239
> [3] https://phabricator.wikimedia.org/T322869
> [4] https://phabricator.wikimedia.org/T323096
> [5]
https://en.wikipedia.org/wiki/Race_condition#In_software
> [6] https://phabricator.wikimedia.org/T263110
>
Hi Guillaume,

Are there plans to decouple WDQS from the back-end
database? Doing that
provides more resilient architecture for Wikidata as a
whole since you
will be able to swap and interchange SPARQL-compliant
backends.


It depends what you mean by decoupling. The coupling points
as I see them are:

* update process
* UI
* exposed SPARQL endpoint

The update process is mostly decoupled from the backend. It
is producing a stream of RDF updates that is backend
independent, with a very thin Blazegraph specific adapted to
load the data into Blazegraph.



Does that mean that we could integrate the RDF stream into
our setup re keeping our Wikidata instance up to date, for
instance?

That data stream isn't exposed publicly. There are a few tricky
part about the stream needing t

[Wikidata] Re: Inconsistencies on WDQS data - data reload on WDQS

2023-02-24 Thread Kingsley Idehen via Wikidata



On 2/23/23 4:17 PM, James Heald wrote:

On 23/02/2023 20:08, Kingsley Idehen via Wikidata wrote:


On 2/23/23 12:19 PM, James Heald wrote:


I have to say I am a bit concerned by this talk, since some of 
Blazegraph's "features and quirks" can be exceedingly useful.



That isn't justification for tightly-coupling a Query Tool to a Query 
Service Endpoint, especially when an open standard (in the form of 
SPARQL) exists.




Of course it's a good thing to be able to swap out the back-end and to 
be able to run essentially the same queries against other realisations 
of the database.


It's also a good thing to be able to clone the user interface and use 
essentially the same UI with a different back-end.  (As I understand 
it, this should be very possible).



Good to hear, since that's my fundamental point re loosely-coupled 
architecture enabled by open standards.







But. There are features which have been listed in the desiderata for 
WDQS from the very start, that go beyond what the out-of-the-box 
SPARQL 1.1 standard offers.



Therein lies the problem. A standards based client can include 
extensions for a specific back-end in configurable form based on 
loose-coupling principles. Doing it otherwise is what's generally known 
as leaky abstraction that ultimately racks up technical debt.


An example of technical debt that's manifesting right now is an 
inability to diffuse the costs of the Wikidata Knowledge Graph across a 
federation of SPARQL query service providers. This doesn't have to be 
the case at all, bearing in mind the nature of SPARQL and structured 
data represented using RDF.





Most notable among these is the ability to retrieve items with 
coordinates close to a particular point on the earth's surface. 
(Something which, as the Blazegraph developers discovered, can be 
implemented fairly easily if you add a "Z-order curve" index on 
coordinate values  https://en.wikipedia.org/wiki/Z-order_curve ).



None of that would be lost in a WDQS instance configured to discover the 
SPARQL query endpoint and associated capabilities.





Not all users will have an interest in geographical objects. Those who 
don't will lose little if they hook up a back-end that doesn't provide 
this, because presumably they won't be running queries which require 
it. But those who do need this functionality need this indexing.



See my comment above.




Given that this was something the Blazegraph developers (all 3 of 
them) found they could add relatively easily; and given that it seems 
to me that any database back-end would gain considerable cachet by 
being able to run wikidata queries, it seems to me not unreasonable to 
approach potential alternative back-ends and see how easily they too 
might be able to add a Z-order curve index for coordinate values, plus 
basic functionality to make use of it. (Where wikibase:box and 
wikibase:around are about as basic as it gets).


Andrea suggested a more GeoSPARQL-orientated solution ( 
https://wikitech.wikimedia.org/wiki/User:AndreaWest/Blazegraph_Features_and_Capabilities#SPARQL_Functional_Extensions 
), but that seems to me a much much bigger ask; I do suspect that (for 
almost all contending projects) the simple wikibase:box and 
wikibase:around services would be a lot more easily implemented, to 
free us from our tight-coupling to Blazegraph, yet still provide this 
functionality, which I do believe is a needed requirement.




As for named subqueries, as well as making queries much more readable, 
IMO they may be particularly valuable as a way to specify particular 
optimisations (ie sequencing of query execution, that may be 
absolutely *crucial* if a query is to run) in a particularly readable 
and **portable** way -- certainly when compared to optimiser "hint" 
syntaxes, that may be tied *very* specifically to a particular back-end.


Why do I think named subqueries are so portable, if they are not part 
of the SPARQL 1.1 standard, and most providers don't support them ?


The answer is because if necessary it would require only a fairly 
simple pre-processor script to turn them into inline sub-queries, 
which *are* supported by the standard.


Named sub-queries having the advantage though of making the query a 
lot more readable; and can be useful to indicate to the back-end that 
the sub-query need only be retrieved once, rather than repeatedly each 
time it is referenced (which may be helpful for some back-ends).



These implementation details aren't really relevant to the fundamental 
point I am trying to make about the virtues of loosely-coupled 
architecture facilitated by existing open standards (e.g., SPARQL).







So: I don't disagree that it would be useful if WDQS was less tightly 
dependent on Blazegraph.


But: rather than going straight to removing good features, I think 
there is a lot of scope for seeing whether the dev teams for other 
back-ends could be persuaded to match the features on tho

[Wikidata] Re: Inconsistencies on WDQS data - data reload on WDQS

2023-02-23 Thread Kingsley Idehen via Wikidata

On 2/23/23 3:09 PM, Guillaume Lederrey wrote:
On Thu, 23 Feb 2023 at 16:39, Kingsley Idehen  
wrote:

On 2/22/23 3:28 AM, Guillaume Lederrey wrote:

On Wed, 22 Feb 2023 at 00:03, Kingsley Idehen via Wikidata
 wrote:

On 2/21/23 4:05 PM, Guillaume Lederrey wrote:
> Hello all!
>
> TL;DR: We expect to successfully complete the recent data
reload on
> Wikidata Query Service soon, but we've encountered multiple
failures
> related to the size of the graph, and anticipate that this
issue may
> worsen in the future. Although we succeeded this time, we
cannot
> guarantee that future reload attempts will be successful
given the
> current trend of the data reload process. Thank you for your
> understanding and patience..
>
> Longer version:
>
> WDQS is updated from a stream of recent changes on
Wikidata, with a
> maximum delay of ~2 minutes. This process was improved as
part of the
> WDQS Streaming Updater project to ensure data coherence[1]
. However,
> the update process is still imperfect and can lead to data
> inconsistencies in some cases[2][3]. To address this, we
reload the
> data from dumps a few times per year to reinitialize the
system from a
> known good state.
>
> The recent reload of data from dumps started in
mid-December and was
> initially met with some issues related to download and
instabilities
> in Blazegraph, the database used by WDQS[4]. Loading the
data into
> Blazegraph takes a couple of weeks due to the size of the
graph, and
> we had multiple attempts where the reload failed after >90%
of the
> data had been loaded. Our understanding of the issue is
that a "race
> condition" in Blazegraph[5], where subtle timing changes
lead to
> corruption of the journal in some rare cases, is to blame.[6]
>
> We want to reassure you that the last reload job was
successful on one
> of our servers. The data still needs to be copied over to
all of the
> WDQS servers, which will take a couple of weeks, but should
not bring
> any additional issues. However, reloading the full data
from dumps is
> becoming more complex as the data size grows, and we wanted
to let you
> know why the process took longer than expected. We
understand that
> data inconsistencies can be problematic, and we appreciate
your
> patience and understanding while we work to ensure the
quality and
> consistency of the data on WDQS.
>
> Thank you for your continued support and understanding!
>
>
>     Guillaume
>
>
> [1] https://phabricator.wikimedia.org/T244590
> [2] https://phabricator.wikimedia.org/T323239
> [3] https://phabricator.wikimedia.org/T322869
> [4] https://phabricator.wikimedia.org/T323096
> [5] https://en.wikipedia.org/wiki/Race_condition#In_software
> [6] https://phabricator.wikimedia.org/T263110
>
Hi Guillaume,

Are there plans to decouple WDQS from the back-end database?
Doing that
provides more resilient architecture for Wikidata as a whole
since you
will be able to swap and interchange SPARQL-compliant backends.

It depends what you mean by decoupling. The coupling points as I
see them are:

* update process
* UI
* exposed SPARQL endpoint

The update process is mostly decoupled from the backend. It is
producing a stream of RDF updates that is backend independent,
with a very thin Blazegraph specific adapted to load the data
into Blazegraph.

Does that mean that we could integrate the RDF stream into our
setup re keeping our Wikidata instance up to date, for instance?

That data stream isn't exposed publicly. There are a few tricky part 
about the stream needing to be synchronized with a specific Wikidata 
dump that makes it not entirely trivial to reuse outside of our 
internal use case. But if there is enough interest, we could 
potentially work on making that stream public.

I suspect there's broad interest in this matter since it contributes to 
the overarching issue of loose-coupling re Wikidata's underlying 
infrastructure.

For starters, offering a public stream would be very useful to 3rd party 
Wikidata hosts.

The UI is mostly backend independant. It relies on Search for
some features. And of course, the

[Wikidata] Re: Inconsistencies on WDQS data - data reload on WDQS

2023-02-23 Thread Kingsley Idehen via Wikidata

On 2/22/23 3:28 AM, Guillaume Lederrey wrote:
On Wed, 22 Feb 2023 at 00:03, Kingsley Idehen via Wikidata 
 wrote:

On 2/21/23 4:05 PM, Guillaume Lederrey wrote:
> Hello all!
>
> TL;DR: We expect to successfully complete the recent data reload on
> Wikidata Query Service soon, but we've encountered multiple
failures
> related to the size of the graph, and anticipate that this issue
may
> worsen in the future. Although we succeeded this time, we cannot
> guarantee that future reload attempts will be successful given the
> current trend of the data reload process. Thank you for your
> understanding and patience..
>
> Longer version:
>
> WDQS is updated from a stream of recent changes on Wikidata, with a
> maximum delay of ~2 minutes. This process was improved as part
of the
> WDQS Streaming Updater project to ensure data coherence[1] .
However,
> the update process is still imperfect and can lead to data
> inconsistencies in some cases[2][3]. To address this, we reload the
> data from dumps a few times per year to reinitialize the system
from a
> known good state.
>
> The recent reload of data from dumps started in mid-December and
was
> initially met with some issues related to download and
instabilities
> in Blazegraph, the database used by WDQS[4]. Loading the data into
> Blazegraph takes a couple of weeks due to the size of the graph,
and
> we had multiple attempts where the reload failed after >90% of the
> data had been loaded. Our understanding of the issue is that a
"race
> condition" in Blazegraph[5], where subtle timing changes lead to
> corruption of the journal in some rare cases, is to blame.[6]
>
> We want to reassure you that the last reload job was successful
on one
> of our servers. The data still needs to be copied over to all of
the
> WDQS servers, which will take a couple of weeks, but should not
bring
> any additional issues. However, reloading the full data from
dumps is
> becoming more complex as the data size grows, and we wanted to
let you
> know why the process took longer than expected. We understand that
> data inconsistencies can be problematic, and we appreciate your
> patience and understanding while we work to ensure the quality and
> consistency of the data on WDQS.
>
> Thank you for your continued support and understanding!
>
>
>     Guillaume
>
>
> [1] https://phabricator.wikimedia.org/T244590
> [2] https://phabricator.wikimedia.org/T323239
> [3] https://phabricator.wikimedia.org/T322869
> [4] https://phabricator.wikimedia.org/T323096
> [5] https://en.wikipedia.org/wiki/Race_condition#In_software
> [6] https://phabricator.wikimedia.org/T263110
>
Hi Guillaume,

Are there plans to decouple WDQS from the back-end database? Doing
that
provides more resilient architecture for Wikidata as a whole since
you
will be able to swap and interchange SPARQL-compliant backends.

It depends what you mean by decoupling. The coupling points as I see 
them are:

* update process
* UI
* exposed SPARQL endpoint

The update process is mostly decoupled from the backend. It is 
producing a stream of RDF updates that is backend independent, with a 
very thin Blazegraph specific adapted to load the data into Blazegraph.

Does that mean that we could integrate the RDF stream into our setup re 
keeping our Wikidata instance up to date, for instance?

The UI is mostly backend independant. It relies on Search for some 
features. And of course, the queries themselves might depend on 
Blazegraph specific features.

Can WDQS, based on what's stated above, work with a generic SPARQL 
back-end like Virtuoso, for instance? By that I mean dispatch SPARQL 
queries input by a user (without alteration) en route to server processing?

The exposed SPARQL endpoint is at the moment a direct exposition of 
the Blazegraph endpoint, so it does expose all the Blazegraph specific 
features and quirks.

Is there a Query Service that's separated from the Blazegraph endpoint? 
The crux of the matter here is that WDQS benefits more by being loosely- 
bound to endpoints rather than tightly-bound to the Blazegraph endpoint.

What we would like to do at some point (this is not more than a rough 
idea at this point) is to add a proxy in front of the SPARQL endpoint, 
that would filter specific SPARQL features, so that we limit what is 
available to a standard set of features available across most 
potential backends. This would help reduce the coupling of queries 
with the backend. Of course, this would have the drawback of limiting 
the feature s

[Wikidata] Re: Inconsistencies on WDQS data - data reload on WDQS

2023-02-21 Thread Kingsley Idehen via Wikidata



On 2/21/23 4:05 PM, Guillaume Lederrey wrote:

Hello all!

TL;DR: We expect to successfully complete the recent data reload on 
Wikidata Query Service soon, but we've encountered multiple failures 
related to the size of the graph, and anticipate that this issue may 
worsen in the future. Although we succeeded this time, we cannot 
guarantee that future reload attempts will be successful given the 
current trend of the data reload process. Thank you for your 
understanding and patience..


Longer version:

WDQS is updated from a stream of recent changes on Wikidata, with a 
maximum delay of ~2 minutes. This process was improved as part of the 
WDQS Streaming Updater project to ensure data coherence[1] . However, 
the update process is still imperfect and can lead to data 
inconsistencies in some cases[2][3]. To address this, we reload the 
data from dumps a few times per year to reinitialize the system from a 
known good state.


The recent reload of data from dumps started in mid-December and was 
initially met with some issues related to download and instabilities 
in Blazegraph, the database used by WDQS[4]. Loading the data into 
Blazegraph takes a couple of weeks due to the size of the graph, and 
we had multiple attempts where the reload failed after >90% of the 
data had been loaded. Our understanding of the issue is that a "race 
condition" in Blazegraph[5], where subtle timing changes lead to 
corruption of the journal in some rare cases, is to blame.[6]


We want to reassure you that the last reload job was successful on one 
of our servers. The data still needs to be copied over to all of the 
WDQS servers, which will take a couple of weeks, but should not bring 
any additional issues. However, reloading the full data from dumps is 
becoming more complex as the data size grows, and we wanted to let you 
know why the process took longer than expected. We understand that 
data inconsistencies can be problematic, and we appreciate your 
patience and understanding while we work to ensure the quality and 
consistency of the data on WDQS.


Thank you for your continued support and understanding!


    Guillaume


[1] https://phabricator.wikimedia.org/T244590
[2] https://phabricator.wikimedia.org/T323239
[3] https://phabricator.wikimedia.org/T322869
[4] https://phabricator.wikimedia.org/T323096
[5] https://en.wikipedia.org/wiki/Race_condition#In_software
[6] https://phabricator.wikimedia.org/T263110


Hi Guillaume,

Are there plans to decouple WDQS from the back-end database? Doing that 
provides more resilient architecture for Wikidata as a whole since you 
will be able to swap and interchange SPARQL-compliant backends.


BTW -- we are going to make AWS and even Azure hosted instances (offered 
on a PAGO basis) of our Virtuoso-hosted edition of Wikidata (which we 
recently reloaded).


--
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/6ND7MOVXL3F73SR37MBWEIT5CCOK2EES/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Re: Weekly Summary #556

2023-02-17 Thread Kingsley Idehen via Wikidata



On 2/7/23 1:57 PM, Samuel Klein wrote:
Ok, so I finally read Tpt's rundown of Blazegraph alternatives, and 
it's great.  Thank you!



  Press, articles, blog posts, videos

  * Blogs
  o Is there something better than Blazegraph for Wikidata?
<https://thomas.pellissier-tanon.fr/blog/2023-01-15-wdqs.html>


What's the latest on benchmarking alternatives and future migration?

- WDBench seems actively maintained 
<https://github.com/MillenniumDB/WDBench>; any narrative update since 
last year's paper? Is something else used internally?
- The WDQS backend updates 
<https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_update> 
last year were informative.  Are there updated risk projections + 
timelines?



___
Wikidata mailing list --wikidata@lists.wikimedia.org
Public archives 
athttps://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/GRV5GEK4KD6YDPV6JW6N6BERRFK6G75U/
To unsubscribe send an email towikidata-le...@lists.wikimedia.org



All,

For the record, the following is inaccurate:

Virtuoso is a SPARQL implementation developed for more than a decade by 
a small company, OpenLink Software. It is at its core a SQL database 
targeting OLAP workloads with a layer on top converting SPARQL to SQL. 
It seems to provide great performances, powering very large endpoints 
like Uniprot. *However, according to WDQS Backend Alternative work, 
Virtuoso is also tuned for bulk-load with high-frequency read, and not 
for read/write.* But, this is also the case with Blazegraph. So, it 
might be interesting to do a good benchmark to see if it actually 
outperforms Blazegraph or not.


Virtuoso is a high-performance and scalable multi-model DBMS with a very 
strong OLTP pedigree -- FWIW .


We will soon be releasing pre-loaded and pre-configured Wikidata 
instance editions for both the AWS and Azure clouds that anyone can test 
for themselves.


--
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Home Page:http://www.openlinksw.com
Community Support:https://community.openlinksw.com
Weblogs (Blogs):
Company Blog:https://medium.com/openlink-software-blog
Virtuoso Blog:https://medium.com/virtuoso-blog
Data Access Drivers 
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog:https://medium.com/@kidehen
Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest:https://www.pinterest.com/kidehen/
Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter:https://twitter.com/kidehen
Google+:https://plus.google.com/+KingsleyIdehen/about
LinkedIn:http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i

:http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this


smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/YORJB4AYYSUFSYM7H3VTZSOZBC4GTEOZ/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Re: Announce: New OpenLink Virtuoso hosted Wikidata Knowledge Graph Release

2023-01-12 Thread Kingsley Idehen via Wikidata



On 1/12/23 3:39 AM, Larry Gonzalez wrote:

Dear Kingsley,

Let me start saying that I appreciate and thank the effort of loading 
complete wikidata over a graph database and make and sparql endpoint 
available. I know it is not an easy task to do


I just tried out the new virtuoso-hosted sparql endpoint with some 
queries. My experiments are not exhaustive at all, but I just wanted 
to raise two concern that I detected


Considering a (very simple) query that count all humans:

'''
SELECT (count(?human) as ?c)
WHERE
{
  ?human wdt:P31 wd:Q5 .
}
'''

I get a result of 10396057, which is ok considering the dataset that 
you are using


But if we try to export all instances of human (on a tsv file) with 
the following query:


'''
SELECT ?human
WHERE
{
  ?human wdt:P31 wd:Q5 .
}
'''

Then I only get 10 results. Is there a limit over the number of 
results that a query can have?



Yes, because these services are primarily for ad-hoc querying rather 
than wholesale data exports. If you want to export massive amounts of 
data then you can do so using OFFSET and LIMIT.


Alternatively, you can instantiate your own instance in the Azure or AWS 
cloud and use as you see fit.


Like what we provide regarding DBpedia, there's a server side 
configuration in place for enforcing a "fair use" policy :)






Furthermore, if we want to get all humans ordered by id, then the 
endpoint times out. The following is the query:


'''
SELECT ?human
WHERE
{
  ?human wdt:P31 wd:Q5 .
}
ORDER BY DESC(?human)
'''



If you set the query timeout to a value over 1000 msecs, the Virtuoso 
Anytime Query feature will provide you with a partial solution which you 
can use in conjunction with OFFSET and LIMIT to creative an interactive 
cursor (or scrollable cursor). Beyond that, its back to the "fair use" 
policy and option to instantiate your own service-specific instance 
using our cloud offerings.



Regards,

Kingsley




Thank you again for all your efforts. I am looking forward to see how 
this new endpoint work, :)


Are you planning to update regularly the dataset?

All the best!
Larry

https://iccl.inf.tu-dresden.de/web/Larry_Gonzalez



On 11.01.23 21:51, Kingsley Idehen via Wikidata wrote:

All,

We are pleased to announce immediate availability of an new 
Virtuoso-hosted Wikidata instance based on the most recent datasets. 
This instance comprises 17 billion+ RDF triples.


Host Machine Info:

Item Value

CPU



|2x Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz|

Cores



|24|

Memory



|378 GB|

SSD



|4x Crucial M4 SSD 500 GB|


Cloud related costs for a self-hosted variant, assuming:

  *

    dedicated machine for 1 year without upfront costs

  *

    128 GiB memory

  *

    16 cores or more

  *

    512GB SSD for the database

  *

    3T outgoing internet traffic (based on our DBpedia statistics)


vendor machine type memory vCPUs monthly machine 
monthly disk monthly network monthly total


Amazon



r5a.4xlarge



128 GiB



16



$479.61



$55.96



$276.48



$812.05

Google



e2highmem-16



128 GiB



16



$594.55



$95.74



$255.00



$945.30

Azure



D32a



128 GiB



32



$769.16



$38.40



$252.30



$1,060.06


SPARQL Query and Full Text Search service endpoints:

  *

    https://wikidata.demo.openlinksw.com/sparql -- SPARQL Query Services
    Endpoint

  *

    https://wikidata.demo.openlinksw.com/fct -- Faceted Search & 
Browsing



Additional Information

  *

    Loading the Wikidata dataset 2022/12 into Virtuoso Open Source -
    Announcements - OpenLink Software Community (openlinksw.com)
<https://community.openlinksw.com/t/loading-the-wikidata-dataset-2022-12-into-virtuoso-open-source/3580>


Happy New Year!

--
Regards,

Kingsley Idehen
Founder & CEO
OpenLink Software
Home Page:http://www.openlinksw.com
Community Support:https://community.openlinksw.com
Weblogs (Blogs):
Company Blog:https://medium.com/openlink-software-blog
Virtuoso Blog:https://medium.com/virtuoso-blog
Data Access Drivers 
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers


Personal Weblogs (Blogs):
Medium Blog:https://medium.com/@kidehen
Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
   http://kidehen.blogspot.com

Profile Pages:
Pinterest:https://www.pinterest.com/kidehen/
Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter:https://twitter.com/kidehen
Google+:https://plus.google.com/+KingsleyIdehen/about
LinkedIn:http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
:http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this


___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/TI7U5Q6ZBEEPCNSTZ2KYLEXEDO4E4GMG/

To unsubscribe send an email to

[Wikidata] Announce: New OpenLink Virtuoso hosted Wikidata Knowledge Graph Release

2023-01-11 Thread Kingsley Idehen via Wikidata


All,

We are pleased to announce immediate availability of an new 
Virtuoso-hosted Wikidata instance based on the most recent datasets. 
This instance comprises 17 billion+ RDF triples.


Host Machine Info:

ItemValue

CPU



|2x Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz|

Cores



|24|

Memory



|378 GB|

SSD



|4x Crucial M4 SSD 500 GB|


Cloud related costs for a self-hosted variant, assuming:

 *

   dedicated machine for 1 year without upfront costs

 *

   128 GiB memory

 *

   16 cores or more

 *

   512GB SSD for the database

 *

   3T outgoing internet traffic (based on our DBpedia statistics)


vendor 	machine type 	memory 	vCPUs 	monthly machine 	monthly disk 
monthly network 	monthly total


Amazon



r5a.4xlarge



128 GiB



16



$479.61



$55.96



$276.48



$812.05

Google



e2highmem-16



128 GiB



16



$594.55



$95.74



$255.00



$945.30

Azure



D32a



128 GiB



32



$769.16



$38.40



$252.30



$1,060.06


SPARQL Query and Full Text Search service endpoints:

 *

   https://wikidata.demo.openlinksw.com/sparql -- SPARQL Query Services
   Endpoint

 *

   https://wikidata.demo.openlinksw.com/fct -- Faceted Search & Browsing


Additional Information

 *

   Loading the Wikidata dataset 2022/12 into Virtuoso Open Source -
   Announcements - OpenLink Software Community (openlinksw.com)
   
<https://community.openlinksw.com/t/loading-the-wikidata-dataset-2022-12-into-virtuoso-open-source/3580>


Happy New Year!

--
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Home Page:http://www.openlinksw.com
Community Support:https://community.openlinksw.com
Weblogs (Blogs):
Company Blog:https://medium.com/openlink-software-blog
Virtuoso Blog:https://medium.com/virtuoso-blog
Data Access Drivers 
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog:https://medium.com/@kidehen
Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest:https://www.pinterest.com/kidehen/
Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter:https://twitter.com/kidehen
Google+:https://plus.google.com/+KingsleyIdehen/about
LinkedIn:http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i

:http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/TI7U5Q6ZBEEPCNSTZ2KYLEXEDO4E4GMG/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Announce: New OpenLink Wikidata Snapshot released to AWS Cloud

2022-07-18 Thread Kingsley Idehen via Wikidata


All,

We've released a copy of the Wikidata Snapshot that we host to the AWS 
Cloud. Our hosted Wikidata Snapshot access points include:


 * https://wikidata.demo.openlinksw.com/sparql -- SPARQL Query Services
   Endpoint
 * https://wikidata.demo.openlinksw.com/fct -- Faceted Search & Browsing

The AWS snapshot release enables the following:

1. Immediate instantiation of a preloaded and pre-configured Wikidata
   instance for personal-, project-, or service-specific use
2. SPARQL (via SPARQL Query Service Endpoint, Jena or RDF4J providers)
   and SQL (via iSQL, ODBC, or JDBC) Query Access
3. Native Faceted Search & Exploration
4. Built-in DBpedia cross-references via owl:sameAs relations

*Additional Information*

 * AWS Marketplace Page for Wikidata Snapshot Virtual Machine
   <https://aws.amazon.com/marketplace/pp/prodview-fi6y6lnzvs6vc>
 * Wikidata Snapshot (Virtuoso PAGO) EBS-backed EC2 AMI
   
<https://community.openlinksw.com/t/wikidata-snapshot-virtuoso-pago-ebs-backed-ec2-ami/3243>
 * Twitter Announcement Thread
   <https://twitter.com/OpenLink/status/1547992642364903428>
 * LinkedIn Post
   
<https://www.linkedin.com/posts/kidehen_wikidata-knowledgegraph-virtuosordbms-activity-6953752893724254208-Oj4u>

--
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Home Page:http://www.openlinksw.com
Community Support:https://community.openlinksw.com
Weblogs (Blogs):
Company Blog:https://medium.com/openlink-software-blog
Virtuoso Blog:https://medium.com/virtuoso-blog
Data Access Drivers 
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog:https://medium.com/@kidehen
Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest:https://www.pinterest.com/kidehen/
Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter:https://twitter.com/kidehen
Google+:https://plus.google.com/+KingsleyIdehen/about
LinkedIn:http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i

:http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/IX25M5OOADK5GSQMCMJFSGX664RH33V5/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Re: Help make this Property Query faster

2021-11-03 Thread Kingsley Idehen via Wikidata


On 10/29/21 10:11 AM, Thad Guidry wrote:

Hi David and team,

In Yi Liu's tool, Wikidata Property Explorer, I noticed that the query 
performance could be better ideally.  Currently the query takes about 
9 seconds and I'm asking if there might be anything to help reduce 
that considerably?  Refactoring query for optimization, backend 
changes, anything you can think of Davd?


SELECT DISTINCT ?prop ?label ?desc ?type (GROUP_CONCAT(DISTINCT 
?alias; SEPARATOR = " | ") AS ?aliases) WHERE {

  ?prop (wdt:P31/(wdt:P279*)) wd:Q18616576;
    wikibase:propertyType ?type.
  OPTIONAL {
    ?prop rdfs:label ?label.
    FILTER((LANG(?label)) = "en")
  }
  OPTIONAL {
    ?prop schema:description ?desc.
    FILTER((LANG(?desc)) = "en")
  }
  OPTIONAL {
    ?prop skos:altLabel ?alias.
    FILTER((LANG(?alias)) = "en")
  }
}
GROUP BY ?prop ?label ?desc ?type

Thad
https://www.linkedin.com/in/thadguidry/
https://calendly.com/thadguidry/

___
Wikidata mailing list --wikidata@lists.wikimedia.org
To unsubscribe send an email towikidata-le...@lists.wikimedia.org



Hi Thad,

Don't know what your expectations are, but here are results from our 
Wikidata instance:


 * Query Solution Page with "Anytime Query" Feature Enabled
   
<https://wikidata.demo.openlinksw.com/sparql?default-graph-uri=http%3A%2F%2Fwww.wikidata.org%2F=SELECT+DISTINCT+%3Fprop+%3Flabel+%3Fdesc+%3Ftype+%28GROUP_CONCAT%28DISTINCT+%3Falias%3B+SEPARATOR+%3D+%22+%7C+%22%29+AS+%3Faliases%29+WHERE+%7B%0D%0A++%3Fprop+%28wdt%3AP31%2F%28wdt%3AP279*%29%29+wd%3AQ18616576%3B%0D%0Awikibase%3ApropertyType+%3Ftype.%0D%0A++OPTIONAL+%7B%0D%0A%3Fprop+rdfs%3Alabel+%3Flabel.%0D%0AFILTER%28%28LANG%28%3Flabel%29%29+%3D+%22en%22%29%0D%0A++%7D%0D%0A++OPTIONAL+%7B%0D%0A%3Fprop+schema%3Adescription+%3Fdesc.%0D%0AFILTER%28%28LANG%28%3Fdesc%29%29+%3D+%22en%22%29%0D%0A++%7D%0D%0A++OPTIONAL+%7B%0D%0A%3Fprop+skos%3AaltLabel+%3Falias.%0D%0AFILTER%28%28LANG%28%3Falias%29%29+%3D+%22en%22%29%0D%0A++%7D%0D%0A%7D%0D%0AGROUP+BY+%3Fprop+%3Flabel+%3Fdesc+%3Ftype%0D%0A=text%2Fx-html%2Btr=36_void=on_unconnected=on>
 * Query Solution Page with "Anytime Query" Feature Disabled
   
<https://wikidata.demo.openlinksw.com/sparql?default-graph-uri=http%3A%2F%2Fwww.wikidata.org%2F=SELECT+DISTINCT+%3Fprop+%3Flabel+%3Fdesc+%3Ftype+%28GROUP_CONCAT%28DISTINCT+%3Falias%3B+SEPARATOR+%3D+%22+%7C+%22%29+AS+%3Faliases%29+WHERE+%7B%0D%0A++%3Fprop+%28wdt%3AP31%2F%28wdt%3AP279*%29%29+wd%3AQ18616576%3B%0D%0Awikibase%3ApropertyType+%3Ftype.%0D%0A++OPTIONAL+%7B%0D%0A%3Fprop+rdfs%3Alabel+%3Flabel.%0D%0AFILTER%28%28LANG%28%3Flabel%29%29+%3D+%22en%22%29%0D%0A++%7D%0D%0A++OPTIONAL+%7B%0D%0A%3Fprop+schema%3Adescription+%3Fdesc.%0D%0AFILTER%28%28LANG%28%3Fdesc%29%29+%3D+%22en%22%29%0D%0A++%7D%0D%0A++OPTIONAL+%7B%0D%0A%3Fprop+skos%3AaltLabel+%3Falias.%0D%0AFILTER%28%28LANG%28%3Falias%29%29+%3D+%22en%22%29%0D%0A++%7D%0D%0A%7D%0D%0AGROUP+BY+%3Fprop+%3Flabel+%3Fdesc+%3Ftype%0D%0A=text%2Fx-html%2Btr=0_void=on_unconnected=on>

 * Ditto, but using GRAPH CLAUSE
   
<https://wikidata.demo.openlinksw.com/sparql?default-graph-uri=http%3A%2F%2Fwww.wikidata.org%2F=SELECT+DISTINCT+%3Fprop+%3Flabel+%3Fdesc+%3Ftype+%28GROUP_CONCAT%28DISTINCT+%3Falias%3B+SEPARATOR+%3D+%22+%7C+%22%29+AS+%3Faliases%29+%0D%0AWHERE+%7B%0D%0A+++GRAPH+%3Fg+%7B%0D%0A%3Fprop+%28wdt%3AP31%2F%28wdt%3AP279*%29%29+wd%3AQ18616576%3B%0D%0Awikibase%3ApropertyType+%3Ftype.%0D%0AOPTIONAL+%7B%0D%0A%3Fprop+rdfs%3Alabel+%3Flabel.%0D%0AFILTER%28%28LANG%28%3Flabel%29%29+%3D+%22en%22%29%0D%0A%7D%0D%0AOPTIONAL+%7B%0D%0A%3Fprop+schema%3Adescription+%3Fdesc.%0D%0AFILTER%28%28LANG%28%3Fdesc%29%29+%3D+%22en%22%29%0D%0A%7D%0D%0AOPTIONAL+%7B%0D%0A%3Fprop+skos%3AaltLabel+%3Falias.%0D%0AFILTER%28%28LANG%28%3Falias%29%29+%3D+%22en%22%29%0D%0A%7D%0D%0A+++%7D%0D%0A%7D%0D%0AGROUP+BY+%3Fprop+%3Flabel+%3Fdesc+%3Ftype%0D%0A=text%2Fx-html%2Btr=0_void=on_unconnected=on>
   -- this might not make a difference here since all the data is in a
   single Named Graph

Hope this helps.

--
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Home Page:http://www.openlinksw.com
Community Support:https://community.openlinksw.com
Weblogs (Blogs):
Company Blog:https://medium.com/openlink-software-blog
Virtuoso Blog:https://medium.com/virtuoso-blog
Data Access Drivers 
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog:https://medium.com/@kidehen
Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages

[Wikidata] Re: Degraded service availability for WDQS codfw

2021-09-09 Thread Kingsley Idehen via Wikidata

On 9/8/21 11:55 PM, Ryan Kemper wrote:
> We noticed a user who was responsible for the most requests by far
> (albeit still not a large percentage of total requests) and banned
> them, and that immediately restored full service availability
> (following another quick round of blazegraph restarts to get the
> deadlocked blazegraph processes back up and running properly).
>
> This problem is resolved (for now at least). I'll be sending an e-mail
> out to the user we banned informing them of the user agent ban.


Note my Phabricator response at:
https://phabricator.wikimedia.org/T206560#7342750.

It covers the "Anytime Query" functionality in Virtuoso i.e., one of the
built-in features you can use to protect against attacks (intentional or
inadvertent) attacks.

Sometimes folks don't have a clear sense of the impact of queries,
relative to the usage needs of others. There are other occasions where
they just want to download everything etc..

In some cases, you may have to ban an account.  Historically though, the
"Anytime Query" has kept the "Fair Use" rules of DBpedia intact [1].

[1] https://www.dbpedia.org/resources/sparql/ -- search on "Fair Use"

Kingsley

>
>
> On Wed, Sep 8, 2021 at 8:03 PM Ryan Kemper  <mailto:rkem...@wikimedia.org>> wrote:
>
> Our WDQS backend servers (in CODFW only) have incredibly patchy
> availability currently.
>
> As a result a sizeable portion of queries made to
> query.wikidata.org <http://query.wikidata.org> are failing or
> taking unusually long.
>
> We're doing our best to isolate a cause (basically a user or
> user(s) submitting particularly expensive or error-generating
> queries). Until we succeed in that service availability is likely
> to be quite poor.
>
> Note that we currently have a mitigation in place where we're
> restarting blazegraph across the affected hosts (codfw) hourly,
> but that mitigation is insufficient currently.
>
> You can see the current status of wdqs backend server availability
> here:
> 
> https://grafana.wikimedia.org/d/00489/wikidata-query-service?viewPanel=7=now-1h=now=1m
> 
> <https://grafana.wikimedia.org/d/00489/wikidata-query-service?viewPanel=7=now-1h=now=1m>
>
> ^ This is a graph of our total triple count (i.e. not explicitly a
> graph of service availability), but servers affected by the
> blazegraph deadlock issue that we're experiencing fail to report
> metrics while they're affected. So the presence or absence of RDF
> triple counts for a given host corresponds to its uptime
>
>
> _______
> Wikidata mailing list -- wikidata@lists.wikimedia.org
> To unsubscribe send an email to wikidata-le...@lists.wikimedia.org


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



OpenPGP_signature
Description: OpenPGP digital signature
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

[Wikidata] Re: Wikidata Query Service scaling update Aug 2021

2021-09-01 Thread Kingsley Idehen via Wikidata

On 9/1/21 8:38 AM, Samuel Klein wrote:
> I like the idea of comparing live instances; could we pose a
> test-instance challenge, with some benchmarks, and invite different
> communities to take it up, hosting their own demos of what a
> well-tuned instance of WD could look like?  (Could also be hosted by
> us / spun up by advocates for a tool in our community; could also spur
> some kaggle interest)
>
> The size of the community actively interested in the health of
> Wikidata seems complementary information; alongside overall community
> size/health (which appears on the existing metrics list).   //S


Yes, but for best effect, in line with ultimate goal, it should progress
in stages:

[1] A basic Wikidata instance and SPARQL Query Service endpoint -- that
allows provides users and user-agents with 24/7 query capability

[2] Specific Query Related Challenges

There should be an open invite as part of an effort to move Wikidata
forward in light of its current scaling related challenges.

If we reach 3 stage-1 participants that would be awesome!


Related Links:

[1] https://wikidata.demo.openlinksw.com/ -- our live instance and its
free text query interface that's a segue into Faceted Search & Browsing

[2] https://wikidata.demo.openlinksw.com/sparql -- SPARQL Query Services
Endpoint

[3]
https://community.openlinksw.com/t/loading-wikidata-into-virtuoso-open-source-or-enterprise-edition/2717
-- Loading Wikidata into a Virtuoso Open Source Edition instance

[4] https://github.com/openlink/virtuoso-opensource -- Virtuoso Open
Source Edition Github Repo


Kingsley

>
>
> On Fri, Aug 27, 2021 at 10:19 AM Kingsley Idehen via Wikidata
> mailto:wikidata@lists.wikimedia.org>>
> wrote:
>
> On 8/25/21 3:17 PM, Mike Pham wrote:
>>
>> Thanks for all suggestions, and general enthusiasm in helping
>> scale WDQS! A number of you have suggested various graph backends
>> to consider moving to from Blazegraph, and I wanted to take a
>> minute to respond more generically.
>>
>> There are several criteria we need to consider for a Blazegraph
>> alternative. Ideally we would have this list of criteria ready
>> and available to share, so that the community can help vet
>> alternatives with us. Unfortunately, we do not currently have a
>> full list of these criteria. While the criteria we judged
>> candidate graph backends on are available here
>> 
>> <https://docs.google.com/spreadsheets/d/1MXikljoSUVP77w7JKf9EXN40OB-ZkMqT8Y5b2NYVKbU/edit?usp=sharing>,
>> it is highly unlikely these will be the exact set we will use in
>> this next stage of scaling, and should only be used as a
>> historical reference.
>>
>> It is likely that there is no silver bullet solution that will
>> satisfy every criteria. We will probably need to make compromises
>> in some areas in order to optimize for others. This is a primary
>> reason for conducting the WDQS user survey
>> 
>> <https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2021/08#Wikidata_Query_Service_(WDQS)_User_Survey_2021>:
>> we would like a better understanding of what the overall
>> community priorities are, including from those who may be less
>> vocal in existing discussions. These priorities will then be a
>> major component in distilling the criteria (and weights) for a
>> new graph backend.
>>
>> The current plan is to share the (most up to date as we can)
>> survey results at WikidataCon
>> <https://www.wikidata.org/wiki/Wikidata:WikidataCon_2021> this
>> year. I appreciate the discussion around potential candidates so
>> far, and welcome the continued insight/help, but wanted to also
>> be clear that we will not be making any decisions about a new
>> graph backend, or have a complete list of criteria or testing
>> process, at the moment — WikidataCon will be the next strategic
>> check-in point.
>>
>> As always, your patience is appreciated, and I’m looking forward
>> to the continuing discussions and collaboration!
>>
>> Best,
>> Mike
>>
>>
>>
>>
>> —
>>
>> *Mike Pham* (he/him)
>> Sr Product Manager, Search
>> Wikimedia Foundation <https://wikimediafoundation.org/>
>
>
> Hi Mike,
>
> Here's a suggestion regarding this important matter, circa 2021:
>
> At the very least, a candidate platform should be able to deliver
> on a live instance of the Wikidata dataset accessible for
> interaction via SPARQL Query Services Endpoint.
>
> Based on the in

[Wikidata] Re: Wikidata Query Service scaling update Aug 2021

2021-08-27 Thread Kingsley Idehen via Wikidata

 things we did differently than WDQS is to introduce a
>> controlled layer between the "public" and the "database".
>> To allow things like query rewriting/redirection upon data model
>> changes, as well as rewriting some schema rediscovery queries to
>> a known
>> faster query. We also parse the queries with RDF4J before handing
>> them
>> to virtuoso. This makes sure that the queries that we accept are
>> only
>> valid SPARQL 1.1. Avoiding users getting used to almost SPARQL
>> dialects
>> (i.e. retain the flexiblity to move to a different endpoint). We
>> are in
>> the process of updating this code and contributing it to RDF4J,
>> with the
>> first contribution in the develop/4.0.0 branch
>>
>> I think a number of current customizations in WDQS can be moved to a
>> front RDF4J layer. Then the RDF4J sail/repository layer can be
>> used to
>> preserve flexibility. So that WDQS can more easily switch between
>> backend databases in the future.
>>
>> One large difference between UniProt and WDQS is that WikiData is
>> continually updated while UniProt is batch released a few times a
>> year.
>> WDQS is somewhat easier in some areas and more difficult in others
>> because of that.
>>
>> Regards,
>> Jerven
>>
>> [1] No Database is perfect, but it does scale a lot better than
>> Blazegraph did. Which we also evaluated in the past. There is
>> still a
>> lot of potential in Virtuoso to scale even better in the future.
>>
>>
>>
>>
>>
>> On 23/08/2021 21:36, Samuel Klein wrote:
>> > Ah, that's lovely.  Thanks for the update, Kingsley!  Uniprot
>> is a good
>> > parallel to keep in mind.
>> >
>> > For Egon, Andra, others who work with them: Is there someone you'd
>> > recommend chatting with at uniprot?
>> > "scaling alongside uniprot" or at least engaging them on how to
>> solve
>> > shared + comparable issues (they also offer authentication-free
>> SPARQL
>> > querying) sounds like a compelling option.
>> >
>> > S.
>> >
>> > On Thu, Aug 19, 2021 at 4:32 PM Kingsley Idehen via Wikidata
>> > > <mailto:wikidata@lists.wikimedia.org>
>> <mailto:wikidata@lists.wikimedia.org
>> <mailto:wikidata@lists.wikimedia.org>>> wrote:
>> >
>> >     On 8/18/21 5:07 PM, Mike Pham wrote:
>> >>
>> >>     Wikidata community members,
>> >>
>> >>
>> >>     Thank you for all of your work helping Wikidata grow and
>> improve
>> >>     over the years. In the spirit of better communication, we
>> would
>> >>     like to take this opportunity to share some of the current
>> >>     challenges Wikidata Query Service (WDQS) is facing, and some
>> >>     strategies we have for dealing with them.
>> >>
>> >>
>> >>     WDQS currently risks failing to provide acceptable service
>> quality
>> >>     due to the following reasons:
>> >>
>> >>     1.
>> >>
>> >>         Blazegraph scaling
>> >>
>> >>         1.
>> >>
>> >>             Graph size. WDQS uses Blazegraph as our graph backend.
>> >>             While Blazegraph can theoretically support 50 billion
>> >>             edges <https://blazegraph.com/
>> <https://blazegraph.com/>>, in reality Wikidata is
>> >>             the largest graph we know of running on Blazegraph
>> (~13
>> >>             billion triples
>> >>           
>>  
>> <https://grafana.wikimedia.org/d/00489/wikidata-query-service?viewPanel=7=1=1m
>> 
>> <https://grafana.wikimedia.org/d/00489/wikidata-query-service?viewPanel=7=1=1m>>),
>> >>             and there is a risk that we will reach a size
>> >>           
>>  <https://www.w3.org/wiki/LargeTripleStores#Bigdata.28R.29_.2812.7B.29
>> 
>> <https://www.w3.org/wiki/LargeTripleStores#Bigdata.28R.29_.2812.7B.29>>limit
>> >>             of what it can realistically support
>> >>             <https://

[Wikidata] Re: Wikidata Query Service scaling update Aug 2021

2021-08-19 Thread Kingsley Idehen via Wikidata

On 8/18/21 5:07 PM, Mike Pham wrote:
>
> Wikidata community members,
>
>
> Thank you for all of your work helping Wikidata grow and improve over
> the years. In the spirit of better communication, we would like to
> take this opportunity to share some of the current challenges Wikidata
> Query Service (WDQS) is facing, and some strategies we have for
> dealing with them.
>
>
> WDQS currently risks failing to provide acceptable service quality due
> to the following reasons:
>
> 1.
>
> Blazegraph scaling
>
> 1.
>
> Graph size. WDQS uses Blazegraph as our graph backend. While
> Blazegraph can theoretically support 50 billion edges
> <https://blazegraph.com/>, in reality Wikidata is the largest
> graph we know of running on Blazegraph (~13 billion triples
> 
> <https://grafana.wikimedia.org/d/00489/wikidata-query-service?viewPanel=7=1=1m>),
> and there is a risk that we will reach a size
> 
> <https://www.w3.org/wiki/LargeTripleStores#Bigdata.28R.29_.2812.7B.29>limit
> of what it can realistically support
> <https://phabricator.wikimedia.org/T213210>. Once Blazegraph
> is maxed out, WDQS can no longer be updated. This will also
> break Wikidata tools that rely on WDQS.
>
> 2.
>
> Software support. Blazegraph is end of life software, which is
> no longer actively maintained, making it an unsustainable
> backend to continue moving forward with long term.  
>
>
> Blazegraph maxing out in size poses the greatest risk for catastrophic
> failure, as it would effectively prevent WDQS from being updated
> further, and inevitably fall out of date. Our long term strategy to
> address this is to move to a new graph backend that best meets our
> WDQS needs and is actively maintained, and begin the migration off of
> Blazegraph as soon as a viable alternative is identified
> <https://phabricator.wikimedia.org/T206560>. 
>

Hi Mike,

Do bear in mind that pre and post selection of Blazegraph for Wikidata,
we've always offered an RDF-based DBMS that can handle current and
future requirements for Wikidata, just as we do DBpedia.

At the time of our first rendezvous, handling 50 billion triples would
have typically required our Cluster Edition which is a Commercial Only
offering -- basically, that was the deal breaker back then.

Anyway, in recent times, our Open Source Edition has evolved to handle
some 80 Billion+ triples (exemplified by the live Uniprot instance)
where performance and scale is primary a function of available memory.

I hope this helps.

Related:

[1] https://wikidata.demo.openlinksw.com/sparql
<https://wikidata.demo.openlinksw.com/sparql>-- Our Live Wikidata SPARQL
Query Endpoint
[2]
https://docs.google.com/spreadsheets/d/15AXnxMgKyCvLPil_QeGC0DiXOP-Hu8Ln97fZ683ZQF0/edit#gid=0
<https://docs.google.com/spreadsheets/d/15AXnxMgKyCvLPil_QeGC0DiXOP-Hu8Ln97fZ683ZQF0/edit#gid=0>
-- Google Spreadsheet about various Virtuoso Configurations associated
with some well-known public endpoints
[3] https://t.co/EjAAO73wwE <https://t.co/EjAAO73wwE> -- this query
doesn't complete with the current Blazegraph-based Wikidata endpoint
[4] https://t.co/GTATPPJNBI <https://t.co/GTATPPJNBI> -- same query
completing when applied to the Virtuoso-based endpoint
[5] https://t.co/X7mLmcYC69 <https://t.co/X7mLmcYC69> -- about loading
Wikidata's datasets into a Virtuoso instance
[6]
https://twitter.com/search?q=%23Wikidata%20%23VirtuosoRDBMS%20%40kidehen=typed_query=live
<https://twitter.com/search?q=%2523Wikidata%20%2523VirtuosoRDBMS%20%2540kidehen=typed_query=live>
-- various demos shared via Twitter over the years regarding Wikidata

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this

___
Wikidata mailing list -- wikidata@lists.wikimedia.org
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org

Re: [Wikidata] AfroCine: Conclusion of the 2020 Months of African Cinema Global Contest

2021-04-28 Thread Kingsley Idehen via Wikidata

On 4/26/21 7:12 AM, Sam Oyeyele wrote:
>
> Greetings!
>
>
> The Months of African Cinema Global Edit-a-thon was concluded on 30
> November 2020,[1] and we want to send a big thank you to all the
> participants who helped make it a success! Over 3,200 articles were
> created across 19 language Wikipedias, surpassing all expectations and
> placing the contest firmly as one of the most successful
> article-writing contests on Wikipedia.
>
>  
>
> We also want to extend our heartfelt gratitude to all those who
> volunteered to be part of the jury team. It was very complicated
> trying to assess the quality of 3,000 articles, but we did it! All our
> winners have now been announced and you can check the complete list
> here.
> <https://en.wikipedia.org/wiki/Wikipedia:WikiProject_AfroCine/Months_of_African_Cinema/Winners>[2]
>
>  
>
> Thank you so much for being part of this global event! Thank you for
> helping to fix African content gaps on Wikipedia! We hope to see more
> of your participation in future AfroCine events and activities. Please
> remember to sign up on the main WikiProject participants’ page,[3] and
> on the meta page[4] to get updated with these activities.
>
>  
>
> Thanks,
>
> Sam,
>
> On behalf of The AfroCine Project team.
>
>
> 1.
>
> 
> https://en.wikipedia.org/wiki/Wikipedia:WikiProject_AfroCine/Months_of_African_Cinema
> 
> <https://en.wikipedia.org/wiki/Wikipedia:WikiProject_AfroCine/Months_of_African_Cinema>
>  
>
> 2.
>
> 
> https://en.wikipedia.org/wiki/Wikipedia:WikiProject_AfroCine/Months_of_African_Cinema/Winners
> 
> <https://en.wikipedia.org/wiki/Wikipedia:WikiProject_AfroCine/Months_of_African_Cinema/Winners>
>
> 3.
>
> https://en.wikipedia.org/wiki/Wikipedia:WikiProject_AfroCine/Participants
> 
> <https://en.wikipedia.org/wiki/Wikipedia:WikiProject_AfroCine/Participants>
>
> 4.
>
> https://meta.wikimedia.org/wiki/The_AfroCine_Project
> <https://meta.wikimedia.org/wiki/The_AfroCine_Project> 
>

Hi Sam,

Great news!

Is this page
<https://en.wikipedia.org/wiki/Wikipedia:WikiProject_AfroCine/Article_Suggestions>
[1] the best starting point for exploring content-enhancement
contributions from this competition?


Links:

[1]
https://en.wikipedia.org/wiki/Wikipedia:WikiProject_AfroCine/Article_Suggestions

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



OpenPGP_signature
Description: OpenPGP digital signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Delta Dumps Production?

2021-02-26 Thread Kingsley Idehen via Wikidata

On 2/26/21 3:46 AM, Guillaume Lederrey wrote:
> Hello!
>
> We are working on a new update process for WDQS, based on a stream of
> changes [1]. While not exactly the solution you are looking for, this
> might be a building block for differential dumps. For example by
> aggregating the stream of changes over a period of time.
>
> Note that at this point, the stream of changes that we construct is
> published to an internal Kafka that isn't exposed to the internet. If
> there is enough interest, we might be able to expose it in some form.
>
> Have fun!
>
>    Guillaume
>
>
>
> [1] https://phabricator.wikimedia.org/T244590
> <https://phabricator.wikimedia.org/T244590>


Hi Guillaume,

I am very interested in exposure right now since we are trying to have
an up-to-date mirror of Wikidata.

We can discuss offline if you like.


Kingsley

>
>
> On Fri, Feb 26, 2021 at 8:49 AM Federico Leva (Nemo)
> mailto:nemow...@gmail.com>> wrote:
>
> Kingsley Idehen via Wikidata, 25/02/21 19:26:
> > Is there a mechanism in place for producing and publishing
> delta-centric
> > dumps for Wikidata?
>
> There's
> https://phabricator.wikimedia.org/T72246
> <https://phabricator.wikimedia.org/T72246>
>
> Magnus Manske used to maintain some biweekly dumps as part of its WDQ
> service, IIRC.
>
> Federico
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
> <https://lists.wikimedia.org/mailman/listinfo/wikidata>
>
>
>
> -- 
>   *Guillaume Lederrey* (he/him)
> Engineering Manager
> Wikimedia Foundation <https://wikimediafoundation.org/>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Delta Dumps Production?

2021-02-25 Thread Kingsley Idehen via Wikidata

Hi Everyone,

Is there a mechanism in place for producing and publishing delta-centric
dumps for Wikidata?

A delta-centric dump would comprise new triples for relevant Wikipedia
pages that can be applied progressively to existing Wikidata instances.
For instance, we maintain a Wikidata instance [1][2] that we would like
to keep up to data by applying deltas rather than performing wholesale
instance reloads etc..

Looking forward to any insights regarding this important matter.

Related Links

[1] https://wikidata.demo.openlinksw.com/fct

[2] https://wikidata.demo.openlinksw.com/sparql

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Identifiers for WDQS queries

2021-02-18 Thread Kingsley Idehen via Wikidata

ve had to copy in was a short identifier.
>
>
> Of course to some extent the URL-shortener does this, but there are some
> issues:
>
> 1) The maintainers point-blank refuse to let it allow URLs of more than
> 2000 characters. ( https://phabricator.wikimedia.org/T220703
> <https://phabricator.wikimedia.org/T220703> ) Gnarly WDQS queries can
> often be longer than this, sometimes a lot longer.
>
> 2) The short URL could be for anything on any wiki site -- the EMEW
> site can't be sure that it corresponds to a SPARQL query
>
> 3) The short URL needs to be adjusted, to turn it from a WDQS url
> that's a link to the query in the GUI into a WDQS url that's an
> external request for query results. This is not straightforward.
>
>
> A short identifier for a WDQS query would get round all these things.
>
> It also might be one step forwards towards creating a place like Quarry
> (https://quarry.wmflabs.org/ <https://quarry.wmflabs.org>) where users
> could save their queries, share them, document them, see other
> people's shared queries, and come
> back to them later. But that's another ticket
> (https://phabricator.wikimedia.org/T104762
> <https://phabricator.wikimedia.org/T104762> open since July 2015).
>
> All I am suggesting, first, is an identifier.
>
>
> One objection that I thought of might be that if identifiers were
> automatically assigned, without having to actually request them, then
> people might be able to "spy" on what other queries people happened to
> be writing at any one time. I don't know how serious an objection this
> is - it doesn't seem to be a problem for Quarry - but could largely be
> avoided if the query-number was hashed to make the sequence less
> predictable.
>
> (Or alternatively, query-numbers could just be issued on request).
>
>
> Anyway, just putting this out here, for thoughts.
>
> Best wishes to everybody,
>
> James.
>
>
>
>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
> <https://lists.wikimedia.org/mailman/listinfo/wikidata>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
> <https://lists.wikimedia.org/mailman/listinfo/wikidata>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] RDF embedded in Wikipedia pages Discussion Space?

2020-09-07 Thread Kingsley Idehen via Wikidata

All,

Is there a designated discussion space for opening and discussing
matters related to RDF embedded in Wikipedia docs, using  based
structured data islands?

I asked the same question on twitter earlier on today [1].

Links:

[1] https://twitter.com/kidehen/status/1303031433188048897

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] WDQS outage - 2020/07/23

2020-07-26 Thread Kingsley Idehen

On 7/26/20 1:44 PM, Egon Willighagen wrote:
>
> it's at the end of the page. Hard to miss, I thought :/
>
> SELECT DISTINCT ?government_governmental_jurisdiction_governing_officials 
> ?government_governmental_jurisdiction_governing_officials 
> ?government_government_position_held_office_holder_inverse 
> ?government_government_position_held_appointed_by 
> ?government_government_position_held_basic_title_inverse
> WHERE {
>   VALUES ?government_governmental_jurisdiction_governing_officials { 
> wdt:P17 p:P17 wdt:P36 p:P36 wdt:P47 p:P47 wdt:P6 p:P6 wdt:P138 p:P138 wdt:P37 
> p:P37 wdt:P150 p:P150 wdt:P1313 p:P1313 wdt:P625 p:P625 wdt:P35 p:P35 } .
>   ?c ?government_governmental_jurisdiction_governing_officials ?k .
>   VALUES ?government_governmental_jurisdiction_governing_officials { 
> wdt:P17 p:P17 wdt:P36 p:P36 wdt:P47 p:P47 wdt:P6 p:P6 wdt:P138 p:P138 wdt:P37 
> p:P37 wdt:P150 p:P150 wdt:P1313 p:P1313 wdt:P625 p:P625 wdt:P35 p:P35 } .
>   ?c ?government_governmental_jurisdiction_governing_officials ?y .
>   VALUES ?government_government_position_held_office_holder_inverse { 
> wdt:P19 p:P19 wdt:P103 p:P103 wdt:P69 p:P69 wdt:P937 p:P937 wdt:P106 p:P106 
> wdt:P26 p:P26 wdt:P27 p:P27 wdt:P166 p:P166 wdt:P39 p:P39 wdt:P607 p:P607 
> wdt:P735 p:P735 wdt:P21 p:P21 wdt:P551 p:P551 wdt:P102 p:P102 wdt:P1412 
> p:P1412 wdt:P6886 p:P6886 wdt:P2163 p:P2163 } .
>   wd:Q15029 ?government_government_position_held_office_holder_inverse ?y 
> .
>   VALUES ?government_government_position_held_appointed_by { wdt:P991 
> p:P991 wdt:P1308 p:P1308 wdt:P112 p:P112 wdt:P710 p:P710 wdt:P542 p:P542 
> wdt:P488 p:P488 wdt:P726 p:P726 wdt:P40 p:P40 wdt:P6 p:P6 wdt:P138 p:P138 
> wdt:P1365 p:P1365 wdt:P921 p:P921 wdt:P35 p:P35 wdt:P748 p:P748 wdt:P1366 
> p:P1366 wdt:P22 p:P22 wdt:P3373 p:P3373 wdt:P50 p:P50 wdt:P26 p:P26 } .
>   ?k ?government_government_position_held_appointed_by wd:Q19211 .
>   VALUES ?government_government_position_held_basic_title_inverse { 
> wdt:P19 p:P19 wdt:P88 p:P88 wdt:P1193 p:P1193 wdt:P3616 p:P3616 wdt:P205 
> p:P205 wdt:P106 p:P106 wdt:P6161 p:P6161 wdt:P1331 p:P1331 wdt:P279 p:P279 
> wdt:P677 p:P677 wdt:P6828 p:P6828 wdt:P131 p:P131 wdt:P1001 p:P1001 wdt:P2099 
> p:P2099 } .
>   wd:Q30461 ?government_government_position_held_basic_title_inverse ?y .
>   VALUES ?c { wd:Q148 } .
> }  LIMIT 100
>
I am currently getting an empty solution for the query above from both
our endpoint and the WDQS endpoint:

[1] https://tinyurl.com/y2t5nwc9 -- results page

[2] https://tinyurl.com/y66n9bk5 -- query editor page

[3] https://tinyurl.com/yxo5kr4w -- WDQS endpoint


What did I miss?


Kingsley


> On Sun, Jul 26, 2020 at 7:24 PM Kingsley Idehen
> mailto:kide...@openlinksw.com>> wrote:
>
> On 7/26/20 1:00 PM, Egon Willighagen wrote:
>>
>> See https://phabricator.wikimedia.org/T242453 linked on the
>> report page
>
>
>     That doesn't take directly to a SPARQL query. I just want the
> SPARQL query.
>
>
> Kingsley
>
>>
>> On Sun, Jul 26, 2020 at 6:55 PM Kingsley Idehen
>> mailto:kide...@openlinksw.com>> wrote:
>>
>> On 7/24/20 3:18 PM, Ryan Kemper wrote:
>> > Hi all,
>> >
>> > We experienced WDQS service disruptions on 2020/07/23. As a
>> result
>> > there was a full outage (inability to respond to all
>> queries) for a
>> > period of several minutes, and a more extended period of
>> > intermittently degraded service (inability to respond to a
>> subset of
>> > queries) for 1-2 hours.
>> >
>> > The full incident report is available here:
>> >
>> 
>> https://wikitech.wikimedia.org/wiki/Incident_documentation/20200723-wdqs-outage
>> >
>> > Ultimately, we traced the proximate cause to a series of
>> > non-performant queries, which caused a deadlock in
>> blazegraph, the
>> > backend for WDQS. We have placed a temporary block on the
>> IP address
>> > in question and are taking steps to better define service
>> availability
>> > expectations as well as processes to make detection of
>> these events
>> > more streamlined going forward.
>>
>>
>> What was the problem query?
>>
>> I ask because I would like to try it against our Wikidata
>> endpoint at:
>> https://wikidata.demo.openlinksw.com/sparql .
>>
>> We have an "Any

Re: [Wikidata] WDQS outage - 2020/07/23

2020-07-26 Thread Kingsley Idehen

On 7/24/20 3:18 PM, Ryan Kemper wrote:
> Hi all,
>
> We experienced WDQS service disruptions on 2020/07/23. As a result
> there was a full outage (inability to respond to all queries) for a
> period of several minutes, and a more extended period of
> intermittently degraded service (inability to respond to a subset of
> queries) for 1-2 hours.
>
> The full incident report is available here:
> https://wikitech.wikimedia.org/wiki/Incident_documentation/20200723-wdqs-outage
>
> Ultimately, we traced the proximate cause to a series of
> non-performant queries, which caused a deadlock in blazegraph, the
> backend for WDQS. We have placed a temporary block on the IP address
> in question and are taking steps to better define service availability
> expectations as well as processes to make detection of these events
> more streamlined going forward.


What was the problem query?

I ask because I would like to try it against our Wikidata endpoint at:
https://wikidata.demo.openlinksw.com/sparql .

We have an "Anytime Query" feature designed for these kinds of problems,
hence the  vested interest in these kinds of problem queries.

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] WDQS outage - 2020/07/23

2020-07-26 Thread Kingsley Idehen

On 7/26/20 1:00 PM, Egon Willighagen wrote:
>
> See https://phabricator.wikimedia.org/T242453 linked on the report page


That doesn't take directly to a SPARQL query. I just want the SPARQL query.


Kingsley

>
> On Sun, Jul 26, 2020 at 6:55 PM Kingsley Idehen
> mailto:kide...@openlinksw.com>> wrote:
>
> On 7/24/20 3:18 PM, Ryan Kemper wrote:
> > Hi all,
> >
> > We experienced WDQS service disruptions on 2020/07/23. As a result
> > there was a full outage (inability to respond to all queries) for a
> > period of several minutes, and a more extended period of
> > intermittently degraded service (inability to respond to a subset of
> > queries) for 1-2 hours.
> >
> > The full incident report is available here:
> >
> 
> https://wikitech.wikimedia.org/wiki/Incident_documentation/20200723-wdqs-outage
> >
> > Ultimately, we traced the proximate cause to a series of
> > non-performant queries, which caused a deadlock in blazegraph, the
> > backend for WDQS. We have placed a temporary block on the IP address
> > in question and are taking steps to better define service
> availability
> > expectations as well as processes to make detection of these events
> > more streamlined going forward.
>
>
> What was the problem query?
>
> I ask because I would like to try it against our Wikidata endpoint at:
> https://wikidata.demo.openlinksw.com/sparql .
>
> We have an "Anytime Query" feature designed for these kinds of
> problems,
> hence the  vested interest in these kinds of problem queries.
>
> -- 
> Regards,
>
> Kingsley Idehen       
> Founder & CEO
> OpenLink Software   
> Home Page: http://www.openlinksw.com
> Community Support: https://community.openlinksw.com
> Weblogs (Blogs):
> Company Blog: https://medium.com/openlink-software-blog
> Virtuoso Blog: https://medium.com/virtuoso-blog
> Data Access Drivers Blog:
> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>
> Personal Weblogs (Blogs):
> Medium Blog: https://medium.com/@kidehen
> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
>               http://kidehen.blogspot.com
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn: http://www.linkedin.com/in/kidehen
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
>         :
> 
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
>
> -- 
> Hi, do you like citation networks? Already 51% of all citations are
> available <https://i4oc.org/> available for innovative new uses
> <https://twitter.com/hashtag/acs2ioc>. Join me in asking the American
> Chemical Society to join the Initiative for Open Citations too
> <https://www.change.org/p/asking-the-american-chemical-society-to-join-the-initiative-for-open-citations>.
>  SpringerNature,
> the RSC and many others already did <https://i4oc.org/#publishers>.
>
> -
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: https://www.zotero.org/egonw
> ORCID: -0001-7542-0286 <http://orcid.org/-0001-7542-0286>
> ImpactStory: https://impactstory.org/u/egonwillighagen
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kideh

Re: [Wikidata] 2 million queries against a Wikidata instance

2020-07-13 Thread Kingsley Idehen

On 7/13/20 1:41 PM, Adam Sanchez wrote:
> Hi,
>
> I have to launch 2 million queries against a Wikidata instance.
> I have loaded Wikidata in Virtuoso 7 (512 RAM, 32 cores, SSD disks with RAID 
> 0).
> The queries are simple, just 2 types.
>
> select ?s ?p ?o {
> ?s ?p ?o.
> filter (?s = ?param)
> }
>
> select ?s ?p ?o {
> ?s ?p ?o.
> filter (?o = ?param)
> }
>
> If I use a Java ThreadPoolExecutor takes 6 hours.
> How can I speed up the queries processing even more?
>
> I was thinking :
>
> a) to implement a Virtuoso cluster to distribute the queries or
> b) to load Wikidata in a Spark dataframe (since Sansa framework is
> very slow, I would use my own implementation) or
> c) to load Wikidata in a Postgresql table and use Presto to distribute
> the queries or
> d) to load Wikidata in a PG-Strom table to use GPU parallelism.
>
> What do you think? I am looking for ideas.
> Any suggestion will be appreciated.
>
> Best,


Hi Adam,

You need to increase the memory available to Virtuoso. If you are at
your limits that's when the Cluster Edition will come in handy i.e.,
enabling you build a large pool or memory from a sharded DB horizontally
partitioning over of collection of commodity computers.

There is a public Google Spreadsheet covering a variety of public
Virtuoso instances that should aid you in this process [1].

Links:

[1]
https://docs.google.com/spreadsheets/d/1-stlTC_WJmMU3xA_NxA1tSLHw6_sbpjff-5OITtrbFw/edit#gid=812792186

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Accessing tabular data from SPARQL?

2020-07-07 Thread Kingsley Idehen

On 6/3/19 2:00 PM, Yuri Astrakhan wrote:
> Kingsley, I'm not sure I understood the question. Please take a look
> at the phabricator ticket that describes the implementation approach.
> The code is currently sitting in a branch, and can be easily merged
> with the sophox's master branch, and enabled on sophox endpoint.


Yuri,

I was just seeking the endpoint URL for the SPARQL Query Service. I
think it is https://sophox.org/ .


Kingsley

>
> On Mon, Jun 3, 2019 at 8:38 PM Kingsley Idehen  <mailto:kide...@openlinksw.com>> wrote:
>
> On 5/31/19 11:28 AM, Yuri Astrakhan wrote:
>> I actually already implemented support in SPARQL for that, but it
>> needs a bit more work to get it properly merged with the
>> Blazegraph code.  I had it working for a while as part of Sophox
>> (OSM Sparql).
>>
>> *
>> docs:  https://wiki.openstreetmap.org/wiki/Sophox#External_Data_Sources
>> *
>> code:  
>> https://github.com/Sophox/wikidata-query-rdf/compare/master...Sophox:tabular
>> (see Tabular* files)
>> * phabricator discussion about the above
>> code:  https://phabricator.wikimedia.org/T181319
>>
>> Tabular support allows any CSV-style tables to be treated as
>> federated sources. With minor changes it should be possible to
>> use mediawiki's .tab pages too.
>
>
> Hi Yuri,
>
>
> What is the SPARQL Query Service endpoint? Basically, the
> equivalent of : http://query.wikidata.org/sparql ??
>
>
> Kingsley
>
>>
>> On Fri, May 31, 2019 at 6:01 PM Daniel Mietchen
>> > <mailto:daniel.mietc...@googlemail.com>> wrote:
>>
>> Hi,
>> I'm looking into ways to use tabular data like
>> https://commons.wikimedia.org/wiki/Data:Zika-institutions-test.tab
>> in SPARQL queries but could not find anything on that.
>>
>> My motivation here is in part coming from the time out
>> limits, and the
>> basic idea here would be to split queries that typically time
>> out into
>> sets of queries that do not time out and - if their results were
>> aggregated - would yield the results that would be expected
>> for the
>> original query would it not time out.
>>
>> The second line of motivation here is that of keeping track
>> of how
>> things develop over time, which would be interesting for both
>> content
>> and maintenance queries as well as usage of things like classes,
>> references, lexemes or properties.
>>
>> I would appreciate any pointers or thoughts on the matter.
>>
>> Thanks,
>>
>> Daniel
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> <mailto:Wikidata@lists.wikimedia.org>
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> -- 
> Regards,
>
> Kingsley Idehen 
> Founder & CEO 
> OpenLink Software   
> Home Page: http://www.openlinksw.com
> Community Support: https://community.openlinksw.com
> Weblogs (Blogs):
> Company Blog: https://medium.com/openlink-software-blog
> Virtuoso Blog: https://medium.com/virtuoso-blog
> Data Access Drivers Blog: 
> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>
> Personal Weblogs (Blogs):
> Medium Blog: https://medium.com/@kidehen
> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
>   http://kidehen.blogspot.com
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn: http://www.linkedin.com/in/kidehen
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
> : 
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
>

Re: [Wikidata] Partial RDF dumps

2020-05-01 Thread Kingsley Idehen

On 5/1/20 11:53 AM, Isaac Johnson wrote:
> If the challenge is downloading large files, you can also get local
> access to all of the dumps (wikidata, wikipedia, and more) through the
> PAWS <https://wikitech.wikimedia.org/wiki/PAWS> (Wikimedia-hosted
> Jupyter notebooks) and Toolforge
> <https://wikitech.wikimedia.org/wiki/Help:Toolforge> (more
> general-purpose Wikimedia hosting environment). From Toolforge, you
> could run the Wikidata toolkit (Java) that Denny mentions. I'm
> personally more familiar with Python, so my suggestion is to use
> Python code to filter down the dumps to what you desire. Below is an
> example Python notebook that will do this on PAWS, though the PAWS
> environment is not set up for these longer running jobs and will
> probably die before the process is complete, so I'd highly recommend
> converting it into a script that can run on Toolforge (see
> https://wikitech.wikimedia.org/wiki/Help:Toolforge/Dumps).
>
> PAWS example:
> https://paws-public.wmflabs.org/paws-public/User:Isaac_(WMF)/Simplified_Wikidata_Dumps.ipynb
>
> Best,
> Isaac
>
That isn't my challenge.

I wanted to know why the WDQ UI doesn't provide an option for CONSTRUCT
and DESCRIBE query solutions using a variety of document types.

See: https://wikidata.demo.openlinksw.com/sparql to see what I mean.
Ditto any DBpedia endpoint.


Kingsley

>
> On Thu, Apr 30, 2020 at 1:33 AM raffaele messuti  <mailto:raffa...@docuver.se>> wrote:
>
> On 27/04/2020 18:02, Kingsley Idehen wrote:
> >> [1] https://w.wiki/PBi <https://w.wiki/PBi>
> >>
> > Do these CONSTRUCT queries return any of the following document
> content-types?
> >
> > RDF-Turtle, RDF-XML, JSON-LD ?
>
> you can use content negotiation on the sparql endpoint
>
> ~ query="CONSTRUCT { ... }"
> ~ curl -H "Accept: application/rdf+xml"
> https://query.wikidata.org/sparql --data-urlencode query=$query
> ~ curl -H "Accept: text/turtle" -G
> https://query.wikidata.org/sparql --data-urlencode query=$query
>
>
>
> --
> raffa...@docuver.se <mailto:raffa...@docuver.se>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
>
> -- 
> Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia Foundation
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Partial RDF dumps

2020-05-01 Thread Kingsley Idehen

On 4/30/20 2:32 AM, raffaele messuti wrote:
> On 27/04/2020 18:02, Kingsley Idehen wrote:
>>> [1] https://w.wiki/PBi <https://w.wiki/PBi>
>>>
>> Do these CONSTRUCT queries return any of the following document
>> content-types?
>>
>> RDF-Turtle, RDF-XML, JSON-LD ?
>
> you can use content negotiation on the sparql endpoint
>
> ~ query="CONSTRUCT { ... }"
> ~ curl -H "Accept: application/rdf+xml"
> https://query.wikidata.org/sparql --data-urlencode query=$query
> ~ curl -H "Accept: text/turtle" -G https://query.wikidata.org/sparql
> --data-urlencode query=$query
>

Okay.

Any reason why that isn't a UI option also re WDQS?

We have an instance at: https://wikidata.demo.openlinksw.com/sparql that
does support DESCRIBE and CONSTRUCT queries that return solutions using
a variety of document types (RDF-Turtle, RDF-XML, JSON-LD, etc..), hence
my question.

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Partial RDF dumps

2020-04-27 Thread Kingsley Idehen

On 4/27/20 5:48 AM, Andra Waagmeester wrote:
> You can use CONSTRUCT queries for this. In this example [1] you'll get
> that subgraph for all items that also have a musicBrainz ID property
>
> [1] https://w.wiki/PBi
>

Hi Andra,

Do these CONSTRUCT queries return any of the following document
content-types?

RDF-Turtle, RDF-XML, JSON-LD ?


Kingsley

> On Mon, Apr 27, 2020 at 11:36 AM Ece Toprak  <mailto:ece.topr...@gmail.com>> wrote:
>
> Hi,
>
> I am currently working on a NER project at school and would like
> to know if there is a way to generate RDF dumps that only contain
> "instance of" or "subclass of" relations.
>  I have found these dumps:
> RDF Exports from Wikidata
> 
> <https://tools.wmflabs.org/wikidata-exports/rdf/exports/20160801/dump_download.html>
> Here, under "simplified and derived dumps" taxonomy and instances
> dumps are very useful for me but unfortunately very old. 
> It would be great if I could generate up to date dumps. 
>
> Thank You,
> Alkım Ece Toprak
> Bogazici University
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] WDQS and SPARQL Endpoint Compatibility

2020-03-30 Thread Kingsley Idehen

On 3/30/20 4:41 PM, Lucas Werkmeister wrote:
> The current whitelist is documented at
> https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual/SPARQL_Federation_endpoints
> and new additions can be proposed at
> https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input.
>
> Cheers,
> Lucas


Entry for our URIBurner Endpoint added.

Kingsley

> On 30.03.20 20:31, Kingsley Idehen wrote:
>> All,
>>
>> I am opening up this thread to discuss the generic support of SPARQL
>> endpoints by WDQS. Correct me if I am wrong, but right now it can use
>> SPARQL-FED against a select number of registered endpoints?
>>
>> As you all know, the LOD Cloud Knowledge Graph is a powerful repository
>> of loosely-coupled, data, information, and knowledge. One that could
>> really help humans and software agents in the collective quest to defeat
>> the COVID19 disease.
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] WDQS and SPARQL Endpoint Compatibility

2020-03-30 Thread Kingsley Idehen

All,

I am opening up this thread to discuss the generic support of SPARQL
endpoints by WDQS. Correct me if I am wrong, but right now it can use
SPARQL-FED against a select number of registered endpoints?

As you all know, the LOD Cloud Knowledge Graph is a powerful repository
of loosely-coupled, data, information, and knowledge. One that could
really help humans and software agents in the collective quest to defeat
the COVID19 disease.

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata-tech] [Wikidata] Future-proof WDQS (Was: Re: [ANN] nomunofu v0.1.0)

2020-01-01 Thread Kingsley Idehen

is SATA, and the CPU is dubbed: Intel(R) Xeon(R) CPU
>>> E3-1220 V2 @ 3.10GHz
>>>
>>> I imported latest-lexeme.nt (6GB) using guile-nomunofu, chez-nomunofu
>>> and Virtuoso:
>>>
>>> - Chez takes 40 minutes to import 6GB
>>> - Chez is 3 to 5 times faster than Guile
>>> - Chez is 11% faster than Virtuoso
>>
>> How did you load the data?  Did you use Virtuoso's build-load
>> facilities?  This is the recommended method [6].
>>
>> [6] http://vos.openlinksw.com/owiki/wiki/VOS/VirtBulkRDFLoader
>>
>>
>>> Regarding query time, Chez is still faster than Virtuoso with or
>>> without cache.  The query I am testing is the following:
>>>
>>> SELECT ?s ?p ?o
>>> FROM <http://fu>
>>> WHERE {
>>>  ?s <http://purl.org/dc/terms/language> 
>>> <http://www.wikidata.org/entity/Q150> .
>>>  ?s <http://wikiba.se/ontology#lexicalCategory>
>>> <http://www.wikidata.org/entity/Q1084> .
>>>  ?s <http://www.w3.org/2000/01/rdf-schema#label> ?o
>>> };
>>>
>>> Virtuoso first query takes: 1295 msec.
>>> The second query takes: 331 msec.
>>> Then it stabilize around: 200 msec.
>>>
>>> chez nomunofu takes around 200ms without cache.
>>>
>>> There is still an optimization I can do to speed up nomunofu a little.
>>>
>>>
>>> Happy hacking!
>>
>> I'll be interested to hear your new results, with a current build,
>> and with proper INI tuning in place.
> What will be the INI options I need to use? Thanks!
>
>> Regards,
>>
>> Ted
>>
>>
>>
>> --
>> A: Yes.  http://www.idallen.com/topposting.html
>> | Q: Are you sure?
>> | | A: Because it reverses the logical flow of conversation.
>> | | | Q: Why is top posting frowned upon?
>>
>> Ted Thibodeau, Jr.   //   voice +1-781-273-0900 x32
>> Senior Support & Evangelism  //mailto:tthibod...@openlinksw.com
>>  //  http://twitter.com/TallTed
>> OpenLink Software, Inc.  //  http://www.openlinksw.com/
>>  20 Burlington Mall Road, Suite 322, Burlington MA 01803
>>  Weblog-- http://www.openlinksw.com/blogs/
>>  Community -- https://community.openlinksw.com/
>>  LinkedIn  -- http://www.linkedin.com/company/openlink-software/
>>  Twitter   -- http://twitter.com/OpenLink
>>  Facebook  -- http://www.facebook.com/OpenLinkSoftware
>> Universal Data Access, Integration, and Management Technology Providers
>>
>>
>>
>>
> Regards,
>
>
> Amirouche ~ zig ~ https://hyper.dev
>
> ___
> Wikidata mailing list
> wikid...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech

Re: [Wikidata] Future-proof WDQS (Was: Re: [ANN] nomunofu v0.1.0)

2019-12-29 Thread Kingsley Idehen

is SATA, and the CPU is dubbed: Intel(R) Xeon(R) CPU
>>> E3-1220 V2 @ 3.10GHz
>>>
>>> I imported latest-lexeme.nt (6GB) using guile-nomunofu, chez-nomunofu
>>> and Virtuoso:
>>>
>>> - Chez takes 40 minutes to import 6GB
>>> - Chez is 3 to 5 times faster than Guile
>>> - Chez is 11% faster than Virtuoso
>>
>> How did you load the data?  Did you use Virtuoso's build-load
>> facilities?  This is the recommended method [6].
>>
>> [6] http://vos.openlinksw.com/owiki/wiki/VOS/VirtBulkRDFLoader
>>
>>
>>> Regarding query time, Chez is still faster than Virtuoso with or
>>> without cache.  The query I am testing is the following:
>>>
>>> SELECT ?s ?p ?o
>>> FROM <http://fu>
>>> WHERE {
>>>  ?s <http://purl.org/dc/terms/language> 
>>> <http://www.wikidata.org/entity/Q150> .
>>>  ?s <http://wikiba.se/ontology#lexicalCategory>
>>> <http://www.wikidata.org/entity/Q1084> .
>>>  ?s <http://www.w3.org/2000/01/rdf-schema#label> ?o
>>> };
>>>
>>> Virtuoso first query takes: 1295 msec.
>>> The second query takes: 331 msec.
>>> Then it stabilize around: 200 msec.
>>>
>>> chez nomunofu takes around 200ms without cache.
>>>
>>> There is still an optimization I can do to speed up nomunofu a little.
>>>
>>>
>>> Happy hacking!
>>
>> I'll be interested to hear your new results, with a current build,
>> and with proper INI tuning in place.
> What will be the INI options I need to use? Thanks!
>
>> Regards,
>>
>> Ted
>>
>>
>>
>> --
>> A: Yes.  http://www.idallen.com/topposting.html
>> | Q: Are you sure?
>> | | A: Because it reverses the logical flow of conversation.
>> | | | Q: Why is top posting frowned upon?
>>
>> Ted Thibodeau, Jr.   //   voice +1-781-273-0900 x32
>> Senior Support & Evangelism  //mailto:tthibod...@openlinksw.com
>>  //  http://twitter.com/TallTed
>> OpenLink Software, Inc.  //  http://www.openlinksw.com/
>>  20 Burlington Mall Road, Suite 322, Burlington MA 01803
>>  Weblog-- http://www.openlinksw.com/blogs/
>>  Community -- https://community.openlinksw.com/
>>  LinkedIn  -- http://www.linkedin.com/company/openlink-software/
>>  Twitter   -- http://twitter.com/OpenLink
>>  Facebook  -- http://www.facebook.com/OpenLinkSoftware
>> Universal Data Access, Integration, and Management Technology Providers
>>
>>
>>
>>
> Regards,
>
>
> Amirouche ~ zig ~ https://hyper.dev
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Contd: [ANN] nomunofu v0.1.0

2019-12-22 Thread Kingsley Idehen

On 12/22/19 4:17 PM, Kingsley Idehen wrote:
> On 12/22/19 3:17 PM, Amirouche Boubekki wrote:
>> Hello all ;-)
>>
>>
>> I ported the code to Chez Scheme to do an apple-to-apple comparison
>> between GNU Guile and Chez and took the time to launch a few queries
>> against Virtuoso available in Ubuntu 18.04 (LTS).
>>
>> Spoiler: the new code is always faster.
>>
>> The hard disk is SATA, and the CPU is dubbed: Intel(R) Xeon(R) CPU
>> E3-1220 V2 @ 3.10GHz
>>
>> I imported latest-lexeme.nt (6GB) using guile-nomunofu, chez-nomunofu
>> and Virtuoso:
>>
>> - Chez takes 40 minutes to import 6GB
>> - Chez is 3 to 5 times faster than Guile
>> - Chez is 11% faster than Virtuoso
>>
>> Regarding query time, Chez is still faster than Virtuoso with or
>> without cache.  The query I am testing is the following:
>>
>> SELECT ?s ?p ?o
>> FROM <http://fu>
>> WHERE {
>>   ?s <http://purl.org/dc/terms/language> 
>> <http://www.wikidata.org/entity/Q150> .
>>   ?s <http://wikiba.se/ontology#lexicalCategory>
>> <http://www.wikidata.org/entity/Q1084> .
>>   ?s <http://www.w3.org/2000/01/rdf-schema#label> ?o
>> };
>>
>> Virtuoso first query takes: 1295 msec.
>> The second query takes: 331 msec.
>> Then it stabilize around: 200 msec.
>>
>> chez nomunofu takes around 200ms without cache.
>>
>> There is still an optimization I can do to speed up nomunofu a little.
>>
>>
>> Happy hacking!
> If you are going to make claims about Virtuoso, please shed light on
> your Virtuoso configuration and host machine.
>
> How much memory do you have on this machine? What the CPU Affinity re
> CPUs available.
>
> Is there a URL for sample data used in your tests?


Looking at
https://ark.intel.com/content/www/us/en/ark/products/65734/intel-xeon-processor-e3-1220-v2-8m-cache-3-10-ghz.html,
your Virtuoso INI settings are even more important due to the fact that
we have CPU Affinity of 4 in play i.e., you need configure Virtuoso such
that it optimizes behavior for this setup.



-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] [ANN] nomunofu v0.1.0

2019-12-22 Thread Kingsley Idehen

On 12/22/19 3:17 PM, Amirouche Boubekki wrote:
> Hello all ;-)
>
>
> I ported the code to Chez Scheme to do an apple-to-apple comparison
> between GNU Guile and Chez and took the time to launch a few queries
> against Virtuoso available in Ubuntu 18.04 (LTS).
>
> Spoiler: the new code is always faster.
>
> The hard disk is SATA, and the CPU is dubbed: Intel(R) Xeon(R) CPU
> E3-1220 V2 @ 3.10GHz
>
> I imported latest-lexeme.nt (6GB) using guile-nomunofu, chez-nomunofu
> and Virtuoso:
>
> - Chez takes 40 minutes to import 6GB
> - Chez is 3 to 5 times faster than Guile
> - Chez is 11% faster than Virtuoso
>
> Regarding query time, Chez is still faster than Virtuoso with or
> without cache.  The query I am testing is the following:
>
> SELECT ?s ?p ?o
> FROM <http://fu>
> WHERE {
>   ?s <http://purl.org/dc/terms/language> 
> <http://www.wikidata.org/entity/Q150> .
>   ?s <http://wikiba.se/ontology#lexicalCategory>
> <http://www.wikidata.org/entity/Q1084> .
>   ?s <http://www.w3.org/2000/01/rdf-schema#label> ?o
> };
>
> Virtuoso first query takes: 1295 msec.
> The second query takes: 331 msec.
> Then it stabilize around: 200 msec.
>
> chez nomunofu takes around 200ms without cache.
>
> There is still an optimization I can do to speed up nomunofu a little.
>
>
> Happy hacking!


If you are going to make claims about Virtuoso, please shed light on
your Virtuoso configuration and host machine.

How much memory do you have on this machine? What the CPU Affinity re
CPUs available.

Is there a URL for sample data used in your tests?

-- 

Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Full-text / autocomplete search on labels

2019-10-04 Thread Kingsley Idehen

On 10/4/19 10:57 AM, Kingsley Idehen wrote:
> On 10/4/19 3:58 AM, Thomas Francart wrote:
>> Hello
>>
>> I understand the wikidata SPARQL label service only fetches the
>> labels, but does not allow to search/filter on them; labels are also
>> available in regulare rdfs:label on which a FILTER can be made.
>> However I would like to do full-text search over labels, to e.g. feed
>> an autocomplete search field, actually just like the usual top-right
>> wikidata search field does. I would also be interested to combine
>> this with a criteria on "instance of", to search only on instances of
>> a given class.
>>
>> Can I do that efficiently using the Wikidata SPARQL service ? or is
>> there a separate API I could use ? (exemple welcome)
>>
>> Thanks
>> Thomas
>>
>> -- 
>> *
>> *
>> *Thomas Francart* -*SPARNA*
>> Web de _données_ | Architecture de l'_information_ | Accès aux
>> _connaissances_
>> blog : blog.sparna.fr <http://blog.sparna.fr>, site : sparna.fr
>> <http://sparna.fr>, linkedin : fr.linkedin.com/in/thomasfrancart
>> <https://fr.linkedin.com/in/thomasfrancart>
>> tel :  +33 (0)6.71.11.25.97, skype : francartthomas
>
>
> Hi Thomas,
>
> Remember, we also publish a SPARQL Endpoint for Wikdata access [1].
> You can repeat your tests there too.
>
> Example (note that the retry loop indicates resources limits of this
> particular instance setup):
>
> Search on "Paris"
>
> http://wikidata.demo.openlinksw.com/fct/facet.vsp?qxml=%3C%3Fxml%20version%3D%221.0%22%20encoding%3D%22UTF-8%22%20%3F%3E%3Cquery%20inference%3D%22%22%20same-as%3D%22%22%20view3%3D%22%22%20s-term%3D%22%22%20c-term%3D%22%22%3E%3Ctext%3EParis%3C%2Ftext%3E%3Cview%20type%3D%22text-d%22%20limit%3D%2220%22%20offset%3D%22%22%20%2F%3E%3C%2Fquery%3E
> <http://lod.openlinksw.com/fct/facet.vsp?qxml=%3C%3Fxml%20version%3D%221.0%22%20encoding%3D%22UTF-8%22%20%3F%3E%3Cquery%20inference%3D%22%22%20same-as%3D%22%22%20view3%3D%22%22%20s-term%3D%22%22%20c-term%3D%22%22%3E%3Ctext%3EParis%3C%2Ftext%3E%3Cview%20type%3D%22text-d%22%20limit%3D%2220%22%20offset%3D%22%22%20%2F%3E%3C%2Fquery%3E>
>
> Config:
>
> OpenLink Virtuoso version 08.03.3315 as of Sep 4 2019, on Linux
> (x86_64-generic-linux-glibc25), Single-Server Edition (378 GB total
> memory)
>
> Same thing using the LOD Cloud cache instance, where data is a little
> out of date also:
>
> http://lod.openlinksw.com/fct/facet.vsp?qxml=%3C%3Fxml%20version%3D%221.0%22%20encoding%3D%22UTF-8%22%20%3F%3E%3Cquery%20inference%3D%22%22%20same-as%3D%22%22%20view3%3D%22%22%20s-term%3D%22%22%20c-term%3D%22%22%3E%3Ctext%3EParis%3C%2Ftext%3E%3Cview%20type%3D%22text-d%22%20limit%3D%2220%22%20offset%3D%22%22%20%2F%3E%3C%2Fquery%3E
>
> Config:
>
> OpenLink Virtuoso version 07.20.3224 as of Dec 19 2017, on Linux
> (i686-generic-linux-glibc212-64), Cluster Edition (4 server processes,
> 756 GB total memory)
>
> Links
>
> [1] http://wikidata.demo.openlinksw.com/fct
>
> [2] http://wikidata.demo.openlinksw.com/sparql
>
>
> -- 
> Regards,
>
> Kingsley Idehen 
> Founder & CEO 
> OpenLink Software   
> Home Page: http://www.openlinksw.com
> Community Support: https://community.openlinksw.com
> Weblogs (Blogs):
> Company Blog: https://medium.com/openlink-software-blog
> Virtuoso Blog: https://medium.com/virtuoso-blog
> Data Access Drivers Blog: 
> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>
> Personal Weblogs (Blogs):
> Medium Blog: https://medium.com/@kidehen
> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
>   http://kidehen.blogspot.com
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn: http://www.linkedin.com/in/kidehen
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
> : 
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>

SPARQL Query variants:

Wikidata SPARQL Endpoint at http://wikidata.demo.openlinksw.com/sparql :

https://wikidata.demo.openlinksw.com/sparql?default-graph-uri==%0D%0Aselect+%3Fs+%3Fo+%0D%0Awhere+%0D%0A++%7B+%0D%0A++graph+%3Fg+%0D%0A++%7B+%0D%0A+%3Fs+rdfs%3Alabel+%3Fo+.%0D%0A+%3Fo+bif%3Acontains++%27%22Paris%22%27++.%0D%0A+filter+%28lang%28%3Fo%29+%3D+%22en%22%29%0D%0A%0D%0A++%7D%0D%0A%0D%0A++%7D%0D%0A%0D%0Alimit+100+=text%2Fhtml_redir_for_subjs=121_redir_for_hrefs==3_void=on_unconnected=on=+Run+Query+


-- 
Regards

Re: [Wikidata] Full-text / autocomplete search on labels

2019-10-04 Thread Kingsley Idehen

On 10/4/19 3:58 AM, Thomas Francart wrote:
> Hello
>
> I understand the wikidata SPARQL label service only fetches the
> labels, but does not allow to search/filter on them; labels are also
> available in regulare rdfs:label on which a FILTER can be made.
> However I would like to do full-text search over labels, to e.g. feed
> an autocomplete search field, actually just like the usual top-right
> wikidata search field does. I would also be interested to combine this
> with a criteria on "instance of", to search only on instances of a
> given class.
>
> Can I do that efficiently using the Wikidata SPARQL service ? or is
> there a separate API I could use ? (exemple welcome)
>
> Thanks
> Thomas
>
> -- 
> *
> *
> *Thomas Francart* -*SPARNA*
> Web de _données_ | Architecture de l'_information_ | Accès aux
> _connaissances_
> blog : blog.sparna.fr <http://blog.sparna.fr>, site : sparna.fr
> <http://sparna.fr>, linkedin : fr.linkedin.com/in/thomasfrancart
> <https://fr.linkedin.com/in/thomasfrancart>
> tel :  +33 (0)6.71.11.25.97, skype : francartthomas


Hi Thomas,

Remember, we also publish a SPARQL Endpoint for Wikdata access [1]. You
can repeat your tests there too.

Example (note that the retry loop indicates resources limits of this
particular instance setup):

Search on "Paris"

http://wikidata.demo.openlinksw.com/fct/facet.vsp?qxml=%3C%3Fxml%20version%3D%221.0%22%20encoding%3D%22UTF-8%22%20%3F%3E%3Cquery%20inference%3D%22%22%20same-as%3D%22%22%20view3%3D%22%22%20s-term%3D%22%22%20c-term%3D%22%22%3E%3Ctext%3EParis%3C%2Ftext%3E%3Cview%20type%3D%22text-d%22%20limit%3D%2220%22%20offset%3D%22%22%20%2F%3E%3C%2Fquery%3E
<http://lod.openlinksw.com/fct/facet.vsp?qxml=%3C%3Fxml%20version%3D%221.0%22%20encoding%3D%22UTF-8%22%20%3F%3E%3Cquery%20inference%3D%22%22%20same-as%3D%22%22%20view3%3D%22%22%20s-term%3D%22%22%20c-term%3D%22%22%3E%3Ctext%3EParis%3C%2Ftext%3E%3Cview%20type%3D%22text-d%22%20limit%3D%2220%22%20offset%3D%22%22%20%2F%3E%3C%2Fquery%3E>

Config:

OpenLink Virtuoso version 08.03.3315 as of Sep 4 2019, on Linux
(x86_64-generic-linux-glibc25), Single-Server Edition (378 GB total memory)

Same thing using the LOD Cloud cache instance, where data is a little
out of date also:

http://lod.openlinksw.com/fct/facet.vsp?qxml=%3C%3Fxml%20version%3D%221.0%22%20encoding%3D%22UTF-8%22%20%3F%3E%3Cquery%20inference%3D%22%22%20same-as%3D%22%22%20view3%3D%22%22%20s-term%3D%22%22%20c-term%3D%22%22%3E%3Ctext%3EParis%3C%2Ftext%3E%3Cview%20type%3D%22text-d%22%20limit%3D%2220%22%20offset%3D%22%22%20%2F%3E%3C%2Fquery%3E

Config:

OpenLink Virtuoso version 07.20.3224 as of Dec 19 2017, on Linux
(i686-generic-linux-glibc212-64), Cluster Edition (4 server processes,
756 GB total memory)

Links

[1] http://wikidata.demo.openlinksw.com/fct

[2] http://wikidata.demo.openlinksw.com/sparql


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-22 Thread Kingsley Idehen

On 9/22/19 11:55 AM, Kingsley Idehen wrote:
> On 9/21/19 6:30 PM, Kingsley Idehen wrote:
>> On 9/20/19 1:31 PM, Denny Vrandečić wrote:
>>> Yes, you're touching exactly on the problems I had during the
>>> evaluation - I couldn't even figure out what DBpedia is.
>> Hi Denny and Sebastian,
>>
>> To reiterate and/or clarify.
>>
>> DBpedia is a community project comprising RDF datasets constructed from
>> Wikipedia content that's deployed using Linked Data principles.
>
>
> A little clearer, as the definition above was a little too concise:
>
> DBpedia is a community project comprising a variety of data curation tools, 
> services (Linked Data lookup and SPARQL), and RDF datasets constructed from 
> Wikipedia that's deployed using Linked Data principles and cross-referenced 
> with other data sources as illustrated in the Linked Open Data Cloud (the 
> world's largest Knowledge Graph)[1][2].
>
> This project has recently spawned a Databus effort which addresses historic 
> challenges associated with dataset curation, publication, discovery, and 
> monetization [3].
>
> [1] https://lod-cloud.ne2
>
> [2] 
> https://medium.com/virtuoso-blog/what-is-the-linked-open-data-cloud-and-why-is-it-important-1901a7cb7b1f
>  -- what is the LOD Cloud and why is it important? 
>
> [3] https://databus.dbpedia.org/ -- Databus 
>

TypoFix:


[1] https://lod-cloud.net

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-22 Thread Kingsley Idehen

On 9/21/19 6:30 PM, Kingsley Idehen wrote:
> On 9/20/19 1:31 PM, Denny Vrandečić wrote:
>> Yes, you're touching exactly on the problems I had during the
>> evaluation - I couldn't even figure out what DBpedia is.
> Hi Denny and Sebastian,
>
> To reiterate and/or clarify.
>
> DBpedia is a community project comprising RDF datasets constructed from
> Wikipedia content that's deployed using Linked Data principles.


A little clearer, as the definition above was a little too concise:

DBpedia is a community project comprising a variety of data curation tools, 
services (Linked Data lookup and SPARQL), and RDF datasets constructed from 
Wikipedia that's deployed using Linked Data principles and cross-referenced 
with other data sources as illustrated in the Linked Open Data Cloud (the 
world's largest Knowledge Graph)[1][2].

This project has recently spawned a Databus effort which addresses historic 
challenges associated with dataset curation, publication, discovery, and 
monetization [3].

[1] https://lod-cloud.ne2

[2] 
https://medium.com/virtuoso-blog/what-is-the-linked-open-data-cloud-and-why-is-it-important-1901a7cb7b1f
 -- what is the LOD Cloud and why is it important? 

[3] https://databus.dbpedia.org/ -- Databus 

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-22 Thread Kingsley Idehen

On 9/22/19 1:34 AM, hellm...@informatik.uni-leipzig.de wrote:
> Hi Kingsley,
>
> that describes the core of the glue that DBpedia is. The definition
> leads to people downloading the EN DBpedia dataset and running
> statistics that will only discover what data is wrong or missing in
> the smallest parts of DBpedia.


The question was "What is DBpedia?" . What is misleading about it being
about Wikipedia content transformed into RDF and deployed using Linked
Data principles?

>
> What happened to "LOD is the largest knowledge graph on earth" ? 


The question wasn't "What is the LOD Cloud?" or am I missing something
here.


> Querying more Freebase data from DBpedia via Linked Data is a use case
> since over 10 years now using ontologies as a GPS.


Freebase is yet another derivative of Wikipedia content, isn't it?


>
> Also the definition you give limits the community to people who have
> edited 10 Scala Classes in the extraction framework, which is probably
> 10 people altogether.


Look, can't you simply make a clear statement of what is missing from my
definition of DBpedia? I sense you are talking about all the other
utilities that have been developed by the project beyond dataset
production e.g., services like DBpedia Spotlight etc?


>
> So this is the most exclusionist view I can think of.
>
> What you wrote here is adequate:
> https://medium.com/openlink-software-blog/what-is-dbpedia-and-why-is-it-important-d306b5324f90
>
> What you wrote in your email as a summary is very narrow and
> misleading, see Markus Kroetzsch's email. People will continue to
> measure DBpedia by exactly the part of the data that is loaded in the
> Virtuoso SPARQL endpoint unless we make the derivatives downloadable
> outside of HTTP LD requests.


You really have to try using a slightly better tone when communicating.

You could simply say:

Kingsley, here are some thing that could be overlooked based on the
description your presented:

Item 1..N.

I'll just fix it, or worst case agree to disagree.


Kingsley

>
> -- Sebastian
>
>
> On September 22, 2019 12:30:24 AM GMT+02:00, Kingsley Idehen
>  wrote:
>
> On 9/20/19 1:31 PM, Denny Vrandečić wrote:
>
> Yes, you're touching exactly on the problems I had during the
> evaluation - I couldn't even figure out what DBpedia is. 
>
>
> Hi Denny and Sebastian,
>
> To reiterate and/or clarify.
>
> DBpedia is a community project comprising RDF datasets constructed from
> Wikipedia content that's deployed using Linked Data principles.
>
> The description above implies the following re focus breakdown:
>
> [1] Dataset creation -- this cannot be created in line with Linked Data
> principles without the items that follow
>
> [2] Linked Data Deployment -- without this there is nothing to look-up
> re follow-your-nose exploration
>
> [3] SPARQL Query Services  -- without this there is nothing to query
>
> Over the years I've written a number of posts addressing the key
> question "what is DBpedia?"
>
> [1]
> 
> https://medium.com/openlink-software-blog/what-is-dbpedia-and-why-is-it-important-d306b5324f90
> -- What is DBpedia, and why is it important?
>
> [2]
> 
> https://medium.com/virtuoso-blog/on-the-mutually-beneficial-nature-of-dbpedia-and-wikidata-5fb2b9f22ada
> -- Mutually beneficial nature of Wikidata and DBpedia
>
>
> -- 
> Sent from my Android device with K-9 Mail. Please excuse my brevity. 


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-22 Thread Kingsley Idehen

On 9/21/19 7:35 PM, Andra Waagmeester wrote:
> Agree, I am also interested in seeing this. I recently did a small
> comparison on science awards on coverage of laureates in both DBpedia
> and wikidata and came to the same conclusion. The difference sometimes
> was quite substantial in favour of Wikidata. 


Are you not able to share SPARQL Query Results page links for this?

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Google's stake in Wikidata and Wikipedia

2019-09-21 Thread Kingsley Idehen

On 9/20/19 1:31 PM, Denny Vrandečić wrote:
> Yes, you're touching exactly on the problems I had during the
> evaluation - I couldn't even figure out what DBpedia is.

Hi Denny and Sebastian,

To reiterate and/or clarify.

DBpedia is a community project comprising RDF datasets constructed from
Wikipedia content that's deployed using Linked Data principles.

The description above implies the following re focus breakdown:

[1] Dataset creation -- this cannot be created in line with Linked Data
principles without the items that follow

[2] Linked Data Deployment -- without this there is nothing to look-up
re follow-your-nose exploration

[3] SPARQL Query Services  -- without this there is nothing to query

Over the years I've written a number of posts addressing the key
question "what is DBpedia?"

[1]
https://medium.com/openlink-software-blog/what-is-dbpedia-and-why-is-it-important-d306b5324f90
-- What is DBpedia, and why is it important?

[2]
https://medium.com/virtuoso-blog/on-the-mutually-beneficial-nature-of-dbpedia-and-wikidata-5fb2b9f22ada
-- Mutually beneficial nature of Wikidata and DBpedia


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] SPARQL search for Wikipedia Link

2019-09-17 Thread Kingsley Idehen

On 9/17/19 4:27 PM, Olaf Simons wrote:
> Exactly, perfect - and good script to know...
>
> thanks,
> Olaf


Another example, using the SPARQL endpoint at:
http://wikidata.demo.openlinksw.com/sparql :

[1] Editor Page -- https://tinyurl.com/yytx7ad9

[2] Results Page -- https://tinyurl.com/y342mewo


Tweaked query:

SELECT ?Masonic_Lodge ?Masonic_LodgeTitle (sample(?Masonic_LodgeTitle)
as ?wpTitle) WHERE {

  ?Masonic_Lodge wdt:P31 wd:Q1454597.
  OPTIONAL { 
    ?article schema:about ?Masonic_Lodge ;
 schema:name ?Masonic_LodgeTitle
  }
 FILTER (LANG(?Masonic_LodgeTitle) = "en")

} GROUP BY ?Masonic_Lodge ?Masonic_LodgeTitle


Kingsley

>
>
>> Thomas Douillard  hat am 17. September 2019 um 
>> 22:15 geschrieben:
>>
>>
>> Is it what you need ?
>> <https://query.wikidata.org/#SELECT%20%3FMasonic_Lodge%20%3FMasonic_LodgeLabel%20%28sample%28%3FMasonic_LodgeTitle%29%20as%20%3FwpTitle%29%20WHERE%20%7B%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20%3FMasonic_Lodge%20wdt%3AP31%20wd%3AQ1454597.%0A%20%20OPTIONAL%20%7B%20%20%0A%20%20%20%20%3Farticle%20schema%3Aabout%20%3FMasonic_Lodge%20%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20schema%3Aname%20%3FMasonic_LodgeTitle%0A%20%20%7D%0A%7D%20group%20by%20%3FMasonic_Lodge%20%3FMasonic_LodgeLabel%20>
>>
>> Le mar. 17 sept. 2019 à 22:03, Olaf Simons 
>> a écrit :
>>
>>> Hi,
>>>
>>> I am trying to bring some light into this query:
>>>
>>> https://w.wiki/8Xv
>>>
>>> Many of the listings have no labels in any language - is there a simple
>>> way to get the Wikipedia-title in whatever wp with the q-Number?
>>>
>>> cheers
>>> Olaf
>>>
>>>
>>>
>>> Dr. Olaf Simons
>>> Forschungszentrum Gotha der Universität Erfurt
>>> Schloss Friedenstein, Pagenhaus
>>> 99867 Gotha
>>>
>>> Büro: +49-361-737-1722
>>> Mobil: +49-179-5196880
>>>
>>> Privat: Hauptmarkt 17b/ 99867 Gotha
>>>
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
> Dr. Olaf Simons
> Forschungszentrum Gotha der Universität Erfurt
> Schloss Friedenstein, Pagenhaus
> 99867 Gotha
>
> Büro: +49-361-737-1722
> Mobil: +49-179-5196880
>
> Privat: Hauptmarkt 17b/ 99867 Gotha
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Virtuoso hosted Wikidata Instance

2019-09-02 Thread Kingsley Idehen

On 9/2/19 3:51 PM, Adam Sanchez wrote:
> Hi 
>
> I was able to reduce the load time to 9.1 hours aprox. (32890338 msec)
> in Virtuoso 7.
> I used 6 SSD disks of 1T each with RAID 0 (mdadm software RAID, I have
> not tried with hardware RAID).
> The virtuoso.ini for 256G RAM is
> https://gist.github.com/asanchez75/58d5aed504051c7fbf9af0921c3c9130
> I downloaded the dump from 
> https://dumps.wikimedia.org/wikidatawiki/entities/latest-all.ttl.gz 
> on August 30th, 
> The size is 387G uncompressed and finally the file virtuoso.db is
> 362G. The total number of triples is 9 470 700 617.
> Have a look to the simple patch here (is just a workaround)
> https://github.com/asanchez75/virtuoso-opensource/commit/5d7b1b9b29e53cb8a25bed69f512a150f9f05d50
> You can create your own docker image with that patch using
> https://github.com/asanchez75/docker-virtuoso/tree/brendan
> Check the Dockerfile which retrieves the patch from my forked Virtuoso
> git repository
> https://github.com/asanchez75/docker-virtuoso/blob/brendan/Dockerfile
>
>
> Best,


Great job!

I've granted access to you via your email address so that you can update
the Google Spreadsheet containing configuration details per sample
Virtuoso instances [1]. You can put your data in the Wikidata worksheet [2].

Links:

[1]
https://docs.google.com/spreadsheets/d/1-stlTC_WJmMU3xA_NxA1tSLHw6_sbpjff-5OITtrbFw/edit

[2]
https://docs.google.com/spreadsheets/d/1-stlTC_WJmMU3xA_NxA1tSLHw6_sbpjff-5OITtrbFw/edit#gid=1799898600=D4



Kingsley

>
>
>
>
> Le dim. 1 sept. 2019 à 13:38, Edgar Meij  <mailto:edgar.m...@gmail.com>> a écrit :
>
> Thanks for this, Kingsley.
>
> Based on
> 
> https://docs.google.com/spreadsheets/d/1-stlTC_WJmMU3xA_NxA1tSLHw6_sbpjff-5OITtrbFw/edit#gid=1799898600
> (copy-pasted below), it seems that it takes 43 hours to load, is
> that correct?
>
> Also, what is the "patch for geometry" mentioned there? I'm
> assuming that is the patch meant to address
> https://github.com/openlink/virtuoso-opensource/issues/295 and
> https://community.openlinksw.com/t/non-terrestrial-geo-literals/359,
> correct? Is it simply disabling the data validation code? Can you
> share the patch?
>
> Thanks,
> Edgar
>
>
> Other Information 
>   
> Architecture  
>   x86_64
> CPU op-mode(s)
>   32-bit, 64-bit
> Byte Order
>   Little Endian
> CPU(s)
>   12.00
> On-line CPU(s) list   
>   0-11
> Thread(s) per core
>   2.00
> Core(s) per socket
>   6.00
> Socket(s) 
>   1.00
> NUMA node(s)  
>   1.00
> Vendor ID 
>   GenuineIntel
> CPU family
>   6.00
> Model 
>   63.00
> Model name
>   
> Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
> Stepping  
>   2.00
> CPU MHz   
>   1,199.92
> CPU max MHz   
>   3,800.00
> CPU min MHz   
>   1,200.00
> BogoMIPS  
>   6,984.39
> Virtualization
>   VT-x
> L1d cache 
>   32K
> L1i cache 
>   32K
> L2 cache  
>   256K
> L3 cache  
>   15360K
> NUMA node0 CPU(s) 
>   0-11
> RAM   
>   128G
>
>   
>   
> wikidata-20190610-all-BETA.ttl
>   383G
> Virtuoso version  
>   
> 07.20.3230 (with patch for geometry)
> Time to load  
>   43 hours
> virtuoso.db   
>   340G
>
>
> On Wed, Aug 14, 2019 at 12:10 AM Kingsley Idehen
> mailto:kide...@openlinksw.com>> wrote:
>
> Hi Everyone,
>
> A little FYI.
>
> We have loaded Wikidata into a Virtuoso instance accessible
> via SPARQL [1]. One benefit is helping to understand Wikidata
> using our Faceted Browsing Interface for Entity Relationship
> Types [2][3].
>
> Links:
>
> [1] http://wikidata.demo.openlinksw.com/sparql -- SPARQL endpoint
>
> [2] http://wikidata.demo.openlinksw.com/fct -- Faceted
> Browsing Interface
>
> [3] About New York
> 
> <https://wikidata.demo.openlinksw.com/describe/?url=http%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ60=16==940=IFP_OFF=SAME_AS_OFF=1>
>
>
> Enjoy!
>
> Feedback always welcome too :)
>
> -- 
> Regards,
>
> Kingsley Idehen 
> Founder & CEO 
> OpenLink Software   
> Home Page: http://www.

Re: [Wikidata] Virtuoso hosted Wikidata Instance

2019-09-02 Thread Kingsley Idehen

On 9/1/19 5:14 AM, Edgar Meij wrote:
> Thanks for this, Kingsley.
>
> Based on
> https://docs.google.com/spreadsheets/d/1-stlTC_WJmMU3xA_NxA1tSLHw6_sbpjff-5OITtrbFw/edit#gid=1799898600
> (copy-pasted below), it seems that it takes 43 hours to load, is that
> correct?


Yes, for that particular single-server instance configuration.


>
> Also, what is the "patch for geometry" mentioned there? I'm assuming
> that is the patch meant to address
> https://github.com/openlink/virtuoso-opensource/issues/295 and
> https://community.openlinksw.com/t/non-terrestrial-geo-literals/359,
> correct? Is it simply disabling the data validation code? Can you
> share the patch?


Best we move this particular item to our community forum [1].

Links:

[1] https://community.openlinksw.com


Kingsley

>
> Thanks,
> Edgar
>
>
> Other Information 
>   
> Architecture  
>   x86_64
> CPU op-mode(s)
>   32-bit, 64-bit
> Byte Order
>   Little Endian
> CPU(s)
>   12.00
> On-line CPU(s) list   
>   0-11
> Thread(s) per core
>   2.00
> Core(s) per socket
>   6.00
> Socket(s) 
>   1.00
> NUMA node(s)  
>   1.00
> Vendor ID 
>   GenuineIntel
> CPU family
>   6.00
> Model 
>   63.00
> Model name
>   
> Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
> Stepping  
>   2.00
> CPU MHz   
>   1,199.92
> CPU max MHz   
>   3,800.00
> CPU min MHz   
>   1,200.00
> BogoMIPS  
>   6,984.39
> Virtualization
>   VT-x
> L1d cache 
>   32K
> L1i cache 
>   32K
> L2 cache  
>   256K
> L3 cache  
>   15360K
> NUMA node0 CPU(s) 
>   0-11
> RAM   
>   128G
>
>   
>   
> wikidata-20190610-all-BETA.ttl
>   383G
> Virtuoso version  
>   
> 07.20.3230 (with patch for geometry)
> Time to load  
>   43 hours
> virtuoso.db   
>   340G
>
>
> On Wed, Aug 14, 2019 at 12:10 AM Kingsley Idehen
> mailto:kide...@openlinksw.com>> wrote:
>
> Hi Everyone,
>
> A little FYI.
>
> We have loaded Wikidata into a Virtuoso instance accessible via
> SPARQL [1]. One benefit is helping to understand Wikidata using
> our Faceted Browsing Interface for Entity Relationship Types [2][3].
>
> Links:
>
> [1] http://wikidata.demo.openlinksw.com/sparql -- SPARQL endpoint
>
> [2] http://wikidata.demo.openlinksw.com/fct -- Faceted Browsing
> Interface
>
> [3] About New York
> 
> <https://wikidata.demo.openlinksw.com/describe/?url=http%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ60=16==940=IFP_OFF=SAME_AS_OFF=1>
>
>
> Enjoy!
>
> Feedback always welcome too :)
>
> -- 
> Regards,
>
> Kingsley Idehen 
> Founder & CEO 
> OpenLink Software   
> Home Page: http://www.openlinksw.com
> Community Support: https://community.openlinksw.com
> Weblogs (Blogs):
> Company Blog: https://medium.com/openlink-software-blog
> Virtuoso Blog: https://medium.com/virtuoso-blog
> Data Access Drivers Blog: 
> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>
> Personal Weblogs (Blogs):
> Medium Blog: https://medium.com/@kidehen
> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
>   http://kidehen.blogspot.com
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn: http://www.linkedin.com/in/kidehen
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
> : 
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Ac

Re: [Wikidata] Virtuoso hosted Wikidata Instance

2019-08-14 Thread Kingsley Idehen

On 8/14/19 4:35 PM, Egon Willighagen wrote:
> On Wed, Aug 14, 2019 at 1:10 AM Kingsley Idehen  
> wrote:
>> We have loaded Wikidata into a Virtuoso instance accessible via SPARQL [1]. 
>> One benefit is helping to understand Wikidata using our Faceted Browsing 
>> Interface for Entity Relationship Types [2][3].
> Awesome!
>
> I've started seeing how much of Scholia can run on it, and opened a
> ticket: https://github.com/fnielsen/scholia/issues/809 It's great the
> Wikidata namespaces are loaded. I only had to add the 'bd' prefix to
> the Scholia SPARQL. And, the sections that use the WDQS graphical
> views, obviously cannot use the VOS instance yet.
>
> So, do you plan to run a WDQS instance on top of your EP? :)
>
> Egon
>

I am hoping that WDQS would be encouraged to be more loosely-coupled
based on SPARQL as the open standard for its data access etc..

There are lots of tools from this community that will benefit immensely
from loose-coupling, IMHO.

We need to demonstrate to the world that the LOD Cloud is its most
powerful and accessible Knowledge Graph :)

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Virtuoso hosted Wikidata Instance

2019-08-14 Thread Kingsley Idehen

On 8/14/19 6:07 AM, Jérémie Roquet wrote:
> Hi!
>
> Le mer. 14 août 2019 à 01:10, Kingsley Idehen  a 
> écrit :
>> We have loaded Wikidata into a Virtuoso instance accessible via SPARQL [1]. 
>> One benefit is helping to understand Wikidata using our Faceted Browsing 
>> Interface for Entity Relationship Types [2][3].
> That's great news, thanks!


Hi Jérémie,


You are welcome!  See responses below.


>
>> Feedback always welcome too :)
> So, I've eagerly tried a very simple SPARQL query with a huge result
> set, the complete version of which¹ I've known for several years to
> timeout in both the official Blazegraph instance and a personal
> Blazegraph instance with supposedly all time limits removed:
>
>   PREFIX wd: <http://www.wikidata.org/entity/>
>   PREFIX wdt: <http://www.wikidata.org/prop/direct/>
>
>   SELECT ?person WHERE {
> ?person wdt:P31 wd:Q5
>   }
>
> … and while the Virtuoso instance manages to answer pretty quickly, it
> seems that it's cutting the result set at 100k triples. Is it the
> expected behavior? 


Yes.


> If so, I suggest you show that in the UI because
> apart from the improbable round number of triples, it's not obvious
> that the result set is incomplete (in this case, the LDF endpoint
> tells us that there should be around 5,4M triples²).
>
> Thanks again!
>
> ¹ ie. using the wikibase:label service
> ² 
> https://query.wikidata.org/bigdata/ldf?subject==wdt%3AP31=wd%3AQ5


If you open up your browser's inspector you will see:

cache-control: max-age=3600
content-encoding: gzip
content-type: text/html; charset=UTF-8
date: Wed, 14 Aug 2019 16:47:47 GMT
expires: Wed, 14 Aug 2019 17:47:47 GMT
server: Virtuoso/08.03.3315 (Linux) x86_64-generic-linux-glibc25  VDB
status: 200
strict-transport-security: max-age=15768000
vary: Accept-Encoding
x-sparql-default-graph: http://www.wikidata.org/
*x-sparql-maxrows:* 10


In addition, note that Virtuoso has an "Anytime Query" feature [1][2]
that it uses to drive a "Fair Use" policy that ensures an endpoint is
able to handle a cocktail of query types from users and bots. This is
also how we handle DBpedia and DBpedia-Live instances [3]. Naturally,
HTTP response metadata will also inform you when this kicks in.


[1] http://docs.openlinksw.com/virtuoso/anytimequeries/

[2]
http://vos.openlinksw.com/owiki/wiki/VOS/VirtTipsAndTricksAnytimeSPARQLQuery

[3] https://wiki.dbpedia.org/public-sparql-endpoint


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Virtuoso hosted Wikidata Instance

2019-08-13 Thread Kingsley Idehen

On 8/13/19 7:20 PM, Denny Vrandečić wrote:
> That is really cool! Thanks and congratulations! I will certainly play
> with it.
>
> Is it in some way synced or is it a static snapshot?


At this juncture, it is a snapshot, but ultimately we want something
that's kept in sycn, just like DBpedia-Live etc..


Kingsley

>
> On Tue, Aug 13, 2019 at 4:10 PM Kingsley Idehen
> mailto:kide...@openlinksw.com>> wrote:
>
> Hi Everyone,
>
> A little FYI.
>
> We have loaded Wikidata into a Virtuoso instance accessible via
> SPARQL [1]. One benefit is helping to understand Wikidata using
> our Faceted Browsing Interface for Entity Relationship Types [2][3].
>
> Links:
>
> [1] http://wikidata.demo.openlinksw.com/sparql -- SPARQL endpoint
>
> [2] http://wikidata.demo.openlinksw.com/fct -- Faceted Browsing
> Interface
>
> [3] About New York
> 
> <https://wikidata.demo.openlinksw.com/describe/?url=http%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ60=16==940=IFP_OFF=SAME_AS_OFF=1>
>
>
>     Enjoy!
>
> Feedback always welcome too :)
>
> -- 
> Regards,
>
> Kingsley Idehen 
> Founder & CEO 
> OpenLink Software   
> Home Page: http://www.openlinksw.com
> Community Support: https://community.openlinksw.com
> Weblogs (Blogs):
> Company Blog: https://medium.com/openlink-software-blog
> Virtuoso Blog: https://medium.com/virtuoso-blog
> Data Access Drivers Blog: 
> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>
> Personal Weblogs (Blogs):
> Medium Blog: https://medium.com/@kidehen
> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
>   http://kidehen.blogspot.com
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn: http://www.linkedin.com/in/kidehen
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
> : 
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Virtuoso hosted Wikidata Instance

2019-08-13 Thread Kingsley Idehen

Hi Everyone,

A little FYI.

We have loaded Wikidata into a Virtuoso instance accessible via SPARQL
[1]. One benefit is helping to understand Wikidata using our Faceted
Browsing Interface for Entity Relationship Types [2][3].

Links:

[1] http://wikidata.demo.openlinksw.com/sparql -- SPARQL endpoint

[2] http://wikidata.demo.openlinksw.com/fct -- Faceted Browsing Interface

[3] About New York
<https://wikidata.demo.openlinksw.com/describe/?url=http%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ60=16==940=IFP_OFF=SAME_AS_OFF=1>


Enjoy!

Feedback always welcome too :)

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] minimal hardware requirements for loading wikidata dump in Blazegraph

2019-06-20 Thread Kingsley Idehen

   The actual hardware requirements will depend
> on your use case. But for comparison, our
> production servers are: * 16 cores (hyper
> threaded, 32 threads) * 128G RAM * 1.5T of SSD
> storage
>
> The downloaded dump file
> wikidata-20190513-all-BETA.ttl is 379G.
> The bigdata.jnl file which stores all the
> triples data in Blazegraph is 478G but
> still growing. I had 1T disk but is almost
> full now. 
>
> The current size of our jnl file in production
> is ~670G. Hope that helps! Guillaume
>
> Thanks, Adam
> 
> 
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> 
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> -- Guillaume Lederrey Engineering Manager,
> Search Platform Wikimedia Foundation UTC+2 / CEST
> 
> 
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> 
> 
> Wikidata mailing list Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata 
>
> 
> 
> Wikidata mailing list Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata 
>
> -- Guillaume Lederrey Engineering Manager, Search Platform
> Wikimedia Foundation UTC+2 / CEST
> 
> 
> Wikidata mailing list Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata 
>
> 
> 
> Wikidata mailing list Wikidata@lists.wikimedia.org
>     https://lists.wikimedia.org/mailman/listinfo/wikidata 
>
> 
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> -- 
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Scaling Wikidata Query Service

2019-06-14 Thread Kingsley Idehen

On 6/13/19 7:55 PM, Stas Malyshev wrote:
> Hi!
>
>> Data living in an RDBMS engine distinct from Virtuoso is handled via the
>> engines Virtual Database module i.e., you can build powerful RDF Views
>> over ODBC- or JDBC- accessible data using Virtuoso. These view also have
>> the option of being materialized etc..
> Yes, but the way the data are stored now is JSON blob within a text
> field in MySQL. I do not see how RDF View over ODBC would help it any -
> of course Virtuoso would be able to fetch JSON text for a single item,
> but then what? We'd need to run queries across millions of items,
> fetching and parsing JSON for every one of them every time is
> unfeasible. Not to mention this JSON is not an accurate representation
> of the RDF data model. So I don't think it is worth spending time in
> this direction... I just don't see how any query engine could work with
> that storage.
> -- Stas Malyshev smalys...@wikimedia.org


The point I am trying to make is that Virtuoso can integrate data from
external DBMS systems in a variety of ways.

ODBC and JDBC are simply APIs for accessing external DBMS systems.

What you really need here is a clear project definition and a discussion
with us about how it would be implemented.

Despite the fact that Virtuoso is a hardcore DBMS, its also a hardcore
Data Virtualization platform for handling relations represented in a
variety of ways using a plethora of protocols.

I am email away if you want to explore this further.

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Scaling Wikidata Query Service

2019-06-13 Thread Kingsley Idehen

On 6/12/19 1:11 PM, Stas Malyshev wrote:
>> That will be vendor lock-in for wikidata and wikimedia along all the
>> poor souls that try to interop with it.
> Since Virtuoso is using standard SPARQL, it won't be too much of a
> vendor lock in, though of course the standard does not cover all, so
> some corners are different in all SPARQL engines. This is why even
> migration between SPARQL engines, even excluding operational aspects, is
> non-trivial. Of course, migration to any non-SPARQL engine would be
> order of magnitude more disruptive, so right now we do not seriously
> consider doing that.
>

Hi Stas,

Yes, Virtuoso supports W3C SPARQL and ASNI SQL standards. The most
important aspect of Virtuoso's design and vision boils down to using
open standard on the front- and back-ends to enable maximum flexibility
for its users.

There is nothing more important to us than open standards. For instance,
we even extend SQL using SPARQL before entering the realm on
non-standard extensions.


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Scaling Wikidata Query Service

2019-06-12 Thread Kingsley Idehen

On 6/11/19 12:06 PM, Andra Waagmeester wrote:
>
>
> On Tue, Jun 11, 2019 at 11:23 AM Jerven Bolleman et al wrote:
>
>
> >>  So we are playing the game since ten years now: Everybody
> tries other databases, but then most people come back to virtuoso. 
>
>
> Nothing bad about virtuoso, on the contrary, they are a prime
> infrastructure provider (Except maybe their trademark SPARQL query:
> "select distinct ?Concept where {[] a ?Concept}" ;). But I personally
> think that replacing the current WDS with virtuoso would be a bad
> idea. Not from a performance perspective, but more from the signal it
> gives. If indeed as you state virtuoso is the only viable solution in
> the field, this field is nothing more than a niche. We really need
> more competition to get things done.  
> Since both DBpedia and UniProt are indeed already running on Virtuoso
> - where it is doing a prime job -, having Wikidata running on another
> vendor's infrastructure does provide us with the so needed benchmark.
> The benchmark seems to be telling some of us already that there is
> room for other alternatives. So it is fulfilling its benchmarks role.
> Is there really no room for improvement with Blazegraph? How about
> graphDB?
>  
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


Hi Andra,

The goal is to provide a solution to a problem. Unfortunately, it has
ended up in a product debate. I've struggle with the logic about a
demonstrable solution being challenged by a lack of alternatives.

The fundamental goal of Linked Data is to enable Data Publication and
Access that applies capabilities delivered by HTTP to modern Data
Access, Integration, and Management.

The Linked Data meme outlining Linked Data principles has existed since
2006. Like others, we digested the meme and applied it to our existing
SQL RDBMS en route to producing a solution that made the vision in the
paper reality, as demonstrated by DBpedia, DBpedia-Live, Uniprot, our
LOD Cloud Cache, and many other nodes in the massive LOD Cloud [1].

Virtuoso's role in the LOD Cloud is an example of what happens when open
standards are understood and appropriately applied to a problem, with a
little innovation.

Links:

[1]
https://medium.com/virtuoso-blog/what-is-the-linked-open-data-cloud-and-why-is-it-important-1901a7cb7b1f
-- What is the LOD Cloud, and why is it important?

[2]
https://medium.com/virtuoso-blog/what-is-small-data-and-why-is-it-important-fbf5f267884
-- What is Small Data, and why is it important?

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Scaling Wikidata Query Service

2019-06-11 Thread Kingsley Idehen

On 6/10/19 4:25 PM, Stas Malyshev wrote:
>> Just a note here: Virtuoso is also a full RDMS, so you could probably
>> keep wikibase db in the same cluster and fix the asynchronicity. That is
> Given how the original data is stored (JSON blob inside mysql table) it
> would not be very useful. In general, graph data model and Wikitext data
> model on top of which Wikidata is built are very, very different, and
> expecting same storage to serve both - at least without very major and
> deep refactoring of the code on both sides - is not currently very
> realistic. And of course moving any of the wiki production databases to
> Virtuoso would be a non-starter. Given than original Wikidata database
> stays on Mysql - which I think is a reasonable assumption - there would
> need to be a data migration pipeline for data to come from Mysql to
> whatever is the WDQS NG storage.
>

Hi Stas,

Data living in an RDBMS engine distinct from Virtuoso is handled via the
engines Virtual Database module i.e., you can build powerful RDF Views
over ODBC- or JDBC- accessible data using Virtuoso. These view also have
the option of being materialized etc..

[1]
https://medium.com/virtuoso-blog/conceptual-data-virtualization-for-sql-and-rdf-using-open-standards-24520925c7ce
-- Conceptual Data Virtualization using Virtuoso

[2]
https://medium.com/virtuoso-blog/generate-relational-tables-to-rdf-relational-graphs-mappings-using-virtuosos-rdb2rdf-wizard-c4b83402599a
-- RDF Views generation over SQL RDBMS data sources using the Virtuoso
Wizard


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Scaling Wikidata Query Service

2019-06-11 Thread Kingsley Idehen

On 6/10/19 4:46 PM, Stas Malyshev wrote:
> Hi!
>
>> thanks for the elaboration. I can understand the background much better.
>> I have to admit, that I am also not a real expert, but very close to the
>> real experts like Vidal and Rahm who are co-authors of the SWJ paper or
>> the OpenLink devs.
> If you know anybody at OpenLink that would be interested in trying to
> evaluate such thing (i.e. how Wikidata could be hosted on Virtuso) and
> provide support for this project, it would be interesting to discuss it.
> While open-source thing is still a barrier and in general the
> requirements are different, at least discussing it and maybe getting
> some numbers might be useful.
>
> Thanks,
> -- Stas Malyshev smalys...@wikimedia.org


I am listening.

I am only a ping away.

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Scaling Wikidata Query Service

2019-06-11 Thread Kingsley Idehen

 this service provides Linked
Data transformation combined with an ability to de-ref URI-variables and
URI-constants in the body of a query as part of the solution production
pipeline; it also includes a service that adds image processing to the
aforementioned pipeline via the PivotViewer module for data visualization

[5]
https://medium.com/virtuoso-blog/what-is-small-data-and-why-is-it-important-fbf5f267884
-- About Small Data (use of URI-dereference to tackle thorny data access
challenges by leveraging the power of HTTP URIs as Super Keys)


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Scaling Wikidata Query Service

2019-06-11 Thread Kingsley Idehen

On 6/10/19 10:54 AM, Guillaume Lederrey wrote:
>> - Virtuoso has proven quite useful. I don't want to advertise here, but the 
>> thing they have going for DBpedia uses ridiculous hardware, i.e. 64GB RAM 
>> and it is also the OS version, not the professional with clustering and 
>> repartition capability. So we are playing the game since ten years now: 
>> Everybody tries other databases, but then most people come back to virtuoso. 
>> I have to admit that OpenLink is maintaining the hosting for DBpedia 
>> themselves, so they know how to optimise. They normally do large banks as 
>> customers with millions of write transactions per hour. In LOD2 they also 
>> implemented column store features with MonetDB and repartitioning in 
>> clusters.
> I'm not entirely sure how to read the above (and a quick look at
> virtuoso website does not give me the answer either), but it looks
> like the sharding / partitioning options are only available in the
> enterprise version. That probably makes it a non starter for us.
>

Virtuoso Cluster Edition is as described by Sebastian in an earlier post
to this thread [1]. Online that's behind our LOD Cloud cache which hosts
40 Billion+ triples, but still using ridiculously cheap hard-ware for
the share-nothing cluster.

As Jerven has already articulated [2], the single-server open source
edition of Virtuoso can also scale to 40 Billion+ triples as
demonstrated by Uniprot amongst others.

There's a publicly available Google Spreadsheet that provides insights
into a variety of Virtuoso configurations that you can also look at
regarding resource requirements [3].

Bottom line, Virtuoso has no fundamental issues with performance, scale,
or security (most haven't hit this bump yet, but its coming!) regarding
RDF-data deployed in line with Linked Data principles.

We are always opened to collaboration with anyone (or group) seeking to
fully exploit the power and promise of a Semantic Web derived from
Linked Data :)

Links:

[1] https://lists.wikimedia.org/pipermail/wikidata/2019-June/013132.html
-- Sebastian Hellman comment

[2] https://lists.wikimedia.org/pipermail/wikidata/2019-June/013143.html
-- Jerven Bolleman comment

[3]
https://docs.google.com/spreadsheets/d/1-stlTC_WJmMU3xA_NxA1tSLHw6_sbpjff-5OITtrbFw/
<https://docs.google.com/spreadsheets/d/1-stlTC_WJmMU3xA_NxA1tSLHw6_sbpjff-5OITtrbFw/edit?ouid=112399767740508618350=sheets_home=true>
-- Virtuoso configurations sample spreadsheet

[4] https://hub.docker.com/u/openlink/ -- Docker Hub offerings

[5] https://aws.amazon.com/marketplace/pp/B00ZWMSNOG -- Amazon
Marketplace BYOL Edition

[6] https://aws.amazon.com/marketplace/pp/B011VMCZ8K -- Amazon
Marketplace PAGO Edition

[7] https://github.com/openlink/virtuoso-opensource -- Github

[8] http://download.openlinksw.com -- Download Site


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Accessing tabular data from SPARQL?

2019-06-03 Thread Kingsley Idehen

On 5/31/19 11:28 AM, Yuri Astrakhan wrote:
> I actually already implemented support in SPARQL for that, but it
> needs a bit more work to get it properly merged with the Blazegraph
> code.  I had it working for a while as part of Sophox (OSM Sparql).
>
> * docs:  https://wiki.openstreetmap.org/wiki/Sophox#External_Data_Sources
> *
> code:  
> https://github.com/Sophox/wikidata-query-rdf/compare/master...Sophox:tabular
> (see Tabular* files)
> * phabricator discussion about the above
> code:  https://phabricator.wikimedia.org/T181319
>
> Tabular support allows any CSV-style tables to be treated as federated
> sources. With minor changes it should be possible to use mediawiki's
> .tab pages too.


Hi Yuri,


What is the SPARQL Query Service endpoint? Basically, the equivalent of
: http://query.wikidata.org/sparql ??


Kingsley

>
> On Fri, May 31, 2019 at 6:01 PM Daniel Mietchen
>  <mailto:daniel.mietc...@googlemail.com>> wrote:
>
> Hi,
> I'm looking into ways to use tabular data like
> https://commons.wikimedia.org/wiki/Data:Zika-institutions-test.tab
> in SPARQL queries but could not find anything on that.
>
> My motivation here is in part coming from the time out limits, and the
> basic idea here would be to split queries that typically time out into
> sets of queries that do not time out and - if their results were
> aggregated - would yield the results that would be expected for the
> original query would it not time out.
>
> The second line of motivation here is that of keeping track of how
> things develop over time, which would be interesting for both content
> and maintenance queries as well as usage of things like classes,
> references, lexemes or properties.
>
> I would appreciate any pointers or thoughts on the matter.
>
> Thanks,
>
> Daniel
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: 
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
  http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Problematic SPARQL-FED Query

2017-12-06 Thread Kingsley Idehen

Hi Everyone,

Does anyone know why the SPARQL-FED query at:
http://tinyurl.com/ycc2tkp3, is failing?

Query Text:

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX schema: <http://schema.org/>
PREFIX bd: <http://www.bigdata.com/rdf#>
PREFIX psn: <http://www.wikidata.org/prop/statement/value-normalized/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dct: <http://purl.org/dc/terms/>


SELECT ?item ?dbpediaID  ?label ?subjectName ?prominence ?image
WHERE
{
 {
        SELECT ?item ?itemLabel ?coord ?prominence ?layer ?image
        WHERE {
                  ?item wdt:P31 wd:Q8502. # a mountain
                  ?item wdt:P625 ?coord.
                  ?item wdt:P17 wd:Q39. # in Switzerland
                  ?item wdt:P2660 ?prominence .
                  BIND(
                    IF(?prominence < 1000, "<1000 metres",
                    IF(?prominence < 2000, "1000 - 2000 metres",
                    IF(?prominence < 3000, "2000 - 3000 metres",
                    IF(?prominence < 4000, "3000 - 4000 metres",
                    "> 4000 metres"
                    AS ?layer).
                  OPTIONAL {?item wdt:P18 ?image.}
                  SERVICE wikibase:label { bd:serviceParam
wikibase:language "[AUTO_LANGUAGE],en". }
            }
           
        LIMIT 200 }
   
   
    SERVICE <http://dbpedia.org/sparql>
         {
            SELECT DISTINCT ?dbpediaID ?name ?label ?subjectName
            WHERE {
                    ?dbpediaID owl:sameAs ?item ;
    rdfs:label ?label ;
                    dct:subject ?subject.
                    FILTER (LANG(?label) = "en")
                   
                    ?subject rdfs:label ?subjectName .
   
                  }

          }     

}
   

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Input needed on Wikidata:Schema.org

2017-05-05 Thread Kingsley Idehen

On 5/5/17 10:35 AM, David Cuenca Tudela wrote:
> Hi all,
>
> In case you didn't notice Denny posted a great RFC yesterday.
> Apparently Schema.org is considering to encourage the use of Wikidata
> as a common entity base for the target of the sameAs relation.
>
> To read more and to give feedback, check:
> https://www.wikidata.org/wiki/Wikidata:Schema.org
>
> In my opinion this is great news. Thanks for reaching out, Denny! :)
>
> Cheers,
> Micru
>

Hi Micru,

Are you referring to owl:sameAs or schema:sameAs. Those entity
relationship types are utterly unalike, hence my quest for clarification :)


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wiki PageID

2017-05-05 Thread Kingsley Idehen

On 5/4/17 8:29 PM, Denny Vrandečić wrote:
> Aren't both ... uhm ... "use cases" supported by dbpedia proper anyway?
>

Yes, but "best practices" in a single place don't solve the problems
associated with mass appreciation and adoption. Basically, the more the
merrier!

Why?  Because I don't believe in the power of one, with regards to a
Semantic Web of Linked Data.


Kingsley
>
> On Thu, May 4, 2017 at 3:40 AM Kingsley Idehen <kide...@openlinksw.com
> <mailto:kide...@openlinksw.com>> wrote:
>
> On 5/3/17 3:37 PM, Nicholas Humfrey wrote:
> >
> > On 26/04/2017, 15:41, "Wikidata on behalf of Kingsley Idehen"
> > <wikidata-boun...@lists.wikimedia.org
> <mailto:wikidata-boun...@lists.wikimedia.org> on behalf of
> kide...@openlinksw.com <mailto:kide...@openlinksw.com>>
> > wrote:
> >
> >> Hi Nick,
> >>
> >> Please don't decommission dbpedialite, it does provide utility
> on other
> >> fronts too :)
> >>
> >
> > Hi Kingsley,
> >
> > Could you elaborate? I was planning on turning dbpedialite into 301
> > redirects to Wikidata for a period of time before switching it off.
> >
> >
> > nick.
> >
> >
> >
> > -
> > http://www.bbc.co.uk
> >
>
> Nick,
>
> dbpedialite provides a "best practices" demonstration for:
>
> 1. Linked Data Deployment -- i.e., it supports both
> content-negotiation
> and metadata embedded in HTML deployment options
>
> 2. Bridging across DBpedia, Wikidata, and Wikipedia -- this also
> provides value to DBpedia and Wikidata with regards to cross-reference
> reconcilliation.
>
> I believe the items above remain important :)
>
>
> --
> Regards,
>
> Kingsley Idehen
> Founder & CEO
> OpenLink Software   (Home Page: http://www.openlinksw.com)
>
> Weblogs (Blogs):
> Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
> <http://www.openlinksw.com/blog/%7Ekidehen/>
> Blogspot Blog: http://kidehen.blogspot.com
> Medium Blog: https://medium.com/@kidehen
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn: http://www.linkedin.com/in/kidehen
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
> :
> 
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wiki PageID

2017-04-26 Thread Kingsley Idehen

On 4/24/17 10:51 AM, Nicholas Humfrey wrote:
> Hi,
>
> A number of years ago I was having some very frustrating times with
> the identifier instability in dbpedia. Two people looking up an
> identifier for the same concept at different times ended up with
> different identifiers.
>
> So I created a proof of concept, dbpedialite, which uses Wikipedia
> PageIDs:
> http://www.dbpedialite.org/things/87851
> (At the time there was a page title edit war between Stoat and Ermine)
>
>
> But now we have Wikidata – which solves this problem much better – so
> I should really get on and decommission dbpedialite.
>
> What are you using Wikipedia Page IDs for?  Might it be better to
> store the Wikidata ID and then lookup the Wikipedia page on demand?
>
>
> nick.
>

Hi Nick,

Please don't decommission dbpedialite, it does provide utility on other
fronts too :)


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Correction: Federation in the Wikidata Query Service

2017-03-31 Thread Kingsley Idehen

On 3/31/17 7:24 PM, Jan Macura wrote:
> On 31 March 2017 at 23:40, Kingsley Idehen <kide...@openlinksw.com
> <mailto:kide...@openlinksw.com>> wrote:
>
> Are controlled by us, and they are CC-BY-SA.  That needs to be
> fixed in the endpoint skin (UI) and metadata. I'll have that fixed.
>
>
> Thanks for your concern!
> I can't see any license information at the endpoint page. SPOI has the
> license info in both its metadata (DOAP
> <http://sdi4apps.eu/spoi/doap-spoi.rdf> & VoID
> <http://sdi4apps.eu/spoi/void-spoi.ttl>). It's not sufficient?
>
>  Jan

That's sufficient.

Note, this is a reply to a mail that should have been replaced by my
correction mail :)


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Federation in the Wikidata Query Service

2017-03-31 Thread Kingsley Idehen

On 3/31/17 4:46 PM, Jan Macura wrote:
> That is a great news!
>
> How about adding an ODbL licensed service? Would it be possible? I am
> thinking about SPOI <http://sdi4apps.eu/spoi/> and their SPARQL
> endpoint <http://data.plan4all.eu/sparql>..
>
> Thanks
>  Jan

Jan,

As an aside, for those who might be interested, here's a link to a Tweet
re., analyzing licenses associated with content loaded into our
URIBurner service:
https://twitter.com/kidehen/status/847934563036909569

Kingsley
>
> On 31 March 2017 at 21:19, Kingsley Idehen <kide...@openlinksw.com
> <mailto:kide...@openlinksw.com>> wrote:
>
> On 3/31/17 3:12 PM, Stas Malyshev wrote:
>> Hi!
>>
>>> What is the process for getting a query service added? I would like to
>>> add the Librarybase endpoint, http://sparql.librarybase.wmflabs.org/
>>> <http://sparql.librarybase.wmflabs.org/>,
>>> but I want to make sure that it meets all the requirements.
>> Right now it's
>> https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input
>> <https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input>. After
>> we're done with CC-By ones, I'll probably clean up that page a bit and
>> update it for more long-term structure but it seems something like that
>> would work.
>
> Stas,
>
> You can also add the SPARQL endpoint associated with our URIBurner
> service. This service will also allow you to nest SPARQL-FED Queries.
>
> [1] http://linkeddata.uriburner.com <http://linkeddata.uriburner.com>
>
> [2] http://linkeddata.uriburner.com/sparql
> <http://linkeddata.uriburner.com/sparql>
>
> -- 
> Regards,
>
> Kingsley Idehen 
> Founder & CEO 
> OpenLink Software   (Home Page: http://www.openlinksw.com)
>
> Weblogs (Blogs):
> Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
> <http://www.openlinksw.com/blog/%7Ekidehen/>
> Blogspot Blog: http://kidehen.blogspot.com
> Medium Blog: https://medium.com/@kidehen
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> <https://www.pinterest.com/kidehen/>
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> <https://www.quora.com/profile/Kingsley-Uyi-Idehen>
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> <https://plus.google.com/+KingsleyIdehen/about>
> LinkedIn: http://www.linkedin.com/in/kidehen
> <http://www.linkedin.com/in/kidehen>
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
> <http://kingsley.idehen.net/dataspace/person/kidehen#this>
> : 
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
> 
> <http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this>
>
> ___ Wikidata mailing
> list Wikidata@lists.wikimedia.org
> <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
> <https://lists.wikimedia.org/mailman/listinfo/wikidata> 
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Correction: Federation in the Wikidata Query Service

2017-03-31 Thread Kingsley Idehen

On 3/31/17 4:46 PM, Jan Macura wrote:
> That is a great news!
>
> How about adding an ODbL licensed service? Would it be possible? I am
> thinking about SPOI <http://sdi4apps.eu/spoi/> and their SPARQL
> endpoint <http://data.plan4all.eu/sparql>..
>
> Thanks
>  Jan

Correction,

[1] http://linkeddata.uriburner.com <http://linkeddata.uriburner.com>

[2] http://linkeddata.uriburner.com/sparql
<http://linkeddata.uriburner.com/sparql>

Are controlled by us, and they are CC-BY-SA.  That needs to be fixed in
the endpoint skin (UI) and metadata. I'll have that fixed.

Kingsley
>
> On 31 March 2017 at 21:19, Kingsley Idehen <kide...@openlinksw.com
> <mailto:kide...@openlinksw.com>> wrote:
>
> On 3/31/17 3:12 PM, Stas Malyshev wrote:
>> Hi!
>>
>>> What is the process for getting a query service added? I would like to
>>> add the Librarybase endpoint, http://sparql.librarybase.wmflabs.org/
>>> <http://sparql.librarybase.wmflabs.org/>,
>>> but I want to make sure that it meets all the requirements.
>> Right now it's
>> https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input
>> <https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input>. After
>> we're done with CC-By ones, I'll probably clean up that page a bit and
>> update it for more long-term structure but it seems something like that
>> would work.
>
> Stas,
>
> You can also add the SPARQL endpoint associated with our URIBurner
> service. This service will also allow you to nest SPARQL-FED Queries.
>
> [1] http://linkeddata.uriburner.com <http://linkeddata.uriburner.com>
>
> [2] http://linkeddata.uriburner.com/sparql
> <http://linkeddata.uriburner.com/sparql>
>
> -- 
> Regards,
>
> Kingsley Idehen 
> Founder & CEO 
> OpenLink Software   (Home Page: http://www.openlinksw.com)
>
> Weblogs (Blogs):
> Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
> <http://www.openlinksw.com/blog/%7Ekidehen/>
> Blogspot Blog: http://kidehen.blogspot.com
> Medium Blog: https://medium.com/@kidehen
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> <https://www.pinterest.com/kidehen/>
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> <https://www.quora.com/profile/Kingsley-Uyi-Idehen>
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> <https://plus.google.com/+KingsleyIdehen/about>
> LinkedIn: http://www.linkedin.com/in/kidehen
> <http://www.linkedin.com/in/kidehen>
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
> <http://kingsley.idehen.net/dataspace/person/kidehen#this>
> : 
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
> 
> <http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this>
>
> ___ Wikidata mailing
> list Wikidata@lists.wikimedia.org
> <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
> <https://lists.wikimedia.org/mailman/listinfo/wikidata> 
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Federation in the Wikidata Query Service

2017-03-31 Thread Kingsley Idehen

On 3/31/17 4:46 PM, Jan Macura wrote:
> That is a great news!
>
> How about adding an ODbL licensed service? Would it be possible? I am
> thinking about SPOI <http://sdi4apps.eu/spoi/> and their SPARQL
> endpoint <http://data.plan4all.eu/sparql>..
>
> Thanks
>  Jan

We don't control that service. By default, Virtuoso Open Source Edition
SPARQL endpoints are typically CC-BY-SA.

Kingsley
>
> On 31 March 2017 at 21:19, Kingsley Idehen <kide...@openlinksw.com
> <mailto:kide...@openlinksw.com>> wrote:
>
> On 3/31/17 3:12 PM, Stas Malyshev wrote:
>> Hi!
>>
>>> What is the process for getting a query service added? I would like to
>>> add the Librarybase endpoint, http://sparql.librarybase.wmflabs.org/
>>> <http://sparql.librarybase.wmflabs.org/>,
>>> but I want to make sure that it meets all the requirements.
>> Right now it's
>> https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input
>> <https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input>. After
>> we're done with CC-By ones, I'll probably clean up that page a bit and
>> update it for more long-term structure but it seems something like that
>> would work.
>
> Stas,
>
> You can also add the SPARQL endpoint associated with our URIBurner
> service. This service will also allow you to nest SPARQL-FED Queries.
>
> [1] http://linkeddata.uriburner.com <http://linkeddata.uriburner.com>
>
> [2] http://linkeddata.uriburner.com/sparql
> <http://linkeddata.uriburner.com/sparql>
>
> -- 
> Regards,
>
> Kingsley Idehen 
> Founder & CEO 
> OpenLink Software   (Home Page: http://www.openlinksw.com)
>
> Weblogs (Blogs):
> Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
> <http://www.openlinksw.com/blog/%7Ekidehen/>
> Blogspot Blog: http://kidehen.blogspot.com
> Medium Blog: https://medium.com/@kidehen
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> <https://www.pinterest.com/kidehen/>
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> <https://www.quora.com/profile/Kingsley-Uyi-Idehen>
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> <https://plus.google.com/+KingsleyIdehen/about>
> LinkedIn: http://www.linkedin.com/in/kidehen
> <http://www.linkedin.com/in/kidehen>
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
> <http://kingsley.idehen.net/dataspace/person/kidehen#this>
> : 
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
> 
> <http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this>
>
> ___ Wikidata mailing
> list Wikidata@lists.wikimedia.org
> <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
> <https://lists.wikimedia.org/mailman/listinfo/wikidata> 
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Federation in the Wikidata Query Service

2017-03-31 Thread Kingsley Idehen

On 3/31/17 3:12 PM, Stas Malyshev wrote:
> Hi!
>
>> What is the process for getting a query service added? I would like to
>> add the Librarybase endpoint, http://sparql.librarybase.wmflabs.org/,
>> but I want to make sure that it meets all the requirements.
> Right now it's
> https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input. After
> we're done with CC-By ones, I'll probably clean up that page a bit and
> update it for more long-term structure but it seems something like that
> would work.

Stas,

Some sources of SPARQL Endpoints that could be of interest:

[1] https://del.icio.us/kidehen/sparql_endpoint

[2]
http://linkeddata.uriburner.com/sparql?default-graph-uri=urn%3Asparql%3Aendpoint%3Alist=select+distinct+*++where+%7B%3FendPoint+a+%3FentityType%7D==text%2Fx-html%2Btr_redir_for_subjs=121_redir_for_hrefs==3000
 
-- SPARQL Query Results page listing some of the endpoints in the LOD Cloud

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Federation in the Wikidata Query Service

2017-03-31 Thread Kingsley Idehen

On 3/31/17 3:12 PM, Stas Malyshev wrote:
> Hi!
>
>> What is the process for getting a query service added? I would like to
>> add the Librarybase endpoint, http://sparql.librarybase.wmflabs.org/,
>> but I want to make sure that it meets all the requirements.
> Right now it's
> https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input. After
> we're done with CC-By ones, I'll probably clean up that page a bit and
> update it for more long-term structure but it seems something like that
> would work.

Stas,

You can also add the SPARQL endpoint associated with our URIBurner
service. This service will also allow you to nest SPARQL-FED Queries.


[1] http://linkeddata.uriburner.com

[2] http://linkeddata.uriburner.com/sparql


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] wikibase:directClaim predicate?

2017-03-27 Thread Kingsley Idehen

On 3/27/17 11:42 AM, Markus Kroetzsch wrote:
> On 27.03.2017 15:13, Kingsley Idehen wrote:
>> On 3/18/17 6:15 PM, Daniel Kinzler wrote:
> ...
>>
>> Daniel,
>>
>> I see Wikidata is a collection of reified RDF Statements. I don't see
>> how this model differs from RDF's model. It just so happens (in my eyes)
>> that Wikidata includes description of statements about things which
>> provides rich metadata, in line with the goals of Wikidata.
>
> Kingsley,
>
> Wikidata is not a collection of reified RDF statements, but it can
> partially be captured by such a collection. There are many ways of
> doing this; see [1] for a comparison of some of the more prominent
> approaches. All of these encodings can "capture" Wikidata in some way,
> but they are not equivalent in terms of RDF or SPARQL. It would
> therefore be wrong to claim that any of these possible encodings "is"
> Wikidata.
>
> Different RDF encodings are not only non-equivalent but also behave
> very differently in practice. Some queries that work well for one
> model are very slow or outright impossible to express in another model
> [1]. One can therefore not say that the encoding is just a detail and
> that Wikidata somehow "in principle" is RDF anyway.
>
> This said, the current Wikidata RDF export should serve the needs of
> most people who want to work with an RDF toolchain while having access
> to most Wikidata content. One cannot get all details from this
> projection, but one can do most practically useful things.
>
> Regards,
>
> Markus
>
> [1] Daniel Hernández, Aidan Hogan, Markus Krötzsch
> Reifying RDF: What Works Well With Wikidata?
> In Thorsten Liebig and Achille Fokoue, eds., Proceedings of the 11th
> International Workshop on Scalable Semantic Web Knowledge Base
> Systems, volume 1457 of CEUR Workshop Proceedings, 32-47, 2015.
> CEUR-WS.org
> https://iccl.inf.tu-dresden.de/web/Inproceedings3037

Markus,

Let's agree to disagree, for now :)

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] wikibase:directClaim predicate?

2017-03-27 Thread Kingsley Idehen

On 3/27/17 10:42 AM, Daniel Kinzler wrote:
> Am 27.03.2017 um 15:13 schrieb Kingsley Idehen:
>> I see Wikidata is a collection of reified RDF Statements. I don't see how 
>> this
>> model differs from RDF's model. It just so happens (in my eyes) that Wikidata
>> includes description of statements about things which provides rich 
>> metadata, in
>> line with the goals of Wikidata.
> It's a matter of perspective.
>
> I agree that Wikidata can be *represented* as a collection of reified RDF
> Statements. That's what we do for the query service. But I do not agree that
> this is what Wikidata *is*.

My point is that your model boils down to treating statements as "first
class citizens" so to speak. If true, then it is as I described i.e.,
still an RDF model, but with emphasis of statement reification, which is
actually a good thing.
>
> RDF and the Wikibase model are quite different conceptually.

RDF doesn't exclude reification. Put differently, using reification
doesn't amount to a new model different from that of RDF.

>  But they are of
> equal power and thus formally equivalent: one can be represented using the
> other. Just because a Turing Machine is computationally equivalent to lambda
> calculus, that does not mean they are the same thing.

I am not implying that.

>  Understanding one in terms
> of the other may be helpful in some context, and irrelevant in another.
>
> There is nothing special about the relationship between Wikibase/Wikidata and
> RDF; Wikibase has an RDF binding, but it is not defined in terms of RDF, its
> specification does not rely on RDF concepts.

RDF concepts boil down to the use of sentences to describe anything,
just as we do everyday in the so-called real world.

>  The Wikibase model can just as well
> (or perhaps more easily) be understood and represented in terms of the Topic
> Maps model (ISO 13250).
>
> Academically, the Wikibase model could perhaps be described as an extended 
> model
> logic with reasoning rules for provenance. I think W. Stelzner explored 
> related
> ideas in the 80s. Maybe one day I'll find the time to dig into this some more.

I think we can just agree to disagree for now, since nothing you've
stated is fundamentally contrary to my view of RDF --  as a Language for
describing anything (including statements)  :)
>
>


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Documentation sprint for Wikidata during the Wikimedia Hackathon

2017-03-17 Thread Kingsley Idehen

On 3/15/17 12:15 AM, Rick Labs wrote:
>
> Kingsley,
>
> Wanted to thank you very much for your valuable post! Its a great
> introduction to making the transition from a table/Excel/spreadsheet
> view of data over to, as you say, /"a//
> //collection of RDF statements grouped by statement Predicate"/
>
> Those of us working on the Company Data project typically come with
> that  table orientation background. Having a "learning path" laid out
> transitioning to the SPARQL world is very helpful.
>
> I'm very fuzzy on basic "inheritance" here at Wikidata.
>
> For example Company->Financial Statements->Income Statement for
> 2016Q4->total revenue->some number
>
>   * Total revenue needs the /*time period*/ attached to it (here start
> and end dates for the quarter); others need point-in-time
> measurements, e.g. as of 12/31/2016)
>   * The total revenue needs to have an associated */currency
> /*attached to it.
>   * The Income Statement for 2016Q4 needs to have a specific
> */accounting standard/* attached to it (for example US GAAP 2017,
> IFRS 2016, more at
> https://www.sec.gov/info/edgar/edgartaxonomies.shtml, and more
> outside the U.S.. The accounting standard followed in preparing
> the numbers must be very specific to help with concordance across
> different standards (especially across countries)
>   * The company needs to have a "dominate" or "default" /*industry
> code*/ attached to it. WikiData might best go with 56 industries
> classified according to the '''International Standard Industrial
> Classification revision 4 (ISIC Rev. 4)'''. This is the set used
> by the World Input-Output tables http://www.wiod.org/home. They
> take data from all 28 EU countries and 15 other major countries in
> the world and transform it to be comparable using these
> industries. Its the broadest "nearly global" coverage I can find.
> It would be also advisable to accommodate multiple industry
> assignments per entity / establishment, each with the standard and
> year which were followed, applied from a specifically enumerated
> list. For example in North America data will often be available
> according to the most current, and highly granular 2017 NAICS
> system https://www.census.gov/eos/www/naics/ and there are
> concordances between versions see:
> https://www.census.gov/eos/www/naics/concordances/concordances.html
> and https://unstats.un.org/unsd/cr/registry/isic-4.asp. Looking
> towards the future where large amounts of company data are machine
> imported it would be best to preserve the original, most detailed
> industry codes available (such as the 6 digit NACIS code) and
> preserve the standard and year associated with that assigned
> code(s). Given the year and the detail the concordances can later
> be used to machine add different codes as needed. Granular users
> are then accommodated, and people looking to do cross country /
> global analysis (at the 56 industry level) are also accommodated.
>
> When I look at the above challenge I think of your prescription of how
> to make RDF collections easier to read.
>
> 1. Addition of annotation relations esp., the likes of rdfs:label,
> skos:prefLabel, skos:altLabel, schema:name, foaf:name, rdfs:comment,
> schema:description etc..
>
> 2. Addition (where possible) use of relations such as foaf:depiction,
> schema:image etc..
>
> Adhering to the above *leads to RDF statement collections that are
> easier**
> **to read*, without the confusing nature of the term "graph"
> getting in the
> way. At the end of the day, RDF is simply an abstract language for
> creating structured data using a variety of notations (RDF-Turtle,
> RDF-NTriples, JSON-LD, RDF-XML etc..). *It isn't a format, but sadly**
> **that's how it is still perceived* by most circa., 2017 (even
> though the
> initial RDF definition snafu on this front occurred around 2000). 
>
> And I can't help but be intensely curious as to what happened in that
> 2000 initial RDF definition snafu?
>

Creating and perpetuating the misconception that RDF/XML == RDF. That
was compounded by a Layer Cake diagram that actually depicted the
misconception that RDF was built atop XML.

Today folks still get distracted by JSON-LD vs RDF-Turtle vs RDF-XML vs
RDFa vs Microdata notations for constructing RDF Language
sentences/statements. Net effect, unleashing the real power behind a
Semantic Web continues to hit unnecessary hiccups.


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home

Re: [Wikidata] Documentation sprint for Wikidata during the Wikimedia Hackathon

2017-03-02 Thread Kingsley Idehen

On 3/2/17 11:48 AM, Rick Labs wrote:
> Perhaps high quality documentation already exists?  Would be great to
> have at least a syllabus (learn this first, then move on to this, then
> on to...  Might be good to also have common / high value "use-case"
> scenarios with pointers to documentation/tutorials that cover it.
> Existing example queries are very helpful but many are complex. For
> training purposes we need a graduated set of examples, that are
> designed step-by-step to teach how to construct queries. 

The trouble here isn't really SQL to SPARQL etc.. In my experience, it's
more to do with understanding what data is and the nature of data
representation. Having arrived at the aforementioned conclusion over the
years, I published a presentation titled "Understanding Data" as an aid
in this area [1].

SQL and SPARQL aren't very good starting points because literature
associated with both assume some fundamental understanding about the
nature of data (relations) against which they operate.

If one starts the journey with data representation comprehension
combined with clarity about RDF as a language, my hope is that folks
reach a point where creating RDF statements always includes (so SPARQL
compliant servers don't need to inject workarounds for label injection
into query solutions):

1. Addition of annotation relations esp., the likes of rdfs:label,
skos:prefLabel, skos:altLabel, schema:name, foaf:name, rdfs:comment,
schema:description etc..

2. Addition (where possible) use of relations such as foaf:depiction,
schema:image etc..

Adhering to the above leads to RDF statement collections that are easier
to read, without the confusing nature of the term "graph" getting in the
way. At the end of the day, RDF is simply an abstract language for
creating structured data using a variety of notations (RDF-Turtle,
RDF-NTriples, JSON-LD, RDF-XML etc..). It isn't a format, but sadly
that's how it is still perceived by most circa., 2017 (even though the
initial RDF definition snafu on this front occurred around 2000).

SPARQL is a Query Language for operating on data represented as a
collection of RDF statements grouped by statement Predicate, as opposed
to SQL which is oriented towards data represented as Records grouped by
Table.

Links:

[1] https://www.slideshare.net/kidehen/understanding-29894555 --
Understanding Data

[2] http://www.openlinksw.com/data/turtle/general/GlossaryOfTerms.ttl 
-- Glossary that might also help with terminology

[3]
https://www.quora.com/What-is-the-Semantic-Web/answer/Kingsley-Uyi-Idehen

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this

smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Label gaps on Wikidata - (SPARQL help needed. SERVICE wikibase:label)

2017-02-23 Thread Kingsley Idehen

On 2/23/17 12:59 PM, Stas Malyshev wrote:
> Hi!
>
> On 2/23/17 7:20 AM, Thad Guidry wrote:
>> In Freebase we had a parameter %lang=all
>>
>> Does the SPARQL label service have something similar ?
> Not as such, but you don't need it if you want all the labels, just do:
>
> ?item rdfs:label ?label
>
> and you'd get all labels. No need to invoke service for that, the
> service is for when you have specific set of languages you're interested
> in.

Yep.

Example at: http://tinyurl.com/h2sbvhd

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Resolver for Sqid?

2017-02-17 Thread Kingsley Idehen

On 2/17/17 4:51 PM, Magnus Manske wrote:
>
>
> On Fri, Feb 17, 2017 at 8:57 PM Kingsley Idehen
> <kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:
>
> On 2/16/17 5:52 PM, Magnus Manske wrote:
> > I have extended the resolver to include squid and reasonator as
> targets:
> >
> >
> 
> https://tools.wmflabs.org/wikidata-todo/resolver.php?quick=VIAF:12307054=sqid
> >
> >
> 
> https://tools.wmflabs.org/wikidata-todo/resolver.php?quick=VIAF:12307054=reasonator
>
> Very cool!
>
> Question: What would it take to have DBpedia URIs added to the 100
> external identifier cross references? As you know, there are
> owl:sameAs
> relations in DBpedia that have Wikidata URIs as objects. We should
> really make the mutually beneficial nature of DBpedia and Wikidata
> clearer [1][2][3], at every turn :)
>
>
> Not quite sure I follow. Do you want to
> * query Wikidata but open the respective DBpedia page instead?
> * query DBpedia but open the respective Wikidata page instead?
> * query and open DBpedia?

Much simpler than all of that.

My point is that pages like https://www.wikidata.org/wiki/Q44461, don't
have any DBpedia cross references in the Identifiers section. Once those
relations are added to Wikidata I would expect DBpedia URIs to simply
show up in he pages emitted by Wikidata tools (e.g., Reasonator, Squid
etc..).

Naturally, if there was an option for Reasonator, Squid etc. to work
with SPARQL endpoints, generically, that would be a huge bonus too :)

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Resolver for Sqid?

2017-02-17 Thread Kingsley Idehen

On 2/16/17 5:52 PM, Magnus Manske wrote:
> I have extended the resolver to include squid and reasonator as targets:
>
> https://tools.wmflabs.org/wikidata-todo/resolver.php?quick=VIAF:12307054=sqid
>
> https://tools.wmflabs.org/wikidata-todo/resolver.php?quick=VIAF:12307054=reasonator

Very cool!

Question: What would it take to have DBpedia URIs added to the 100
external identifier cross references? As you know, there are owl:sameAs
relations in DBpedia that have Wikidata URIs as objects. We should
really make the mutually beneficial nature of DBpedia and Wikidata
clearer [1][2][3], at every turn :)

Links?

[1]
https://medium.com/virtuoso-blog/on-the-mutually-beneficial-nature-of-dbpedia-and-wikidata-5fb2b9f22ada#.a8wyt3mab
-- 

[2]
http://linkeddata.uriburner.com/HtmlPivotViewer/edit.vsp?url=http%3A%2F%2Flinkeddata.uriburner.com%2Fc%2F8ITAH3%23%24view%24%3D1%26%24selection%24%3D75
-- PivotViewer Query Edit Mode view of Federated SPARQL Query with exits
to Wikidata


[3]
http://linkeddata.uriburner.com/HtmlPivotViewer/edit.vsp?url=http%3A%2F%2Flinkeddata.uriburner.com%2Fc%2F8ITAES%23%24view%24%3D1%26%24facet0%24%3Dvirtcxml%3AFacetSubjecttext
-- PivotViewer Query Edit Mode view of Federated SPARQL Query with exits
to DBpedia

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikiopinion - Structured opinions

2017-01-23 Thread Kingsley Idehen

On 1/21/17 12:42 PM, Quico Prol wrote:
> Maybe you would like to take a look to https://lib.reviews a wiki for
> reviewing all kind of things using wikidata items

Given the document at:
https://lib.reviews/review/ac478596-e571-4891-b882-5508aea9d9bf

How do I make the connection with what you state above re. wikidata items?

Kingsley
>
> 2017-01-05 20:39 GMT+01:00 Hector Perez <h...@hectorperezarenas.com
> <mailto:h...@hectorperezarenas.com>>:
>
> Kingsley, thanks for sending it to DBpedia's list too.
>
> I'll have a look at your Linked Data Middleware!
>
> On Wed, Jan 4, 2017 at 8:36 PM, Kingsley Idehen
> <kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:
>
> On 1/4/17 3:34 AM, Hector Perez wrote:
>>
>> To sum up, we think that a social network that challenges
>> what you post and organises who agrees on what and why would
>> complement Wikipedia and the traditional story telling. What
>> do you think? Would you like to join us? Should this project
>> be non-profit or for-profit? Would you donate or help us to
>> fund raise?
>>
>> Kind regards,
>>
>> Hector
>>
>> [1]. Original post:
>> 
>> https://medium.com/@HectorPerez/wikipedias-social-network-578b0257b8ae
>> 
>> <https://medium.com/@HectorPerez/wikipedias-social-network-578b0257b8ae>
>
> Nice idea! I've copied in the DBpedia list, as this would be
> of interest to that community also.
>
> I passed your Medium post through our Linked Data Middleware
> service en route to demonstrating what might complement your
> ultimate goal. Here are the results:
>
> [1]
> 
> http://linkeddata.uriburner.com/about/html/https/medium.com/@HectorPerez/wikipedias-social-network-578b0257b8ae#.bvs2iko2w
> 
> <http://linkeddata.uriburner.com/about/html/https/medium.com/@HectorPerez/wikipedias-social-network-578b0257b8ae#.bvs2iko2w>
>
> [2]
> 
> http://linkeddata.uriburner.com/describe/?url=http%3A%2F%2Flinkeddata.uriburner.com%2Fabout%2Fid%2Fentity%2Fhttps%2Fmedium.com%2F@HectorPerez%2Fwikipedias-social-network-578b0257b8ae=1
> 
> <http://linkeddata.uriburner.com/describe/?url=http%3A%2F%2Flinkeddata.uriburner.com%2Fabout%2Fid%2Fentity%2Fhttps%2Fmedium.com%2F@HectorPerez%2Fwikipedias-social-network-578b0257b8ae=1>
>
> Fundamentally, what you see is the effect of loosely-coupled
> NLP, AI, and Machine Learning oriented services that
> collectively contribute to a final Linked Open Data graph that
> represents a variety of entity relationships and entity
> relationship types :)
>
>
> -- 
> Regards,
>
> Kingsley Idehen 
> Founder & CEO 
> OpenLink Software   (Home Page: http://www.openlinksw.com)
>
> Weblogs (Blogs):
> Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
> <http://www.openlinksw.com/blog/%7Ekidehen/>
> Blogspot Blog: http://kidehen.blogspot.com
> Medium Blog: https://medium.com/@kidehen
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> <https://www.pinterest.com/kidehen/>
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> <https://www.quora.com/profile/Kingsley-Uyi-Idehen>
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> <https://plus.google.com/+KingsleyIdehen/about>
> LinkedIn: http://www.linkedin.com/in/kidehen
> <http://www.linkedin.com/in/kidehen>
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
> <http://kingsley.idehen.net/dataspace/person/kidehen#this>
> : 
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
> 
> <http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this>
>
> ___ Wikidata
> mailing list Wikidata@lists.wikimedia.org
> <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
> <https://lists.wikimedia.org/mailman/listinfo/wikidata> 
>
> ___ Wikidata mailing
> list Wikidata@lists.wikimedia.org
> <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.or

Re: [Wikidata] Wikiopinion - Structured opinions

2017-01-04 Thread Kingsley Idehen

On 1/4/17 3:34 AM, Hector Perez wrote:
>
> To sum up, we think that a social network that challenges what you
> post and organises who agrees on what and why would complement
> Wikipedia and the traditional story telling. What do you think? Would
> you like to join us? Should this project be non-profit or for-profit?
> Would you donate or help us to fund raise?
>
> Kind regards,
>
> Hector
>
> [1]. Original post:
> https://medium.com/@HectorPerez/wikipedias-social-network-578b0257b8ae
> <https://medium.com/@HectorPerez/wikipedias-social-network-578b0257b8ae>

Nice idea! I've copied in the DBpedia list, as this would be of interest
to that community also.

I passed your Medium post through our Linked Data Middleware service en
route to demonstrating what might complement your ultimate goal. Here
are the results:

[1]
http://linkeddata.uriburner.com/about/html/https/medium.com/@HectorPerez/wikipedias-social-network-578b0257b8ae#.bvs2iko2w

[2]
http://linkeddata.uriburner.com/describe/?url=http%3A%2F%2Flinkeddata.uriburner.com%2Fabout%2Fid%2Fentity%2Fhttps%2Fmedium.com%2F@HectorPerez%2Fwikipedias-social-network-578b0257b8ae=1

Fundamentally, what you see is the effect of loosely-coupled NLP, AI,
and Machine Learning oriented services that collectively contribute to a
final Linked Open Data graph that represents a variety of entity
relationships and entity relationship types :)


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] I'm calling it. We made it ;-)

2017-01-01 Thread Kingsley Idehen

On 12/31/16 4:20 PM, Kingsley Idehen wrote:
> On 12/31/16 5:57 AM, Lydia Pintscher wrote:
>> Folks,
>>
>> We're now officially mainstream ;-)
>> https://www.buzzfeed.com/katiehasty/song-ends-melody-lingers-in-2016?utm_term=.nszJxrKqR#.sknE4nVAg
>>
>>
>> Cheers
>> Lydia
>>
> Hi Lydia,
>
> Nice to see mainstream appreciation for sure.
>
> BTW -- Are you able to ping the author of the spreadsheet with regards
> to including Wikidata URIs in the dataset? Naturally, that provides
> powerful attribution and lookup functionality via a single link :)

Here's a tweaked spreadsheet I've quickly knocked up that includes
DBpedia URIs. Why? Because, I can sorta cheat with URI construction
based on the pattern used by DBpedia. Hopefully, someone could derive a
variant with a column for Wikidata URIs, and then share accordingly.


Link:

[1]
https://docs.google.com/spreadsheets/d/1Il_zNNG7m9YcRILHrbGwT6EmCpgGR68ZT2YpuvQrzwY/edit#gid=1644504213
-- Google Spreadsheet

[2]
https://docs.google.com/spreadsheets/d/1Il_zNNG7m9YcRILHrbGwT6EmCpgGR68ZT2YpuvQrzwY/export?format=csv
-- CSV rendition

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] I'm calling it. We made it ;-)

2016-12-31 Thread Kingsley Idehen

On 12/31/16 5:57 AM, Lydia Pintscher wrote:
> Folks,
>
> We're now officially mainstream ;-)
> https://www.buzzfeed.com/katiehasty/song-ends-melody-lingers-in-2016?utm_term=.nszJxrKqR#.sknE4nVAg
>
>
> Cheers
> Lydia
>
Hi Lydia,

Nice to see mainstream appreciation for sure.

BTW -- Are you able to ping the author of the spreadsheet with regards
to including Wikidata URIs in the dataset? Naturally, that provides
powerful attribution and lookup functionality via a single link :)


-- 
Happy New Year!

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-22 Thread Kingsley Idehen

On 12/22/16 3:37 AM, Ruben Verborgh wrote:
> Hi Kingsley,
>
>>> will see a substantial increase in server costs
>>> when they try to host that same data as a public SPARQL HTTP service.
>> Again subjective.
> No, that's not subjective, that's perfectly measurable.
> And that's exactly what we did in our research.

That doesn't negate the fact that your world view is subjective. You've
started this whole thing an fuzzy premise. For instance, why do you
think SPARQL exists, and how have you arrived at the conclusion that it
is some kind of Semantic Web frontier?

SPARQL Query Services are just one of many data definition and
manipulation services available to HTTP network users (public or
private) working with RDF relations.

In some cases, service providers use SPARQL to facilitate and/or
compliment Linked Open Data publishing efforts.
>
> The problem with the SPARQL protocol as an API
> is that the per-request cost is a) higher
> and b) much more variable than any other API.

A Protocol isn't the same thing as an Application Programming Interface
(API), in my world view. APIs provide interaction abstraction over
protocols.

ODBC and JDBC are APIs for building applications against RDBMS
applications that interact with relations represented as Tables, using
SQL (and in the case of Virtuoso, SQL, SPARQL, and the SPASQL hybrid).
Those APIs include abstractions over TCP/IP and other protocols. Jena,
Sesame, Redland, and others do provide APIs that offer similar
functionality to the aforementioned, with regards to RDF triple and quad
stores.

The SPARQL Protocol extends HTTP with an ability to include SPARQL
queries and solutions as part of its request and response payloads.

IMHO, your position is based on a claim that isn't being made by SPARQL
compliant product providers. I continue to sense some confusion about
how it has been used and spoken about, with regards to the early days of
the LOD community i.e., there's no Linked Data without a SPARQL
endpoint, or use of SPARQL etc..

SPARQL Query Language, Protocol, and Results Serialization Formats are
simply tools, like many others, that can be used solve a variety of
problems. Nobody every claimed (as far as I know) that the SPARQL
composite is (or was) a "silver bullet" .

>
> Everywhere else on the Web,
> APIs shield data consumers from the backend,
> limiting the per-request complexity.
> That's why they thrive and SPARQL endpoints don't.

See my comment above. Your characterization is inaccurate.

>
> Don't get me wrong, I'm happy with every
> highly available SPARQL endpoint out there.
> Wikidata and DBpedia are awesome.
> It's just that there are too few
> and I see cost as a major factor there.

It's hard to understand the statement above. Fundamentally, Wikidata &
DBpedia have addressed specific challenges and an inability of others to
emulate (in your world view) has little to do with SPARQL and everything
to do with motivation, engineering capability, and general experience
with RDBMS technology.

>
>> You are implying that cost vs benefit analysis don't
>> drive decisions to put services on the Web, of course they do.
> Quite the contrary, I am arguing that—and this is subjective—
> because cost/benefit analyses drive decisions on the Web,
> we will never have substantially more SPARQL endpoints
> on the public Web than we have now. They're just too expensive.

Like the statement you made prior, I am struggling to understand your
point. You can't simply throw "too expensive" at something, and decide
that's definitive for everyone. That simply isn't the route to a
coherent pitch.

You are taking the world view of a niche and declaring it universal.
What entity (in this case: Person or Organization) profile  would find
this endeavor expensive? A student, academic institution, commercial
company, government?

>
>>> Federation is where I think public SPARQL endpoints will fail,
>>> so it will be worthwhile to see what happens.
>> Really, then you will ultimately be surprised on that front too!
> I really really hope so.
> If one day, machines can execute queries on the Web
> as well as we can, I'd be really happy.

I still don't really understand what you mean by "as well as we can".
All I've seen thus far is a pitch about availability that is justifiably
slow, combined with an inability to deal with complex queries. I also
notice that you don't say much about:

1. change sensitivity and ;
2. actual data loading and deployment time, in a rapidly changing world
increasingly driven by data.

> My way to reach that is lightweight interfaces,
> but if it is possible with heavyweight interfaces,
> all the better.

Again, heavyweight and lightweight are totally subjective
characterizations :)

>
> Best,
>
> Ruben
>
>

-- 
Regards,

Kingsley Idehen

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Kingsley Idehen

On 12/21/16 4:57 PM, Ruben Verborgh wrote:
> Hi Kingsley,
>
>> The Semantic Web community hasn't focused exclusively on query execution
>> speed.
> Let me clarify myself:
> the scientific SemWeb community mostly focused on speed,
> as is apparent from publications about SPARQL query execution
> (and, from personal experience, many researchers and reviewers
>  still having trouble to understand why speed is not our main focus).

Research papers and conference workshops have focused on these matters,
for a variety of reasons.

As I said, the driver for this focus in reality,  is the 250 msec
response time which is a key threshold for human attentions when working
with solutions (on or offline).

>
>> Anyone that encounters a service (Web or Semantic Web) expects results
>> in acceptable timeframes (typically <= 250ms) , that's a function of
>> user behavior on the Web or anywhere else.
> Yes, and it is my opinion that public SPARQL endpoints overpromise in that 
> regard.

They don't.

SPARQL endpoints exist, and experience varies. Ditto motivations behind
the endpoints.

> The whole public SPARQL endpoint discourse has made us believe
> that it is actually realistic to have free+fast+high availability,
> as is the case for any other Web service.

There is no such thing as a free+fast+high availability solution that
costs the solution provider $0.00. That simply doesn't exists!!

> But given that SPARQL is more expressive per request
> than any other Web service I know, this cannot hold.
>
> In simple terms: SPARQL is a very expressive
> and hence very expensive API.
>
> In technical terms: show me any other API
> that exposes a PSPACE-complete interface.

SPARQL is a Query Language that includes an HTTP API accessible via
SAPRQL Endpoints.

What you are not accepting is the notion of queries that complete, in a
configurable query completion timeframe.
>
>> You will find that Wikidata, is doing the very same thing, but with much
>> more hardware at their disposal, since they have more funding than
>> DBpedia, at this point in time.
> Indeed.
>
>> Your "Simply doesn't work on the public Web" claim is subjective
> Let me clarify "simply doesn't work":
> companies/institutions that host their data in any other API on the Web
> will see a substantial increase in server costs
> when they try to host that same data as a public SPARQL HTTP service.

Again subjective. You are implying that cost vs benefit analysis don't
drive decisions to put services on the Web, of course they do.
> My claim is that this increase is so substantial,
> that SPARQL endpoints cannot become a reality on the public Web
> at the same customer cost (= often free) of any other API on that same Web,
> and hence will not become a reality.

The costs are not prohibitive. This is where I utterly completely with you.
> Concretely, for most institutions that want to make their data queryable for 
> free,
> the SPARQL protocol will simply be too expensive for their budgets.

Academic institutions, maybe. The rest of the world, it basic economics,
value propositions, and business models.

> Alternatives, like dumps, LD documents, TPF, might be feasible,
> but they all come at another cost.
> No silver bullet.

Has anyone told you that SPARQL Endpoints are a silver bullet?


Kingsley
>
> So far, that claim has not been proven wrong.
>
> Best,
>
> Ruben
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Kingsley Idehen

On 12/21/16 4:13 PM, Ruben Verborgh wrote:
> Federation is where I think public SPARQL endpoints will fail,
> so it will be worthwhile to see what happens.

Really, then you will ultimately be surprised on that front too!

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Kingsley Idehen

On 12/21/16 2:52 PM, Ruben Verborgh wrote:
> For this, I'd like to point to the overall aim of the LDF project,
> as documented on our website and papers.
> Summarizing: the SemWeb community has almost exclusively
> cared about speed so far concerning query execution.
> This has resulted in super-fast, but super-expensive services,
> which simply don't work on the public Web.
> More than half of all public SPARQL endpoints
> are down for more than 1.5 days each month [1].

Ruben,

The Semantic Web community hasn't focused exclusively on query execution
speed.

Anyone that encounters a service (Web or Semantic Web) expects results
in acceptable timeframes (typically <= 250ms) , that's a function of
user behavior on the Web or anywhere else. Thus, a less overarching
characterization would be as follows: The Linked Open Data community, a
sub segment of the Semantic Web community, has focused on providing
solutions that work, a prominent example (that I know well) is DBpedia,
and many bubbles around it in the LOD Cloud.

You will find that Wikidata, is doing the very same thing, but with much
more hardware at their disposal, since they have more funding than
DBpedia, at this point in time.

That basic response time expectations of users drives everything, all
the time.

The key issue here is all about what method a given service providers
chooses en route to addressing the expectations of users, as I've
outlined above. Fundamentally, each service provider will use a variety
of solution deployment techniques that boil down to:

1. Massive Server Clusters (sharded) and Proxies

2. Fast multi-threaded instances (no sharding but via replication
topologies) behind proxies (functioning as cops, so to speak).

Your "Simply doesn't work on the public Web" claim is subjective, I've
told you that repeatedly. I am sure others will ultimately tell you the
very same thing :)

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this

smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] demand for .nt dumps?

2016-11-08 Thread Kingsley Idehen

On 11/4/16 12:55 PM, Stas Malyshev wrote:
> Hi!
>
>> > BTW -- does the Wikidata SPARQL endpoint currently support CONSTRUCT and
>> > DESCRIBE queries? I ask because you can use that as one of the many
>> > options for dump production.
> Yes, such queries are supported, but using this for dump production is
> both not possible - since dumps are what the data in SPARQL service is
> loaded from - and very inefficient, we don't really need SPARQL server
> for that, it can be done with much simpler code.

I was trying to explain to you that you can use CONSTRUCT to produce a
one-off dump, per whatever you have in the dataset modifier (or body)
part of the SPARQL Query .

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] demand for .nt dumps?

2016-11-04 Thread Kingsley Idehen

On 11/4/16 5:43 AM, Lydia Pintscher wrote:
> Hey :)
>
> We are currently discussing if we should also offer .nt dumps:
> https://phabricator.wikimedia.org/T144103
> Since we'd need to set that up and maintain it I want to make sure
> there is actually demand for it. So if you'd like to have it and use
> it please let me know.
>
>
> Cheers
> Lydia
>

Lydia,

It is an important "best practice" to release RDF dumps e.g.,
RDF-Ntriples, RDF-Turtle, RDF-XML file collections.

BTW -- does the Wikidata SPARQL endpoint currently support CONSTRUCT and
DESCRIBE queries? I ask because you can use that as one of the many
options for dump production.

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Help with SPARQL or API or something to get subcategories

2016-10-17 Thread Kingsley Idehen

On 10/17/16 3:03 PM, Thad Guidry wrote:
> Kingsley,
>
> The http://pending.schema.org namespace will show up as this in RDF Turtle
>
> ns1:Q1251750  owl:equivalentClass ns3:Distillery .
> and this way in JSON-LD
>
> {
>
>  *
> @id: "http://www.wikidata.org/entity/Q1251750;,
>  *
> equivalentClass: "http://pending.schema.org/Distillery;
>
> },
> Thad
> +ThadGuidry <https://www.google.com/+ThadGuidry>

Thad,

I assume you will have @context in the JSON-LD doc preamble with regards
to owl: namespace mapping to "equivalentClass" ?

Anyway, here are some live examples of the effects of mappings (classes
and properties) across wikidata, schema.org, and dbpedia vocabularies
using the latest edition of DBpedia.


[1]
http://dbpedia.org/describe/?url=http%3A%2F%2Fdbpedia.org%2Fontology%2FAirport=urn%3Adbpedia%3Awikidata%3Aschema%3Amapping%3Ainference%3Arules=1
-- With Inference Context

[2]
http://dbpedia.org/describe/?url=http%3A%2F%2Fdbpedia.org%2Fontology%2FAirport=1
-- Without Inference Context

[3]
http://dbpedia.org/describe/?url=http%3A%2F%2Fdbpedia.org%2Fontology%2FBook=urn%3Adbpedia%3Awikidata%3Aschema%3Amapping%3Ainference%3Arules=1
-- With Inference Context

[4]
http://dbpedia.org/describe/?url=http%3A%2F%2Fdbpedia.org%2Fontology%2FBook=1
-- Without Inference Context

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Help with SPARQL or API or something to get subcategories

2016-10-17 Thread Kingsley Idehen

/sparql> 
 { 
SELECT ?s ?equivalentClass ?o
WHERE { ?s wdt:P1709 ?o .
BIND ( wdt:P1709 as ?equivalentClass)
FILTER( REGEX(STR(?o), "schema.org"))
  }
 }   
}
   
## Explicit Remapping to owl:equivalentClass relations
## which enables better reasoning and inference oriented data

PREFIX wdt: <http://www.wikidata.org/prop/direct/>

CONSTRUCT {?s owl:equivalentClass ?o} 
WHERE { SERVICE <http://query.wikidata.org/sparql> 
 { 
SELECT ?s ?equivalentClass ?o
WHERE { ?s wdt:P1709 ?o .
BIND ( wdt:P1709 as ?equivalentClass)
FILTER( REGEX(STR(?o), "schema.org"))
  }
 }   
}
   
Live Links:

[1]
http://linkeddata.uriburner.com/sparql?default-graph-uri==PREFIX+wdt%3A+%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E+%0D%0A%0D%0ACONSTRUCT+%7B%3Fs+owl%3AequivalentClass+%3Fo%7D++%0D%0AWHERE+%7B+SERVICE+%3Chttp%3A%2F%2Fquery.wikidata.org%2Fsparql%3E++%0D%0A+%7B++%0D%0A%09%09%09%09SELECT+%3Fs+%3FequivalentClass+%3Fo%0D%0A%09%09%09%09WHERE+%7B+%3Fs+wdt%3AP1709+%3Fo+.+%0D%0A%09%09%09%09%09%09BIND+%28+wdt%3AP1709+as+%3FequivalentClass%29%0D%0A%09%09%09%09FILTER%28+REGEX%28STR%28%3Fo%29%2C+%22schema.org%22%29%29+%0D%0A%09%09%09%09%09++%7D%0D%0A+%7D%0D%0A%7D==application%2Fld%2Bjson_redir_for_subjs=121_redir_for_hrefs==3000
-- JSON-LD

[2]
http://linkeddata.uriburner.com/sparql?default-graph-uri==PREFIX+wdt%3A+%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E+%0D%0A%0D%0ACONSTRUCT+%7B%3Fs+owl%3AequivalentClass+%3Fo%7D++%0D%0AWHERE+%7B+SERVICE+%3Chttp%3A%2F%2Fquery.wikidata.org%2Fsparql%3E++%0D%0A+%7B++%0D%0A%09%09%09%09SELECT+%3Fs+%3FequivalentClass+%3Fo%0D%0A%09%09%09%09WHERE+%7B+%3Fs+wdt%3AP1709+%3Fo+.+%0D%0A%09%09%09%09%09%09BIND+%28+wdt%3AP1709+as+%3FequivalentClass%29%0D%0A%09%09%09%09FILTER%28+REGEX%28STR%28%3Fo%29%2C+%22schema.org%22%29%29+%0D%0A%09%09%09%09%09++%7D%0D%0A+%7D%0D%0A%7D==text%2Fturtle_redir_for_subjs=121_redir_for_hrefs==3000
-- RDF-Turtle

[3]
http://linkeddata.uriburner.com/sparql?default-graph-uri==PREFIX+wdt%3A+%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E+%0D%0A%0D%0ACONSTRUCT+%7B%3Fs+owl%3AequivalentClass+%3Fo%7D++%0D%0AWHERE+%7B+SERVICE+%3Chttp%3A%2F%2Fquery.wikidata.org%2Fsparql%3E++%0D%0A+%7B++%0D%0A%09%09%09%09SELECT+%3Fs+%3FequivalentClass+%3Fo%0D%0A%09%09%09%09WHERE+%7B+%3Fs+wdt%3AP1709+%3Fo+.+%0D%0A%09%09%09%09%09%09BIND+%28+wdt%3AP1709+as+%3FequivalentClass%29%0D%0A%09%09%09%09FILTER%28+REGEX%28STR%28%3Fo%29%2C+%22schema.org%22%29%29+%0D%0A%09%09%09%09%09++%7D%0D%0A+%7D%0D%0A%7D==text%2Fhtml_redir_for_subjs=121_redir_for_hrefs==3000
 
-- HTML+Microdata

[4]
http://linkeddata.uriburner.com/sparql?default-graph-uri==PREFIX+wdt%3A+%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E+%0D%0A%0D%0ACONSTRUCT+%7B%3Fs+owl%3AequivalentClass+%3Fo%7D++%0D%0AWHERE+%7B+SERVICE+%3Chttp%3A%2F%2Fquery.wikidata.org%2Fsparql%3E++%0D%0A+%7B++%0D%0A%09%09%09%09SELECT+%3Fs+%3FequivalentClass+%3Fo%0D%0A%09%09%09%09WHERE+%7B+%3Fs+wdt%3AP1709+%3Fo+.+%0D%0A%09%09%09%09%09%09BIND+%28+wdt%3AP1709+as+%3FequivalentClass%29%0D%0A%09%09%09%09FILTER%28+REGEX%28STR%28%3Fo%29%2C+%22schema.org%22%29%29+%0D%0A%09%09%09%09%09++%7D%0D%0A+%7D%0D%0A%7D==application%2Fxhtml%2Bxml_redir_for_subjs=121_redir_for_hrefs==3000
-- XHTML+RDFa

Links:

[1] DBpedia 2016-04 Announcement

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Help with SPARQL or API or something to get subcategories

2016-10-12 Thread Kingsley Idehen

On 10/11/16 2:23 PM, Thad Guidry wrote:
> Kingsley,
>
> You might ask others on the list who could help more with providing
> the WD - Schema.org mappings or get them into a format that you
> want...or an RDF dump file.
>
> I just don't have the knowledge to assist you with that via SPARQL.
>  (i.e., I am not a SPARQL guru)

Thad,

I assumed an understanding of how wikidata models properties. Anyway,
I'll take a look as what I requested should be possible, assuming there
isn't new bent on RDF data modeling that exist in this data.

I'll take it from here.

Thanks for your assistance :)

Kingsley
>
>
> On Tue, Oct 11, 2016 at 1:11 PM Kingsley Idehen
> <kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:
>
> On 10/10/16 5:31 PM, Thad Guidry wrote:
>> Kingsley,
>>
>> use shortcut syntax instead. Also look at how the many examples
>> show doing tihngs.
>> http://tinyurl.com/hkv8z7m
>
> Thad,
>
> Not getting the point you are trying to articulate i.e., the
> solution isn't comprised strictly of
> <https://www.wikidata.org/wiki/Property:P1709>
> <https://www.wikidata.org/wiki/Property:P1709> relations.
>
> I am assuming <https://www.wikidata.org/wiki/Property:P1709>
> <https://www.wikidata.org/wiki/Property:P1709> identifies
> equivalentClass relations i.e.,
> <https://www.wikidata.org/wiki/Property:P1709>
> <https://www.wikidata.org/wiki/Property:P1709> 
> owl:equivalentProperty owl:equivalentClass .
>
> You should be able to articulate the equivalent of the following,
> using SPARQL against Wikidata:
>
> SELECT DISTINCT ?s ?relation ?o
> WHERE { ?s owl:equivalentClass ?o.
> BIND (owl:equivalentClass AS ?relation)
> FILTER (isIRI(?s))
> FILTER (isIRI(?o))
>   }
> LIMIT 100
>
> Live Solution Link:
> 
> http://lod.openlinksw.com/sparql?default-graph-uri==SELECT+DISTINCT+%3Fs+%3Frelation+%3Fo%0D%0AWHERE+%7B+%3Fs+owl%3AequivalentClass+%3Fo.%0D%0ABIND+%28owl%3AequivalentClass+AS+%3Frelation%29%0D%0AFILTER+%28isIRI%28%3Fs%29%29+%0D%0A++++FILTER+%28isIRI%28%3Fo%29%29%0D%0A++%7D%0D%0ALIMIT+100=text%2Fx-html%2Btr_redir_for_subjs=121_redir_for_hrefs==3=on
>
>
> Kingsley
>>
>> On Mon, Oct 10, 2016 at 1:50 PM Kingsley Idehen
>> <kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:
>>
>> On 10/9/16 1:34 PM, Thad Guidry wrote:
>>> Kingsley,
>>>
>>> The mappings are already in Wikidata (we still have a few
>>> properties left to map, but the classes are done, except for
>>> a few in pending.schema.org <http://pending.schema.org>. 
>>> You can just query them in some fashion such as
>>> this: http://tinyurl.com/h9vjqd8
>>
>> Thad,
>>
>> Shouldn't the following query return all Classes
>> participating in an
>> <https://www.wikidata.org/wiki/Property:P1709>
>> <https://www.wikidata.org/wiki/Property:P1709> relation? :
>>
>> SELECT ?s ?relation ?o
>> WHERE { ?s <https://www.wikidata.org/wiki/Property:P1709>
>> <https://www.wikidata.org/wiki/Property:P1709> ?o.
>> BIND (<https://www.wikidata.org/wiki/Property:P1709>
>> <https://www.wikidata.org/wiki/Property:P1709> AS ?relation)
>>   }
>> LIMIT 100
>>
>> Query Link:
>> 
>> https://query.wikidata.org/#%0A%0ASELECT%20%3Fs%20%3Frelation%20%3Fo%0AWHERE%20%7B%20%3Fs%20%3Chttps%3A%2F%2Fwww.wikidata.org%2Fwiki%2FProperty%3AP1709%3E%20%3Fo.%20%0A%20%20%20%20%20%20%20%20BIND%20%28%3Chttps%3A%2F%2Fwww.wikidata.org%2Fwiki%2FProperty%3AP1709%3E%20AS%20%3Frelation%29%20%0A%20%20%20%20%20%20%7D%20%0ALIMIT%20100%0A
>>
>> If not, can you formulate something closer to that output
>> which makes matters easier should you not have an RDF dump
>> available etc..
>>
>>
>> Kingsley
>>
>>>
>>> On Sun, Oct 9, 2016 at 11:09 AM Kingsley Idehen
>>> <kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:
>>>
>>> On 10/8/16 5:04 PM, Thad Guidry wrote:
>>>>
>>>> Class mappings are almost complete Kingsley. We were
>>>> trying to get Wikidata to the point where it is easy
>>>> for folks to do trivial queries right ins

Re: [Wikidata] Help with SPARQL or API or something to get subcategories

2016-10-11 Thread Kingsley Idehen

On 10/10/16 5:31 PM, Thad Guidry wrote:
> Kingsley,
>
> use shortcut syntax instead. Also look at how the many examples show
> doing tihngs.
> http://tinyurl.com/hkv8z7m

Thad,

Not getting the point you are trying to articulate i.e., the solution
isn't comprised strictly of
<https://www.wikidata.org/wiki/Property:P1709>
<https://www.wikidata.org/wiki/Property:P1709> relations.

I am assuming <https://www.wikidata.org/wiki/Property:P1709>
<https://www.wikidata.org/wiki/Property:P1709> identifies
equivalentClass relations i.e.,
<https://www.wikidata.org/wiki/Property:P1709>
<https://www.wikidata.org/wiki/Property:P1709>  owl:equivalentProperty
owl:equivalentClass .

You should be able to articulate the equivalent of the following, using
SPARQL against Wikidata:

SELECT DISTINCT ?s ?relation ?o
WHERE { ?s owl:equivalentClass ?o.
BIND (owl:equivalentClass AS ?relation)
FILTER (isIRI(?s))
FILTER (isIRI(?o))
  }
LIMIT 100

Live Solution Link:
http://lod.openlinksw.com/sparql?default-graph-uri==SELECT+DISTINCT+%3Fs+%3Frelation+%3Fo%0D%0AWHERE+%7B+%3Fs+owl%3AequivalentClass+%3Fo.%0D%0ABIND+%28owl%3AequivalentClass+AS+%3Frelation%29%0D%0AFILTER+%28isIRI%28%3Fs%29%29+%0D%0AFILTER+%28isIRI%28%3Fo%29%29%0D%0A++%7D%0D%0ALIMIT+100=text%2Fx-html%2Btr_redir_for_subjs=121_redir_for_hrefs==30000=on

Kingsley
>
> On Mon, Oct 10, 2016 at 1:50 PM Kingsley Idehen
> <kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:
>
> On 10/9/16 1:34 PM, Thad Guidry wrote:
>> Kingsley,
>>
>> The mappings are already in Wikidata (we still have a few
>> properties left to map, but the classes are done, except for a
>> few in pending.schema.org <http://pending.schema.org>.  You can
>> just query them in some fashion such as
>> this: http://tinyurl.com/h9vjqd8
>
> Thad,
>
> Shouldn't the following query return all Classes participating in
> an <https://www.wikidata.org/wiki/Property:P1709>
> <https://www.wikidata.org/wiki/Property:P1709> relation? :
>
> SELECT ?s ?relation ?o
> WHERE { ?s <https://www.wikidata.org/wiki/Property:P1709>
> <https://www.wikidata.org/wiki/Property:P1709> ?o.
> BIND (<https://www.wikidata.org/wiki/Property:P1709>
> <https://www.wikidata.org/wiki/Property:P1709> AS ?relation)
>   }
> LIMIT 100
>
> Query Link:
> 
> https://query.wikidata.org/#%0A%0ASELECT%20%3Fs%20%3Frelation%20%3Fo%0AWHERE%20%7B%20%3Fs%20%3Chttps%3A%2F%2Fwww.wikidata.org%2Fwiki%2FProperty%3AP1709%3E%20%3Fo.%20%0A%20%20%20%20%20%20%20%20BIND%20%28%3Chttps%3A%2F%2Fwww.wikidata.org%2Fwiki%2FProperty%3AP1709%3E%20AS%20%3Frelation%29%20%0A%20%20%20%20%20%20%7D%20%0ALIMIT%20100%0A
>
> If not, can you formulate something closer to that output which
> makes matters easier should you not have an RDF dump available etc..
>
>
> Kingsley
>
>>
>> On Sun, Oct 9, 2016 at 11:09 AM Kingsley Idehen
>> <kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:
>>
>> On 10/8/16 5:04 PM, Thad Guidry wrote:
>>>
>>> Class mappings are almost complete Kingsley. We were trying
>>> to get Wikidata to the point where it is easy for folks to
>>> do trivial queries right inside WDQS as part of WD Goals for
>>> 2016.
>>>
>>
>> Thad,
>>
>> Here is a post (brought up to date in last day) that
>> demonstrates owl:equivalentClass reasoning and inference.
>> Once I have a link to the new mappings I will integrate into
>> this demo.
>>
>> Note, this particular LOD Cloud Cache instance has 30
>> Billion+ triples against which the owl:equivalentClass
>> inference is being applied, at query evaluation time.
>>
>> [1]
>> 
>> http://kidehen.blogspot.com/2014/02/class-equivalence-based-reasoning.html
>>
>>
>> Kingsley
>>>
>>> On Sat, Oct 8, 2016, 2:38 PM Kingsley Idehen
>>> <kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:
>>>
>>> On 10/6/16 2:57 PM, Thad Guidry wrote:
>>>> Hello team :)
>>>>
>>>> So while I'm helping with the Wikidata - Schema.org
>>>> mappings, a request came in to expose subcategories of
>>>> an existing Wikipedia category.
>>>>
>>>> For example, say I start with this
>>>>

Re: [Wikidata] Help with SPARQL or API or something to get subcategories

2016-10-10 Thread Kingsley Idehen

On 10/9/16 1:34 PM, Thad Guidry wrote:
> Kingsley,
>
> The mappings are already in Wikidata (we still have a few properties
> left to map, but the classes are done, except for a few in
> pending.schema.org <http://pending.schema.org>.  You can just query
> them in some fashion such as this: http://tinyurl.com/h9vjqd8

Thad,

Shouldn't the following query return all Classes participating in an
<https://www.wikidata.org/wiki/Property:P1709> relation? :

SELECT ?s ?relation ?o
WHERE { ?s <https://www.wikidata.org/wiki/Property:P1709> ?o.
BIND (<https://www.wikidata.org/wiki/Property:P1709> AS ?relation)
  }
LIMIT 100

Query Link:
https://query.wikidata.org/#%0A%0ASELECT%20%3Fs%20%3Frelation%20%3Fo%0AWHERE%20%7B%20%3Fs%20%3Chttps%3A%2F%2Fwww.wikidata.org%2Fwiki%2FProperty%3AP1709%3E%20%3Fo.%20%0A%20%20%20%20%20%20%20%20BIND%20%28%3Chttps%3A%2F%2Fwww.wikidata.org%2Fwiki%2FProperty%3AP1709%3E%20AS%20%3Frelation%29%20%0A%20%20%20%20%20%20%7D%20%0ALIMIT%20100%0A

If not, can you formulate something closer to that output which makes
matters easier should you not have an RDF dump available etc..

Kingsley

>
> On Sun, Oct 9, 2016 at 11:09 AM Kingsley Idehen
> <kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:
>
> On 10/8/16 5:04 PM, Thad Guidry wrote:
>>
>> Class mappings are almost complete Kingsley. We were trying to
>> get Wikidata to the point where it is easy for folks to do
>> trivial queries right inside WDQS as part of WD Goals for 2016.
>>
>
> Thad,
>
> Here is a post (brought up to date in last day) that demonstrates
> owl:equivalentClass reasoning and inference. Once I have a link to
> the new mappings I will integrate into this demo.
>
> Note, this particular LOD Cloud Cache instance has 30 Billion+
> triples against which the owl:equivalentClass inference is being
> applied, at query evaluation time.
>
> [1]
> http://kidehen.blogspot.com/2014/02/class-equivalence-based-reasoning.html
>
>
> Kingsley
>>
>> On Sat, Oct 8, 2016, 2:38 PM Kingsley Idehen
>> <kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:
>>
>> On 10/6/16 2:57 PM, Thad Guidry wrote:
>>> Hello team :)
>>>
>>> So while I'm helping with the Wikidata - Schema.org
>>> mappings, a request came in to expose subcategories of an
>>> existing Wikipedia category.
>>>
>>> For example, say I start with this
>>> topic: https://www.wikidata.org/wiki/Q27119725  Parking
>>> facilities
>>>
>>> The topic's main category is shown as "Category:Parking
>>> facilities" and that has links to Wikipedia, specifically a
>>> Wikipedia category link, and where the WP category page has
>>> subcategories that I would like to expose somehow in
>>> whichever way is *easiest* currently with our tools, apis, etc.
>>>
>>> Can it all be done in SPARQL against some services that
>>> already expose WP subcategories given a specific category ? 
>>> Or is there an API that does this already ?  other tools
>>> that might expose WP categories ?
>>>
>>> The IDEAL GOAL is to query 'equivalent class' =
>>> schema.org/ParkingFacility
>>> <http://schema.org/ParkingFacility> and get back the WP
>>> categories *in one shot or query or api call.*
>>>
>>> http://schema.org/ParkingFacility
>>>
>>>  *
>>>
>>> Parking facilities in India
>>> 
>>> <https://en.wikipedia.org/wiki/Category:Parking_facilities_in_India>‎
>>>  *
>>> Parking facilities in the United States
>>> 
>>> <https://en.wikipedia.org/wiki/Category:Parking_facilities_in_the_United_States>‎
>>>
>>>  *
>>> Aircraft hangars
>>> <https://en.wikipedia.org/wiki/Category:Aircraft_hangars>‎
>>>
>>>  *
>>> Garages (parking)
>>> <https://en.wikipedia.org/wiki/Category:Garages_%28parking%29>‎
>>>  *
>>> Railway depots
>>> <https://en.wikipedia.org/wiki/Category:Railway_depots>‎
>>>
>>>
>>> Any gurus ?
>>
>> Hi Thad,
>>
>> If there are owl:equivalentClass mappings in some Lin

Re: [Wikidata] Help with SPARQL or API or something to get subcategories

2016-10-08 Thread Kingsley Idehen

On 10/6/16 2:57 PM, Thad Guidry wrote:
> Hello team :)
>
> So while I'm helping with the Wikidata - Schema.org mappings, a
> request came in to expose subcategories of an existing Wikipedia category.
>
> For example, say I start with this
> topic: https://www.wikidata.org/wiki/Q27119725  Parking facilities
>
> The topic's main category is shown as "Category:Parking facilities"
> and that has links to Wikipedia, specifically a Wikipedia category
> link, and where the WP category page has subcategories that I would
> like to expose somehow in whichever way is *easiest* currently with
> our tools, apis, etc.
>
> Can it all be done in SPARQL against some services that already expose
> WP subcategories given a specific category ?  Or is there an API that
> does this already ?  other tools that might expose WP categories ?
>
> The IDEAL GOAL is to query 'equivalent class' =
> schema.org/ParkingFacility <http://schema.org/ParkingFacility> and get
> back the WP categories *in one shot or query or api call.*
>
> http://schema.org/ParkingFacility
>
>  *
>
> Parking facilities in India
> <https://en.wikipedia.org/wiki/Category:Parking_facilities_in_India>‎
>  *
> Parking facilities in the United States
> 
> <https://en.wikipedia.org/wiki/Category:Parking_facilities_in_the_United_States>‎
>
>  *
> Aircraft hangars
> <https://en.wikipedia.org/wiki/Category:Aircraft_hangars>‎
>
>  *
> Garages (parking)
> <https://en.wikipedia.org/wiki/Category:Garages_%28parking%29>‎
>  *
> Railway depots
> <https://en.wikipedia.org/wiki/Category:Railway_depots>‎
>
>
> Any gurus ?

Hi Thad,

If there are owl:equivalentClass mappings in some Linked Data Space, and
the SPARQL service associated with said Data Space supports
owl:equivalentClass reasoning, then the answer to your question is yes.

What unknown right now is the class mappings between Wikidata and
Schema.org. If a dump of those exist, the rest is trivial :)

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: 
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Photographers' Identities Catalog (& WikiData)

2016-04-02 Thread Kingsley Idehen

On 3/25/16 5:00 PM, David Lowe wrote:
> Hi all,
>
> This is an old thread now, but I thought I'd update you that NYPL has
> now launched Photographers' Identities Catalog. Read about it here
> <http://www.nypl.org/blog/2016/03/25/introducing-pic>, or skip
> straight to the site at pic.nypl.org <http://pic.nypl.org> .
> I hope it may be of interest to some of you.
> Thanks,
> David

David,

Very nice!

I had to search a bit, but eventually found: http://on.nypl.org/25DhGDm .


Kingsley
>
>
> On Tue, Dec 15, 2015 at 3:55 PM, Gerard Meijssen
> <gerard.meijs...@gmail.com <mailto:gerard.meijs...@gmail.com>> wrote:
>
> Hoi,
> Sorry, I understand sarcasm but I do not understand what it is
> based upon.
> Thanks,
>  GerardM
>
> On 15 December 2015 at 20:10, John Erling Blad <jeb...@gmail.com
> <mailto:jeb...@gmail.com>> wrote:
>
> There are some pretty good methods for optimizing the match
> process, but I have not seen any implementation for that
> against Wikidata items. Only things I've seen are some
> opportunistic methods. Duck tests gone wrong, or "Darn it was
> a platypus!"
>
> On Mon, Dec 14, 2015 at 11:19 PM, André Costa
> <andre.co...@wikimedia.se <mailto:andre.co...@wikimedia.se>>
> wrote:
>
> I'm planning to bring a few of the datasets into
> mix'n'match (@Magnus this is the one I asked sbout on
> Twitter) in January but not all of them are suitable and I
> believe separating KulturNav into multiple datasets on
> mix'n'match maxes more sense and makes it more likely that
> they get matched.
>
> Some of the early adopters of KulturNav have been working
> with WMSE to facilitate bi-directional matching. This is
> done on a dataset-by-dataset level since different
> institutions are responsible for different datasets. My
> hope is that mix'n'match will help in this area as well,
> even as a tool for the institutions own staff who are
> often interested in matching entries to Wikipedia (which
> most of the time means wikidata).
>
> @John: There are processes for matching kulturnav
> identifiers to wikidata entities. Only afterwards are
> details imported. Mainly to source statements [1] and [2].
> There is some (not so user friendly) stats at [3].
>
> Cheers,
> André
>
> 
> [1]https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_2
> 
> [2]https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_3
> [3] https://tools.wmflabs.org/lp-tools/misc/data/
> --
> André Costa
> GLAM developer
> Wikimedia Sverige
>
> Magnus Manske, 13/12/2015 11:24:
>
> >
> > Since no one mentioned it, there is a tool to do the
> matching to WD much
> > more efficiently:
> > https://tools.wmflabs.org/mix-n-match/
> <https://tools.wmflabs.org/mix-n-match/>
>
> +1
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
>
>     _______
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software 
Company Web: http://www.openlinksw.com
Personal Webl

Re: [Wikidata] SPARQL CONSTRUCT results truncated

2016-02-16 Thread Kingsley Idehen

On 2/13/16 6:29 PM, Markus Kroetzsch wrote:
> On 13.02.2016 23:56, Kingsley Idehen wrote:
>> On 2/13/16 4:56 PM, Markus Kroetzsch wrote:
> ...
>>
>> For a page-size of 20 (covered by LIMIT) you can move through offets of
>> 20 via:
>
> To clarify: I just added the LIMIT to prevent unwary readers from
> killing their browser on a 100MB HTML result page. The server does not
> need it at all and can give you all result at once. Online
> applications may still want to scroll results, I agree, but for the OP
> it would be more useful to just donwload one file here.
>
> Markus

Scrolling or Paging through query solutions is a technique beneficial to
clients and servers. Understanding the concept has to be part of the
narrative for working with SPARQL query solutions.

This is about flexibility via usage of the full functionality of SPARQL
as most developers and users simply execute queries without factoring in
these techniques or the impact of their queries on other system users etc..

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software 
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] SPARQL CONSTRUCT results truncated

2016-02-16 Thread Kingsley Idehen

On 2/13/16 6:26 PM, Markus Kroetzsch wrote:
> On 13.02.2016 23:50, Kingsley Idehen wrote:
> ...
>> Markus and others interested in this matter,
>>
>> What about using OFFSET and LIMIT to address this problem? That's what
>> we advice users of the DBpedia endpoint (and others we publish) to do.
>>
>> We have to educate people about query implications and options. Even
>> after that, you have the issue of timeouts (which aren't part of the
>> SPARQL spec) that can be used to produce partial results (notified via
>> HTTP headers), but that's something that comes after the basic scrolling
>> functionality of OFFSET and LIMIT are understood.
>
> I think this does not help here. If I only ask for part of the data
> (see my previous email), I can get all 300K results in 9.3sec. The
> size of the result does not seem to be the issue. If I add further
> joins to the query, the time needed seems to go above 10sec (timeout)
> even with a LIMIT. Note that you need to order results for using LIMIT
> in a reliable way, since the data changes by the minute and the
> "natural" order of results would change as well. I guess with a
> blocking operator like ORDER BY in the equation, the use of LIMIT does
> not really save much time (other than for final result serialisation
> and transfer, which seems pretty quick).
>
> Markus

Markus,

LIMIT isn't the key element in my example since all it does is set
cursor size. It's the use of OFFSET to move the cursor through positions
in the solution that's key here.

Fundamentally, this is about using HTTP GET requests to page through the
data if a single query solution is either too large or its preparation
exceeds underlying DBMS timeout settings.

Ultimately, developers have to understand these time-tested techniques
for working with data.

Kingsley
>
>>
>> [1]
>> http://stackoverflow.com/questions/20937556/how-to-get-all-companies-from-dbpedia
>>
>> [2] https://sourceforge.net/p/dbpedia/mailman/message/29172307/
>>
>>
>>
>> _______
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software 
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] SPARQL CONSTRUCT results truncated

2016-02-13 Thread Kingsley Idehen

On 2/11/16 9:25 AM, Markus Krötzsch wrote:
> On 11.02.2016 15:01, Gerard Meijssen wrote:
>> Hoi,
>> What I hear is that the intentions were wrong in that you did not
>> anticipate people to get actual meaningful requests out of it.
>>
>> When you state "we have two choices", you imply that it is my choice as
>> well. It is not. The answer that I am looking for is yes, it does not
>> function as we would like, we are working on it and in the mean time we
>> will ensure that toolkit is available on Labs for the more complex
>> queries.
>>
>> Wikidata is a service and the service is in need of being better.
>
> Gerard, do you realise how far away from technical reality your wishes
> are? We are far ahead of the state of the art in what we already have
> for Wikidata: two powerful live query services + a free toolkit for
> batch analyses + several Web APIs for live lookups. I know of no site
> of this scale that is anywhere near this in terms of functionality.
> You can always ask for more, but you should be a bit reasonable too,
> or people will just ignore you.
>
> Markus 

Markus and others interested in this matter,

What about using OFFSET and LIMIT to address this problem? That's what
we advice users of the DBpedia endpoint (and others we publish) to do.

We have to educate people about query implications and options. Even
after that, you have the issue of timeouts (which aren't part of the
SPARQL spec) that can be used to produce partial results (notified via
HTTP headers), but that's something that comes after the basic scrolling
functionality of OFFSET and LIMIT are understood.

[1]
http://stackoverflow.com/questions/20937556/how-to-get-all-companies-from-dbpedia
[2] https://sourceforge.net/p/dbpedia/mailman/message/29172307/

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software 
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Use of Sparql service is going through the roof

2015-11-06 Thread Kingsley Idehen

On 11/6/15 9:27 AM, James Heald wrote:
> Does anyone know what's going on with the Sparql service ?
>
> Up until a couple of days ago, the most hits ever in one day was about
> 6000.
>
> But according to
>  http://searchdata.wmflabs.org/wdqs/
>
> two days ago suddenly there were 6.77 *million* requests, and
> yesterday over 21 million.
>
> Does anyone know what sort of requests these are, and whether they are
> all coming from the same place ?
>
>-- James. 

Lookup #SPARQL on Twitter :)

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software 
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Use of Sparql service is going through the roof

2015-11-06 Thread Kingsley Idehen

On 11/6/15 1:04 PM, Mikhail Popov wrote:
> Hi! We looked at the logs. 21,740,641 requests are coming from a
> single IP without a user agent that we can't geolocate because it's in
> the 10 range.
>
> Looking into the actual queries revealed that it's probably a broken
> bot. Stas said "the query makes no sense and is broken" and that it
> "looks like somebody trying to download whole DB in very weird way but
> is doing it all wrong."
>
> We are investigating the issue.
>
> – *Mikhail Popov*// Data Analyst, Discovery

That will always happen, folks always want to dump the entire DB.

Takes a while for clarity to arise.

This has been the DBpedia experience for years.

[1]
https://docs.google.com/document/d/12VljKl-yDNBoMGb_FnQWiXDAaZC3VnQHqy-E9iD8Mz4/edit
-- DBpedia Usage Report

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software 
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

1 2 >

1 - 100 of 144 matches

Mail list logo