Re: [Wikidata] Wikidata HDT dump

2017-10-31 Thread sushil dutt
Please take me out from these conversations.

On Wed, Nov 1, 2017 at 5:02 AM, Stas Malyshev 
wrote:

> Hi!
>
> > OK. I wonder though, if it would be possible to setup a regular HDT
> > dump alongside the already regular dumps. Looking at the dumps page,
> > https://dumps.wikimedia.org/wikidatawiki/entities/, it looks like a
> > new dump is generated once a week more or less. So if a HDT dump
> > could
>
> True, the dumps run weekly. "More or less" situation can arise only if
> one of the dumps fail (either due to a bug or some sort of external
> force majeure).
> --
> Stas Malyshev
> smalys...@wikimedia.org
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
Regards,
Sushil Dutt
8800911840
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Claimed] T179228: geoPrecision exported to RDF as decimal, but is in fact float

2017-10-31 Thread Smalyshev
Smalyshev claimed this task.
TASK DETAILhttps://phabricator.wikimedia.org/T179228EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: gerritbot, thiemowmde, mkroetzsch, Lucas_Werkmeister_WMDE, daniel, Aklapper, Smalyshev, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, merbst, Avner, Lewizho99, Maathavan, debt, Gehel, Jonas, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T179428: Can not enable old SQL prefix search mode on wikidata

2017-10-31 Thread Smalyshev
Smalyshev moved this task from Needs triage to Current work on the Discovery-Search board.Smalyshev edited projects, added Discovery-Search (Current work); removed Discovery-Search.
TASK DETAILhttps://phabricator.wikimedia.org/T179428WORKBOARDhttps://phabricator.wikimedia.org/project/board/1849/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: gerritbot, EBernhardson, dcausse, Aklapper, Smalyshev, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, Lewizho99, Maathavan, Jdrewniak, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T179428: Can not enable old SQL prefix search mode on wikidata

2017-10-31 Thread Smalyshev
Smalyshev claimed this task.
TASK DETAILhttps://phabricator.wikimedia.org/T179428EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: gerritbot, EBernhardson, dcausse, Aklapper, Smalyshev, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, Lewizho99, Maathavan, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T179061: pasting full URL into entity selector no longer works

2017-10-31 Thread Smalyshev
Smalyshev claimed this task.
TASK DETAILhttps://phabricator.wikimedia.org/T179061EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: Liuxinyu970226, gerritbot, hoo, thiemowmde, Smalyshev, Multichill, Aklapper, Lydia_Pintscher, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Jrbranaa, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, Avner, Lewizho99, Maathavan, debt, Gehel, Jdrewniak, FloNight, Wikidata-bugs, aude, jayvdb, Mbch331, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T179453: Wikibase prefix search does not find other language label

2017-10-31 Thread gerritbot
gerritbot added a project: Patch-For-Review.
TASK DETAILhttps://phabricator.wikimedia.org/T179453EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Smalyshev, gerritbotCc: gerritbot, dcausse, Aklapper, Smalyshev, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, Lewizho99, Maathavan, Jdrewniak, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T179453: Wikibase prefix search does not find other language label

2017-10-31 Thread gerritbot
gerritbot added a comment.
Change 387758 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/Wikibase@master] Use "should" instead of "must" in query

https://gerrit.wikimedia.org/r/387758TASK DETAILhttps://phabricator.wikimedia.org/T179453EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Smalyshev, gerritbotCc: gerritbot, dcausse, Aklapper, Smalyshev, Lahi, GoranSMilovanovic, QZanden, EBjune, Jdrewniak, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Triaged] T179453: Wikibase prefix search does not find other language label

2017-10-31 Thread Smalyshev
Smalyshev triaged this task as "Normal" priority.
TASK DETAILhttps://phabricator.wikimedia.org/T179453EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: dcausse, Aklapper, Smalyshev, Lahi, GoranSMilovanovic, QZanden, EBjune, Jdrewniak, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T179453: Wikibase prefix search does not find other language label

2017-10-31 Thread Smalyshev
Smalyshev created this task.Smalyshev added projects: Wikidata, Discovery-Search (Current work).Herald added a subscriber: Aklapper.
TASK DESCRIPTIONPrefix search for "Instytut Matematyczny" (even in English) should find Q11713106 since Polish label is "Instytut Matematyczny PAN". That does not happen.TASK DETAILhttps://phabricator.wikimedia.org/T179453EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: Aklapper, Smalyshev, Lahi, GoranSMilovanovic, QZanden, EBjune, Jdrewniak, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Subscribers] T179453: Wikibase prefix search does not find other language label

2017-10-31 Thread Smalyshev
Smalyshev added a subscriber: dcausse.Smalyshev added a comment.
@dcausse's comment:

This is because the list of scoring clauses is inside a MUST:

query: 
{

query: 
{
bool: 
{
should: 
[
{
bool: 
{
filter: 
[
{
match: 
{
labels_all.prefix: "Instytut Matematyczny"
}
}
],
must: <= this should be a SHOULD
[
{
dis_max: 
{

TASK DETAILhttps://phabricator.wikimedia.org/T179453EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: dcausse, Aklapper, Smalyshev, Lahi, GoranSMilovanovic, QZanden, EBjune, Jdrewniak, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T178661: Drop wb_entity_per_page views in Wiki Replicas

2017-10-31 Thread bd808
bd808 added a comment.
Actually we probably just need maintain-views --databases wikidatawiki --clean rather than all wikis.TASK DETAILhttps://phabricator.wikimedia.org/T178661EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: bd808Cc: bd808, chasemp, Magnus, hoo, daniel, jcrespo, Marostegui, Aklapper, Ladsgroup, Lahi, aborrero, GoranSMilovanovic, QZanden, Tbscho, JJMC89, Luke081515, Wikidata-bugs, aude, Gryllida, scfc, Mbch331, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T178661: Drop wb_entity_per_page views in Wiki Replicas

2017-10-31 Thread bd808
bd808 added a comment.
The view was removed by rOPUPb787dc9b4410: labs: do not replicate wb_entity_per_page table.

(u3518@wikidatawiki.labsdb) [wikidatawiki_p]> describe wb_entity_per_page;
ERROR 1356 (HY000): View 'wikidatawiki_p.wb_entity_per_page' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them

I think this means we need to run maintain-views --all-databases --clean to drop views that are no longer defined in the config. The script currently does not support dropping a single view.TASK DETAILhttps://phabricator.wikimedia.org/T178661EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: bd808Cc: bd808, chasemp, Magnus, hoo, daniel, jcrespo, Marostegui, Aklapper, Ladsgroup, Lahi, aborrero, GoranSMilovanovic, QZanden, Tbscho, JJMC89, Luke081515, Wikidata-bugs, aude, Gryllida, scfc, Mbch331, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Status] T168214: Updater misses updates when two updates happen very close to each other

2017-10-31 Thread Smalyshev
Smalyshev changed the task status from "Open" to "Stalled".Smalyshev lowered the priority of this task from "High" to "Normal".
TASK DETAILhttps://phabricator.wikimedia.org/T168214EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: Jonas, PokestarFan, Aklapper, Smalyshev, Lahi, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, Avner, debt, Gehel, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T179228: geoPrecision exported to RDF as decimal, but is in fact float

2017-10-31 Thread gerritbot
gerritbot added a comment.
Change 387754 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/Wikibase@master] Type GlobeCoordinate values as floats which they are

https://gerrit.wikimedia.org/r/387754TASK DETAILhttps://phabricator.wikimedia.org/T179228EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: gerritbotCc: gerritbot, thiemowmde, mkroetzsch, Lucas_Werkmeister_WMDE, daniel, Aklapper, Smalyshev, Lahi, GoranSMilovanovic, QZanden, EBjune, merbst, Avner, debt, Gehel, Jonas, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Raised Priority] T179228: geoPrecision exported to RDF as decimal, but is in fact float

2017-10-31 Thread Smalyshev
Smalyshev raised the priority of this task from "Low" to "Normal".Smalyshev added a comment.
I disagree with "low" priority - producing invalid RDF is bad.TASK DETAILhttps://phabricator.wikimedia.org/T179228EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: thiemowmde, mkroetzsch, Lucas_Werkmeister_WMDE, daniel, Aklapper, Smalyshev, Lahi, GoranSMilovanovic, QZanden, EBjune, merbst, Avner, debt, Gehel, Jonas, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T179228: geoPrecision exported to RDF as decimal, but is in fact float

2017-10-31 Thread gerritbot
gerritbot added a project: Patch-For-Review.
TASK DETAILhttps://phabricator.wikimedia.org/T179228EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: gerritbotCc: gerritbot, thiemowmde, mkroetzsch, Lucas_Werkmeister_WMDE, daniel, Aklapper, Smalyshev, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, merbst, Avner, Lewizho99, Maathavan, debt, Gehel, Jonas, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T176593: Reload WDQS dataset

2017-10-31 Thread Smalyshev
Smalyshev updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTIONAfter T121274 and T175948 are merged and deployed, we will need to reload the data for all servers. If we want to do any other data modifications, we may attach them to this ticket too. Dependencies:

[x] {T121274} - in 1.31-wmf.5
[] {T175948}
[] {T179228} TASK DETAILhttps://phabricator.wikimedia.org/T176593EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: Aklapper, Gehel, Smalyshev, Lahi, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, Avner, debt, Jonas, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Unblock] T179091: Add case-insensitive title match capability for Wikidata search

2017-10-31 Thread Smalyshev
Smalyshev closed subtask T179045: Wikibase prefix search for IDs is case sensitive as "Resolved".
TASK DETAILhttps://phabricator.wikimedia.org/T179091EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: Aklapper, EBernhardson, dcausse, Smalyshev, Lahi, GoranSMilovanovic, Jrbranaa, QZanden, EBjune, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Block] T174519: [epic] SDoC: Determine baseline for metrics

2017-10-31 Thread debt
debt created subtask T179450: Documentation of findings.
TASK DETAILhttps://phabricator.wikimedia.org/T174519EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: debtCc: Nuria, Liuxinyu970226, Capt_Swing, Ramsey-WMF, SandraF_WMF, Abit, chelsyx, mpopov, debt, Aklapper, Lahi, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T179450: Documentation of findings

2017-10-31 Thread debt
debt created this task.debt added projects: Wikidata, Discovery, Structured-Data-Commons, Discovery-Analysis (Current work).
TASK DESCRIPTIONOnce we finish with all the baseline metrics for SDoC, we'll want to document them. @Nuria has suggested that we post them here: https://meta.wikimedia.org/wiki/Structured_Data_on_Commons.

When we're ready to post, we'll reach out to @Ramsey-WMF to confirm the final location of our findings.TASK DETAILhttps://phabricator.wikimedia.org/T179450EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: debtCc: Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Capt_Swing, Liuxinyu970226, debt, Nuria, Ramsey-WMF, Lahi, E1presidente, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


Re: [Wikidata] Wikidata HDT dump

2017-10-31 Thread Stas Malyshev
Hi!

> OK. I wonder though, if it would be possible to setup a regular HDT
> dump alongside the already regular dumps. Looking at the dumps page,
> https://dumps.wikimedia.org/wikidatawiki/entities/, it looks like a
> new dump is generated once a week more or less. So if a HDT dump
> could

True, the dumps run weekly. "More or less" situation can arise only if
one of the dumps fail (either due to a bug or some sort of external
force majeure).
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Commented On] T179428: Can not enable old SQL prefix search mode on wikidata

2017-10-31 Thread gerritbot
gerritbot added a comment.
Change 387749 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/Wikidata@wmf/1.31.0-wmf.6] Allow turning Cirrus usage off from query

https://gerrit.wikimedia.org/r/387749TASK DETAILhttps://phabricator.wikimedia.org/T179428EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: gerritbotCc: gerritbot, EBernhardson, dcausse, Aklapper, Smalyshev, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, Lewizho99, Maathavan, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T112715: Enable different URL shorteners for WDQS

2017-10-31 Thread Framawiki
Framawiki added a comment.
Just cheer to the guy who set up the tinyurl.com links, a blacklisted domain on wikipedia. Looks terribly logic that it's still active...TASK DETAILhttps://phabricator.wikimedia.org/T112715EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Smalyshev, FramawikiCc: Framawiki, Simon_Villeneuve, Metronomo, PokestarFan, Nemo_bis, Base, Lea_Lacroix_WMDE, Ash_Crow, Jklamo, eranroz, debt, Multichill, Pasleim, Zppix, nichtich, Bhumika30, Liuxinyu970226, Jonas, Ricordisamoa, Lydia_Pintscher, Aklapper, Smalyshev, Lahi, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, Avner, Gehel, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


Re: [Wikidata] Wikidata HDT dump

2017-10-31 Thread Jérémie Roquet
2017-10-31 21:27 GMT+01:00 Laura Morales :
>> I've just loaded the provided hdt file on a big machine (32 GiB wasn't
> enough to build the index but ten times this is more than enough)
> Could you please share a bit about your setup? Do you have a machine with 
> 320GB of RAM?

It's a machine with 378 GiB of RAM and 64 threads running Scientific
Linux 7.2, that we use mainly for benchmarks.

Building the index was really all about memory because the CPUs have
actually a lower per-thread performance (2.30 GHz vs 3.5 GHz) compared
to those of my regular workstation, which was unable to build it.

> Could you please also try to convert wikidata.ttl to hdt using "rdf2hdt"? I'd 
> be interested to read your results on this too.

As I'm also looking for up-to-date results, so I plan do it with the
last turtle dump as soon as I have a time slot for it; I'll let you
know about the outcome.

>> I'll try to run a few queries to see how it behaves.
>
> I don't think there is a command-line tool to parse SPARQL queries, so you 
> probably have to setup a Fuseki endpoint which uses HDT as a data source.

You're right. The limited query language of hdtSearch is closer to
grep than to SPARQL.

Thank you for pointing out Fuseki, I'll have a look at it.

-- 
Jérémie

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Commented On] T179428: Can not enable old SQL prefix search mode on wikidata

2017-10-31 Thread gerritbot
gerritbot added a comment.
Change 387662 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/Wikibase@wmf/1.31.0-wmf.6] Allow turning Cirrus usage off from query

https://gerrit.wikimedia.org/r/387662TASK DETAILhttps://phabricator.wikimedia.org/T179428EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: gerritbotCc: gerritbot, EBernhardson, dcausse, Aklapper, Smalyshev, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, Lewizho99, Maathavan, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


Re: [Wikidata] Wikidata HDT dump

2017-10-31 Thread Laura Morales
> I've just loaded the provided hdt file on a big machine (32 GiB wasn't
enough to build the index but ten times this is more than enough)


Could you please share a bit about your setup? Do you have a machine with 320GB 
of RAM?
Could you please also try to convert wikidata.ttl to hdt using "rdf2hdt"? I'd 
be interested to read your results on this too.
Thank you!


> I'll try to run a few queries to see how it behaves.


I don't think there is a command-line tool to parse SPARQL queries, so you 
probably have to setup a Fuseki endpoint which uses HDT as a data source.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata HDT dump

2017-10-31 Thread Ghislain ATEMEZING
Hola,
Please don’t get me wrong and don’t give any interpretation based on my 
question. 
Since the beginning of this thread, I am also trying to push the use of HDT 
here. For example, I was the one contacting HDT gurus to fix the dataset error 
on Twitter and so on...

Sorry if Laura or any one thought I was giving “some lessons here “. I don’t 
have a super computer either nor a member of Wikidata team. Just a “data 
consumer” as many here ..

Best,
Ghislain 

Sent from my iPhone, may include typos

> Le 31 oct. 2017 à 20:44, Luigi Assom  a écrit :
> 
> Doh what's wrong with asking for supporting own user case "UC" ?
> 
> I think it is a totally legit question to ask, and that's why this thread 
> exists.
> 
> Also, I do support for possibility to help access to data that would be hard 
> to process from "common" hardware. Especially in the case of open data.
> They exists to allow someone take them and build them - amazing if can 
> prototype locally, right?
> 
> I don't like the use case where a data-scientist-or-IT show to the other 
> data-scientist-or-IT own work looking for emotional support or praise.
> I've seen that, not here, and I hope this attitude stays indeed out from 
> here..
> 
> I do like when the work of data-scientist-or-IT ignites someone else's 
> creativity - someone who is completely external - , to say: hey your work is 
> cool and I wanna use it for... my use case!
> That's how ideas go around and help other people build complexity over them, 
> without constructing not necessary borders.
> 
> About a local version of compressed, index RDF - I think that if was 
> available, more people yes probably would use it.
> 
> 
> 
>> On Tue, Oct 31, 2017 at 4:03 PM, Laura Morales  wrote:
>> I feel like you are misrepresenting my request, and possibly trying to 
>> offend me as well.
>> 
>> My "UC" as you call it, is simply that I would like to have a local copy of 
>> wikidata, and query it using SPARQL. Everything that I've tried so far 
>> doesn't seem to work on commodity hardware since the database is so large. 
>> But HDT could work. So I asked if a HDT dump could, please, be added to 
>> other dumps that are periodically generated by wikidata. I also told you 
>> already that *I AM* trying to use the 1 year old dump, but in order to use 
>> the HDT tools I'm told that I *MUST* generate some other index first which 
>> unfortunately I can't generate for the same reasons that I can convert the 
>> Turtle to HDT. So what I was trying to say is, that if wikidata were to add 
>> any HDT dump, this dump should contain both the .hdt file and .hdt.index in 
>> order to be useful. That's about it, and it's not just about me. Anybody who 
>> wants to have a local copy of wikidata could benefit from this, since 
>> setting up a .hdt file seems much easier than a Turtle dump. And I don't 
>> understand why you're trying to blame me for this?
>> 
>> If you are part of the wikidata dev team, I'd greatly appreciate a 
>> "can/can't" or "don't care" response rather than playing the 
>> passive-aggressive game that you displayed in your last email.
>> 
>> 
>> > Let me try to understand ...
>> > You are a "data consumer" with the following needs:
>> >   - Latest version of the data
>> >   - Quick access to the data
>> >   - You don't want to use the current ways to access the data by the 
>> > publisher (endpoint, ttl dumps, LDFragments)
>> >  However, you ask for a binary format (HDT), but you don't have enough 
>> > memory to set up your own environment/endpoint due to lack of memory.
>> > For that reason, you are asking the publisher to support both .hdt and 
>> > .hdt.index files.
>> >
>> > Do you think there are many users with your current UC?
>> 
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
> 
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Updated] T179428: Can not enable old SQL prefix search mode on wikidata

2017-10-31 Thread ReleaseTaggerBot
ReleaseTaggerBot added a project: MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)).
TASK DETAILhttps://phabricator.wikimedia.org/T179428EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ReleaseTaggerBotCc: gerritbot, EBernhardson, dcausse, Aklapper, Smalyshev, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, Lewizho99, Maathavan, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


Re: [Wikidata] Wikidata HDT dump

2017-10-31 Thread Luigi Assom
Doh what's wrong with asking for supporting own user case "UC" ?

I think it is a totally legit question to ask, and that's why this thread
exists.

Also, I do support for possibility to help access to data that would be
hard to process from "common" hardware. Especially in the case of open data.
They exists to allow someone take them and build them - amazing if can
prototype locally, right?

I don't like the use case where a data-scientist-or-IT show to the other
data-scientist-or-IT own work looking for emotional support or praise.
I've seen that, not here, and I hope this attitude stays indeed out from
here..

I do like when the work of data-scientist-or-IT ignites someone else's
creativity - someone who is completely external - , to say: hey your work
is cool and I wanna use it for... my use case!
That's how ideas go around and help other people build complexity over
them, without constructing not necessary borders.

About a local version of compressed, index RDF - I think that if was
available, more people yes probably would use it.



On Tue, Oct 31, 2017 at 4:03 PM, Laura Morales  wrote:

> I feel like you are misrepresenting my request, and possibly trying to
> offend me as well.
>
> My "UC" as you call it, is simply that I would like to have a local copy
> of wikidata, and query it using SPARQL. Everything that I've tried so far
> doesn't seem to work on commodity hardware since the database is so large.
> But HDT could work. So I asked if a HDT dump could, please, be added to
> other dumps that are periodically generated by wikidata. I also told you
> already that *I AM* trying to use the 1 year old dump, but in order to use
> the HDT tools I'm told that I *MUST* generate some other index first which
> unfortunately I can't generate for the same reasons that I can convert the
> Turtle to HDT. So what I was trying to say is, that if wikidata were to add
> any HDT dump, this dump should contain both the .hdt file and .hdt.index in
> order to be useful. That's about it, and it's not just about me. Anybody
> who wants to have a local copy of wikidata could benefit from this, since
> setting up a .hdt file seems much easier than a Turtle dump. And I don't
> understand why you're trying to blame me for this?
>
> If you are part of the wikidata dev team, I'd greatly appreciate a
> "can/can't" or "don't care" response rather than playing the
> passive-aggressive game that you displayed in your last email.
>
>
> > Let me try to understand ...
> > You are a "data consumer" with the following needs:
> >   - Latest version of the data
> >   - Quick access to the data
> >   - You don't want to use the current ways to access the data by the
> publisher (endpoint, ttl dumps, LDFragments)
> >  However, you ask for a binary format (HDT), but you don't have enough
> memory to set up your own environment/endpoint due to lack of memory.
> > For that reason, you are asking the publisher to support both .hdt and
> .hdt.index files.
> >
> > Do you think there are many users with your current UC?
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Commented On] T179428: Can not enable old SQL prefix search mode on wikidata

2017-10-31 Thread gerritbot
gerritbot added a comment.
Change 387631 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Allow turning Cirrus usage off from query

https://gerrit.wikimedia.org/r/387631TASK DETAILhttps://phabricator.wikimedia.org/T179428EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: gerritbotCc: gerritbot, EBernhardson, dcausse, Aklapper, Smalyshev, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, Lewizho99, Maathavan, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Lowered Priority] T172246: Partial support to rollout of editing wikidata descriptions on Android for en.wp

2017-10-31 Thread Elitre
Elitre lowered the priority of this task from "Normal" to "Lowest".
TASK DETAILhttps://phabricator.wikimedia.org/T172246EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ElitreCc: PokestarFan, Aklapper, JKatzWMF, Dbrant, Qgil, Trizek-WMF, Tbayer, Fjalapeno, Lea_Lacroix_WMDE, He7d3r, gerritbot, Stashbot, Elitre, Lahi, SandraF_WMF, GoranSMilovanovic, nickisverygood, Ivana_Isadora, QZanden, Designsbydavesconcepts, Serumulapiet, Jseddon, donaldepig, FloNight, Wikidata-bugs, Base, aude, Mbch331, Keegan___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Raised Priority] T172246: Partial support to rollout of editing wikidata descriptions on Android for en.wp

2017-10-31 Thread Elitre
Elitre moved this task from October to November on the Community-Liaisons (Oct-Dec 2017) board.Elitre raised the priority of this task from "Lowest" to "Normal".
TASK DETAILhttps://phabricator.wikimedia.org/T172246WORKBOARDhttps://phabricator.wikimedia.org/project/board/2706/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ElitreCc: PokestarFan, Aklapper, JKatzWMF, Dbrant, Qgil, Trizek-WMF, Tbayer, Fjalapeno, Lea_Lacroix_WMDE, He7d3r, gerritbot, Stashbot, Elitre, Lahi, SandraF_WMF, GoranSMilovanovic, nickisverygood, Ivana_Isadora, QZanden, Designsbydavesconcepts, Serumulapiet, Jseddon, donaldepig, FloNight, Wikidata-bugs, Base, aude, Mbch331, Keegan___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T179428: Can not enable old SQL prefix search mode on wikidata

2017-10-31 Thread gerritbot
gerritbot added a comment.
Change 387631 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/Wikibase@master] Allow turning Cirrus usage off from query

https://gerrit.wikimedia.org/r/387631TASK DETAILhttps://phabricator.wikimedia.org/T179428EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: gerritbotCc: gerritbot, EBernhardson, dcausse, Aklapper, Smalyshev, Lahi, GoranSMilovanovic, QZanden, EBjune, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T179428: Can not enable old SQL prefix search mode on wikidata

2017-10-31 Thread gerritbot
gerritbot added a project: Patch-For-Review.
TASK DETAILhttps://phabricator.wikimedia.org/T179428EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: gerritbotCc: gerritbot, EBernhardson, dcausse, Aklapper, Smalyshev, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, Lewizho99, Maathavan, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T177486: [Tracking] Wikidata entity dumpers need to cope with the immense Wikidata growth recently

2017-10-31 Thread ArielGlenn
ArielGlenn added a comment.

In T177486#3724106, @hoo wrote:
(Probably) due to the DataModel updates the current JSON dump was created in just 25 hours, compared to ~34-35h last week. (This is data from one run only, so not overly reliable… but the difference is huge)


If all future runs turn out that way, this is very good news! Looking forward to the other optimizations too.TASK DETAILhttps://phabricator.wikimedia.org/T177486EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hoo, ArielGlennCc: Sjoerddebruin, gerritbot, thiemowmde, Aklapper, ezachte, daniel, Lydia_Pintscher, mark, ArielGlenn, bd808, Liuxinyu970226, aude, JanZerebecki, Jimkont, Denis.bykov, Ricordisamoa, PokestarFan, hoo, Lahi, GoranSMilovanovic, QZanden, Wikidata-bugs, Svick, Mbch331, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T179428: Can not enable old SQL prefix search mode on wikidata

2017-10-31 Thread Smalyshev
Smalyshev created this task.Smalyshev added projects: Wikidata, Discovery-Search.Herald added a subscriber: Aklapper.
TASK DESCRIPTIONWhen useCirrus is set to true in Wikidata config, it is not possible to call old SQL search with explicit useCirrus=false in command line. I think we should keep this capability, at least for a while, e.g. to check for difference between the old code and the new and maybe for BC cases.TASK DETAILhttps://phabricator.wikimedia.org/T179428EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: SmalyshevCc: EBernhardson, dcausse, Aklapper, Smalyshev, Lahi, GoranSMilovanovic, QZanden, EBjune, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


Re: [Wikidata] Wikidata HDT dump

2017-10-31 Thread Jérémie Roquet
2017-10-31 14:56 GMT+01:00 Laura Morales :
> 1. I have downloaded it and I'm trying to use it, but the HDT tools (eg. 
> query) require to build an index before I can use the HDT file. I've tried to 
> create the index, but I ran out of memory again (even though the index is 
> smaller than the .hdt file itself). So any Wikidata dump should contain both 
> the .hdt file and the .hdt.index file unless there is another way to generate 
> the index on commodity hardware

I've just loaded the provided hdt file on a big machine (32 GiB wasn't
enough to build the index but ten times this is more than enough), so
here are a few interesting metrics:
 - the index alone is ~14 GiB big uncompressed, ~9 GiB gzipped and
~6.5 GiB xzipped ;
 - once loaded in hdtSearch, Wikidata uses ~36 GiB of virtual memory ;
 - right after index generation, it includes ~16 GiB of anonymous
memory (with no memory pressure, that's ~26 GiB resident)…
 - …but after a reload, the index is memory mapped as well, so it only
includes ~400 MiB of anonymous memory (and a mere ~1.2 GiB resident).

Looks like a good candidate for commodity hardware, indeed. It loads
in less than one second on a 32 GiB machine. I'll try to run a few
queries to see how it behaves.

FWIW, my use case is very similar to yours, as I'd like to run queries
that are too long for the public SPARQL endpoint and can't dedicate a
powerful machine do this full time (Blazegraph runs fine with 32 GiB,
though — it just takes a while to index and updating is not as fast as
the changes happening on wikidata.org).

-- 
Jérémie

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikimedia Blog - Wikidata at Five

2017-10-31 Thread David McDonell
Great article! Thank you, Andrew and Rob!!

On Tue, Oct 31, 2017 at 11:41 AM Andrew Lih  wrote:

> Here’s a piece I wrote with Rob Fernandez for the Wikimedia blog about
> Wikidata at five and Wikidatacon.
>
> https://blog.wikimedia.org/2017/10/30/wikidata-fifth-birthday/
>
> -Andrew
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
-- 
David McDonell Co-founder & CEO ICONICLOUD, Inc. "Illuminating the cloud"
M: 703-864-1203 EM: da...@iconicloud.com URL: http://iconicloud.com
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Commented On] T177486: [Tracking] Wikidata entity dumpers need to cope with the immense Wikidata growth recently

2017-10-31 Thread hoo
hoo added a comment.
(Probably) due to the DataModel updates the current JSON dump was created in just 25 hours, compared to ~34-35h last week. (This is data from one run only, so not overly reliable… but the difference is huge)TASK DETAILhttps://phabricator.wikimedia.org/T177486EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hooCc: Sjoerddebruin, gerritbot, thiemowmde, Aklapper, ezachte, daniel, Lydia_Pintscher, mark, ArielGlenn, bd808, Liuxinyu970226, aude, JanZerebecki, Jimkont, Denis.bykov, Ricordisamoa, PokestarFan, hoo, Lahi, GoranSMilovanovic, QZanden, Wikidata-bugs, Svick, Mbch331, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T121274: Provide an RDF mapping for external identifiers

2017-10-31 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment.
Ah, I didn’t think about formatter URLs, which pose the same problem for HTML renderings. Good point.TASK DETAILhttps://phabricator.wikimedia.org/T121274EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: daniel, Lucas_Werkmeister_WMDECc: Pigsonthewing, mkroetzsch, Denny, Ladsgroup, Lucas_Werkmeister_WMDE, Jonas, Esc3300, abian, gerritbot, Lydia_Pintscher, Aklapper, daniel, hoo, thiemowmde, JanZerebecki, aude, Ricordisamoa, Micru, Sannita, Laddo, Smalyshev, Tobi_WMDE_SW, Bugreporter, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, Lewizho99, Maathavan, Wikidata-bugs, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T121274: Provide an RDF mapping for external identifiers

2017-10-31 Thread hoo
hoo added a comment.

In T121274#3722773, @Ladsgroup wrote:
In order to update that we need to run rebuildPropertyInfo script I guess.


That should in fact happen on edit.

There are no short/mid term plans to purge the cache in these cases… see also T112081: [Story] purge cached renderings of IDs when the formatter URL changes. If such a URL changes, this would show up in the next dump (this is not cached) and on Special:EntityData (about ~24h after the change). But the query service might have the old one in certain places for quiet a while :/TASK DETAILhttps://phabricator.wikimedia.org/T121274EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: daniel, hooCc: Pigsonthewing, mkroetzsch, Denny, Ladsgroup, Lucas_Werkmeister_WMDE, Jonas, Esc3300, abian, gerritbot, Lydia_Pintscher, Aklapper, daniel, hoo, thiemowmde, JanZerebecki, aude, Ricordisamoa, Micru, Sannita, Laddo, Smalyshev, Tobi_WMDE_SW, Bugreporter, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, Lewizho99, Maathavan, Wikidata-bugs, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T179312: robots.txt prevents indexing of Special:EntityData

2017-10-31 Thread hoo
hoo added a comment.
Well, we could allow this, I guess… but we should at least set a canonical URL (or one per output?) as header (we can't put it in the html here, as there's none).

This is probably interesting especially as we already put the various EntityData URLs in to our regular URLs as .TASK DETAILhttps://phabricator.wikimedia.org/T179312EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hooCc: hoo, daniel, aude, Aklapper, Lydia_Pintscher, Lahi, GoranSMilovanovic, QZanden, Wikidata-bugs, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-31 Thread hoo
hoo updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTION...[x] Re-enable ORES on wikidata (T179107)...TASK DETAILhttps://phabricator.wikimedia.org/T179156EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hooCc: daniel, Peachey88, ema, Gehel, Smalyshev, TerraCodes, Jay8g, Liuxinyu970226, Paladox, Zppix, Stashbot, gerritbot, thiemowmde, aude, Marostegui, Lucas_Werkmeister_WMDE, Legoktm, tstarling, awight, Ladsgroup, Lydia_Pintscher, ori, BBlack, demon, greg, Aklapper, hoo, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, Lewizho99, Maathavan, Mkdw, Liudvikas, srodlund, Luke081515, Wikidata-bugs, ArielGlenn, faidon, zeljkofilipin, Alchimista, He7d3r, Mbch331, Rxy, fgiunchedi, mmodell___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata] Wikimedia Blog - Wikidata at Five

2017-10-31 Thread Andrew Lih
Here’s a piece I wrote with Rob Fernandez for the Wikimedia blog about
Wikidata at five and Wikidatacon.

https://blog.wikimedia.org/2017/10/30/wikidata-fifth-birthday/

-Andrew
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] WDCM: Wikidata usage in Wikivoyage

2017-10-31 Thread Yaroslav Blanter
Thank you Goran.

Cheers
Yaroslav

On Tue, Oct 31, 2017 at 3:28 PM, Goran Milovanovic <
goran.milovanovic_...@wikimedia.de> wrote:

> Hi,
>
> responding to Yaroslav Blanter's following observation on this mailing
> list:
>
> "However, when I look at the statistics of usage,
> http://wdcm.wmflabs.org/WDCM_UsageDashboard/ I see that Wikivoyage
> allegedly uses, in particular, genes, humans (quite a lot, actually), and
> scientific articles. How could this be? I am pretty sure it does not use
> any of these."
>
> Please note that The *Wikidata item usage per semantic category in each
> project type* chart that you have referred to in a later message has a
> logarithmic y-scale (there's a Note explaining this immediately below the
> title of the chart). Also, even from the chart that you were referring to
> you can see that Wikivoyage projects taken together make no use of the
> categories Gene an Scientific Article. The usage of the logarithmic y-axis
> there is a necessity, otherwise we could not offer a comparison across the
> project types (because the differences in usage statistics are huge).
>
> Here's my suggestion on how to obtain a more readable (and more precise)
> information:
>
> - go to the WDCM Usage Dashboard: http://wdcm.wmflabs.org/WDCM_
> UsageDashboard/
> - Tab: Dashboard, and then Tab: Tabs/Crosstabs
> - Enter only: _Wikivoyage in the "Search projects:" field, and select all
> semantic categories in the "Search categories:" field
> - Click "Apply Selection"
>
> What you should be able to learn from the results is that on all
> Wikivoyage projects taken together the total usage of Q5 (Human) is 26, and
> that no items from the Gene (Q7187) or Scientific Article (Q13442814)
> category are used there at all.
>
> Important reminder. The usage statistic in WDCM has the following
> semantics:
>
> - pick an item;
> - count on how many pages in a particular project is that item used;
> - sum up the counts to obtain the usage statistic for that particular item
> in the particular project.
>
> All WDCM Dashboards have a section titled "Description" which provides
> this and similarly important definitions, as well as (hopefully) simple
> descriptions of the respective dashboard's functionality.
>
> Hope this helps.
>
> Best,
> Goran
>
>
>
>
> Goran S. Milovanović, PhD
> Data Analyst, Software Department
> Wikimedia Deutschland
> 
> "It's not the size of the dog in the fight,
> it's the size of the fight in the dog."
> - Mark Twain
> 
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] WDCM: Wikidata usage in Wikivoyage

2017-10-31 Thread Goran Milovanovic
Hi,

responding to Yaroslav Blanter's following observation on this mailing list:

"However, when I look at the statistics of usage,
http://wdcm.wmflabs.org/WDCM_UsageDashboard/ I see that Wikivoyage
allegedly uses, in particular, genes, humans (quite a lot, actually), and
scientific articles. How could this be? I am pretty sure it does not use
any of these."

Please note that The *Wikidata item usage per semantic category in each
project type* chart that you have referred to in a later message has a
logarithmic y-scale (there's a Note explaining this immediately below the
title of the chart). Also, even from the chart that you were referring to
you can see that Wikivoyage projects taken together make no use of the
categories Gene an Scientific Article. The usage of the logarithmic y-axis
there is a necessity, otherwise we could not offer a comparison across the
project types (because the differences in usage statistics are huge).

Here's my suggestion on how to obtain a more readable (and more precise)
information:

- go to the WDCM Usage Dashboard:
http://wdcm.wmflabs.org/WDCM_UsageDashboard/
- Tab: Dashboard, and then Tab: Tabs/Crosstabs
- Enter only: _Wikivoyage in the "Search projects:" field, and select all
semantic categories in the "Search categories:" field
- Click "Apply Selection"

What you should be able to learn from the results is that on all Wikivoyage
projects taken together the total usage of Q5 (Human) is 26, and that no
items from the Gene (Q7187) or Scientific Article (Q13442814) category are
used there at all.

Important reminder. The usage statistic in WDCM has the following semantics:

- pick an item;
- count on how many pages in a particular project is that item used;
- sum up the counts to obtain the usage statistic for that particular item
in the particular project.

All WDCM Dashboards have a section titled "Description" which provides this
and similarly important definitions, as well as (hopefully) simple
descriptions of the respective dashboard's functionality.

Hope this helps.

Best,
Goran




Goran S. Milovanović, PhD
Data Analyst, Software Department
Wikimedia Deutschland

"It's not the size of the dog in the fight,
it's the size of the fight in the dog."
- Mark Twain

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata HDT dump

2017-10-31 Thread Ghislain ATEMEZING
Interesting use case Laura! Your UC is rather "special" :)
Let me try to understand ...
You are a "data consumer" with the following needs:
  - Latest version of the data
  - Quick access to the data
  - You don't want to use the current ways to access the data by the
publisher (endpoint, ttl dumps, LDFragments)
 However, you ask for a binary format (HDT), but you don't have enough
memory to set up your own environment/endpoint due to lack of memory.
For that reason, you are asking the publisher to support both .hdt and
.hdt.index files.

Do you think there are many users with your current UC?


El mar., 31 oct. 2017 a las 14:56, Laura Morales ()
escribió:

> > @Laura: I suspect Wouter wants to know if he "ignores" the previous
> errors and proposes a rather incomplete dump (just for you) or waits for
> Stas' feedback.
>
>
> OK. I wonder though, if it would be possible to setup a regular HDT dump
> alongside the already regular dumps. Looking at the dumps page,
> https://dumps.wikimedia.org/wikidatawiki/entities/, it looks like a new
> dump is generated once a week more or less. So if a HDT dump could be added
> to the schedule, it should show up with the next dump and then so forth
> with the future dumps. Right now even the Turtle dump contains the bad
> triples, so adding a HDT file now would not introduce more inconsistencies.
> The problem will be fixed automatically with the future dumps once the
> Turtle is fixed (because the HDT is generated from the .ttl file anyway).
>
>
> > Btw why don't you use the oldest version in HDT website?
>
>
> 1. I have downloaded it and I'm trying to use it, but the HDT tools (eg.
> query) require to build an index before I can use the HDT file. I've tried
> to create the index, but I ran out of memory again (even though the index
> is smaller than the .hdt file itself). So any Wikidata dump should contain
> both the .hdt file and the .hdt.index file unless there is another way to
> generate the index on commodity hardware
>
> 2. because it's 1 year old :)
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
-- 
---
"Love all, trust a few, do wrong to none" (W. Shakespeare)
Web: http://atemezing.org
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata HDT dump

2017-10-31 Thread Laura Morales
> @Laura: I suspect Wouter wants to know if he "ignores" the previous errors 
> and proposes a rather incomplete dump (just for you) or waits for Stas' 
> feedback.


OK. I wonder though, if it would be possible to setup a regular HDT dump 
alongside the already regular dumps. Looking at the dumps page, 
https://dumps.wikimedia.org/wikidatawiki/entities/, it looks like a new dump is 
generated once a week more or less. So if a HDT dump could be added to the 
schedule, it should show up with the next dump and then so forth with the 
future dumps. Right now even the Turtle dump contains the bad triples, so 
adding a HDT file now would not introduce more inconsistencies. The problem 
will be fixed automatically with the future dumps once the Turtle is fixed 
(because the HDT is generated from the .ttl file anyway).


> Btw why don't you use the oldest version in HDT website?


1. I have downloaded it and I'm trying to use it, but the HDT tools (eg. query) 
require to build an index before I can use the HDT file. I've tried to create 
the index, but I ran out of memory again (even though the index is smaller than 
the .hdt file itself). So any Wikidata dump should contain both the .hdt file 
and the .hdt.index file unless there is another way to generate the index on 
commodity hardware

2. because it's 1 year old :)

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Commented On] T121274: Provide an RDF mapping for external identifiers

2017-10-31 Thread Ladsgroup
Ladsgroup added a comment.
In order to update that we need to run rebuildPropertyInfo script I guess.TASK DETAILhttps://phabricator.wikimedia.org/T121274EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: daniel, LadsgroupCc: Pigsonthewing, mkroetzsch, Denny, Ladsgroup, Lucas_Werkmeister_WMDE, Jonas, Esc3300, abian, gerritbot, Lydia_Pintscher, Aklapper, daniel, hoo, thiemowmde, JanZerebecki, aude, Ricordisamoa, Micru, Sannita, Laddo, Smalyshev, Tobi_WMDE_SW, Bugreporter, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, Lewizho99, Maathavan, Wikidata-bugs, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


Re: [Wikidata] Wikidata HDT dump

2017-10-31 Thread Ghislain ATEMEZING
@Laura: I suspect Wouter wants to know if he "ignores" the previous errors
and proposes a rather incomplete dump (just for you) or waits for Stas'
feedback.
Btw why don't you use the oldest version in HDT website?

El mar., 31 oct. 2017 a las 7:53, Laura Morales ()
escribió:

> @Wouter
>
> > Thanks for the pointer!  I'm downloading from
> https://dumps.wikimedia.org/wikidatawiki/entities/latest-all.ttl.gz now.
>
> Any luck so far?
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
-- 
---
"Love all, trust a few, do wrong to none" (W. Shakespeare)
Web: http://atemezing.org
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Commented On] T173710: Job queue is increasing non-stop

2017-10-31 Thread elukey
elukey added a comment.

In T173710#3720358, @EBernhardson wrote:
All jobs have a requestId parameter, which is passed down through the execution chain. This is the same as the reqId field in logstash. Basically this means if the originating request logged anything to logstash, you should be able to find it with the query type:mediawiki reqId:x and looking for the very first message. That assumes of course the initial request logged anything.


Thanks! I tried to spot check in logstash but I am able to see only the request that starts from the jobrunner (the one executing the job), not much more .. :(

https://gerrit.wikimedia.org/r/#/c/385248 is really really promising, not sure when it will be deployed but it would surely help in finding quickly a massive template change or similar.TASK DETAILhttps://phabricator.wikimedia.org/T173710EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: elukeyCc: zhuyifei1999, RP88, Jack_who_built_the_house, elukey, debt, mxn, Daimona, YOUR1, Tbayer, mobrovac, Nikerabbit, Mholloway, Legoktm, ema, Joe, GWicke, Nemo_bis, Andreasmperu, BBlack, Peachey88, Liuxinyu970226, daniel, Stashbot, Agabi10, Daniel_Mietchen, XXN, Pasleim, Bugreporter, Sjoerddebruin, Magnus, Mr.Ibrahem, gerritbot, EBernhardson, Esc3300, jcrespo, WMDE-leszek, Jdforrester-WMF, Krinkle, aaron, fgiunchedi, Aklapper, Ladsgroup, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, EBjune, Vali.matei, Avner, Lewizho99, Zppix, Maathavan, Gehel, FloNight, Eevans, Hardikj, Wikidata-bugs, aude, jayvdb, faidon, Mbch331, Jay8g, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


Re: [Wikidata] Wikidata HDT dump

2017-10-31 Thread Laura Morales
@Wouter

> Thanks for the pointer!  I'm downloading from 
> https://dumps.wikimedia.org/wikidatawiki/entities/latest-all.ttl.gz now.

Any luck so far?

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Updated] T179241: Enable the ArticlePlaceholder for Northern Sami (sewiki)

2017-10-31 Thread Dereckson
Dereckson added a project: Wikimedia-Site-requests.
TASK DETAILhttps://phabricator.wikimedia.org/T179241EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Zoranzoki21, DerecksonCc: gerritbot, Aklapper, jhsoby, Lea_Lacroix_WMDE, Lydia_Pintscher, Lucie, hoo, jhsoby-WMNO, Lahi, Lordiis, GoranSMilovanovic, Adik2382, Jayprakash12345, Th3d3v1ls, Ramalepe, Liugev6, QZanden, cmadeo, Zoranzoki21, Lewizho99, Maathavan, DatGuy, Devwaker, Urbanecm, JEumerus, Tulsi_Bhagat, Wong128hk, Luke081515, SimmeD, biplabanand, Wikidata-bugs, Snowolf, aude, Dcljr, jayvdb, Ricordisamoa, Shizhao, Jdforrester-WMF, Matanya, Mbch331, Rxy, Jay8g, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs