[Wikidata] Blazegraph

2015-10-25 Thread Gerard Meijssen
Hoi,
Arguments have been raised where Blazegraph was key to the problem. It is
however a server based tool. Would someone please install it on labs and
thereby making it available to all of us.

In the process the argument becomes an argument that is of relevance to all
of us. At this stage it is very much a niche issue.
Thanks,
  GerardM
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Updated] T113959: Test everything

2015-10-25 Thread Liuxinyu970226
Liuxinyu970226 set Security to None.

TASK DETAIL
  https://phabricator.wikimedia.org/T113959

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucie, Liuxinyu970226
Cc: Ricordisamoa, Aklapper, Lucie, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T113957: [Story] Set the image Property id in configurations

2015-10-25 Thread Liuxinyu970226
Liuxinyu970226 set Security to None.

TASK DETAIL
  https://phabricator.wikimedia.org/T113957

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucie, Liuxinyu970226
Cc: Lydia_Pintscher, Lucie, gerritbot, Aklapper, hoo, Wikidata-bugs, aude, 
Ricordisamoa



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T109584: [Bug] wbsearchentities API defaults to something else than the given language parameter

2015-10-25 Thread Addshore
Addshore added a comment.

So this is due to language fallback.
DE falls back to EN.
Again I don't see anything in the comment that we did not explicitly design 
into this version of the API module.

You search in a language (with fallback)
The matched result is always returned to you, with what the result is, the 
language it is in, and the actual result.
You are given the entity ID, the concept and url, as well as the title and 
pageid.

For convenience you are then also given a label and description to display in 
the current user language, or language specified with uselang (which again has 
fallback)
This is, as said, provided for convenience...

If the excluded the label and description field from the result would you still 
see the module as broken?


TASK DETAIL
  https://phabricator.wikimedia.org/T109584

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: nichtich, Lydia_Pintscher, Jonas, Addshore, thiemowmde, Aklapper, 
Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


Re: [Wikidata] Announcing Wikidata Taxonomy Browser (beta)

2015-10-25 Thread Markus Krötzsch

On 25.10.2015 02:18, Kingsley Idehen wrote:

On 10/24/15 10:51 AM, Markus Krötzsch wrote:

On 24.10.2015 12:29, Martynas Jusevičius wrote:

I don't see how cycle queries can be a requirement for SPARQL engines if
they are not part of SPARQL spec? The closest thing you have is property
paths.


We were talking about *cyclic data* not cyclic queries (which you can
also create easily using BGPs, but that's unrelated here). Apparently,
BlazeGraph has performance issues when computing a path expression
over a cyclic graph.

Markus


Markus,

Out of curiosity, can you share a SPARQL query example (text or query
results url) that demonstrates your point?


You mean a query with BlazeGraph having performance issues? That problem 
was reported by Stas. He should have examples. In any case, it is always 
a combination of query and data.


Markus


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Commented On] T116547: try computing certains wikidata stats via hadoop (e.g. spark) instead of query.w.o (blazegraph)

2015-10-25 Thread Addshore
Addshore added a comment.

> certains wikidata stats


Could you elaborate?


TASK DETAIL
  https://phabricator.wikimedia.org/T116547

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: Addshore, Christopher, JanZerebecki, Lydia_Pintscher, Aklapper, 
Ricordisamoa, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T105126: Evaluate pattern constraints (safely)

2015-10-25 Thread Popcorndude
Popcorndude added a comment.

this matches the constraints I suggested:

  
^(?!.*?\.(\+|\*|\{\d+,\})\()(\\.|[^()\\\[\]]|\[([^\\\[\]]|\\.)*\]|\((?!\?)(\\.|[^()\\]|\[([^\\\[\]]|\\.)*\])*\))+$

these convert infinite repetition (other than ##.*()##, ##.+()##, and 
##.{n,}()##) to atomic groups:

| ##(\[([^\[\]]|\\.)*\])(\+|\*|\{\d+,\})## | ##(?>\1\2)##|
| ##((?\1\2)##|
| ##((?\1\2)##|
| ##(\(([^()]|\\.)*\))(\+|\*|\{\d+,\})##   | ##(?>\1\2)##|
| ##\.(\+|\*|\{\d+,\})(\\.|[^()\[\]\\])##  | ##(?>[^\2]\1)\2##   |
| ##\.(\+|\*|\{\d+,\})\[(([^\\\[\]]|\\.)*)\]## | ##(?>[^\2]\1)[\2]## |

(incidentally, this accepts 669/715 (93%) of the current constraints)


TASK DETAIL
  https://phabricator.wikimedia.org/T105126

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Popcorndude
Cc: Nikki, Popcorndude, Aklapper, daniel, Wikidata-bugs, aude, GWicke, csteipp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


Re: [Wikidata] Blazegraph

2015-10-25 Thread James Heald
Hi Gerard.  Blazegraph is the name of the open-source SPARQL engine 
being used to provide the Wikidata SPARQL service.


So Blazegraph *is* available to all of us, at 
https://query.wikidata.org/ , via both the query editor, and the SPARQL 
API endpoint.


It's convenient to talk describe some issues with the SPARQL service 
being "Blazegraph issues", if the issues appear to lie with the query 
engine.


Other query engines that other people be running might be running might 
have other specific issues, eg "Virtuoso issues".  But it is Blazegraph 
that the Discovery team and Wikidata have decided to go with.


Hope that helps,

All best, James.




On 25/10/2015 07:28, Gerard Meijssen wrote:

Hoi,
Arguments have been raised where Blazegraph was key to the problem. It is
however a server based tool. Would someone please install it on labs and
thereby making it available to all of us.

In the process the argument becomes an argument that is of relevance to all
of us. At this stage it is very much a niche issue.
Thanks,
   GerardM



___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Retitled] T116298: SPARQL endpoint should gracefully handle cycles and loops in transitive properties

2015-10-25 Thread daniel
daniel changed the title from "SPARQL endpoint should gracefully handle loops 
in transitive properties" to "SPARQL endpoint should gracefully handle cycles 
and loops in transitive properties".
daniel edited the task description.
daniel set Security to None.

TASK DETAIL
  https://phabricator.wikimedia.org/T116298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Smalyshev, daniel
Cc: Aklapper, daniel, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, 
Deskana, Manybubbles



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


Re: [Wikidata] Announcing Wikidata Taxonomy Browser (beta)

2015-10-25 Thread James Heald

On 25/10/2015 09:31, Markus Krötzsch wrote:

On 25.10.2015 02:18, Kingsley Idehen wrote:

On 10/24/15 10:51 AM, Markus Krötzsch wrote:


We were talking about *cyclic data* not cyclic queries (which you can
also create easily using BGPs, but that's unrelated here). Apparently,
BlazeGraph has performance issues when computing a path expression
over a cyclic graph.

Markus


Markus,

Out of curiosity, can you share a SPARQL query example (text or query
results url) that demonstrates your point?


You mean a query with BlazeGraph having performance issues? That problem
was reported by Stas. He should have examples. In any case, it is always
a combination of query and data.



Hi Kingsley,

I had a problem with Blazgraph queries that had path requirements 
containing a compound path predicate, and ending in a variable, eg


   wd:Q289 wdt:P31/wdt:P279* ?o.

However, this particular example now appears to work.  (With the recent 
upgrade of the SPARQL endpoint to the latest Blazegraph production 
release ?)


On the other hand, it appears that path queries can still fail if they 
involve a variable intended to be a fixed constant set by a BIND 
statement (usually the first thing a query engine will do).


So, for example, a query to count incidences of instances of subclasses 
of painting, where the key requirement statement is


  ?a wdt:P31/wdt:P279* wd:Q3305213

runs in about 0.4 seconds.   However, a very similar query where the 
identity of that target superclass is set using a BIND statement,


   BIND (wd:Q3305213 AS ?class) .
   ?a wdt:P31/wdt:P279* ?class .

times out -- or rather: it ought to be reporting that it has timed out, 
and used to, but now it doesn't throw a "Query Timed Out" error, but 
instead now after 120 seconds returns an (incorrect) count of zero. (An 
additional, new bug).


Complete versions of these queries can be found at
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/suggestions#Path_assertions_that_end_in_a_variable_can_blow_up

and as a Blazegraph bug at

https://jira.blazegraph.com/browse/BLZG-1543

(although, as with a couple of other issues described on the same wiki 
page linked above, that I've filed a Blazegraph bug for, there doesn't 
seem to be any indication that anybody has actually read the bug...)



I'm not sure if Stas knows of other current issues with path queries.

I did post a complaint to this list, just after the query service was 
publicly announced, that path queries seemed very slow.  They *are* 
still slower than the equivalent search on WDQ.  But I think it was this 
issue with binding variables that was underlying the worst of what I was 
seeing.


As for cyclical paths, as I posted a couple of days ago, the queries at
https://www.wikidata.org/wiki/Wikidata:WikiProject_Names/given-name_variants
for counting up incidences of given-name variants involve graphs that 
are anything but directed (based on the P460 "said to be the same as" 
property), and Blazegraph seems to handle them without any particular 
difficulty; though it's possible that there may have been earlier 
problems when the service was still at an alpha stage.


  -- James.




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Edited] T116298: SPARQL endpoint should gracefully handle cycles and loops in transitive properties

2015-10-25 Thread daniel
daniel edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T116298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Smalyshev, daniel
Cc: Aklapper, daniel, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, 
Deskana, Manybubbles



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T116298: SPARQL endpoint should gracefully handle cycles and loops in transitive properties

2015-10-25 Thread daniel
daniel edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T116298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Smalyshev, daniel
Cc: Aklapper, daniel, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, 
Deskana, Manybubbles



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T116547: try computing certains wikidata stats via hadoop (e.g. spark) instead of query.w.o (blazegraph)

2015-10-25 Thread JanZerebecki
JanZerebecki added a comment.

This would be an alternative to T115242: Add Munger option to not filter 
uninteresting object type triples .


TASK DETAIL
  https://phabricator.wikimedia.org/T116547

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JanZerebecki
Cc: Addshore, Christopher, JanZerebecki, Lydia_Pintscher, Aklapper, 
Ricordisamoa, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T116547: try computing certains wikidata stats via hadoop (e.g. spark) instead of query.w.o (blazegraph)

2015-10-25 Thread JanZerebecki
JanZerebecki created this task.
JanZerebecki added subscribers: Ricordisamoa, Aklapper, Lydia_Pintscher, 
JanZerebecki, Christopher, Addshore.
JanZerebecki added a project: Wikidata.

TASK DESCRIPTION
  Try computing certains wikidata stats via hadoop (e.g. spark) instead of 
query.w.o (blazegraph). That might be a better fit.

TASK DETAIL
  https://phabricator.wikimedia.org/T116547

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JanZerebecki
Cc: Addshore, Christopher, JanZerebecki, Lydia_Pintscher, Aklapper, 
Ricordisamoa, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T105126: Evaluate pattern constraints (safely)

2015-10-25 Thread Popcorndude
Popcorndude added a comment.

Those criteria accept 62 (8%) of the current constraints.
Adding character classes (\d is everywhere) brings it up to 166 (23%)

I would suggest allowing infinite repetition if the thing being repeated cannot 
overlap with the next thing. Although to prevent that from requiring the 
checking regex to execute the format on itself maybe infinite repetition should 
be allowed on character classes if they don't overlap with the following thing. 
(e.g. \d*[a-z] but not \d*0), and then also treat them as atomic (automatically 
convert \d+ to (?>\d+))

It would probably make things a lot easier for the people making the formats if 
each property was allowed to have multiple format strings (match this or this).

As for grouping, maybe allow it but only non-nested atomic, since the 
alternative is to rewrite this (admittedly slightly extreme example) as a flat 
list of alternatives

https?://((academia|android|anime|apple|arduino|astronomy|aviation|beer|bicycles|biology|bitcoin|blender|boardgames|bricks|buddhism|chemistry|chess|chinese|christianity|codegolf|codereview|cogsci|cooking|craftcms|crypto|cs|cstheory|datascience|dba|diy|drupal|dsp|earthscience|ebooks|electronics|ell|emacs|english|expatriates|expressionengine|fitness|freelancing|french|gamedev|gaming|gardening|genealogy|german|gis|graphicdesign|ham|hermeneutics|hinduism|history|homebrew|islam|italian|japanese|joomla|judaism|linguistics|magento|martialarts|math|matheducators|mathematica|mechanics|meta|moderators|money|movies|music|networkengineering|opendata|outdoors|parenting|patents|pets|philosophy|photo|physics|pm|poker|politics|productivity|programmers|puzzling|quant|raspberrypi|reverseengineering|robotics|rpg|russian|salesforce|scicomp|scifi|security|sharepoint|skeptics|softwarerecs|sound|space|spanish|sports|sqa|startups|stats|sustainability|tex|tor|travel|tridion|unix|ux|video|webapps|webmasters|
windowsphone|wordpress|workplace|writers)\.stackexchange\.com|askubuntu\.com|mathoverflow\.net|pt\.stackoverflow\.com|serverfault\.com|stackapps\.com|stackoverflow\.com|superuser\.com)/(tags|questions/tagged)/.*
(https://phabricator.wikimedia.org/P1482)

I'll see if I can make some regexs for this in a bit.


TASK DETAIL
  https://phabricator.wikimedia.org/T105126

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Popcorndude
Cc: Popcorndude, Aklapper, daniel, Wikidata-bugs, aude, GWicke, csteipp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T105126: Evaluate pattern constraints (safely)

2015-10-25 Thread Nikki
Nikki added a subscriber: Nikki.
Nikki added a comment.

@Popcorndude: I don't really know what's going on here (sorry if I've 
completely misunderstood), are you only looking at a subset of the format 
constraints? I know `P1814` is using \p and `P898` is using \x and you didn't 
mention either of those even though they're using things other than "?*+ [] () 
| {} {,} \d\D\s\S\w .".


TASK DETAIL
  https://phabricator.wikimedia.org/T105126

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Nikki
Cc: Nikki, Popcorndude, Aklapper, daniel, Wikidata-bugs, aude, GWicke, csteipp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T105126: Evaluate pattern constraints (safely)

2015-10-25 Thread Popcorndude
Popcorndude added a comment.

My apologies. I eliminated those in my initial analysis and forgot to mention 
it. The full list of things with backslashes in front of them:
bdDpsSwx2()[]{}|^\/$?+*,-.


TASK DETAIL
  https://phabricator.wikimedia.org/T105126

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Popcorndude
Cc: Nikki, Popcorndude, Aklapper, daniel, Wikidata-bugs, aude, GWicke, csteipp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116485: Echo should not notify the user about his own linking activity

2015-10-25 Thread He7d3r
He7d3r added a comment.

Same problem for this edit on Wikipedia:
https://pt.wikipedia.org/w/index.php?diff=43741618
where I linked
https://pt.wikipedia.org/wiki/Anel_quase_comutativo
to
https://pt.wikipedia.org/wiki/%C3%81lgebra_de_Weyl


TASK DETAIL
  https://phabricator.wikimedia.org/T116485

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: He7d3r
Cc: Sjoerddebruin, He7d3r, Aklapper, Luke081515, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T116464: [Task] Refactor gui.js

2015-10-25 Thread gerritbot
gerritbot added a project: Patch-For-Review.

TASK DETAIL
  https://phabricator.wikimedia.org/T116464

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: gerritbot
Cc: gerritbot, Aklapper, Jonas, Smalyshev, JanZerebecki, jkroll, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116080: [Story] SPARQL auto completion for keywords, ?variables and Templates or snippets

2015-10-25 Thread gerritbot
gerritbot added a subscriber: gerritbot.
gerritbot added a comment.

Change 248690 had a related patch set uploaded (by Jonas Kress (WMDE)):
SPARQL auto completion for keywords and ?variables

https://gerrit.wikimedia.org/r/248690


TASK DETAIL
  https://phabricator.wikimedia.org/T116080

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Jonas, gerritbot
Cc: gerritbot, Bene, Jonas, Aklapper, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116464: [Task] Refactor gui.js

2015-10-25 Thread gerritbot
gerritbot added a subscriber: gerritbot.
gerritbot added a comment.

Change 248689 had a related patch set uploaded (by Jonas Kress (WMDE)):
[WIP] Refactor gui.js

https://gerrit.wikimedia.org/r/248689


TASK DETAIL
  https://phabricator.wikimedia.org/T116464

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: gerritbot
Cc: gerritbot, Aklapper, Jonas, Smalyshev, JanZerebecki, jkroll, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T115837: [Story] Query service toolbar

2015-10-25 Thread gerritbot
gerritbot added a subscriber: gerritbot.
gerritbot added a comment.

Change 248692 had a related patch set uploaded (by Jonas Kress (WMDE)):
Created toolbar and cleaned UI

https://gerrit.wikimedia.org/r/248692


TASK DETAIL
  https://phabricator.wikimedia.org/T115837

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Jonas, gerritbot
Cc: gerritbot, Smalyshev, Jonas, Aklapper, jkroll, Wikidata-bugs, Jdouglas, 
aude, Deskana, Manybubbles



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T116080: [Story] SPARQL auto completion for keywords, ?variables and Templates or snippets

2015-10-25 Thread gerritbot
gerritbot added a project: Patch-For-Review.

TASK DETAIL
  https://phabricator.wikimedia.org/T116080

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Jonas, gerritbot
Cc: gerritbot, Bene, Jonas, Aklapper, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T115837: [Story] Query service toolbar

2015-10-25 Thread gerritbot
gerritbot added a project: Patch-For-Review.

TASK DETAIL
  https://phabricator.wikimedia.org/T115837

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Jonas, gerritbot
Cc: gerritbot, Smalyshev, Jonas, Aklapper, jkroll, Wikidata-bugs, Jdouglas, 
aude, Deskana, Manybubbles



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T116485: Echo should not notify the user about his own linking activity

2015-10-25 Thread Sjoerddebruin
Sjoerddebruin added a project: Regression.
Sjoerddebruin added a subscriber: Sjoerddebruin.

TASK DETAIL
  https://phabricator.wikimedia.org/T116485

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Sjoerddebruin
Cc: Sjoerddebruin, He7d3r, Aklapper, Luke081515, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


Re: [Wikidata] Announcing Wikidata Taxonomy Browser (beta)

2015-10-25 Thread James Heald

The standard algorithm for a path search is very simple:

  Keep adding a new generation of links, until the new link brings in 
no node not already seen.



This works for graphs of equivalence relations, it works for directed 
acyclic graphs.


It's not the /graphs/ that are causing the problem here, because 
Blazegraph can handle either of them by themself and give the right answer.


Rather, in the query like:

SELECT (COUNT(DISTINCT(?city)) AS ?count) WHERE {
  ?city wdt:P31/wdt:P279* wd:Q515 .  # find instances of subclasses of city
  ?city wdt:P131* wd:Q1202 .
}

something is going wrong with the way Blazegraph handles the two 
conditions *together*.



I suspect this may be closely related to whatever is going wrong with a 
query like:


SELECT (COUNT(DISTINCT(?a)) AS ?count) WHERE {
   BIND (wd:Q3305213 AS ?class) .
   ?a wdt:P31/wdt:P279* ?class .
}

which times out.;


It's the plan of joins which is going wrong, not whether the graph is 
acyclic or not.


  -- James.



On 25/10/2015 17:53, Daniel Kinzler wrote:

"Said to be the same as" is a good example of a case where cycles are
unavoidable. A possible workaround in this case is to make sure that the
transitive closure of "said to be the same as" is already in the data, such that
the path "P460+" returns the same results as a mere "P460" would. It's not
ideal, but maybe workable.


I think we have to distinguish between different use cases:

1) Antisymmetric transitive relations, like subclass-of or part-of, which should
form an acyclic graph. For these, the "*" notation in sparql can be used to
query a sub-graph, such as all kinds of cars or all places in Idaho. This is our
primary use case for path traversal, I believe

2) Symmetric transitive relations, such as "said to be the same as". These
(should) form small "islands" of fully connected graphs that are (hopefully)
unconnected to each other. Here, the "*" notation can be used to include the
entire clique instead of only a single node in a query. This might be useful in
some cases, but doesn't strike me as a typical use case.

3) Cycles in non-transitive properties: these are not errors at all, and
problems only arise when such properties as used in a query as if they were
transitive. We could perhaps detect and reject attempts to apply the "*"
notation to properties that are not transitive.

4) Intransitive symmetrical relations (e.g. "souse of"). Do we need any special
handling for them, or do they just get treated like (3)?


Anyway: we need a solution for (1) that allows transitive queries, and a
solution for (3) that prevents pathological behavior. If we get nice handling
for case (2), that's a bonus, but not a requirement, I think.





___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Commented On] T105126: Evaluate pattern constraints (safely)

2015-10-25 Thread daniel
daniel added a comment.

Maybe we should spell out the pattern matching features we actually need to 
cover our primary use case, matching identifiers. I'd suggest:

- Literal, e.g. //abc//
- Character set, e.g. //a[bc]//.
- Character range, e.g. //foo[0-9]//
- Repetition for an exact number of times, e.g. //[ab]{2}//
- Repetition for a range of times number of times, e.g. //[ab]{2,5}//

Expressions are always anchored (have to match the full string), and are always 
case sensitive.

I don't think we need escapes for special characters like \t, and I don't think 
we need character classes like \w. I also don't think we need inverse sets like 
//[^bc]//, or infinite repetition using //a+// or //a*// or even //a{2,}//.


TASK DETAIL
  https://phabricator.wikimedia.org/T105126

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: Popcorndude, Aklapper, daniel, Wikidata-bugs, aude, GWicke, csteipp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


Re: [Wikidata] Announcing Wikidata Taxonomy Browser (beta)

2015-10-25 Thread Daniel Kinzler
Am 25.10.2015 um 19:20 schrieb James Heald:
> It's not the /graphs/ that are causing the problem here, because Blazegraph 
> can
> handle either of them by themself and give the right answer.

That's an interesting observation, would you add your examples to
?

-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Commented On] T95686: [Task] write a maintenance script to migrate properties from string to new identifier datatype

2015-10-25 Thread daniel
daniel added a comment.

XML dumps with histories are indeed an issue: the script that builds them will 
take old revisions from old dump files, to avoid re-serializing the data. We 
would need to tell it not to do this, and to re-serialize everything. This is 
not hard, but it causes the dump process to take a long time, which makes 
it more likely to fail. And then we have to start over. A bit annyoing.

I can only recommend against using data from XML dumps. We do not give *any* 
guarantees to the format of the content blobs you find in there. They may 
change without notice, and contain serializations in various forms.


TASK DETAIL
  https://phabricator.wikimedia.org/T95686

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, daniel
Cc: JanZerebecki, jayvdb, gerritbot, MGChecker, daniel, Multichill, 
Ricordisamoa, Liuxinyu970226, Aklapper, Lydia_Pintscher, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116300: [Bug] Cannot edit references

2015-10-25 Thread daniel
daniel added a comment.

Huh, odd. Need to investigate this.


TASK DETAIL
  https://phabricator.wikimedia.org/T116300

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: hoo, Aklapper, daniel, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T105126: Evaluate pattern constraints (safely)

2015-10-25 Thread daniel
daniel added a comment.

How about using (?>) independent sub-expressions 

 to avoid backtracking?

As to https://phabricator.wikimedia.org/P1949, /oai:.*?:.*/ or /oai:[^:]*:.*/ 
would work (since we don't capture anyway)


TASK DETAIL
  https://phabricator.wikimedia.org/T105126

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: Popcorndude, Aklapper, daniel, Wikidata-bugs, aude, GWicke, csteipp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T116298: SPARQL endpoint should gracefully handle cycles and loops in transitive properties

2015-10-25 Thread Jheald
Jheald added a subscriber: Jheald.
Jheald added a comment.

Transitive properties aren't necessary a-cyclic  -- for example, consider 
https://phabricator.wikimedia.org/P460 "said to be the same as", which is 
transitive, currently used to link given names together, if they are considered 
to be variants of the same name in different languages.

The page at

  https://www.wikidata.org/wiki/Wikidata:WikiProject_Names/given-name_variants

includes two columns of queries that find all the variants of each name (ie 
extract the whole equivalence class), using a 
https://phabricator.wikimedia.org/P460* path query from a given starting name, 
and count up the number of occurrences of each one.

https://phabricator.wikimedia.org/P460 is transitive, but it is not directed, 
and so there are huge numbers of loops in the equivalence classes. But this 
isn't a problem.  The path search just keeps going until no new 
https://phabricator.wikimedia.org/P460 link leads it to a node it has not 
already seen.  So the engine //can// handles path loops, without a problem.

Similarly, looking at the query Daniel posted,

  SELECT (COUNT(DISTINCT(?city)) AS ?count) WHERE {
?city wdt:P31/wdt:P279* wd:Q515 .# find instances of subclasses of city
?city wdt:P131* wd:Q1202 .
  }

if one comments out either one or the other of the two requirement lines, the 
query runs without a hitch.  So I don't think there is anything to do with the 
nature of the data (eg whether it contains cycles or not) that affects whether 
or not Blazegraph can handle it.

The problem appears to arise with how Blazegraph combines the two requirements.

Related to this may be the fact that Blazegraph also fails with path searches 
of the form

  SELECT (COUNT(DISTINCT ?a) AS ?count) WHERE { 
  BIND (wd:Q3305213 AS ?class) .# paintings 
  ?a wdt:P31/wdt:P279* ?class .
  }

even though the same search works perfectly if the variable ?class is 
eliminated, and wd:Q3305213 instead specified inline.

I filed the latter with Blazegraph as issue 1543 three weeks ago, but no 
commentary on it from them as yet

  https://jira.blazegraph.com/browse/BLZG-1543


TASK DETAIL
  https://phabricator.wikimedia.org/T116298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Smalyshev, Jheald
Cc: Jheald, Aklapper, daniel, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, 
Deskana, Manybubbles



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


Re: [Wikidata] Announcing Wikidata Taxonomy Browser (beta)

2015-10-25 Thread Daniel Kinzler
Am 25.10.2015 um 19:50 schrieb Daniel Kinzler:
> Am 25.10.2015 um 19:20 schrieb James Heald:
>> It's not the /graphs/ that are causing the problem here, because Blazegraph 
>> can
>> handle either of them by themself and give the right answer.
> 
> That's an interesting observation, would you add your examples to
> ?

Oh, you just did :) thanks!


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-bugs] [Maniphest] [Commented On] T115679: Handle property datatype changing gracefully

2015-10-25 Thread daniel
daniel added a subscriber: daniel.
daniel added a comment.

Does pywikibot differentiate between value type and data type? Does pywikibot 
even *use* the data type as such? The value type is not going to change, only 
the data type.

For example, for the "url" data type, the value type is "string". For the new 
ID data type, the value type will remain "string" when the data type is changed 
from "string" to "ID".

Also note that this kind of change is not going to happen often. It's a 
breaking change (to the knowledge base, not the software), and will be 
announced as such.


TASK DETAIL
  https://phabricator.wikimedia.org/T115679

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: daniel, XZise, Ricordisamoa, hoo, Lydia_Pintscher, Multichill, Legoktm, 
Aklapper, jayvdb, pywikibot-bugs-list, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T115679: Handle property datatype changing gracefully

2015-10-25 Thread jayvdb
jayvdb added a comment.

Pywikibot does use the data type, quite extensively. e.g.
https://github.com/wikimedia/pywikibot-core/blob/master/pywikibot/page.py#L3951

They are mapped to classes, and mapped to value-types.

With something like https://gerrit.wikimedia.org/r/247080 , we can easily 
create those mappings dynamically at startup, and provide fallback classes to 
unknown data types.


TASK DETAIL
  https://phabricator.wikimedia.org/T115679

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: jayvdb
Cc: daniel, XZise, Ricordisamoa, hoo, Lydia_Pintscher, Multichill, Legoktm, 
Aklapper, jayvdb, pywikibot-bugs-list, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs