[Wikidata-bugs] [Maniphest] [Updated] T191633: Implement searching of 'depicts' on commons

2020-03-16 Thread Sarahmarie1981
Sarahmarie1981 edited projects, added 
WMDE-Tech-Communication-Source-Code-Berlin; removed Structured-Data-Backlog.
Restricted Application added a project: Structured-Data-Backlog.

TASK DETAIL
  https://phabricator.wikimedia.org/T191633

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Cparle, Sarahmarie1981
Cc: Jheald, Ramsey-WMF, Cparle, Aklapper, NavinRizwi, CBogen, darthmon_wmde, 
Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, 
GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, LawExplorer, Salgo60, 
Silverfish, _jensen, rosalieper, Scott_WUaS, Susannaanas, Abit, Jane023, 
Wikidata-bugs, Base, matthiasmullie, aude, Dinoguy1000, Ricordisamoa, Wesalius, 
Lydia_Pintscher, Fabrice_Florin, Raymond, Jdforrester-WMF, Steinsplitter, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T191633: Implement searching of 'depicts' on commons

2019-04-01 Thread Ramsey-WMF
Ramsey-WMF edited projects, added SDC Engineering, SDC-Statements 
(Depicts-on-training-wheels); removed SDC Engineering (Depicts and other 
statements on a bicycle), Multimedia-Team-Working-Board.

TASK DETAIL
  https://phabricator.wikimedia.org/T191633

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Cparle, Ramsey-WMF
Cc: Jheald, Ramsey-WMF, Cparle, Aklapper, alaa_wmde, Nandana, JKSTNK, Lahi, 
PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, 
QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Silverfish, _jensen, 
rosalieper, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, 
matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, 
Raymond, Jdforrester-WMF, Steinsplitter, Matanya, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T191633: Implement searching of 'depicts' on commons

2019-03-27 Thread Cparle
Cparle added a comment.


  Searching for 'depicts' statements using `haswbstatement` is already 
implemented.
  
  Traversing the tree of related statements is covered by T199241 
, T207863 
 and T194401 

  
  I propose that this ticket be marked as resolved. Any objections?

TASK DETAIL
  https://phabricator.wikimedia.org/T191633

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Cparle
Cc: Jheald, Ramsey-WMF, Cparle, Aklapper, alaa_wmde, Nandana, JKSTNK, Lahi, 
PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, 
QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Silverfish, _jensen, 
rosalieper, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, 
matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, 
Raymond, Jdforrester-WMF, Steinsplitter, Matanya, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T191633: Implement searching of 'depicts' on commons

2019-03-01 Thread MarkTraceur
MarkTraceur edited projects, added SDC Engineering (Depicts and other 
statements on a bicycle); removed SDC Engineering.

TASK DETAIL
  https://phabricator.wikimedia.org/T191633

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Cparle, MarkTraceur
Cc: Jheald, Ramsey-WMF, Cparle, Aklapper, Nandana, JKSTNK, Lahi, PDrouin-WMF, 
Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, 
Tramullas, Acer, V4switch, LawExplorer, Silverfish, _jensen, rosalieper, 
Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, 
Ricordisamoa, Mvolz, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, 
Jdforrester-WMF, Steinsplitter, Matanya, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T191633: Implement searching of 'depicts' on commons

2019-02-05 Thread Jdforrester-WMF
Jdforrester-WMF edited parent tasks, added: T215305: "Depicts on rollerskates": Qualifiers, and search by depicts statements; removed: T199352:  Deploy Structured Data on Commons with arbitrary Statements.
TASK DETAILhttps://phabricator.wikimedia.org/T191633EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Cparle, Jdforrester-WMFCc: Jheald, Ramsey-WMF, Cparle, Aklapper, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Silverfish, _jensen, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Jdforrester-WMF, Steinsplitter, Matanya, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T191633: Implement searching of 'depicts' on commons

2019-01-10 Thread Jdforrester-WMF
Jdforrester-WMF added a parent task: T199352:  Deploy Structured Data on Commons with arbitrary Statements.
TASK DETAILhttps://phabricator.wikimedia.org/T191633EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Cparle, Jdforrester-WMFCc: Jheald, Ramsey-WMF, Cparle, Aklapper, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Silverfish, _jensen, D3r1ck01, Susannaanas, Wong128hk, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Jdforrester-WMF, Steinsplitter, Matanya, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T191633: Implement searching of 'depicts' on commons

2018-09-27 Thread Jdforrester-WMF
Jdforrester-WMF added a project: SDC Engineering.
TASK DETAILhttps://phabricator.wikimedia.org/T191633EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Cparle, Jdforrester-WMFCc: Jheald, Ramsey-WMF, Cparle, Aklapper, Nandana, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Susannaanas, Wong128hk, Aschroet, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Matanya, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T191633: Implement searching of 'depicts' on commons

2018-07-14 Thread Ramsey-WMF
Ramsey-WMF added a comment.
@Jheald thanks for the detailed comment. We have indeed worked on a number of the things you've mentioned, but for the sake of brevity I'll focus on the "man with hat" example you gave.

What we're working on now is similar to what you talk about. We aim to introduce an explicit depicts search to supplement searching for plain text strings.

You can see round 1 of designs for this here: https://wikimedia.invisionapp.com/share/B9MYIFJGVX7#/screens/308729068

As you said:

The user interface will therefore presumably need to lead the user away from searching for "man with hat" towards "human being who is male with hat".

This is exactly what we're attempting to do on search - a new interface that allows users to specify depicts statements and qualifiers that apply to it (we're starting with the default P180 qualifiers now but encourage Commonists to talk about which others they'll need). Additionally, we'll be adding features in UploadWizard, the File page, and other areas to encourage users to add depicts statements with useful qualifiers so that this kind of search can actually work (perhaps with the assistance of suggestions from image content recognition systems).

We are also currently building alpha working versions that allow us to test performance so we know for sure what's feasible in terms of response time, server load, etc. We hope to have this all ready in an update by end of year.TASK DETAILhttps://phabricator.wikimedia.org/T191633EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Cparle, Ramsey-WMFCc: Jheald, Ramsey-WMF, Cparle, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Susannaanas, Wong128hk, Aschroet, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Matanya, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T191633: Implement searching of 'depicts' on commons

2018-07-14 Thread Jheald
Jheald added a comment.
I have now found T198261 and T199119, which investigate how some of the drawbacks above with the simple string-matching approach might be addressed.TASK DETAILhttps://phabricator.wikimedia.org/T191633EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Cparle, JhealdCc: Jheald, Ramsey-WMF, Cparle, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Susannaanas, Wong128hk, Aschroet, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Matanya, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T191633: Implement searching of 'depicts' on commons

2018-07-14 Thread Jheald
Jheald added a comment.
The attached subtickets are an interesting read.  They all seem to be based on taking the Q-number value of "depicts", storing it as a string in the text-search index, and then doing an indexed string-match for it.  Of course first baby-steps are important, and this facility will be crucial to be able to confirm correct entry, storage, and direct retrievability of "depicts" values.

But it seems a long way short of the functionality that has generally been assumed for retrieval based on "depicts" values, and that is ultimately going to be needed.

Has the team had any initial thoughts what possible strategies are likely to be available or preferred or required, to make such more general retrieval achievable, and what sort of back-end requirements may be needed to make the system acceptably responsive (ie near-instant returns), and able to cope with full production load?  Are there tickets open for these questions anywhere?

To give an idea of the sort of issue that's motivating my question, consider a user search for "man with hat".

One of the images we might hope to see included in the set returned to the user might be File:Giovanni Bellini, portrait of Doge Leonardo Loredan.jpg.

But the CommonsData for this file will most probably not include the literal statements "depicts man" or "depicts hat".

Instead, most probably, it would have the statement "depicts Q1759759", the Wikidata item for the painting.

Even if it was described in-situ, it would most probably have the statement "depicts Q250210" (Leonardo Loredan) rather than "depicts man"; and depicts (or P1354 "shown with features") Q1134210 "doge's cap".

A simple string search for depicts Q8441 "man" or depicts Q80151 "hat" is not going to match it.

It would seem that, at the very least, search is going to need to match to any items in the wikidata subclass (P279) trees of the search terms.

(As an aside, it may be worth remembering that Blazegraph can get into difficulties (T116298) if queries combine multiple path expansions (as all of ours would), without careful hinting to explore such expansions in a many-to-few direction.  It also (for some reason) sometimes finds it much quicker to traverse such paths if they are given in reverse  (ie ?b ^prop* ?a rather than ?a prop* ?b), even when a direction hint is given. So this might need care.)

A further complication is that male individuals on wikidata are not represented as instances of "man" (Q8441), but rather as instances of "human" (Q5), with property "sex or gender" (P21) = "male" (Q6581097).

The user interface will therefore presumably need to lead the user away from searching for "man with hat" towards "human being who is male with hat".  Similarly, if a user is searching for images that depict "bald politician", wikidata does not represent individuals as instances  (P31) of "politician" (Q82955); instead they are instances of "human" (Q5), with "occupation" (P106) = "politician".  The faceted search UI is going to need to make it a lot easier for people to enter "human" and then "occupation", rather than "politician".

But the big question, at least for me: are the team confident that searches like these are going to be deliverable in reasonable time; and still deliverable in reasonable time at production load?

Is it a concern, for example, that anything that requires materialising the whole set of instances of Q5 "human" is inevitably going to be slow -- the number of such items on Wikidata is currently 4.375 million, and Commons will add even more.  Thus, for example, just to count the number of such items with preferred images (P18) currently takes no less than 50 seconds (tinyurl.com/y7ddk8wm).

Of course, even with searches involving instances of Q5, it might not be necessary to materialise the full set, if the final result-set will ultimately be LIMITed to only, say, 2000 results. (Though beware that currently the labels SERVICE requires the entire results set to be materialised, even with a LIMIT directive).  Also it's true that other clauses in the search (if present) may well be more restrictive, and therefore (if identified by the optimiser) may open up a faster solution path.   Nevertheless, some searches might still require quite deep tree explorations, materialisations of quite big intermediate sets, and quite big merges, even to produce only a few hundred results.

As a team, do we have a target for how quickly results need to start being returned from such a search, if the system is to be considered decently responsive?

Are we confident that we think this should be achievable, for the faceted search back-end running in the full context of the entire wikidata dataset plus a maximal level of image description in CommonsData?

And how many such searches do we anticipate the system will need to be able to field an hour, in full production use?

(A first estimate might perhaps be some multiple of the number of Commons category pages currently served an

[Wikidata-bugs] [Maniphest] [Updated] T191633: Implement searching of 'depicts' on commons

2018-04-16 Thread Cparle
Cparle added a project: Multimedia-Team-Working-Board.
TASK DETAILhttps://phabricator.wikimedia.org/T191633EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: CparleCc: Cparle, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Ramsey-WMF, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Susannaanas, Wong128hk, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Matanya, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T191633: Implement searching of 'depicts' on commons

2018-04-06 Thread EBjune
EBjune removed a parent task: T190315: [Epic] Provide search results for media file captions on Commons.
TASK DETAILhttps://phabricator.wikimedia.org/T191633EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Cparle, EBjuneCc: Cparle, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Ramsey-WMF, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Susannaanas, Wong128hk, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Matanya, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T191633: Implement searching of 'depicts' on commons

2018-04-06 Thread EBjune
EBjune added a parent task: T190315: [Epic] Provide search results for media file captions on Commons.
TASK DETAILhttps://phabricator.wikimedia.org/T191633EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Cparle, EBjuneCc: Cparle, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Ramsey-WMF, SandraF_WMF, GoranSMilovanovic, QZanden, EBjune, Tramullas, Acer, V4switch, LawExplorer, Susannaanas, Wong128hk, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Matanya, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs