[Wikidata-bugs] [Maniphest] T312781: Cirrus search appears not to be indexing reference URLs from wikidata

2022-08-13 Thread Aklapper
Aklapper added a project: CirrusSearch.

TASK DETAIL
  https://phabricator.wikimedia.org/T312781

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Aklapper
Cc: bking, MPhamWMF, Lydia_Pintscher, Aklapper, Jheald, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, Wilmanbeno, CBogen, ItamarWMDE, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, jayvdb, 
Mbch331, jeremyb
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T312781: Cirrus search appears not to be indexing reference URLs from wikidata

2022-08-08 Thread MPhamWMF
MPhamWMF added a comment.


  In T312781#8121115 , 
@MPhamWMF wrote:
  
  > talk to lydia. part of wd model is not entirely index: values or properties 
and properties. can't ask for all wiki articles with references from nyt. 
structured data needed
  
  Ooops. please ignore this comment. I was taking notes and this accidentally 
got saved as a comment and I can't seem to delete it

TASK DETAIL
  https://phabricator.wikimedia.org/T312781

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: MPhamWMF
Cc: MPhamWMF, Lydia_Pintscher, Aklapper, Jheald, Astuthiodit_1, karapayneWMDE, 
Invadibot, maantietaja, CBogen, ItamarWMDE, Akuckartz, Nandana, Matlin, Lahi, 
Gq86, GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T312781: Cirrus search appears not to be indexing reference URLs from wikidata

2022-08-01 Thread Jheald
Jheald added a comment.


  Part of the expectation of an RDF-based system is that it should be easy to 
retrieve URLs of a particular form.
  
  It's not appropriate to think or expect the community will index by hand 
things that directly lend themselves to be indexed by machine - such as 
fragments of URLs, in particular domains.
  
  There's simply no way that trying to index the domains of these URLs in a 
community-driven way would be as accurate or as comprehensive as automatic 
indexing -- and it would be pure makework.  This is what computers are for.
  
  There is too much of a mountain of work to do or to fix on wikidata that 
actually requires human judgment, to imagine the community is going to waste 
its time and divert resources to indexing URLs or extracting domain parts when 
this is so straightforward and done so much better automatically.
  
  Blazegraph (like most triplestores) actually comes with the option to turn on 
full-text indexing for URLs.  But this was not done, because, we were told, 
full-text indexing would be done so much more efficiently by Cirrus, which was 
already activated.  Now it turns out, that was only actually true in certain 
areas.
  
  It's not unreasonable to want to be able to ask how material from a 
particular source is being used -- and SPARQL should be a perfect tool for 
doing such analyses.  Being able to retrieve references with URLs of a 
particular type by full text indexing would be best.  But if that is not going 
to be possible, can I suggest at least adding extra triples to the triplestore 
for the domain part of URLs -- so at least it would be possible to pull out the 
URLs from a particular domain quickly (and almost instantaneously to count 
them.  That at least would open the door to making more complicated 
requirements possible.

TASK DETAIL
  https://phabricator.wikimedia.org/T312781

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Jheald
Cc: MPhamWMF, Lydia_Pintscher, Aklapper, Jheald, Astuthiodit_1, karapayneWMDE, 
Invadibot, maantietaja, CBogen, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T312781: Cirrus search appears not to be indexing reference URLs from wikidata

2022-08-01 Thread MPhamWMF
MPhamWMF changed the subtype of this task from "Bug Report" to "Feature 
Request".
MPhamWMF added a comment.


  talk to lydia. part of wd model is not entirely index: values or properties 
and properties. can't ask for all wiki articles with references from nyt. 
structured data needed

TASK DETAIL
  https://phabricator.wikimedia.org/T312781

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: MPhamWMF
Cc: MPhamWMF, Lydia_Pintscher, Aklapper, Jheald, Astuthiodit_1, karapayneWMDE, 
Invadibot, maantietaja, CBogen, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T312781: Cirrus search appears not to be indexing reference URLs from wikidata

2022-08-01 Thread MPhamWMF
MPhamWMF triaged this task as "Medium" priority.

TASK DETAIL
  https://phabricator.wikimedia.org/T312781

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: MPhamWMF
Cc: Lydia_Pintscher, Aklapper, Jheald, Astuthiodit_1, karapayneWMDE, Invadibot, 
MPhamWMF, maantietaja, CBogen, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T312781: Cirrus search appears not to be indexing reference URLs from wikidata

2022-07-25 Thread Gehel
Gehel edited projects, added Discovery-Search; removed Discovery-Search 
(Current work).

TASK DETAIL
  https://phabricator.wikimedia.org/T312781

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel
Cc: Lydia_Pintscher, Aklapper, Jheald, Astuthiodit_1, karapayneWMDE, Invadibot, 
MPhamWMF, maantietaja, CBogen, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T312781: Cirrus search appears not to be indexing reference URLs from wikidata

2022-07-11 Thread Jheald
Jheald updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T312781

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Jheald
Cc: Lydia_Pintscher, Aklapper, Jheald, Astuthiodit_1, karapayneWMDE, Invadibot, 
MPhamWMF, maantietaja, CBogen, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T312781: Cirrus search appears not to be indexing reference URLs from wikidata

2022-07-11 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.


  Related: T240334  and T238498 


TASK DETAIL
  https://phabricator.wikimedia.org/T312781

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lydia_Pintscher
Cc: Lydia_Pintscher, Aklapper, Jheald, Astuthiodit_1, karapayneWMDE, Invadibot, 
MPhamWMF, maantietaja, CBogen, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T312781: Cirrus search appears not to be indexing reference URLs from wikidata

2022-07-11 Thread Jheald
Jheald added projects: Wikidata, Discovery-Search (Current work).

TASK DETAIL
  https://phabricator.wikimedia.org/T312781

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Jheald
Cc: Aklapper, Jheald, Astuthiodit_1, karapayneWMDE, Invadibot, MPhamWMF, 
maantietaja, CBogen, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org