[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2023-01-05 Thread dcausse
dcausse added a comment. In T302189#8501314 , @Nikki wrote: > This report of grammatical features is wrong because it

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2023-01-05 Thread Nikki
Nikki added a comment. This report of grammatical features is wrong because it includes deleted data. Like with the previous queries I mentioned, I'm unable to fix it because that

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-12-10 Thread Nikki
Nikki added a comment. Ran into this again while trying to check whether a property is in use in references, since pr: includes non-existent references - https://query.wikidata.org/#select%20%2a%20%7B%20%3Fs%20pr%3AP2183%20%3Fval%20%7D Here's

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-12-05 Thread Nikki
Nikki added a comment. In T302189#7848616 , @MPhamWMF wrote: > Thanks for the clarification. With regard to orphaned nodes throwing off query results, there should be ways to write SPARQL queries in such a way that they ignore these

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-08-09 Thread JFishback_WMF
JFishback_WMF added a project: Privacy Engineering. TASK DETAIL https://phabricator.wikimedia.org/T302189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: JFishback_WMF Cc: HenkvD, MPhamWMF, Bugreporter, dcausse, Aklapper, Yair_rand, Astuthiodit_1,

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-08-06 Thread Bugreporter
Bugreporter added a project: Privacy. TASK DETAIL https://phabricator.wikimedia.org/T302189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Bugreporter Cc: MPhamWMF, Bugreporter, dcausse, Aklapper, Yair_rand, Astuthiodit_1, AWesterinen,

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-04-12 Thread MPhamWMF
MPhamWMF triaged this task as "Low" priority. TASK DETAIL https://phabricator.wikimedia.org/T302189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: MPhamWMF Cc: MPhamWMF, Bugreporter, dcausse, Aklapper, Yair_rand, Astuthiodit_1, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-04-12 Thread MPhamWMF
MPhamWMF added a comment. Thanks for the clarification. With regard to orphaned nodes throwing off query results, there should be ways to write SPARQL queries in such a way that they ignore these nodes. In the meantime, we will try to reload Wikidata more often, as discussed above.

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-04-06 Thread Bugreporter
Bugreporter added a comment. Another possible workflow: (1) once a sitelink/value/reference is unused it is put to a queue (deduplicated) (2) use something like LDF to quickly filter out value or reference that is still used (no need to do this for sitelinks as one sitelink can only be

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-04-06 Thread Bugreporter
Bugreporter added a comment. See also: T105427: Need a way for WDQS updater to become aware of suppressed deletes TASK DETAIL https://phabricator.wikimedia.org/T302189 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-04-05 Thread MPhamWMF
MPhamWMF added a comment. Has removing sensitive information been a problem so far? Or is it something that is currently an issue? I acknowledge that as an edge case, it should be ideally be one that we can take care of quickly. In the past, we've had one off tickets to clean up data

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-04-05 Thread Bugreporter
Bugreporter added a comment. A proposed workflow: 1. using some queries to find all orphaned sitelink, value and reference nodes (and recording the IRIs) 2. remove them from triplestore 3. recheck if they are still unused and if they are not we need to add them back to the

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-04-05 Thread Bugreporter
Bugreporter added a comment. I don't think "once, twice or four times a year" is enough. If someone added sensitive information into Wikidata, we need to remove it as soon as possible. TASK DETAIL https://phabricator.wikimedia.org/T302189 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-04-05 Thread dcausse
dcausse added a comment. Reason is that this data //may// be referenced by other items and thus cannot be deleted blindly without asking blazegraph: //"is this data used by another item?"// which would be too costly to ask for every edit. Another approach is to reload blazegraph from the

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-04-05 Thread MPhamWMF
MPhamWMF added a comment. @dcausse , is there a reason these changes are not reflected within the 10min update lag we have for other WD changes to be reflected in WDQS? TASK DETAIL https://phabricator.wikimedia.org/T302189 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-03-31 Thread Bugreporter
Bugreporter reopened this task as "Open". Bugreporter added a comment. Reopen. Having orphaned sitelink, value or reference node means users can add sensitive information into WDQS with no easy way to clean up. TASK DETAIL https://phabricator.wikimedia.org/T302189 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2022-03-31 Thread Bugreporter
Bugreporter renamed this task from "Query service retains orphaned sitelinks" to "Regularly purge orphaned sitelink, value and reference nodes". TASK DETAIL https://phabricator.wikimedia.org/T302189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: