[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-07-07 Thread diego
diego closed this task as "Resolved".

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego
Cc: Isaac, achou, Lydia_Pintscher, MunizaA, Aklapper, leila, mrephabricator, 
KinneretG, Astuthiodit_1, YLiou_WMF, karapayneWMDE, Invadibot, Ywats0ns, 
maantietaja, ItamarWMDE, Akuckartz, Nandana, Abdeaitali, Lahi, Gq86, 
GoranSMilovanovic, QZanden, LawExplorer, Avner, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-07-07 Thread diego
diego added a comment.


  **Weekly Updates**
  
  - The Wikidata Revert Risk model is now available for testing on this PAWS 
notebook 
.
  
  I'm going to resolve this task and add the evaluation and improvements in a 
new ticket.

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego
Cc: Isaac, achou, Lydia_Pintscher, MunizaA, Aklapper, leila, mrephabricator, 
KinneretG, Astuthiodit_1, YLiou_WMF, karapayneWMDE, Invadibot, Ywats0ns, 
maantietaja, ItamarWMDE, Akuckartz, Nandana, Abdeaitali, Lahi, Gq86, 
GoranSMilovanovic, QZanden, LawExplorer, Avner, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-06-30 Thread diego
diego added a subscriber: Isaac.
diego added a comment.


  **Weekly Updates**
  
  - @MunizaA has released an alpha version of the evaluation tool. Results for 
Wikidata Model can be found here .
  - For Wikidata Revert Risk, I'm going to upload thetraining and testing code, 
plus the model on public repo, and then open another task for model's 
evaluation and improvements.
  - Regarding the Item Quality model, I'm going to coordinate with @Isaac for 
the follow-ups on that project.

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego
Cc: Isaac, achou, Lydia_Pintscher, MunizaA, Aklapper, leila, mrephabricator, 
KinneretG, Astuthiodit_1, YLiou_WMF, karapayneWMDE, Invadibot, Ywats0ns, 
maantietaja, ItamarWMDE, Akuckartz, Nandana, Abdeaitali, Lahi, Gq86, 
GoranSMilovanovic, QZanden, LawExplorer, Avner, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-06-16 Thread diego
diego added a comment.


  **Weekly updates**
  
  - I'm currently working on the Model Card for this algorithm.
  - @MunizaA  please notify us in this ticket when the annotation tool app is 
ready.
  - We are preparing the code to be shared with @Lydia_Pintscher and (through 
her) with volunteer developers to test the current algorithm on their own 
datasets.

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego
Cc: achou, Lydia_Pintscher, MunizaA, Aklapper, leila, mrephabricator, 
Astuthiodit_1, YLiou_WMF, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Abdeaitali, Lahi, Gq86, GoranSMilovanovic, 
QZanden, LawExplorer, Avner, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, 
aude, Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-06-12 Thread diego
diego added a comment.


  - Weekly Updates**
  
  - We have met with Lydia and community developers.  We are going to share our 
code with them and we have also learn about their efforts on automatic content 
patrolling in Wikidata.
  - The evaluation tool code is ready, this week @MunizaA would upload this to 
a public end-point (toolforge or wmfcloud).

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego
Cc: achou, Lydia_Pintscher, MunizaA, Aklapper, leila, mrephabricator, 
Astuthiodit_1, YLiou_WMF, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Abdeaitali, Lahi, Gq86, GoranSMilovanovic, 
QZanden, LawExplorer, Avner, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, 
aude, Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-06-02 Thread diego
diego added a comment.


  **Weekly Updates**
  
  - We are still working on the evaluation tool.

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego
Cc: achou, Lydia_Pintscher, MunizaA, Aklapper, leila, mrephabricator, 
Astuthiodit_1, YLiou_WMF, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Abdeaitali, Lahi, Gq86, GoranSMilovanovic, 
QZanden, LawExplorer, Avner, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, 
aude, Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-05-26 Thread diego
diego added a comment.


  **Weekly Updates**
  
  - @MunizaA  is working on evaluation tool that would be usable by all the 
Revert Risk Models, including the Wikidata on as well as the LA and 
Multilingual for Wikipedia

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego
Cc: achou, Lydia_Pintscher, MunizaA, Aklapper, leila, mrephabricator, 
Astuthiodit_1, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, ItamarWMDE, 
Akuckartz, Nandana, Abdeaitali, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, Avner, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-05-14 Thread diego
diego added a comment.


  **Weekly Updates**
  
  - The model card for Multilingual model is available here 
.
  - We are working with Lydia to evaluate the model, and update if needed.

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego
Cc: achou, Lydia_Pintscher, MunizaA, Aklapper, leila, mrephabricator, 
Astuthiodit_1, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, ItamarWMDE, 
Akuckartz, Nandana, Abdeaitali, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, Avner, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-05-05 Thread diego
diego added a subscriber: achou.
diego added a comment.


  **Weekly Updates**
  
  - The first version of this model is ready to go to LiftWing.
  - @MunizaA  has submitted a merge request 
.
 Now @achou is reviewing the code.
  - I'll be meeting with @Lydia_Pintscher next week to show the results and 
discuss next steps.
  - We are planning to create and upload the model card next week.

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego
Cc: achou, Lydia_Pintscher, MunizaA, Aklapper, leila, mrephabricator, 
Astuthiodit_1, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, ItamarWMDE, 
Akuckartz, Nandana, Abdeaitali, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, Avner, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-04-28 Thread diego
diego added a comment.


  **Weekly Updates**
  
  - We are finalizing the feature extraction pipeline code and the code to 
serve the model on LiftWing.

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego
Cc: Lydia_Pintscher, MunizaA, Aklapper, leila, mrephabricator, Astuthiodit_1, 
karapayneWMDE, Invadibot, Ywats0ns, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Abdeaitali, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, 
Avner, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-04-20 Thread diego
diego added a comment.


  **Weekly Updates**
  
  - We have develop a meta-model. This model has two main components.
- The first one is a Catboost based classifier, designed to assess the 
Revert Risk for claims set and updates.
- The second model is an hybrid approach, designed to evaluate Revert Risk 
on Wikidata Item Descriptions. This model uses mBert 
.
- @MunizaA has developed a methodology for creating clean training data for 
the mBert Model
  - @MunizaA  is now working on implementing this model, and the feature 
extraction pipeline by updating the Knowledge Integrity Repo 
.

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego
Cc: Lydia_Pintscher, MunizaA, Aklapper, leila, mrephabricator, Astuthiodit_1, 
karapayneWMDE, Invadibot, Ywats0ns, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Abdeaitali, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, 
Avner, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-04-14 Thread diego
diego added a comment.


  **Weekly updates**
  
  - @MunizaA has created an efficient pipeline to train HuggingFace 
Transformers, using the GPUs from the stat machines, and data coming from the 
Data Lake.
  - We are experimenting with different LLM such as mBert and Roberta, to 
detect vandalism on Item Descriptions.

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego
Cc: MunizaA, Aklapper, leila, mrephabricator, Astuthiodit_1, karapayneWMDE, 
Invadibot, Ywats0ns, maantietaja, ItamarWMDE, Akuckartz, Nandana, Abdeaitali, 
Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, Avner, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-04-07 Thread diego
diego added a subscriber: MunizaA.
diego added a comment.


  **Weekly Updates**
  
  - @MunizaA has been testing the feasibility and utility of using Wikidata 
Embeddings, both for Item Quality and Revert Risk. We have studied different 
implementations, and experimenting with the PyTorch BigGraph model 
. We have been able to 
train on medium-size subgraphs. While the training on large graphs seems to be 
possible, we are still evaluating the value of such embeddings for the proposed 
tasks.
  - We have tested specific approaches for different types of actions. Eg: One 
language-based model to assess quality of descriptions and labels, and other 
models for claims containing triples (Q_x P_y Q_z). This is improving the 
performance and quality of our results.

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego
Cc: MunizaA, Aklapper, leila, mrephabricator, Astuthiodit_1, karapayneWMDE, 
Invadibot, Ywats0ns, maantietaja, ItamarWMDE, Akuckartz, Nandana, Abdeaitali, 
Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, Avner, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T333892: Develop a new generation of ML models for Wikidata

2023-04-04 Thread Lydia_Pintscher
Lydia_Pintscher added projects: Wikidata, Wikidata data quality and trust.

TASK DETAIL
  https://phabricator.wikimedia.org/T333892

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: diego, Lydia_Pintscher
Cc: Aklapper, leila, mrephabricator, Astuthiodit_1, karapayneWMDE, Invadibot, 
Ywats0ns, maantietaja, ItamarWMDE, Akuckartz, Nandana, Abdeaitali, Lahi, Gq86, 
GoranSMilovanovic, QZanden, LawExplorer, Avner, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Capt_Swing, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org