[Wiki-research-l] Re: New private, granular pageview dataset

2024-03-18 Thread Hal Triedman
Hello Kai (and everyone else)!

I've updated these datasets (from 2017-present) to include an additional
column with QID wherever possible. Please let me know if there are any
issues or confusion about the datasets — I'm happy to get on calls,
prioritize dataset improvements, or answer questions on this listserv :)

Happy analyses,
Hal

On Mon, Mar 4, 2024 at 11:47 AM Hal Triedman 
wrote:

> Hi Kai!
>
> Thank you for this reminder — when this dataset was published, there
> wasn't a consistently-updated, stable page ID <--> QID table available
> internally. Now there is. I'll see what I can get done on this in the next
> week or two, and send any updates as soon as I can :)
>
> Thanks again,
> Hal
>
> On Mon, Mar 4, 2024 at 10:04 AM Kai Zhu  wrote:
>
>> Hi,
>>
>> I hope this message finds you well. I'm writing to follow up on our
>> previous discussions about enhancing the pageviews data file by adding a
>> QID column. My collaborator and I have identified several use cases where
>> the ability to match concepts across languages at a large scale is
>> crucial.
>> Given the volume of articles we're working with, relying on API calls for
>> millions of them isn't feasible. Incorporating the QID column would
>> significantly benefit not only our project but also a wide range of
>> potential users who may face similar challenges.
>>
>> Thank you for considering this request. We believe this addition could
>> greatly improve the utility and accessibility of the data for various
>> research and analysis purposes.
>>
>> Best regards,
>> Kai Zhu
>> Assistant Professor
>> Bocconi University
>>
>> On Mon, Jun 26, 2023 at 7:22 PM Hal Triedman 
>> wrote:
>>
>> > Hi Kai!
>> >
>> > Thanks for this suggestion — I'll put it on the list of improvements to
>> > this dataset, and hopefully be able to put it into production in the
>> next
>> > month or two. In the meantime, the example python notebook
>> > <
>> >
>> https://public-paws.wmcloud.org/67457802/private_pageview_data_access.ipynb
>> > >
>> > I linked above has a subsection entitled "Example of joining page_ids
>> and
>> > titles to wikidata QID" that shows how you can retrieve a set of QIDs
>> > manually for a given page ID or title. Hope this helps get you started!
>> >
>> > Thanks again,
>> > Hal
>> >
>> > On Sun, Jun 25, 2023 at 4:30 PM Kai Zhu  wrote:
>> >
>> > > Great dataset! This is amazing. I have no doubt that this will enable
>> a
>> > lot
>> > > of new research endeavors.
>> > >
>> > > If I may have a suggestion: is it possible to also have wikidata id
>> for
>> > > each row? That way we can more conveniently match the same concepts
>> > across
>> > > languages at large scale...
>> > >
>> > > Best,
>> > > Kai Zhu
>> > > Assistant Professor at Bocconi University
>> > >
>> > > On Wed, Jun 21, 2023 at 12:51 PM Hal Triedman <
>> htried...@wikimedia.org>
>> > > wrote:
>> > >
>> > > > Hello world!
>> > > >
>> > > > My name is Hal Triedman, and I’m a senior privacy engineer at WMF. I
>> > work
>> > > > to make data that WMF releases about reading, editing, and other
>> > on-wiki
>> > > > behavior safer, more granular, and more accessible to the world
>> using
>> > > > differential
>> > > > privacy <https://en.wikipedia.org/wiki/Differential_privacy>.
>> > > >
>> > > > Today I’m reaching out to share that WMF has released almost 8 years
>> > > (from
>> > > > 1 July 2015 to present) of privatized pageview data
>> > > > <
>> > > >
>> > >
>> >
>> https://diff.wikimedia.org/2023/06/21/new-dataset-uncovers-wikipedia-browsing-habits-while-protecting-users/
>> > > > >,
>> > > > partitioned by country, project, and page. This data is
>> significantly
>> > > more
>> > > > granular than other datasets we release, and should help
>> researchers to
>> > > > disambiguate both long- and short-term trends within languages on a
>> > > > country-by-country basis — several
>> > > > <https://phabricator.wikimedia.org/T207171> long-standing requests
>> > > > <https://phabricator.wikimedia.org/T2672

[Wiki-research-l] Re: New private, granular pageview dataset

2024-03-04 Thread Hal Triedman
Hi Kai!

Thank you for this reminder — when this dataset was published, there wasn't
a consistently-updated, stable page ID <--> QID table available internally.
Now there is. I'll see what I can get done on this in the next week or two,
and send any updates as soon as I can :)

Thanks again,
Hal

On Mon, Mar 4, 2024 at 10:04 AM Kai Zhu  wrote:

> Hi,
>
> I hope this message finds you well. I'm writing to follow up on our
> previous discussions about enhancing the pageviews data file by adding a
> QID column. My collaborator and I have identified several use cases where
> the ability to match concepts across languages at a large scale is crucial.
> Given the volume of articles we're working with, relying on API calls for
> millions of them isn't feasible. Incorporating the QID column would
> significantly benefit not only our project but also a wide range of
> potential users who may face similar challenges.
>
> Thank you for considering this request. We believe this addition could
> greatly improve the utility and accessibility of the data for various
> research and analysis purposes.
>
> Best regards,
> Kai Zhu
> Assistant Professor
> Bocconi University
>
> On Mon, Jun 26, 2023 at 7:22 PM Hal Triedman 
> wrote:
>
> > Hi Kai!
> >
> > Thanks for this suggestion — I'll put it on the list of improvements to
> > this dataset, and hopefully be able to put it into production in the next
> > month or two. In the meantime, the example python notebook
> > <
> >
> https://public-paws.wmcloud.org/67457802/private_pageview_data_access.ipynb
> > >
> > I linked above has a subsection entitled "Example of joining page_ids and
> > titles to wikidata QID" that shows how you can retrieve a set of QIDs
> > manually for a given page ID or title. Hope this helps get you started!
> >
> > Thanks again,
> > Hal
> >
> > On Sun, Jun 25, 2023 at 4:30 PM Kai Zhu  wrote:
> >
> > > Great dataset! This is amazing. I have no doubt that this will enable a
> > lot
> > > of new research endeavors.
> > >
> > > If I may have a suggestion: is it possible to also have wikidata id for
> > > each row? That way we can more conveniently match the same concepts
> > across
> > > languages at large scale...
> > >
> > > Best,
> > > Kai Zhu
> > > Assistant Professor at Bocconi University
> > >
> > > On Wed, Jun 21, 2023 at 12:51 PM Hal Triedman  >
> > > wrote:
> > >
> > > > Hello world!
> > > >
> > > > My name is Hal Triedman, and I’m a senior privacy engineer at WMF. I
> > work
> > > > to make data that WMF releases about reading, editing, and other
> > on-wiki
> > > > behavior safer, more granular, and more accessible to the world using
> > > > differential
> > > > privacy <https://en.wikipedia.org/wiki/Differential_privacy>.
> > > >
> > > > Today I’m reaching out to share that WMF has released almost 8 years
> > > (from
> > > > 1 July 2015 to present) of privatized pageview data
> > > > <
> > > >
> > >
> >
> https://diff.wikimedia.org/2023/06/21/new-dataset-uncovers-wikipedia-browsing-habits-while-protecting-users/
> > > > >,
> > > > partitioned by country, project, and page. This data is significantly
> > > more
> > > > granular than other datasets we release, and should help researchers
> to
> > > > disambiguate both long- and short-term trends within languages on a
> > > > country-by-country basis — several
> > > > <https://phabricator.wikimedia.org/T207171> long-standing requests
> > > > <https://phabricator.wikimedia.org/T267283> from Wikimedia
> > communities.
> > > >
> > > > Due to various technical factors, there are three distinct datasets:
> > > >
> > > >-
> > > >
> > > >1 July 2015 – 8 Feb 2017
> > > ><
> > > >
> > >
> >
> https://analytics.wikimedia.org/published/datasets/country_project_page_historical_pre_2017/
> > > > >
> > > >/ README
> > > ><
> > > >
> > >
> >
> https://analytics.wikimedia.org/published/datasets/country_project_page_historical_pre_2017/00_README.html
> > > > >
> > > >(publishing threshold [1]: 3,500 pageviews)
> > > >-
> > > >
> > > >9 Feb 2017 – 5 Feb 2023
> > > >

[Wiki-research-l] Re: New private, granular pageview dataset

2023-06-26 Thread Hal Triedman
Hi Kai!

Thanks for this suggestion — I'll put it on the list of improvements to
this dataset, and hopefully be able to put it into production in the next
month or two. In the meantime, the example python notebook
<https://public-paws.wmcloud.org/67457802/private_pageview_data_access.ipynb>
I linked above has a subsection entitled "Example of joining page_ids and
titles to wikidata QID" that shows how you can retrieve a set of QIDs
manually for a given page ID or title. Hope this helps get you started!

Thanks again,
Hal

On Sun, Jun 25, 2023 at 4:30 PM Kai Zhu  wrote:

> Great dataset! This is amazing. I have no doubt that this will enable a lot
> of new research endeavors.
>
> If I may have a suggestion: is it possible to also have wikidata id for
> each row? That way we can more conveniently match the same concepts across
> languages at large scale...
>
> Best,
> Kai Zhu
> Assistant Professor at Bocconi University
>
> On Wed, Jun 21, 2023 at 12:51 PM Hal Triedman 
> wrote:
>
> > Hello world!
> >
> > My name is Hal Triedman, and I’m a senior privacy engineer at WMF. I work
> > to make data that WMF releases about reading, editing, and other on-wiki
> > behavior safer, more granular, and more accessible to the world using
> > differential
> > privacy <https://en.wikipedia.org/wiki/Differential_privacy>.
> >
> > Today I’m reaching out to share that WMF has released almost 8 years
> (from
> > 1 July 2015 to present) of privatized pageview data
> > <
> >
> https://diff.wikimedia.org/2023/06/21/new-dataset-uncovers-wikipedia-browsing-habits-while-protecting-users/
> > >,
> > partitioned by country, project, and page. This data is significantly
> more
> > granular than other datasets we release, and should help researchers to
> > disambiguate both long- and short-term trends within languages on a
> > country-by-country basis — several
> > <https://phabricator.wikimedia.org/T207171> long-standing requests
> > <https://phabricator.wikimedia.org/T267283> from Wikimedia communities.
> >
> > Due to various technical factors, there are three distinct datasets:
> >
> >-
> >
> >1 July 2015 – 8 Feb 2017
> ><
> >
> https://analytics.wikimedia.org/published/datasets/country_project_page_historical_pre_2017/
> > >
> >/ README
> ><
> >
> https://analytics.wikimedia.org/published/datasets/country_project_page_historical_pre_2017/00_README.html
> > >
> >(publishing threshold [1]: 3,500 pageviews)
> >-
> >
> >9 Feb 2017 – 5 Feb 2023
> ><
> >
> https://analytics.wikimedia.org/published/datasets/country_project_page_historical/
> > >
> >/ README
> ><
> >
> https://analytics.wikimedia.org/published/datasets/country_project_page_historical/00_README.html
> > >
> >(publishing threshold: 450 pageviews)
> >-
> >
> >6 Feb 2023 – present
> ><
> > https://analytics.wikimedia.org/published/datasets/country_project_page/
> >
> >/ README
> ><
> >
> https://analytics.wikimedia.org/published/datasets/country_project_page/00_README.html
> > >
> >(publishing threshold: 90 pageviews)
> >
> >
> > API access to this data should be coming in the next few months. In the
> > interim, I’ve built an example python notebook
> > <
> >
> https://public-paws.wmcloud.org/67457802/private_pageview_data_access.ipynb
> > >
> > illustrating how one might access the data in its current csv format, as
> > well as several different kinds of simple analyses that can be done with
> > it.
> >
> > I also want to invite the research community to join me for a brief demo
> of
> > this project at the July Research Showcase
> > <https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase>. In the
> > meantime, please feel free to reach out with any questions on the project
> > talk
> > page <https://meta.wikimedia.org/wiki/Talk:Differential_privacy>.
> >
> > For more information about WMF’s work on differential privacy more
> > generally, see the differential privacy homepage on meta
> > <https://meta.wikimedia.org/wiki/Differential_privacy>. And in the
> future,
> > look for more announcements of privatized datasets on editor behavior,
> > on-wiki search, centralnotice impressions and clicks, and more.
> >
> > Best,
> >
> > Hal
> >
> > [1] “Publishing threshold” is the minimum value of a row in the dataset
> in
> > order to be published.
> > ___
> > Wiki-research-l mailing list -- wiki-research-l@lists.wikimedia.org
> > To unsubscribe send an email to
> wiki-research-l-le...@lists.wikimedia.org
> >
> ___
> Wiki-research-l mailing list -- wiki-research-l@lists.wikimedia.org
> To unsubscribe send an email to wiki-research-l-le...@lists.wikimedia.org
>
___
Wiki-research-l mailing list -- wiki-research-l@lists.wikimedia.org
To unsubscribe send an email to wiki-research-l-le...@lists.wikimedia.org


[Analytics] Re: Pageviews per country

2022-12-21 Thread Hal Triedman
Hi all,

Ismael, thanks so much for reaching out about this. Unfortunately, I think
Dan is right when he says that the granularity of data carries a big
privacy cost. We're working hard to try and lower the threshold of daily
unique visitors by country in order to be released from 1000 to 90, but it
seems like these Wikivoyage itineraries are likely to have less than 90
daily unique visitors in most countries. Either way, we're hoping to start
releasing daily pageviews by country in January, so you should check to see
if your pages are in the dataset once that release is live.

If you want unfettered access to the data (split up by country), you should
pursue a research partnership. Besides that, you can likely use some
existing tools (like the Pageviews API
<https://wikimedia.org/api/rest_v1/#/Pageviews%20data/get_metrics_pageviews_top__project___access___year___month___day_>
or
pageviews.wmcloud.org) to get a sense of the data. I'll be sure to reach
back out once the differentially-private data is released so that you might
be able to check on the relevant pages!

Thanks again for reaching out :)

Hal

On Wed, Dec 21, 2022 at 12:07 PM Dan Andreescu 
wrote:

> The only way is to help with the ongoing (and complex) differential
>>> privacy work <https://phabricator.wikimedia.org/T307245>
>>>
>>
>> I have systems background but probably this could be outside my skills.
>> How could I help?
>>
>
> Hm, it's some tricky programming work, I'm not 100% sure of the latest
> status or opportunities to get involved, but I'm cc-ing Hal Triedman to see
> if he has thoughts. (Hal see archive
> <https://lists.wikimedia.org/hyperkitty/list/analytics@lists.wikimedia.org/thread/IKL3WOQ2UY7IMMCUTV7EYGT6PFVFLVCA/>
> )
>
> [1] https://meta.wikimedia.org/wiki/Research:Page_view#Resulting_format
>>>>
>>>
>>>  If you are indeed interested in pageviews, the definition you linked to
>>> talks about the data internally available.
>>>
>>
>> Oh!
>>
>>
>>>   Can I ask you to elaborate a bit more on why you need per-country data?
>>>
>>
>> Well, First I've been looking for the most useful tools and sources
>> available (and found very interesting many of them[1]). Second, in this
>> precise case we are running a pilot project in which has been published
>> some academic project results as Wikivoyage itineraries (3 in EN and 3 in
>> ES). These are the articles we are interested in tracking now.
>>
>> About the rationale, one of the bigger drivers nowadays is the well known
>> link between heritage, tourism and sustainability (example: the Sustainable
>> Development Goals), so there is a trend to better analyze this context to
>> study and plan. Usually touristic destinations have very well defined
>> countries of origin. The best you know the origin, the best you can plan.
>> Also there should be another positive impact in Wikimedia: new incentives
>> for institutions to create or translate articles to the relevant languages.
>> Always restrited to the heritage domain.  Here in Spain tourism is one of
>> the main economic sectors and anything providing intelligence would help
>> for better planning and conservation.
>>
>> Also, we have identified a new potential activity area about doing
>> intelligence analysis of trends in heritage (interest of the public,
>> changes in institutional focuses, new relevant practices, etc), not only
>> about the Spanish one but worldwide. This is also an scientific institution
>> and would find it very useful to collect the most precise traces available
>> (with absolute respect to the users privacy) to look for signals they could
>> use to refocus/prioritize their institutional goals.
>>
>> So, this is it.
>>
>> [1]  https://toolhub.wikimedia.org/lists/277
>>
>
> This is indeed a very interesting use case and a chance for this data to
> be very helpful.  Unfortunately to my naive eyes, this granularity of data
> also carries a big privacy cost.  The only way to get to it would be a
> research collaboration, but there are *lots* of requests for those and not
> enough researchers to help facilitate.  I'm honestly not sure there's an
> easy way around this... but I'll keep thinking about it and I know it'll be
> useful for Hal to see this kind of request and add it to his back burner.
> Thanks for detailing!
>
___
Analytics mailing list -- analytics@lists.wikimedia.org
To unsubscribe send an email to analytics-le...@lists.wikimedia.org


[Wikimedia-l] Re: Most visited articles in 2022

2022-12-08 Thread Hal Triedman
Hi all!

Looks like Isaac and I had the same thought here. I also spent ~45 minutes
hacking together a script that collects the top (up to) 500 pages for a
given country from 1 December 2021 through 30 November 2022 using the WMF
pageviews API . All
of the datasets are relatively small and available for download and free use
.
Code for generating these lists is available on the WMF gitlab instance
, and runs in ~3.5
hours on a normal Macbook, if anyone wants to download/fork it and try it
on their own.

There are only 135 ISO codes included in this set of files — I removed
codes that WMF doesn't release data about or that have no data reported for
the 365 day period in question. Let me know if you have any questions, and
hope this helps!

Hal

On Thu, Dec 8, 2022 at 8:18 AM Isaac Johnson  wrote:

> Romaine,
> Building on Chico's comment, I put together an example notebook of how to
> estimate such a list from the public data in case you're curious (I
> calculated it for January-November for Nigeria in the example). It's not a
> perfect approach in that it makes some assumptions and uses incomplete data
> but probably is close to what the actual list would be (details in the
> link). You'd likely want to use your knowledge of the region/languages to
> filter out pages like Special:Search and bot-driven views that slipped
> through into the data (like Cookie and Cleopatra in the example below).
>
> Notebook:
> https://public.paws.wmcloud.org/User:Isaac_(WMF)/Top_Read_2022_Geo.ipynb#Example-Results-(Nigeria-for-2022)
>
> It makes use of these public Wikimedia resources:
> * PAWS infrastructure: https://wikitech.wikimedia.org/wiki/PAWS
> * Pageviews API:
> https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews
> * Python mwviews library for interacting with the pageviews API:
> https://github.com/mediawiki-utilities/python-mwviews
>
> You can read instructions for how to copy this notebook and run it for
> other countries here:
> https://wikitech.wikimedia.org/wiki/PAWS/Getting_started_with_PAWS#Fork
>
> Best,
> Isaac
>
> Copying the top-100 output for Nigeria below for ease of access:
>
> article views
> 1 https://en.wikipedia.org/wiki/Special:Search 13696500
> 2 https://fr.wikipedia.org/wiki/Cookie_(informatique) 10754500
> 3 https://ig.wikipedia.org/wiki/Special:Search 7579900
> 4 https://en.wikipedia.org/wiki/Main_Page 5502800
> 5 https://ig.wikipedia.org/wiki/Ihü_kárírí:Search 1791900
> 6 https://foundation.wikimedia.org/wiki/Privacy_policy 87
> 7 https://en.wikipedia.org/wiki/Bet9ja 664700
> 8 https://foundation.wikimedia.org/wiki/Terms_of_Use 646900
> 9 https://en.wikipedia.org/wiki/XXX 624200
> 10 https://en.wikipedia.org/wiki/Nigeria 491700
> 11 https://en.wikipedia.org/wiki/Cleopatra 429900
> 12 https://en.wikipedia.org/wiki/Elizabeth_II 328400
> 13 https://en.wikipedia.org/wiki/Bola_Tinubu 320300
> 14 https://en.wikipedia.org/wiki/XXX_(film_series) 234700
> 15 https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Africa_2022/en
> 230600
> 16 https://en.wikipedia.org/wiki/Peter_Obi 229000
> 17 https://fr.wikipedia.org/wiki/Enoch_Adeboye 197300
> 18
> https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Earth_2022_in_Nigeria
> 154600
> 19 https://en.wikipedia.org/wiki/XXX:_Return_of_Xander_Cage 143000
> 20 https://en.wikipedia.org/wiki/Vladimir_Putin 131100
> 21 https://en.wikipedia.org/wiki/Russo-Ukrainian_War 122700
> 22 https://en.wikipedia.org/wiki/_(beer) 116800
> 23 https://en.wikipedia.org/wiki/Charles_III 114600
> 24 https://en.wikipedia.org/wiki/Africa_Cup_of_Nations 112300
> 25 https://en.wikipedia.org/wiki/Jeffrey_Dahmer 110300
> 26 https://en.wikipedia.org/wiki/Yusuf_Datti_Baba-Ahmed 108700
> 27 https://en.wikipedia.org/wiki/Cristiano_Ronaldo 106300
> 28
> https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Earth_2022_in_South_West_Nigeria
> 99700
> 29 https://en.wikipedia.org/wiki/Atiku_Abubakar 91800
> 30 https://en.wikipedia.org/wiki/2022_FIFA_World_Cup 91300
> 31 https://en.wikipedia.org/wiki/NATO 86300
> 32 https://en.wikipedia.org/wiki/Erling_Haaland 84800
> 33 https://en.wikipedia.org/wiki/Russia–Ukraine_relations 84300
> 34 https://en.wikipedia.org/wiki/2021_Africa_Cup_of_Nations 83300
> 35 https://en.wikipedia.org/wiki/Diana,_Princess_of_Wales 81900
> 36 https://en.wikipedia.org/wiki/Black_Adam_(film) 80600
> 37 https://en.wikipedia.org/wiki/Black_Panther:_Wakanda_Forever 66800
> 38 https://en.wikipedia.org/wiki/Ademola_Adeleke 66600
> 39 https://en.wikipedia.org/wiki/Ukraine 65100
> 40 https://en.wikipedia.org/wiki/Rishi_Sunak 60700
> 41 https://en.wikipedia.org/wiki/Elon_Musk 60200
> 42 https://en.wikipedia.org/wiki/Takeoff_(rapper) 58000
> 43 https://en.wikipedia.org/wiki/House_of_the_Dragon 57500
> 44 https://en.wikipedia.org/wiki/Casemiro 56800
> 45 

[Wikitech-l] Give WMF Feedback on Model Cards

2022-04-04 Thread Hal Triedman
Hi all,

The WMF Privacy and Machine Learning Platform teams are developing model
cards to increase visibility, transparency, and accountability of
algorithmic decision-making on WMF platforms. A model card
 is a document about a machine learning
model that seeks to answer basic questions about the model in a clear and
concise manner. The broad goal of this project is for every ML model hosted
by WMF to have a model card for the community and public to understand,
discuss, and govern that model.

We would love for you to give some feedback on the talk page of our
prototype:
https://meta.wikimedia.org/wiki/User:HTriedman_(WMF)/Language_Agnostic_Link-Based_Article_Topic_Model_Card

We're specifically looking to answer the following questions:
- What aspects of the model card are useful, informative, or helpful?
- What aspects of the model card are confusing or unhelpful?
- Are there any features or sections that aren't on the model card that you
would like to see?

Thanks so much!
Hal
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wiki-research-l] Give WMF feedback on model cards

2022-03-23 Thread Hal Triedman
Hi all,

The WMF Privacy and Machine Learning Platform teams are developing model
cards to increase visibility, transparency, and accountability of
algorithmic decision-making on WMF platforms. The broad goal is for every
ML model hosted by WMF to have a model card for the community and public to
understand, discuss, and govern that model.

We would love for you to give some feedback on the talk page of our
prototype:
https://meta.wikimedia.org/wiki/User:HTriedman_(WMF)/Language_Agnostic_Link-Based_Article_Topic_Model_Card

Thanks so much!
Hal
___
Wiki-research-l mailing list -- wiki-research-l@lists.wikimedia.org
To unsubscribe send an email to wiki-research-l-le...@lists.wikimedia.org


[Wikimedia-l] Give WMF feedback on model cards

2022-03-17 Thread Hal Triedman
Hi all,

The WMF Privacy and Machine Learning Platform teams are developing model
cards to increase visibility, transparency, and accountability of
algorithmic decision-making on WMF platforms. The broad goal is for every
ML model hosted by WMF to have a model card for the community and public to
understand, discuss, and govern that model.

We would love for you to give some feedback on the talk page of our
prototype:
https://meta.wikimedia.org/wiki/User:HTriedman_(WMF)/Language_Agnostic_Link-Based_Article_Topic_Model_Card

Thanks so much!
Hal
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/VAJLYCQYTPELZS3DBFC7HHVPW6MTPRBC/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org