[Wikidata-bugs] [Maniphest] T341409: Use LanguageNameUtils::ALL for monolingual text and lexemes

2023-09-18 Thread ItamarWMDE
ItamarWMDE renamed this task from "[SW] Use LanguageNameUtils::ALL for 
monolingual text and lexemes" to "Use LanguageNameUtils::ALL for monolingual 
text and lexemes".

TASK DETAIL
  https://phabricator.wikimedia.org/T341409

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ItamarWMDE
Cc: Nemo_bis, Michael, ItamarWMDE, Bugreporter, thiemowmde, 
Lucas_Werkmeister_WMDE, jhsoby, Amire80, Lydia_Pintscher, Manuel, 
mrephabricator, Nikki, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, 
Mahir256, QZanden, srishakatux, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341409: Use LanguageNameUtils::ALL for monolingual text and lexemes

2023-09-12 Thread ItamarWMDE
ItamarWMDE moved this task from WikibaseLexeme to [DOT] Prioritized on the 
wmde-wikidata-tech board.
ItamarWMDE added a project: Wikidata Dev Team.

TASK DETAIL
  https://phabricator.wikimedia.org/T341409

WORKBOARD
  https://phabricator.wikimedia.org/project/board/5864/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ItamarWMDE
Cc: ItamarWMDE, Bugreporter, thiemowmde, Lucas_Werkmeister_WMDE, jhsoby, 
Amire80, Lydia_Pintscher, Manuel, mrephabricator, Nikki, Danny_Benjafield_WMDE, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, 
Gq86, GoranSMilovanovic, Mahir256, QZanden, srishakatux, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341409: Use LanguageNameUtils::ALL for monolingual text and lexemes

2023-09-12 Thread ItamarWMDE
ItamarWMDE added a comment.


  **Task Review Notes**:
  
  - This is probably not a full-blown epic, the particular requirements for 
this task can be achieved relatively simply, however we should anticipate a few 
followup tasks that might come out of it.
  - Specifically, one followup could be to consolidate language name sources, 
but any problems arising from this will be quite obvious as they occur.
  
  **Prio Notes**:
  
  - Affects end users / production
  - Does not affect monitoring
  - Does not (really) development efforts
  - Affects onboarding efforts (After this change we will not have to onboard 
new hires to the language addition process and how to review it)
  - Affects additional stakeholders (langcom)

TASK DETAIL
  https://phabricator.wikimedia.org/T341409

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ItamarWMDE
Cc: ItamarWMDE, Bugreporter, thiemowmde, Lucas_Werkmeister_WMDE, jhsoby, 
Amire80, Lydia_Pintscher, Manuel, mrephabricator, Nikki, Danny_Benjafield_WMDE, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, 
Gq86, GoranSMilovanovic, Mahir256, QZanden, srishakatux, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341409: Use LanguageNameUtils::ALL for monolingual text and lexemes

2023-09-08 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.


  In T341409#9019197 , 
@Manuel wrote:
  
  > Thx for the ping, Thimo!
  >
  > I am all for simplifying the current process, as it is inconsistent and 
hard to maintain.
  >
  > @Lydia_Pintscher could there be unintended consequences with going the 
route described in this task?
  
  Yeah I am still kinda attached to the current process but I also must face 
the fact that it's not working. So I'm fine with doing this.

TASK DETAIL
  https://phabricator.wikimedia.org/T341409

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lydia_Pintscher
Cc: Bugreporter, thiemowmde, Lucas_Werkmeister_WMDE, jhsoby, Amire80, 
Lydia_Pintscher, Manuel, mrephabricator, Nikki, Danny_Benjafield_WMDE, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, Mahir256, QZanden, srishakatux, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341409: Use LanguageNameUtils::ALL for monolingual text and lexemes

2023-09-07 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment.


  In T341409#9148879 , 
@Lucas_Werkmeister_WMDE wrote:
  
  > I assume we always want to request the same language here, rather than make 
this depend on the user / request language; should it be the wiki content 
language (`en` on Wikidata), a hard-coded one (e.g. `en` or `qqq`), or 
something else?
  
  On second thought – it should probably be `en`, since the language names will 
also fall back to `en`, not the wiki content language. If we used the content 
language, then a wiki with a non-`en` content language might have extra 
language codes (e.g. `en-uk` or `az-arab`) with no language names available for 
some request languages, which doesn’t sound great.

TASK DETAIL
  https://phabricator.wikimedia.org/T341409

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucas_Werkmeister_WMDE
Cc: Bugreporter, thiemowmde, Lucas_Werkmeister_WMDE, jhsoby, Amire80, 
Lydia_Pintscher, Manuel, mrephabricator, Nikki, Danny_Benjafield_WMDE, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, Mahir256, QZanden, srishakatux, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341409: Use LanguageNameUtils::ALL for monolingual text and lexemes

2023-09-07 Thread Bugreporter
Bugreporter added a comment.


  Also, the proposed "ultimate" list of language codes may be language-data 
library; See T190129: Consolidate language metadata into a 'language-data' 
library and use in MediaWiki  for 
core integration and T281067: merge CLDR extension to core 
 for proposal to fold CLDR to it.
  
  Currently there are languages in Wikidata/CLDR that is not in language-data 
though.

TASK DETAIL
  https://phabricator.wikimedia.org/T341409

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Bugreporter
Cc: Bugreporter, thiemowmde, Lucas_Werkmeister_WMDE, jhsoby, Amire80, 
Lydia_Pintscher, Manuel, mrephabricator, Nikki, Danny_Benjafield_WMDE, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, Mahir256, QZanden, srishakatux, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341409: Use LanguageNameUtils::ALL for monolingual text and lexemes

2023-09-07 Thread Bugreporter
Bugreporter added a comment.


  See also T231755: Local language name should be translatable in translatewiki 
 for localizing language names.

TASK DETAIL
  https://phabricator.wikimedia.org/T341409

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Bugreporter
Cc: Bugreporter, thiemowmde, Lucas_Werkmeister_WMDE, jhsoby, Amire80, 
Lydia_Pintscher, Manuel, mrephabricator, Nikki, Danny_Benjafield_WMDE, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, Mahir256, QZanden, srishakatux, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341409: Use LanguageNameUtils::ALL for monolingual text and lexemes

2023-09-07 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment.


  > - This would make another 230+ languages available, reducing the number of 
languages we have to dump under `mis` (related: T289776 
)
  
  And if T168799: Integrate IANA language registry with language-data and 
MediaWiki (let MediaWiki "knows" all languages with ISO 639-1/2/3 codes) 
 happens, that would take us the 
rest of the way to T289776: Enable all ISO 639-3 codes on Wikidata 
, right?
  
  From a technical side, I don’t see major issues with this proposal. But we 
might want to consolidate language name sources; currently, we have some 
`wikibase-lexeme-language-name-*` messages in WikibaseLexeme (but not used by 
Wikibase), and also some languages names in the cldr extension (`LocalNames/` 
directory). Maybe we can make Wikibase fall back to the language code and also 
track the missing language name, so we can have a Grafana board for the most 
frequently used language codes without names. But I think that doesn’t need to 
block this task.
  
  There is a slight ambiguity in the task description that I didn’t realize 
before. If we take it literally, and only pass `LanguageNameUtils::ALL` as the 
second `getLanguageNames()` argument while leaving the first argument the same 
(`LanguageNameUtils::AUTONYMS`, the default), then we won’t actually see any 
difference:
  
> 
count(mws()->getLanguageNameUtils()->getLanguageNames(LanguageNameUtils::AUTONYMS))
= 517

> 
count(mws()->getLanguageNameUtils()->getLanguageNames(LanguageNameUtils::AUTONYMS,
 LanguageNameUtils::ALL))
= 517
  
  The additional cldr language codes are only added when asking for language 
names in a specific language, and the returned language codes vary slightly 
depending on which language you ask for:
  
> count(mws()->getLanguageNameUtils()->getLanguageNames('en', 
LanguageNameUtils::ALL))
= 981

> count(mws()->getLanguageNameUtils()->getLanguageNames('de', 
LanguageNameUtils::ALL))
= 982

> count(mws()->getLanguageNameUtils()->getLanguageNames('pt', 
LanguageNameUtils::ALL))
= 982

> count(mws()->getLanguageNameUtils()->getLanguageNames('bar', 
LanguageNameUtils::ALL))
= 982

> count(mws()->getLanguageNameUtils()->getLanguageNames('es', 
LanguageNameUtils::ALL))
= 981

> count(mws()->getLanguageNameUtils()->getLanguageNames('qqx', 
LanguageNameUtils::ALL))
= 981

> count(mws()->getLanguageNameUtils()->getLanguageNames('qqq', 
LanguageNameUtils::ALL))
= 981

> 
count(mws()->getLanguageNameUtils()->getLanguageNames('invalidlanguagecode', 
LanguageNameUtils::ALL))
= 981
  
  (`de` and `bar` have additionally `en-uk`, with `bar` presumably inheriting 
it from `de` via language fallback; `pt`’s extra language code is `az-arab`.) I 
assume we always want to request the same language here, rather than make this 
depend on the user / request language; should it be the wiki content language 
(`en` on Wikidata), a hard-coded one (e.g. `en` or `qqq`), or something else?

TASK DETAIL
  https://phabricator.wikimedia.org/T341409

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucas_Werkmeister_WMDE
Cc: thiemowmde, Lucas_Werkmeister_WMDE, jhsoby, Amire80, Lydia_Pintscher, 
Manuel, mrephabricator, Nikki, Danny_Benjafield_WMDE, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, 
Gq86, GoranSMilovanovic, Mahir256, QZanden, srishakatux, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341409: Use LanguageNameUtils::ALL for monolingual text and lexemes

2023-07-17 Thread Manuel
Manuel added a comment.


  Thx for the ping, Thimo!
  
  I am all for simplifying the current process, as it is inconsistent and hard 
to maintain.
  
  @Lydia_Pintscher could there be unintended consequences with going the route 
described in this task?

TASK DETAIL
  https://phabricator.wikimedia.org/T341409

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: thiemowmde, Lucas_Werkmeister_WMDE, jhsoby, Amire80, Lydia_Pintscher, 
Manuel, mrephabricator, Nikki, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, 
Mahir256, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, 
aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341409: Use LanguageNameUtils::ALL for monolingual text and lexemes

2023-07-17 Thread thiemowmde
thiemowmde added subscribers: Manuel, Lydia_Pintscher, Amire80, jhsoby, 
Lucas_Werkmeister_WMDE, thiemowmde.
thiemowmde added a comment.


  I might get this wrong. But as I understand the proposal it would make the 
currently established processes of how languages on wikidata.org are managed, 
requested, and confirmed (briefly described in T312845 
) obsolete. 
https://phabricator.wikimedia.org/project/profile/4981/ contains more details. 
As far as I remember (note this might be outdated as I'm not part of the 
Wikidata team any more) the basic idea is that there is an "official" working 
group that intentionally reviews and accepts new languages one by one only when 
they are actually needed.
  
  I added people that are most probably interested in this and suggest to 
decline or approve this ticket in a timely manner to reduce confusion.

TASK DETAIL
  https://phabricator.wikimedia.org/T341409

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: thiemowmde
Cc: thiemowmde, Lucas_Werkmeister_WMDE, jhsoby, Amire80, Lydia_Pintscher, 
Manuel, mrephabricator, Nikki, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, 
Mahir256, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, 
aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341409: Use LanguageNameUtils::ALL for monolingual text and lexemes

2023-07-10 Thread ItamarWMDE
ItamarWMDE added a project: wmde-wikidata-tech.

TASK DETAIL
  https://phabricator.wikimedia.org/T341409

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ItamarWMDE
Cc: Nikki, mrephabricator, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, 
Mahir256, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, 
aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341409: Use LanguageNameUtils::ALL for monolingual text and lexemes

2023-07-08 Thread Nikki
Nikki created this task.
Nikki added projects: Wikidata, Wikidata Lexicographical data, Language codes.

TASK DESCRIPTION
  Currently for monolingual text and lexemes, Wikibase uses the defaults for 
LanguageNameUtils, which only returns "defined" languages (whatever that 
means). If it instead requested all known languages using 
`LanguageNameUtils::ALL`, it would include all the codes known to the CLDR 
extension, including the ones from CldrNamesEn.php 
.
  
  - This would make another 230+ languages available, reducing the number of 
languages we have to dump under `mis` (related: T289776 
)
  - There are existing requests for at least 18 of these: T313782 
, T332265 
, T332256 
, T214238 
, T332258 
, T320984 
, T316004 
, T332262 
, T332259 
, T314458 
, T317497 
 (`akk`, `hit`), T332255 
 (`bum`, `ken`, `sba`), T321957 
 (`dum`), T321979 
 (`mga`, `sga`)
  - Most if not all of the extra languages for monolingual text and lexemes 
would no longer be necessary (Wikibase does not add language names for its 
extra languages, so they all have to be added to the CLDR extension too).
  - Monolingual text and lexemes would use the same set of languages: T320889 

  - If it includes any language codes that we decide we don't want, there is 
already a way to exclude codes for monolingual text (link 
)
 and T320887  requests the same for 
lexemes.

TASK DETAIL
  https://phabricator.wikimedia.org/T341409

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Nikki
Cc: Nikki, mrephabricator, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, 
Mahir256, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, 
aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org