[Wikitech-l]  Fresh 23.05.1 released!

2023-05-08 Thread Krinkle
TLDR: fresh-node now supports a one-off "command" invocation mode.

Learn more or install:
 https://gerrit.wikimedia.org/g/fresh
Changelog:
 https://gerrit.wikimedia.org/g/fresh/+/23.05.1/CHANGELOG.md

Each of the fresh-node scripts now supports a positional "command" argument, to 
run a single command without launching a shell first. For example: fresh-node 
-- npm install. Thanks *Gergő Tisza* and *Kosta Harlan* for their contributions!

fresh-node16 has been upgraded to include Firefox 102.10.0esr and Chromium 112. 
The same container has been in use in WMF CI for npm tests in most repos since 
12 April 2023. The welcome text saw a make-over this release, featuring a new 
mimalistic look. I hope this will make the environment feel even snappier. By 
consensing this baseline, timely warnings about enabled mount points and 
environment exposure should stand out more. *Before* 
 / *After* 



Fresh is a fast way to launch isolated environments from your terminal. These 
can be used to work more responsibly with 'npm' developer tools such as ESLint, 
QUnit, Grunt, Selenium, and more. Example guide: 
https://www.mediawiki.org/wiki/Manual:JavaScript_unit_testing. To report issues 
or browse past and current tasks, check Phabricator at 
https://phabricator.wikimedia.org/tag/fresh/.

--
Timo Tijhof,
Principal Engineer,
Wikimedia Foundation.___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Word embeddings / vector search

2023-05-08 Thread Lars Aronsson

A partial outcome of the research in natural language processing
in the last decade is the representation of language as numeric
vectors, called word embeddings. These are used in large language
models such as Bert, Elmo, and (Chat)GPT. A peculiar aspect of
these numeric vectors is that they cluster semantically, so that
words for similar concepts (dog, puppy, pet) group together even
though their spelling is very different. This can be used for
"semantic" search. If a search query (dog) is converted to a vector,
it can search terms found in documents (e.g. wiki articles) that
have similar vectors and find those of similar content even though
the text doesn't match.

https://en.wikipedia.org/wiki/Word_embedding

Here are just two of very many videos that explain the concept:
https://www.youtube.com/watch?v=xzHhZh7F25I
https://www.youtube.com/watch?v=MUve9LiEAeI

Is there any ongoing work at WMF or around the Mediawiki software
to apply this new technique to search in Wikipedia?


--
  Lars Aronsson (l...@aronsson.se, user:LA2)
  Linköping, Sweden
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Gerrit downtime maintenance Wed, 10 May 2023 14:00–15:00 UTC

2023-05-08 Thread Tyler Cipriani
Hi All

We're moving the primary Gerrit host to new hardware on *Wed, 10 May 2023
at 14:00 UTC*[0]

There will be downtime during the move. I expect downtime to be less than
the full hour, but please plan for a full hour.

Apologies for the inconvenience :(

Tyler Cipriani (he/him)
Engineering Manager, Release Engineering
Wikimedia Foundation

[0]: 
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: Python requests broken by urllib3 version 2.x

2023-05-08 Thread Andrew Otto
> For Java, we run an instance of Archiva: https://archiva.wikimedia.org/
> It's not a perfect approach but I think we can and should move in that
direction with all our other ecosystems

Gitlab package registries may help us here!



On Mon, May 8, 2023 at 8:59 AM Andrew Otto  wrote:

> > Tangent: is it worthwhile to establish a consensus for best practices
> with package pinning and package management for Python projects in the
> Wikimedia ecosystem?
> Yes! That would be awesome. I have spent a lot of time floundering in this
> area trying to make decisions; it'd be nice if we had a good guideline
> established.
>
> > I'm working on an "Essential Tools for Managing Python Development
> Environments
> "
> tutorial
> Awesome!   Did you consider conda envs?  FWIW, we rely on conda envs
> 
> in the Data Engineering world to work around the lack of ability to
> securely run docker images in production.  We didn't try pyenv, mainly
> because conda gets us more than just python :)
>
> > Poetry is a modern lockfile-based packaging and dependency management
> tool worth looking into
> Poetry was nice, but I found that it wasn't as comprehensive (yet?) as ye
> old setuptools based stuff.  I can't quite recall what, but I think it was
> around support for datafiles and scripts? But, this was 1.5 years ago so
> maybe things are better now.
>
> Thank you!
>
> On Fri, May 5, 2023 at 11:32 AM Slavina Stefanova <
> sstefan...@wikimedia.org> wrote:
>
>> Tangent: is it worthwhile to establish a consensus for best practices
>>> with package pinning and package management for Python projects in the
>>> Wikimedia ecosystem? When I last worked on a python project (
>>> https://wikitech.wikimedia.org/wiki/Add_Link) I found it confusing that
>>> we have so many different tools and approaches for doing these things, and
>>> seems like we'd benefit from having a standard, supported way. (Or maybe
>>> that already exists and I haven't found it?)
>>
>>
>> I'm working on an "Essential Tools for Managing Python Development
>> Environments
>> "
>> tutorial that will be published to the wikis when ready. Maybe that could
>> be expanded upon? In my experience though, it can be hard to get people to
>> agree on following a standard, especially when there are so many different
>> options and many folks already have their favorite tools and workflows. But
>> it would be nice to have a set of recommendations to reduce the cognitive
>> load.
>>
>> --
>> Slavina Stefanova (she/her)
>> Software Engineer - Technical Engagement
>>
>> Wikimedia Foundation
>>
>>
>> On Fri, May 5, 2023 at 5:18 PM Kosta Harlan 
>> wrote:
>>
>>> Tangent: is it worthwhile to establish a consensus for best practices
>>> with package pinning and package management for Python projects in the
>>> Wikimedia ecosystem? When I last worked on a python project (
>>> https://wikitech.wikimedia.org/wiki/Add_Link) I found it confusing that
>>> we have so many different tools and approaches for doing these things, and
>>> seems like we'd benefit from having a standard, supported way. (Or maybe
>>> that already exists and I haven't found it?)
>>>
>>> Kosta
>>>
>>> On 5. May 2023, at 13:51, Slavina Stefanova 
>>> wrote:
>>>
>>> Poetry is a modern lockfile-based packaging and dependency management
>>> tool worth looking into. It also supports exporting dependencies into a
>>> requirements.txt file, should you need that (nice if you want to
>>> containerize an app without bloating the image with Poetry, for instance).
>>>
>>> https://python-poetry.org/  
>>>
>>> --
>>> Slavina Stefanova (she/her)
>>> Software Engineer - Technical Engagement
>>>
>>> Wikimedia Foundation
>>>
>>>
>>> On Fri, May 5, 2023 at 1:38 PM Sebastian Berlin <
>>> sebastian.ber...@wikimedia.se> wrote:
>>>
 A word of warning: using `pip freeze` to populate requirements.txt can
 result in a hard to read (very long) file and other issues:
 https://medium.com/@tomagee/pip-freeze-requirements-txt-considered-harmful-f0bce66cf895
 .

 *Sebastian Berlin*
 Utvecklare/*Developer*
 Wikimedia Sverige (WMSE)

 E-post/*E-Mail*: sebastian.ber...@wikimedia.se
 Telefon/*Phone*: (+46) 0707 - 92 03 84


 On Fri, 5 May 2023 at 13:17, Amir Sarabadani 
 wrote:

> You can also create an empty virtual env, install all requirements and
> then do
> pip freeze > requirements.txt
>
> That should take care of pinning
>
> Am Fr., 5. Mai 2023 um 13:11 Uhr schrieb Lucas Werkmeister <
> lucas.werkmeis...@wikimedia.de>:
>
>> For the general case of Python projects, I’d argue that a better

[Wikitech-l] Re: Python requests broken by urllib3 version 2.x

2023-05-08 Thread Andrew Otto
> Tangent: is it worthwhile to establish a consensus for best practices
with package pinning and package management for Python projects in the
Wikimedia ecosystem?
Yes! That would be awesome. I have spent a lot of time floundering in this
area trying to make decisions; it'd be nice if we had a good guideline
established.

> I'm working on an "Essential Tools for Managing Python Development
Environments
"
tutorial
Awesome!   Did you consider conda envs?  FWIW, we rely on conda envs

in the Data Engineering world to work around the lack of ability to
securely run docker images in production.  We didn't try pyenv, mainly
because conda gets us more than just python :)

> Poetry is a modern lockfile-based packaging and dependency management
tool worth looking into
Poetry was nice, but I found that it wasn't as comprehensive (yet?) as ye
old setuptools based stuff.  I can't quite recall what, but I think it was
around support for datafiles and scripts? But, this was 1.5 years ago so
maybe things are better now.

Thank you!

On Fri, May 5, 2023 at 11:32 AM Slavina Stefanova 
wrote:

> Tangent: is it worthwhile to establish a consensus for best practices with
>> package pinning and package management for Python projects in the Wikimedia
>> ecosystem? When I last worked on a python project (
>> https://wikitech.wikimedia.org/wiki/Add_Link) I found it confusing that
>> we have so many different tools and approaches for doing these things, and
>> seems like we'd benefit from having a standard, supported way. (Or maybe
>> that already exists and I haven't found it?)
>
>
> I'm working on an "Essential Tools for Managing Python Development
> Environments
> "
> tutorial that will be published to the wikis when ready. Maybe that could
> be expanded upon? In my experience though, it can be hard to get people to
> agree on following a standard, especially when there are so many different
> options and many folks already have their favorite tools and workflows. But
> it would be nice to have a set of recommendations to reduce the cognitive
> load.
>
> --
> Slavina Stefanova (she/her)
> Software Engineer - Technical Engagement
>
> Wikimedia Foundation
>
>
> On Fri, May 5, 2023 at 5:18 PM Kosta Harlan  wrote:
>
>> Tangent: is it worthwhile to establish a consensus for best practices
>> with package pinning and package management for Python projects in the
>> Wikimedia ecosystem? When I last worked on a python project (
>> https://wikitech.wikimedia.org/wiki/Add_Link) I found it confusing that
>> we have so many different tools and approaches for doing these things, and
>> seems like we'd benefit from having a standard, supported way. (Or maybe
>> that already exists and I haven't found it?)
>>
>> Kosta
>>
>> On 5. May 2023, at 13:51, Slavina Stefanova 
>> wrote:
>>
>> Poetry is a modern lockfile-based packaging and dependency management
>> tool worth looking into. It also supports exporting dependencies into a
>> requirements.txt file, should you need that (nice if you want to
>> containerize an app without bloating the image with Poetry, for instance).
>>
>> https://python-poetry.org/  
>>
>> --
>> Slavina Stefanova (she/her)
>> Software Engineer - Technical Engagement
>>
>> Wikimedia Foundation
>>
>>
>> On Fri, May 5, 2023 at 1:38 PM Sebastian Berlin <
>> sebastian.ber...@wikimedia.se> wrote:
>>
>>> A word of warning: using `pip freeze` to populate requirements.txt can
>>> result in a hard to read (very long) file and other issues:
>>> https://medium.com/@tomagee/pip-freeze-requirements-txt-considered-harmful-f0bce66cf895
>>> .
>>>
>>> *Sebastian Berlin*
>>> Utvecklare/*Developer*
>>> Wikimedia Sverige (WMSE)
>>>
>>> E-post/*E-Mail*: sebastian.ber...@wikimedia.se
>>> Telefon/*Phone*: (+46) 0707 - 92 03 84
>>>
>>>
>>> On Fri, 5 May 2023 at 13:17, Amir Sarabadani 
>>> wrote:
>>>
 You can also create an empty virtual env, install all requirements and
 then do
 pip freeze > requirements.txt

 That should take care of pinning

 Am Fr., 5. Mai 2023 um 13:11 Uhr schrieb Lucas Werkmeister <
 lucas.werkmeis...@wikimedia.de>:

> For the general case of Python projects, I’d argue that a better
> solution is to adopt the lockfile pattern (package-lock.json,
> composer.lock, Cargo.lock, etc.) and pin *all* dependencies, and only
> update them when the new versions have been tested and are known to work.
> pip-tools  can help with that,
> for example (requirements.in specifies “loose” dependencies;
> pip-compile creates a pinned requirements.txt; pip-sync installs it; 
>