Quick update:
* I talked to Jacob from the infra team and he setup rate limiting at
the HAProxy level for the wiki VM. This seems to have reduced the
number of 503s you get when using the wiki.
* I had a call with one of the moin2 developers today and we discussed
ways forward. moin2 was completely rewritten from scratch and so
there may be some rough edges to resolve before we can actually use
it. I'll set up a VM and run a test migration on this to check how
far we can get with moin2 in its current state.
* I didn't have time to write up our discussion on a page yet. Perhaps
next weekend.
On 06.03.2025 11:28, Marc-Andre Lemburg wrote:
We can hash out a plan to do a new drive for editors in the coming weeks.
I'll try to put together a wiki page outlining what we've discussed so
far.
I'm also having a call with a lead moin2 developer next week to see
how realistic migrating to moin2 is at this point.
Since I was seeing a few 503s when using the wiki recently, I asked
our infra team for help. They will add another vCPU to the VM to help
address load spikes. Rate limiting should further help against LLM
scrapers causing too much load. Moin surge protection is already in
place (https://moinmo.in/HelpOnConfiguration/SurgeProtection).
On 05.03.2025 20:55, Elena Williams via pydotorg-www wrote:
These later numbers are more consistent with what I have found and
can actually be seen.
I'm happy to help clean up having done substantial work already on
what this could look like and strategies (particularly looking at the
discussion from the recent docs meeting), though not sure how to action.
---
Elena Williams
On Thu, 6 Mar 2025 at 05:38, Marc-Andre Lemburg <m...@egenix.com> wrote:
Correction for the numbers: We have 3400+ pages and 47k users.
I had looked at a backup which doesn't remove things which were
deleted on the main server - because unfortunately, moin's logic
for deleting pages is to actually delete them on disk, without
any way to get them back.
Instead of deleting a page, it's normally better to either add a
redirect or to put a notice on the page that the content was
cleared. That way, the history remains available. It may actually
be a good idea to disable the delete action (if possible, I'd
have to check).
On 05.03.2025 12:47, Marc-Andre Lemburg wrote:
FYI: I've started looking into the moin2 migration...
https://github.com/moinwiki/moin/discussions/1717#discussioncomment-12399187
In order to get there, we will need to do a test installation to
hash out any problems we may run into and evaluate the state of
moin2.
They just released 2.0.0b2.
I'll see whether I can find some time later this week to get
something going.
*I also checked our current stats:*
We have 32k pages in the wiki and 221k users.
Those numbers are what we have in the backend. Moin itself lists
the number of pages as 3436.
Looking at the page names, we'll be able to clean up a lot of
spam pages which have accumulated before we added the editor
signup requirement. Many of those are empty pages, so we should
be able to write a tool to clean those up.
It looks like Moin filters out those empty pages itself, since
the title index does not list them:
https://wiki.python.org/moin/TitleIndex
Scanning through those 3.4k page titles, most of those look
legitimate. And there's a lot of history in there :-)
Similarly, we should be able to go through the user accounts and
clear out all accounts which have not done any edits, in order
to bring the numbers down.
Thanks,
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Experts (#1, Mar 05 2025)
>>> Python Projects, Coaching and Support ... https://www.egenix.com/
>>> Python Product Development ... https://consulting.egenix.com/
________________________________________________________________________
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
https://www.egenix.com/company/contact/
https://www.malemburg.com/
_______________________________________________
pydotorg-www mailing list
pydotorg-www@python.org
https://mail.python.org/mailman/listinfo/pydotorg-www
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Experts (#1, Mar 05 2025)
>>> Python Projects, Coaching and Support ... https://www.egenix.com/
>>> Python Product Development ... https://consulting.egenix.com/
________________________________________________________________________
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
https://www.egenix.com/company/contact/
https://www.malemburg.com/
_______________________________________________
pydotorg-www mailing list
pydotorg-www@python.org
https://mail.python.org/mailman/listinfo/pydotorg-www
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Experts (#1, Mar 06 2025)
>>> Python Projects, Coaching and Support ... https://www.egenix.com/
>>> Python Product Development ... https://consulting.egenix.com/
________________________________________________________________________
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
https://www.egenix.com/company/contact/
https://www.malemburg.com/
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Experts (#1, Mar 12 2025)
Python Projects, Coaching and Support ... https://www.egenix.com/
Python Product Development ... https://consulting.egenix.com/
________________________________________________________________________
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
https://www.egenix.com/company/contact/
https://www.malemburg.com/
_______________________________________________
pydotorg-www mailing list
pydotorg-www@python.org
https://mail.python.org/mailman/listinfo/pydotorg-www