[Mailman-Developers] Improving the speed of mailman import21

2019-10-06 Thread Stephen J. Turnbull
Abhilash Raj writes:

 > 90% of the time is spent trying to encrypt user passwords, for each
 > of the imported member. Well, duh, encryption is an expensive
 > operation and when you do that once per-imported member, it is
 > definitely going to be slow.

Why are we storing unencrypted passwords at all?  Passwords are pretty
low-security in any case, but this is asking for trouble.

 > Although, another interesting fact is the user passwords are kind
 > of useless in Mailman 3. In Mailman 2 you had to setup a password
 > or one was auto-generated for you per-list and you needed that to
 > login to the web ui. However, in Mailman 3, the passwords (in
 > Core's database) aren't used for logging in since Web Frontend
 > stores the authentication tokens (social auth or passwords). In
 > fact, the users who sign up first time on Mailman 3 probably don't
 > ever have a password set in Mailman Core's database.

I'll trust you on that.  Although it suggests the question, if nobody
has a password, why does it take so much time to encrypt no passwords?

 > So, I commented out the code that actually imports the
 > password(src/mailman/utilities/importer.py#L663-664)

I'm happy with this.  This is a major breaking change *if* anyone is
using core passwords which they probably aren't, but it deserves
flashing lights and sirens in the release announcements.

Steve

-- 
Associate Professor  Division of Policy and Planning Science
http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information
Email: turnb...@sk.tsukuba.ac.jp   University of Tsukuba
Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN
___
Mailman-Developers mailing list -- mailman-developers@python.org
To unsubscribe send an email to mailman-developers-le...@python.org
https://mail.python.org/mailman3/lists/mailman-developers.python.org/
Mailman FAQ: https://wiki.list.org/x/AgA3

Security Policy: https://wiki.list.org/x/QIA9


[Mailman-Developers] Improving the speed of mailman import21

2019-10-06 Thread Abhilash Raj
Hi All,

This morning, I set out to improve the performance of "mailman import21" 
command. If you have used it in the past, you will know that it is slow. Until 
now, I never had an idea about why? Here were my ideas:

- Too many database calls and sqlite3 being the usual self
  
  Although, I forgot that it is slow irrespective of the database 
  backend. Maybe we are doing way too many queries?

- Too many string comparisons

  We all know string comparisons are slow, but how slow could they be?

- Something wasteful being done over and over again.


Here is a rough estimate of the time it takes to import mailman2.1's config.pck 
for two lists:

 151 members: 58 seconds
1429 members: 9 minutes

This is quote slow, 9 minutes is a lot. So, I set out to do the usual python 
profiling using the standard library `cProfile` module and only wrapped it 
around `mailman.utilities.importers._import_roster`. That method is the slowest 
one since if you have run the the command, you know it takes the maximum amount 
of time importing the list of members.

Without even looking at the entire output, the problem was apparent and none of 
the ones that I guessed before:


   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
10.0090.009   50.692   50.692 
/home/maxking/Documents/mm3/core/src/mailman/utilities/importer.py:600(_import_roster)
  1510.0010.000   45.6910.303 
/home/maxking/Documents/mm3/core/src/mailman/utilities/passwords.py:35(encrypt)


90% of the time is spent trying to encrypt user passwords, for each of the 
imported member. Well, duh, encryption is an expensive operation and when you 
do that once per-imported member, it is definitely going to be slow.

Mailman 3 uses passlib[1] for crypto and so I set out to figure out if there is 
a hashing algorithm which can do this much faster and perhaps has a C library 
wrapper that we can use to speed things up. I settled on argon2 cipher with a 
supporting library argon2_cffi. Then I changed the config and tried the imports 
again:


151 members: 15.884 seconds
   1429 memebrs: 2minutes 29 seconds

   
That was a significant improvement over the previous numbers. 

Although, another interesting fact is the user passwords are kind of useless in 
Mailman 3. In Mailman 2 you had to setup a password or one was auto-generated 
for you per-list and you needed that to login to the web ui. However, in 
Mailman 3, the passwords (in Core's database) aren't used for logging in since 
Web Frontend stores the authentication tokens (social auth or passwords). In 
fact, the users who sign up first time on Mailman 3 probably don't ever have a 
password set in Mailman Core's database.

So, I commented out the code that actually imports the 
password(src/mailman/utilities/importer.py#L663-664) and the import speed 
improved even more, obviously:


151 members: 4 seconds
   1429 members: 57 seconds

I am hoping that I can commit the change with the commented out code, unless I 
am reminded of a use for the passwords in Core's database. Then, it might be a 
bit more of work trying to figure out another way to improve the speed.

Thanks for reading up!


[1]: 
https://passlib.readthedocs.io/en/stable/narr/quickstart.html#making-a-decision


-- 
  thanks,
  Abhilash Raj (maxking)
___
Mailman-Developers mailing list -- mailman-developers@python.org
To unsubscribe send an email to mailman-developers-le...@python.org
https://mail.python.org/mailman3/lists/mailman-developers.python.org/
Mailman FAQ: https://wiki.list.org/x/AgA3

Security Policy: https://wiki.list.org/x/QIA9