On Fri, May 07, 2021 at 06:02:51PM -0700, Chris Jerdonek wrote: > To know what compression methods might be effective, I’m wondering if it > could be useful to see separate histograms of, say, the start column number > and width over the code base. Or for people that really want to dig in, > maybe access to the set of all pairs could help. (E.g. maybe a histogram of > pairs could also reveal something.)
I think this is over-analysing. Do we need to micro-optimize the compression algorithm? Let's make the choice simple: live with the size increase, or swap to LZ4 compression as Antoine suggested. Analysis paralysis is a real risk here. If there are implementations which cannot support either (MicroPython?) they should be free to continue doing things the old way. In other words, "fine grained error messages" should be a quality of implementation feature rather than a language guarantee. I understand that the plan is to make this feature optional in any case, to allow third-party tools to catch up. If people really want to do that histogram analysis so that they can optimize the choice of compression algorithm, of course they are free to do so. But the PEP authors should not feel that they are obliged to do so, and we should avoid the temptation to bikeshed over compressors. (For what it's worth, I like this proposed feature, I don't care about a 20-25% increase in pyc file size, but if this leads to adding LZ4 compression to the stdlib, I like it even more :-) -- Steve _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/6H2XSRMARU4SX4WRMIO2M4MI4EQASPBC/ Code of Conduct: http://python.org/psf/codeofconduct/