Re: [sage-devel] Re: Urgent and important: Please vote on disputed PR #36964 (next step of the modularization project)

Matthias Koeppe Wed, 01 May 2024 10:31:46 -0700

Hi Sage developers, 

Since I posted my request to urgently vote on the modularization PRs, the 
big revert (https://github.com/sagemath/sage/pull/37796) was merged into 
Sage 10.4.beta4.
The modularization PRs have now been re-created (thanks, Julian, for your 
help with this).

*I'm now asking you to vote on the new PRs; it's important – participation
matters!*
- https://github.com/sagemath/sage/pull/37900 (*Restructure sage.*.all for
modularization, replace relative by absolute imports*). (As I explained, the
PR is "mostly harmless": There are no user-visible changes; it's just a
bunch of imports that are moved around. It includes no policy change of any
kind; it only executes a design that was previously reviewed and carefully
documented in separate PRs. Nothing permanent or irreversible is done here.
The new files provide the top-level namespaces needed for doctesting
modularized installations of Sage.)
- https://github.com/sagemath/sage/pull/37901 (*Add # sage_setup:
distribution directives to all files, remove remaining # coding: utf-8*).

I'm responding to a few messages that were posted here in this sage-devel
thread in the meantime and I did not have a chance to respond to earlier.

On Thursday, April 25, 2024 at 6:28:48 AM UTC-7 *Dima Pasechnik* wrote:

> On Wednesday, April 24, 2024 at 10:14:09 PM UTC-5 Matthias Koeppe wrote:
> Yes, native Windows would clearly be a very important target.

Essential components of sagelib such as GAP, Singular, don't run on native
Windows (on Cygwin, yes [...]) and I don't think anyone is keen on doing
the hard work to port it. This puts native Windows support into the area of
wishful thinking.

Yes, porting software to new platforms is hard (thanks all for the detailed
and entertaining discussion regarding GAP).
But Dima's message is ignoring the very point of why we are talking about
porting to new platforms in this thread:
*The modularization project enables us to port those parts of Sage to new
platforms for which there is interest to port*, without being held back by
those parts and libraries for which porting is too hard or in which there
is no interest.

On Thursday, April 25, 2024 at 5:00:33 PM UTC-7 *TB* wrote:

On 25/04/2024 15:28, Nathan Dunfield wrote:

In another direction: I have started a port of Sage to pyodide, the
distribution of Python for WebAssembly (WASM), which makes Python runnable
directly in the browser. I can already run and test the modularized
distributions **sagemath-objects**, **sagemath-categories** there.

It would be amazing if a decent portion of Sage could be run in the browser
this way, e.g. to have the occasional HW assignment that needs Sage without
the overhead of using something like CoCalc.

Although SageMathCell <https://sagecell.sagemath.org/> does not run
locally, it does run in the browser. There are examples of Sage exercises
in this book <http://abstract.ups.edu/aata/aata.html> and more on the about
page <https://sagecell.sagemath.org/static/about.html> of SageMathCell.
Having a completely offline version of parts of Sage that can run in the
browser with WASM will be wonderful indeed.

Yes, *pyodide will enable running portions of Sage completely offline, i.e.
in serverless mode.* There is currently a lot of momentum in the scientific
computing community for developing such deployments, see for example the
post https://blog.pyodide.org/posts/marimo/ on the port of the (very
impressive!) *reactive Python notebook* *marimo* to pyodide.

On Thursday, April 25, 2024 at 5:45:33 AM UTC-7 *Nathan Dunfield* wrote:

Another example is large-scale pure math computation on clusters. Because
of Sage's size and the nature of distributive file systems, the time to
startup Sage can be 30 seconds or more, which complicates things if you
want to do 100,000 calculations that are only 10 seconds each. I was
recently at a workshop on computational topology, and several researchers
there regarded using Sage in this context as a non-starter, in one case
they were completely changing their approach to avoid using Sage.

Indeed, starting up sage (when installed from a shared volume) not only
takes long, but it also incurs a huge load on the cluster's network from
network file system operations for loading all the Python modules. This can
degrade other jobs' performance. (My student's jobs using Sage on our HPC
cluster were occasionally flagged by the admins for this.)

On Thursday, April 25, 2024 at 12:17:31 AM UTC-7 *Martin R* wrote:

On Thursday 25 April 2024 at 05:13:37 UTC+2 Matthias Koeppe wrote:

On Wednesday, April 24, 2024 at 1:07:44 AM UTC-7 Martin R wrote:

You mentioned several times, that discoverability is an important aspect.
Do you have any evidence to support that?

I mentioned "discoverability" in the context of how I have *named* the
distributions.

Sorry that my question was not clear enough. Do you have evidence, that
this naming enhances discoverability, and that this enhanced
discoverability would be worthwhile, since it comes with a cost (as
outlined above)?

You are asking about the enhanced discoverability compared to what?
I'm not aware of an alternative proposal how to name the distributions.

Wouldn't people in the python world who need a serious amount of math know
of sage anyway,

What is a "serious" amount of math?

You know it when you see it.

What I mean is, roughly, that it certainly does not make sense to use sage
as a package if you need a few graph algorithms (like shortest paths or
some such), because then you'd be better served with using a specialised
library

There's a false dichotomy regarding size happening here.

There's a huge "dynamic range" between the size of "a few algorithms" and
the size of the whole Sage library (in particular, when considering the
dependencies). And everything that's interesting is happening between these
two extremes. *I have designed the modularized distributions to provide a
meaningful and practical compromise regarding the granularity: A relatively
small number of distributions, each with few dependencies, each not too
big, and each with a clear and easy to explain scope and purpose.*

or perhaps copy the code and adapt it.

Well, *copying parts of the source code is certainly some kind of
reusability*, and it is allowed by our license. But it's the lowest grade
of reusability and is well-known to be problematic for maintainability etc.

In the modularization project, I am aiming for a higher grade of
reusability; one that places Sage within best practices for reusable
software in the Python "ecosystem".

You might want to use sage as a package, if you want to do serious
enumeration of arbitrary combinatorial objects. But then you will need
gap, symmetrica, nauty and maxima or fricas anyway.

Yes, there exist use cases that use more libraries/dependencies than other
use cases. That's fine (you just install some more of the modularized
distributions), but *the existence of examples with many dependencies
should not hold back the benefits of the modularization for the use cases
that need few dependencies.* (Even the example that you gave only depends
on a rather small fraction of the whole Sage library and its dependencies.)

and then, if they cannot rely on all of sage because that is too large,
use, for example, `citation.get_systems` to see whether they can do without
some dependencies?

What do you mean by, "whether they can do without some dependencies"?

That's exactly the point of the modularization:
- To enable people to use parts of Sage without some [actually, most!] of
the dependencies.

The only point I am critical of is the splitting of the math library into
arbitrary pseudo-mathematical parts (i.e., sage-combinat ...
sage-symbolics, as listed above), which has nothing to do with dependencies.

Martin, first of all, it's not "arbitrary" and it's certainly not
"pseudo-"anything.

The splitting is the result of my careful and detailed work over the years,
which included going through the entirety of the Sage library line by line.

You asked before about the dependencies; and I explained them in detail
in https://groups.google.com/g/sage-devel/c/mqgtkLr2gXY/m/xzBiXoHbAAAJ
and https://groups.google.com/g/sage-devel/c/mqgtkLr2gXY/m/bVzb0CYaBQAJ
(which you responded to by thanking me for the list).
I'm not sure what your declaration that the mathematical parts have
"nothing to do with dependencies" could possibly be doing in this
discussion after this.

[...] I admit, however, that I cannot really think of any serious use of
any of the functionality of sage outside of math, where people would know
that what they need is, say combinatorics or graph theory.

What do you mean by "outside of math"?

I agree that my terminology is not good. I tried to make a distinction
between research involving math and the - to me unknown - rest. I find it
hard to imagine that any mathematician would bother installing anything
else but all of sage.

Ever since the symbolic calculus facilities were added to SAGE in 2005(?),
the project has not been one that is purely focused on mathematical
research.

Just look at the mission statement of Sage, "Creating a Viable Open Source
Alternative to Magma, Maple, Mathematica, and MATLAB". Let's ignore for a
moment whether and how it should be updated. It certainly does not say,
"Creating a Viable Open Source Alternative to mathematicians' use of Magma,
Maple, Mathematica, and MATLAB".

But I am glad that you brought up this viewpoint because there's indeed
something substantial to discuss. For example, one of the weakest points of
Sage, the maintenance status of our symbolic calculus system, is arguably
like this because of the current *self-isolation of the Sage project* in a
community of developers focused on research mathematics unrelated to
symbolic calculus. Similar effects of self-isolation can be noted in many
other parts of our codebase.

So, maybe I should have written: "I cannot think of .... sage in an
environment, where dependencies are show-stopper, ...".

That's OK, as nothing in the modularization project will force anyone do
change how they use Sage.

And with the confidence of the developer who has done the bulk of the
modularization work in the Sage library, I can also say that the cost for
Sage developers to work with the modularized Sage library is very low and
is far outweighed by the opportunities that the modularization provides for
our community.

--
You received this message because you are subscribed to the Google Groups
"sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to sage-devel+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/sage-devel/d18d22a8-fbfe-4509-8591-6b66b0073f1cn%40googlegroups.com.

Re: [sage-devel] Re: Urgent and important: Please vote on disputed PR #36964 (next step of the modularization project)

Reply via email to