Dear numpy community

I would like to propose a mew feature in np.linalg to compute weighted gram matrix (sandwich product), see https://github.com/numpy/numpy/issues/29559.


     Proposed new feature or change:

The solvers for many important statistical models like Generalized Linear Models often need to compute |X.T @ W @ X| where |X| is a 2-dimensional array of features (features in columns, observations in rows) and |W| is very often a diagonal array of (current) weights. This computation is usually the main computational bottleneck.

It would be great if numpy could provide an efficient implementation of it, e.g. |np.linalg.sandwicht_product(X, weight=w)|.

Why numpy? It seems even more unrealistic to me, to get it into BLAS implementations. Numpy has support for SIMD via Highways.


       Computational alternatives

Drawbacks of |(X.T * diag(W)) @ X|:

 * Additional memory allocation to compute |X.T @ diag(W)| , same size
   as (usually large) |X|.
 * The result is symmetric, but this fact is not used. So at least a
   factor of 2 is possible.
   Note that without weights, |X.T @ X| uses BLAS syrk. But the weights
   are crucial.

Drawback of |Z = np.sqrt(diag(W))[:, None] * X| and then |Z @ Z|:

 * Additional memory allocation for |Z|, same size as (usually large) |X|.
 * Taking square roots for possibly large |diag(W)|


     Additional information

https://github.com/Quantco/tabmat has an implementation of it with XSIMD.


Best
Christian Lorentzen
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: arch...@mail-archive.com

Reply via email to