In general, it is rare that solving the dual of a problem takes less time, so I 
would not recommend an approach that tests if the dual is more numerically 
stable.  The L1 regression problem is a rare exception in which directly 
solving the dual makes sense.



The chief issue with the original formulation of the L1 regression is that 
there are 2 dense columns in the "A" matrix.  The computational burden of the 
interior point methods is the Cholesky factorization of a matrix that looks 
like (in terms of density) AA(t) = A times the transpose of A.  This 
factorization occurs once per iteration.  Even a single dense column in A 
creates a completely dense AA(t), and that is a lot of computations and could 
lead to numerical difficulties.  For 1000 points in the L1 regression, taking 
the Cholesky factorization of a dense 1000 x 1000 matrix is a huge computation 
(something like 2 billion floating point operations).



The dual formulation of this particular problem results in AA(t) being a small 
2 x 2 matrix so there is a reduction in computations by several orders of 
magnitude.



The real issue is how to handle dense columns.  Some interior point 
implementations separate out dense columns by splitting the column, and this 
seems to generally work well.  Experiments with using the Sherman-Morrison 
formula (Wikipedia calls it the Woodbury matrix identity) that algebraically 
can separate out dense columns generally don't work very well numerically in 
practice, although in your case it would have worked magic and would have been 
better than the column splitting.



Note that the GLPK documentation does point out the issue of dense columns and 
numerical instability.  It says:



"The routine glp_interior implements an easy version of the primal-dual 
interior-point method based on Mehrotra's technique.



Note that currently the GLPK interior-point solver does not include many 
important features, in particular:

- it is not able to process dense columns. Thus, if the constraint matrix of 
the LP problem has dense columns, the solving process may be inefficient;

- it has no features against numerical instability. For some LP problems 
premature termination may happen if the matrix ADAT becomes singular or 
ill-conditioned;

- it is not able to identify the optimal basis, which corresponds to the 
interior-point solution found."





-----Original Message-----
From: 
[email protected]<mailto:[email protected]>
 [mailto:[email protected]] On Behalf Of 
Reginald Beardsley
Sent: Wednesday, December 19, 2012 10:28 AM
To: glpk
Subject: Re: [Help-glpk] Interior point failure





Solving the dual as Marc suggested works much better. There were no failures 
even w/ 10,000 points whereas the original formulation consistently fails w/ 
1000 points.



This may be hopelessly naive of me, but would it make sense to generate the 
dual and restart if an IPM problem fails? Or, alternatively, is there a 
criterion that could be applied at startup that would allow determining when 
solving the dual would be more appropriate?



BTW I'm using the scripts for this as an example of Unix batch execution for 
the wikibook.  Unfortunately, a system hang at my end resulted in losing an 
hour's work just before I was going to save it.  I've started redoing it but 
have to rewrite the explanation and add the model that Marc contributed.



Have Fun!

Reg



_______________________________________________

Help-glpk mailing list

[email protected]<mailto:[email protected]>

https://lists.gnu.org/mailman/listinfo/help-glpk

________________________________
This e-mail and any attachments may be confidential or legally privileged. If 
you received this message in error or are not the intended recipient, you 
should destroy the e-mail message and any attachments or copies, and you are 
prohibited from retaining, distributing, disclosing or using any information 
contained herein. Please inform us of the erroneous delivery by return e-mail. 
Thank you for your cooperation.
_______________________________________________
Help-glpk mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/help-glpk

Reply via email to