[Help-glpk] [Fwd: randomization in MIPs]

Andrew Makhorin Mon, 17 Dec 2012 03:47:28 -0800

-------- Forwarded Message --------
From: Matteo Fischetti DEI <[email protected]>
Reply-to: [email protected]
To: Andrew Makhorin <[email protected]>
Subject: randomization in MIPs
Date: Mon, 17 Dec 2012 11:28:11 +0100


Hi Andrew.

I have a paper on the role of randomization at


http://www.dei.unipd.it/~fisch/papers/exploiting_erraticism_in_search.pdf
 
see Table 1 where a multi-thread experiment is simulated by running K
copies of the same MIP solver in parallel, with absolutely no
communication, by just randomizing the runs. I recently had a
provocative talk about this (well know) behavior, see Mike Trick's blog
at

     http://mat.tepper.cmu.edu/blog/?p=1695

Below I copied a recent mail of mine to the cbc-list, where a same
instance is solved by cbc in 79 or 1816 sec.s, depending on the random
seed used.

In my view, a quick-and-dirty way to have a multithread GLPK version is:

1. introduce a random seed that can be set at the beginning of the run,
and let some decisions in the MIP solver be affected by a random value
(see below)
2. run K independent copies of glpk, in parallel, with different seeds,
and abort the execution of all threads as soon as one thread finishes
its execution

In case one thread runs into numerical troubles, it can automatically
self-replicate itself by running another thread that solves the same
MIP, from scratch but with a different seed, and abort.

More elaborated strategies can of course be implemented, e.g. collect
the best incumbent among all thread (so the best heuristic is always
available for the user), and possibly share it among threads.

As to point 1 above, randomization can be  made

a. by reshuffling the internal order of the vars. (i.e., as reported in
the MIPLIB2010 paper, just randomly permuting columns can act as a
significant perturbation).

and also

b. by randomize the choices in case of (almost) ties e.g. in the
heuristics and when selecting the branching variable

c. when the LP is perturbed because of degeneracy 

c. right after the first LP relaxation at the root node, my making a
small n. of random pivots on zero-reduced cost columns just to sample a
different LP solution on the optimal face (see my paper above for
details).

Note that randomizing the very first optimal LP solution is a good idea
as it affects all the subsequent choices (cuts, heuristics, branching)

If you want to make a very quick shot and see how randomization can
speedup GLPK in a K-thread enviornment (K=1,2,5,10,20),  you can run K
times GLPK on a same set of (hard) instances after having randomly
permuted the input columns according to option a. above, and for each
instance collect the smallest computing time out of  the K runs. I
recently convinced Hans Mittleman to do this experiment with Cplex 12.5,
see

      http://plato.asu.edu/ftp/path.html

Keep in touch

--M--

-------- Messaggio originale -------- 
                           Oggetto: 
Re: [Cbc] muti-thread compile for
cbc-2.7.8
                              Data: 
Fri, 30 Nov 2012 17:01:59 +0100
                          Mittente: 
Matteo Fischetti DEI
<[email protected]>
                        Rispondi-a: 
[email protected]
                                 A: 
Noli Sicad <[email protected]>
                                CC: 
cbc <[email protected]>, 



Hi Noli.

Let me summarize the situation.

1. Proximity search is only available in CBC\trunk; in interactive mode,
it is activated by -proximity on. 

2. On stable CBC 2.7.8 the proximity flag is accepted but not used (yet)

3. In interactive mode, you can check the LOG to see the effect of the
proximity heuristic, e.g.

> cbc ..\seymour.lp -proximity on -solve
Welcome to the CBC MILP Solver
Version: Trunk (unstable)
Build Date: Nov 29 2012
...
Cbc0012I Integer solution of 428 found by feasibility pump after 0
iterations and 0 nodes (16.67 seconds)                  
Cbc0012I Integer solution of 427 found by RINS after 0 iterations and 0
nodes (19.44 seconds)  
...                            
Cbc0012I Integer solution of 426 found by Proximity Search after 0
iterations and 0 nodes (37.95 seconds)                  
...

4. Proximity search is a refinement heuristic, hence it is NOT applied
when no solution is found at the root

5. Now I come to your important question below: 

>>> However, I could not understand why 2.7.8 runs faster than the
trunk. The trunk solved the same MIP double the time (i.e. 2x longer)
than using 2.7.8 with or without the proximity search. It seems that
proximity search does not matter if you run threads (i.e. threads 8). 

My explanation is that you are just facing "erraticism" here, meaning
that small changes of the initial conditions (including using a
different architecture/compiler/n. of threads) can randomly change LP
sol.s --> branching --> search path, sometimes allowing you to converge
much earlier.  

I recently had a provocative talk about this (well know) behavior, see
Mike Trick's blog at

   http://mat.tepper.cmu.edu/blog/?p=1695

Coming to your specific instance (TimberHarvest_0010p.lp), tonight I ran
99 times cbc\trunk (1-thread) with the command

> cbc ..\TimberHarvest_0010p.lp   -randomCbcSeed 0 -randomSeed 0 -solve 

where -randomCbcSeed and -randomSeed allow you to change the internal
CLP and CBC random seeds (0 uses clock as seed).

I posted at 

  www.dei.unipd.it/~fisch/rand.txt

the LOGs; here is an extract:

    run    CPU sec.s         nodes
    --------------------------------

    #001      422.80       688,438
    #002      120.42       541,641
    #003      289.89       980,125
    #004      442.28       472,552
    #005      599.50       694,749
    #006      638.95     1,192,398
    #007      220.62       464,118
    #008      446.78       581,108
    #009      231.38       627,515
    #010      275.19       714,783
...
    #069       79.25       262,105
...
    #099     1816.64     1,998,075


So, depending on the random seeds, your instance can be solved in 79 or
1816 sec.s (!!)

6. Of course, not all instances behave the same, but you better know
that sometimes erraticism can play a relevant role, e.g., when you make
parameter tuning--or when you try to explain why a certain idea (out of
100 you tried) is working magically well...

7. In my view, erraticism is an opportunity rather than an issue--e.g.,
by running multiple times the same solver with different random seeds
(possibly in parallel), you can hopefully find better heuristic
solutions at the root node. 

Here is an example: running 5  times

> cbc   ..\seymour.lp -randomCbcSeed 0  -randomSeed 0 -solve

produces the following root-node heuristic solutions:

run #1: Objective value:
429.00000000                                                                    
          
run #2: Objective value:
428.00000000                                                                    
          
run #3: Objective value:
426.00000000                                                                    
          
run #4 :Objective value:
427.00000000                                                                    
          
run #5: Objective value:                427.00000000 

Analogously, running

> cbc    ..\seymour.lp -randomCbcSeed 0  -randomSeed 0 -proximity on
-passF 0 -solve

produced (still at the root):                                           

run #1: Objective value:                426.00000000
run #2: Objective value:                425.00000000
run #3: Objective value:                425.00000000
run #4: Objective value:                424.00000000
run #5: Objective value:                423.00000000  <-  optimal value

7. Any comment on the subject is welcome.

Best

--Matteo--

  

-- 
Prof. Matteo Fischetti
DEI, University of Padova
via Gradenigo 6/A
I-35131 Padova (Italy)
e-mail: [email protected]
web: www.dei.unipd.it/~fisch
reports: www.dei.unipd.it/~fisch/papers




-- 
Prof. Matteo Fischetti
DEI, University of Padova
via Gradenigo 6/A
I-35131 Padova (Italy)
e-mail: [email protected]
web: www.dei.unipd.it/~fisch
reports: www.dei.unipd.it/~fisch/papers




_______________________________________________
Help-glpk mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/help-glpk

[Help-glpk] [Fwd: randomization in MIPs]

Reply via email to