Hi, I have what I think is some kind of linear programming question. Basically, what I want to figure out is if I have a vector of numbers,
> x <- rnorm(10) > x [1] -0.44305959 -0.26707077 0.07121266 0.44123714 -1.10323616 -0.19712807 0.20679494 -0.98629992 0.97191659 -0.77561593 > mean(x) [1] -0.2081249 Using each number only once, I want to find the set of five pairs where the magnitude of the differences between the mean(x) and each pairs sum is least. > y <- outer(x, x, "+") - (2 * mean(x)) > y [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] -0.46986936 -0.29388054 0.04440289 0.41442737 -1.1300459 -0.22393784 0.1799852 -1.0131097 0.9451068 -0.80242569 [2,] -0.29388054 -0.11789173 0.22039171 0.59041619 -0.9540571 -0.04794902 0.3559740 -0.8371209 1.1210956 -0.62643688 [3,] 0.04440289 0.22039171 0.55867514 0.92869962 -0.6157737 0.29033441 0.6942574 -0.4988374 1.4593791 -0.28815345 [4,] 0.41442737 0.59041619 0.92869962 1.29872410 -0.2457492 0.66035889 1.0642819 -0.1288130 1.8294035 0.08187104 [5,] -1.13004593 -0.95405711 -0.61577368 -0.24574920 -1.7902225 -0.88411441 -0.4801914 -1.6732863 0.2849302 -1.46260226 [6,] -0.22393784 -0.04794902 0.29033441 0.66035889 -0.8841144 0.02199368 0.4259167 -0.7671782 1.1910383 -0.55649417 [7,] 0.17998518 0.35597399 0.69425742 1.06428191 -0.4801914 0.42591670 0.8298397 -0.3632552 1.5949614 -0.15257116 [8,] -1.01310969 -0.83712087 -0.49883744 -0.12881296 -1.6732863 -0.76717817 -0.3632552 -1.5563500 0.4018665 -1.34566603 [9,] 0.94510682 1.12109563 1.45937907 1.82940355 0.2849302 1.19103834 1.5949614 0.4018665 2.3600830 0.61255048 [10,] -0.80242569 -0.62643688 -0.28815345 0.08187104 -1.4626023 -0.55649417 -0.1525712 -1.3456660 0.6125505 -1.13498203 With this matrix, if I put together a combination of pairs which uses each number only once, the sum of the corresponding numbers is 0. For example, compare the SD between this set of 5 pairs > y[10,1] + y[9,2] + y[8,3] + y[7,4] + y[6,5] [1] 0 > sum(c(y[10,1], y[9,2], y[8,3], y[7,4], y[6,5])) [1] 5.551115e-17 # basically 0, I assume this is round-off error > mean(c(y[10,1], y[9,2], y[8,3], y[7,4], y[6,5])) [1] 1.111307e-17 # basically 0, I assume this is round-off error > sd(c(y[10,1], y[9,2], y[8,3], y[7,4], y[6,5])) [1] 1.007960 versus this hand-selected, possibly lowest SD combination of pairs > sum(c(y[3,1], y[6,2], y[10,4], y[9,5], y[8,7])) [1] -1.665335e-16 # basically 0, I assume this is round-off error > mean(c(y[3,1], y[6,2], y[10,4], y[9,5], y[8,7])) [1] -3.330669e-17 # basically 0, I assume this is round-off error > sd(c(y[3,1], y[6,2], y[10,4], y[9,5], y[8,7])) [1] 0.2367030 I believe that if I could test all the various five pair combinations, the combination with the lowest SD of values from the table would give me my answer. I believe I have 3 questions regarding my problem. 1) How can I find all the 5 pair combinations of my 10 numbers so that I can perform a brute force test of each set of combinations? I believe there are 45 different pairs (i.e. choose(10,2)). I found combinations from the {Combinations} package but I can't figure out how to get it to provide pairs. 2) Will my brute force strategy of testing the SD of each of these 5 pair combinations actually give me the answer I'm searching for? 3) Is there a better way of doing this? Probably something to do with real linear programming, rather than this method I've concocted. Thanks for any help you can provide regarding my question. Best regards, James ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.