Not very scientific but ... If I question how one (I) would try to solve the problem in the real world. My attempt would be to initially sort the items (largest to smallest) and then distribute forward and reverse among the boxes until I had only the 'least weighty' remaining. Those smaller remaining items would be used to top-up the needy boxes.
If the items are large in number and random this actually gets one quite close .... but I suppose that is cheating! David On Thu, 2011-04-14 at 13:16 -0500, Skip Cave wrote: > On 4/12/2011 5:22 AM, Aai wrote: > > Giving bucket sums: > > > > 12 8 # (,>:)<. bm > Aai, > > Nice approach to finding the set of integer bucket sums that would come > the nearest to the bucket mean, bm. I knew I needed to be able to find > that set of integer sums, but I hadn't come up with a programmatic way > to do that. The key point here, is that any other set of sums other than > the one you found, will have a larger sum of differences between > buckets. It would make sense that our first attempt at the problem would > be to see if the masses could be partitioned to the > minimal-bucket-difference configuration you found. > > Of course, there is no guarantee that m can be partitioned in this > specific manner, which would produce the theoretical minimal mass > difference between buckets. So in the long run, our search may have to > identify more partition sets which produce somewhat larger differences, > Ideally, we could find these additional partition sets, and then sort > them by their distance magnitude, so the smaller differences could be > examined first. This definitely looks like a way to cut down on > computations. > > However for now, the challenge is to see if we can partition the m > masses to result in the minimal-difference sum sets you defined. We need > an efficient algorithm that will find partitions of m that sum to either > <. bm or >: <. bm which in our large case, is 5273 or 5274. as you > discovered, we need 12 buckets summing to 5273, & 8 buckets summing to 5274. > > +/ 12 8 # (,>:)<. bm > 105468 > +/ m > 105468 > > Though we can reduce the search space significantly by considering only > partitions that sum to the two integers nearest bm, we still have a huge > search to perform. Even if we assume that the number of masses in each > bucket would all be equal, we still have a big search. > > In our large example, if the number of masses in each bucket were equal, > we would have (#m)%nb masses in each bucket, which in our large case, is > 10 masses in each bucket. If each bucket had exactly 10 masses in it, > and we use a brute force approach, we would have 10 comb 200 possible > partitions of m to examine, to see which 10-sets summed to 5273 or 5274. > > Using the combinations formula, the number of combinations of 10 comb 200 is > (!#m)%(!10)*!(#m)-10 > > Using normal floating point arithmetic, this formula generates values > larger than my machine's floating point range, which equates to infinity. > > (!#m)%(!10)*!(#m)-10 > |NaN error > | (!#m) %(!10)*!(#m)-10 NB. Oops! infinity % infinity is a NaN. > > !#m NB. The factorial of #m is infinity in regular floating-point > arithmetic > _ > > > so we must resort to extended precision: > > fm =. */ >: x: i.#m NB. factorial #m > f10 =. */ >: x: i.10 NB. factorial 10 (min number of masses in each > bucket if all the same) > fd =. */ >: x: i.(#m)-10 NB. factorial of difference (#m - 10) > fm % F10 * fd > 22,451,004,309,013,280 NB. Commas added for clarity > > or 2.2451e16 NB. This is the number of possible partitions > (combinations) of m, taken 10 at a time, in floating point notation. > > 10 comb 200 is a lot of partitions of m to examine. The 10 comb 200 > function, which would generate all those possible combinations, creates > an array that is bigger than my machine can handle: > > c =. 10 comb 200 > |out of memory: comb > | z=.k ,.&.>,&.>/\.>:&.>z > > I have a 32-bit machine, so perhaps a 64 bit machine could deal with > this array of all 10-partitions of m. If I could get all the 10 > combinations in an array, it would be simple to see which ones sum up to > 5273 or 5274. For that matter, I could find all of the partition sums > within a few mass units of bm. > > Roger Hui has provided an interesting Jwiki article on the efficient > computation of partition sums, at: > http://www.jsoftware.com/jwiki/Essays/Combination%20Sums > I haven't gotten my head around all of Roger's math as yet, but his > approach looks promising as an efficient way to find all the sums of all > the partitions of a certain size. > > However, in the general case, any brute-force, monolithic array approach > will always eventually run out of memory as the number of masses and > buckets grow. The brute-force monolithic array approach is probably not > an option for generality's sake. An iterative approach seems more > appropriate here. I would at least like the general solution to be > time-constrained, rather than space constrained. I seem to have more of > the former than the latter, at least as far as this problem is concerned. > > In any case, this initial plan only gets us the 10-partitions of m. > There is no guarantee that m will partition out with 10 masses in each > bucket. We will likely need to look at a range of partition sizes to > solve our problem. > > For now, we can still set about finding more limits to the problem, in > order to continue to reduce the computations needed. Given any specific > m, we should be able to discover the minimum and maximum partition sizes > that would sum to any specific integer. By sorting m and looking at > running sums, it seems that we should be able to find the smallest > number of m's masses that will sum to (or near) a specific integer, as > well as the largest number of m's masses that will sum to that same > integer. This will give us boundaries on the smallest and largest > partitions we need to examine for our solution. > > Thinking about the sorted m, I sense a glimmer of an idea about using > sorted m to pick out partitions that sum close to bm. I will have to > think about this more. > > I have some paying work to get to now, so I will need to postpone > working on the partition-size limits computation and the sorted-m issue > until tomorrow. This is definitely giving me a workout in polishing my J > skills, though. > > Skip > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
