Re: [EM] Redistricting, now with racial demographics

Kristofer Munsterhjelm Sun, 19 Jul 2009 14:37:48 -0700

Brian Olson wrote:

I've updated my redistricting site ( http://bolson.org/dist/ ) toinclude the racial breakdown of all current congressional districts(sometimes interesting by itself) and that of the compactness baseddistricts I have come up with. If you want you can jump directly tohttp://bolson.org/dist/ XX where XX is any US state abbreviation.

As I've mentioned before, I've been doing something like this, but outof no other purpose than abstract interest, and applied to the world atlarge.

Thus, I think I can give some ideas as to how to improve it. The firstand most immediate one is to use K-means++ clustering. For my ownsimulator, this immediately gave an order of magnitude standarddeviation improvement.See http://en.wikipedia.org/wiki/K-means%2B%2B for that. Basically,K-means++ picks the first center randomly. Then it picks the next centerwith a probability for each point proportional to the distance from theclosest existing center, times a power (originally squared, but myprogram uses 0.25). Set the power to whatever gives the best results.The simplest way of picking a new center is to just record the distanceto closest centers (you have to do that anyway to draw the map), thenuse roulette selection for the probabilities.

Second, by recording distances in such a manner, further optimization ispossible. When adding a new district, you can simply grow outwards fromits centroid (center point) without having to recalculate all the otherdistricts. This kind of heap optimization enables my "globalredistricter" to work with distances based on elevation deltas (a roughapproximation to travel time). Of course, if you're using plainEuclidean distance, you can just use Fortune's plane sweep algorithm tocalculate the Voronoi diagrams.

After all the district centers have been picked, you can't use K-means++any more. Revert to ordinary K-means clustering at that point (orwhatever other clustering you may want). For my program, this means thatthe K-means++ initial phase can be quicker than the subsequentmove-centroid-to-population-center K-means phase, so you might also tryrunning the initial phase for various different random seeds and thenonly picking the most promising to go onto the next phase.

Another refinement I've been considering, but I haven't implemented, isa way to split districts internally. Usually, most districts("countries" in my case) are about even, but some have much lowerpopulation and some have much greater. Remove one of the lowerpopulation districts and use the spare point to split a high populationdistrict. One way of doing that would be to turn a point (x, y)(centroid of a populous district) into (x+r, y+r) and (x-r, y-r) forsome small value r, then letting k-means clustering draw the centroidsapart in the right direction.The split might affect other districts and the direction of the rcomponent might be wrong, but it might also turn out well (and my manualemulation of this does simply because the large districts are so largethat it distorts the others less than the split gains).

----
Election-Methods mailing list - see http://electorama.com/em for list info

Re: [EM] Redistricting, now with racial demographics

Reply via email to