On Sep 3, 2008, at 5:32 AM, Raph Frank wrote:

On 9/3/08, Brian Olson <[EMAIL PROTECTED]> wrote:
Anyway, what my pseudo-voronoi solver does is this:

Initialization:
• Assign district centers to random points (block coordinates).

       • Assign census blocks to nearest district center.

Repeat until done:
       • Move district centers towards center of population they were
actually assigned.

Is this guaranteed to improve the district quality?

In fact, do you only make an update if that improves quality by your measure?

I think it's not guaranteed to be an improvement because the simultaneous update of all the other district centers and weights could wind up creating overshoots or other destructive interference. I always make the update. It may actually not be terribly well thought out beyond 'seems like an intuitively good idea' and 'seems to work pretty well'.


• Adjust district weight, nudging weight up or down if there are too
few or two many people assigned to it.

Do you do this until it completely converges, or just 1 nudge?

1 nudge per cycle of all the "Repeat until done" stages. 10000 to 20000 cycles are needed to settle depending on the state.

       • (optional) Nudge district centers away from underpopulated
district centers and towards overpopulated district centers.
• For each census block, assign census block to the district with
the lowest ((distance to block from district center) * weightd)

You would get straight edges if you used

lowest( (distance^2) - weight )

This also saves you having to do a square-root, though I guess your
method could be implemented as

lowest( (distance^2)*(weight)^2 )

This would speed up the calculations as there would be no need for
finding the square root of the distance, while still giving the same
result.

I checked my code and I'm not doing the expensive square root. It's not quite the second though, it's actually:
((dx*dx + dy*dy) * weight)

The weight gets nudged by multiplying by 1.01 or 0.99. Squaring the weight or not and how it's nudged is probably just an efficiency issue and not a correctness issue, it should still get to the weight it needs, just maybe along some suboptimal curve.

I think multiplicative weight makes more sense. No chance of underflow, and might scale better between districts with different densities.


• Fixup district contiguity (because straight-line-nearest can jump
across water and other gaps in odd ways)

I think water counts as part of a district, so technically that isn't
a breech of contiguity.

What could be an issue if if a State has a concave boundary.

Btw, do you parse the block geography file to work out which blocks
border each other?

Yes, the block shape file determines adjacency. They handily label every line segment with what block is on either side of it, so I can pretty quickly filter that to keep a list of all pairings I've seen. Of course I do also wind up going back and rendering that geometry to an actual map.

(That's now enshrined here:
http://code.google.com/p/redistricter/wiki/NearestNeighborSolver
)

Oh, and when it's done (an hour or two for most states), start it over again with a different random starting state because it may have gotten
stuck in a local optimum.

My splitline software was able to do all the States in one evening.
That is the benefit of a non-search algorithm.

I can get decent mappings for 90% of the states in a couple days. CA and TX are tough. Lots of data and lots of of districts to balance. There are some tunable parameters in the search process that I should try adjusting specifically for solving those states.


Brian Olson
http://bolson.org/


----
Election-Methods mailing list - see http://electorama.com/em for list info

Reply via email to