On Sep 3, 2008, at 5:32 AM, Raph Frank wrote:
On 9/3/08, Brian Olson <[EMAIL PROTECTED]> wrote:
Anyway, what my pseudo-voronoi solver does is this:
Initialization:
• Assign district centers to random points (block
coordinates).
• Assign census blocks to nearest district center.
Repeat until done:
• Move district centers towards center of population they were
actually assigned.
Is this guaranteed to improve the district quality?
In fact, do you only make an update if that improves quality by your
measure?
I think it's not guaranteed to be an improvement because the
simultaneous update of all the other district centers and weights
could wind up creating overshoots or other destructive interference. I
always make the update. It may actually not be terribly well thought
out beyond 'seems like an intuitively good idea' and 'seems to work
pretty well'.
• Adjust district weight, nudging weight up or down if there
are too
few or two many people assigned to it.
Do you do this until it completely converges, or just 1 nudge?
1 nudge per cycle of all the "Repeat until done" stages. 10000 to
20000 cycles are needed to settle depending on the state.
• (optional) Nudge district centers away from underpopulated
district centers and towards overpopulated district centers.
• For each census block, assign census block to the district
with
the lowest ((distance to block from district center) * weightd)
You would get straight edges if you used
lowest( (distance^2) - weight )
This also saves you having to do a square-root, though I guess your
method could be implemented as
lowest( (distance^2)*(weight)^2 )
This would speed up the calculations as there would be no need for
finding the square root of the distance, while still giving the same
result.
I checked my code and I'm not doing the expensive square root. It's
not quite the second though, it's actually:
((dx*dx + dy*dy) * weight)
The weight gets nudged by multiplying by 1.01 or 0.99. Squaring the
weight or not and how it's nudged is probably just an efficiency issue
and not a correctness issue, it should still get to the weight it
needs, just maybe along some suboptimal curve.
I think multiplicative weight makes more sense. No chance of
underflow, and might scale better between districts with different
densities.
• Fixup district contiguity (because straight-line-nearest
can jump
across water and other gaps in odd ways)
I think water counts as part of a district, so technically that isn't
a breech of contiguity.
What could be an issue if if a State has a concave boundary.
Btw, do you parse the block geography file to work out which blocks
border each other?
Yes, the block shape file determines adjacency. They handily label
every line segment with what block is on either side of it, so I can
pretty quickly filter that to keep a list of all pairings I've seen.
Of course I do also wind up going back and rendering that geometry to
an actual map.
(That's now enshrined here:
http://code.google.com/p/redistricter/wiki/NearestNeighborSolver
)
Oh, and when it's done (an hour or two for most states), start it
over
again with a different random starting state because it may have
gotten
stuck in a local optimum.
My splitline software was able to do all the States in one evening.
That is the benefit of a non-search algorithm.
I can get decent mappings for 90% of the states in a couple days. CA
and TX are tough. Lots of data and lots of of districts to balance.
There are some tunable parameters in the search process that I should
try adjusting specifically for solving those states.
Brian Olson
http://bolson.org/
----
Election-Methods mailing list - see http://electorama.com/em for list info