Recently I've been working with a dataset which has many large polygons with many holes (eg 100K+ vertices, 1000+ holes). This revealed a performance issue with the Shapefile reading code that I'm using, which is pretty much the same as the code in JUMP (and is descended from an old GeoTools version). The issue is that the point-in-polygon test is fairly slow when run against large shells and for many holes.
There's a few fixes required: 1. if only one shell is present, do not run the PIP code, but simply assign the holes to the shell. (This fix has been made in the current GeoTools PolygonHandler code, and could be copied from there) 2. this can be extended to checking for only one *candidate* shell (a candidate shell is one whose envelope contains the hole envelope). This code is not yet developed. 3. the code uses ArrayList.indexOf. This is inefficient, since it uses equals(), which does a full equality comparison. Instead, an iterator and == should be used instead (code below) The performance improvement from just #1 was dramatic - reading a 1M feature shapefile went from over 1000 s to 32 s. ---------------------------- PolygonHandler change: ((ArrayList) holesForShells.get(findIndex(shells, minShell))).add(testHole); /** * Finds a object in a list. Should be much faster than indexof * * @param list * @param o * @return */ private static int findIndex(ArrayList list, Object o) { int n = list.size(); for (int i = 0; i < n; i++) { if (list.get(i) == o) return i; } return -1; } ------------------------------------------------------------------------------ For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 _______________________________________________ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel