On 25 Dez., 19:15, Tom Sharpless <[email protected]> wrote:
> I too think it is time for us to take a hard look at the whole problem > of aligning source images. > > Editing control points is one important aspect of that problem. > Finding CPs in the first place is a more fundamental one. Your > suggestions about focusing on local areas where images actually > overlap is, I think, the key to both. Maybe even thinking in terms of control points is a limiting bias. What is really needed is making parts of warped images overlap. One has to consider where the whole methodology of the current CPGs like autopano-sift-c and panomatic actually comes from: If I'm not mistaken, SIFT and SURF originate from pattern recognition in AI. They were made to find common points of reference in images which were decidedly not taken from the same spot with good knowledge of the camera position (or a reasonable guess) - more for autonomous robots trying to navigate a real environment. This is why they were designed to allow for invariance to affine transformations - which is precisely the parameters you want to be immune against when moving the camera (hoping parallax won't ruin it for you). Using them for CP generation is more of a spinoff: - lens distortions locally look very similar to affine transformations - many users shoot handheld, introducing corresponding errors some of which SIFT and SURF cope well with - different lenses in a project make use of the scale invariance so, SURF and SIFT work for us, but, especially in a very properly set up take with pano head and well-calibrated lens, they may be overkill. On the other hand, there are ceratin drawbacks to SURF and SIFT when using them in a photographic context, especially with fisheye lenses. (I'm a bit out of my depth here, so please correct me if I'm mistaken). The problem is that any convolution on a pixel matrix yields results that are determined by the geometry of the pixel matrix. The assumption here is of course that the pixel matrix is a set of equidistant sample points, which, in the case of a fisheye image, and towards it's edges, is true for the sensor, but not the corresponding points in the captured scene. To make convolution results comparable, you either have to do mathematic trickery (like making your detector invariant to affine transforms), or you have to reproject the image locally so the convolution kernel will see the same image geometry. If you know all relevant parameters and have just one shared point of reference, you can reproject two images to each have their projection axis going to the shared reference point. If you match these reprojected images, you should be able to get away with less involved mathematics since you don't have to be transform- or scale-invariant, just rotation-invariant. I have the feeling that a very fast and efficient semi-manual detection could be set up for well- parametrized situations where the user would only have to supply one CP, and the software could take it from there. I have actually experimented a bit here, reprojecting images from my Walimex 8mm stereographic fisheye by shifting the projection axis so that the axes go through corresponding image points and running the matching process on these reprojected images. My results are inconclusive - The quality of the matching even without reprojection is extremely good with autopano-sift-c in full scale, so I din't look much further in that direction, being happy with what is there already. At lower resolutions and with other CPGs, I felt the CP detection was better (particularly, more evenly distributed) after such a reprojection. Maybe further investigation and play in that direction would be rewarding. Another drawback of relying on control points is their one-dimensional nature. Of course, in case of the match between two SIFT/SURF feature points, the environment of the points is present in the feature vectors, but everything else in the further distance may be totally unaligned. So to have a good result, more CPs are generated. But these won't be evenly distributed, and if the projection and camera position is known, an analysis of the assumed overlapping region should be possible which makes a clearer statement of the quality of the overlap than a few feature point correspondences. > It is amazing to me that Hugin (and the other PT based stitchers) > still treat the alignment problem as if the source images were taken > in random directions, when in fact all serious panographers take a lot > of trouble to orient their views systematically. do keep in mind though that a portion of the users (I'm not sure about how big a portion...) are either - casual - non-technical - inexperienced - have accidentally got it wrong - or unable to be extremely precise due to circumstances I, for example, often take handheld shots because I'm out in the wilderness trecking and can't drag a tripod around with me because I'd rather have something to eat in my pack. These users musn't be left behind just because there is also a fair portion of 'serious panographers'. In fact, the tool should (and, currently, does) adapt to different qualities of input, but if it deals with the serious panographer's input, it could maybe be put into a mode with fewer degrees of freedom to exploit the comfortable data situation. > ... > The general principle here is to let the user specify rough local > alignments, and have the SW refine them into a precise global > alignment. And to take advantage of pre-calibrated lenses and > shooting patterns. Yes. I agree. And have proper pattern matching as a fallback option if there is a lack of known constraints. Kay -- You received this message because you are subscribed to the Google Groups "Hugin and other free panoramic software" group. A list of frequently asked questions is available at: http://wiki.panotools.org/Hugin_FAQ To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/hugin-ptx
