On 25 Dez., 19:15, Tom Sharpless <[email protected]> wrote:

> I too think it is time for us to take a hard look at the whole problem
> of aligning source images.
>
> Editing control points is one important aspect of that problem.
> Finding CPs in the first place is a more fundamental one.  Your
> suggestions about focusing on local areas where images actually
> overlap is, I think, the key to both.

Maybe even thinking in terms of control points is a limiting bias.
What is really needed is making parts of warped images overlap. One
has to consider where the whole methodology of the current CPGs like
autopano-sift-c and panomatic actually comes from: If I'm not
mistaken, SIFT and SURF originate from pattern recognition in AI. They
were made to find common points of reference in images which were
decidedly not taken from the same spot with good knowledge of the
camera position (or a reasonable guess) - more for autonomous robots
trying to navigate a real environment. This is why they were designed
to allow for invariance to affine transformations - which is precisely
the parameters you want to be immune against when moving the camera
(hoping parallax won't ruin it for you). Using them for CP generation
is more of a spinoff:

- lens distortions locally look very similar to affine transformations
- many users shoot handheld, introducing corresponding errors some of
which SIFT and SURF cope well with
- different lenses in a project make use of the scale invariance

so, SURF and SIFT work for us, but, especially in a very properly set
up take with pano head and well-calibrated lens, they may be overkill.
On the other hand, there are ceratin drawbacks to SURF and SIFT when
using them in a photographic context, especially with fisheye lenses.
(I'm a bit out of my depth here, so please correct me if I'm
mistaken). The problem is that any convolution on a pixel matrix
yields results that are determined by the geometry of the pixel
matrix. The assumption here is of course that the pixel matrix is a
set of equidistant sample points, which, in the case of a fisheye
image, and towards it's edges, is true for the sensor, but not the
corresponding points in the captured scene. To make convolution
results comparable, you either have to do mathematic trickery (like
making your detector invariant to affine transforms), or you have to
reproject the image locally so the convolution kernel will see the
same image geometry. If you know all relevant parameters and have just
one shared point of reference, you can reproject two images to each
have their projection axis going to the shared reference point. If you
match these reprojected images, you should be able to get away with
less involved mathematics since you don't have to be transform- or
scale-invariant, just rotation-invariant. I have the feeling that a
very fast and efficient semi-manual detection could be set up for well-
parametrized situations where the user would only have to supply one
CP, and the software could take it from there.
I have actually experimented a bit here, reprojecting images from my
Walimex 8mm stereographic fisheye by shifting the projection axis so
that the axes go through corresponding image points and running the
matching process on these reprojected images. My results are
inconclusive - The quality of the matching even without reprojection
is extremely good with autopano-sift-c in full scale, so I din't look
much further in that direction, being happy with what is there
already. At lower resolutions and with other CPGs, I felt the CP
detection was better (particularly, more evenly distributed) after
such a reprojection. Maybe further investigation and play in that
direction would be rewarding.

Another drawback of relying on control points is their one-dimensional
nature. Of course, in case of the match between two SIFT/SURF feature
points, the environment of the points is present in the feature
vectors, but everything else in the further distance may be totally
unaligned. So to have a good result, more CPs are generated. But these
won't be evenly distributed, and if the projection and camera position
is known, an analysis of the assumed overlapping region should be
possible which makes a clearer statement of the quality of the overlap
than a few feature point correspondences.

> It is amazing to me that Hugin (and the other PT based stitchers)
> still treat the alignment problem as if the source images were taken
> in random directions, when in fact all serious panographers take a lot
> of trouble to orient their views systematically.

do keep in mind though that a portion of the users (I'm not sure about
how big a portion...) are either

- casual
- non-technical
- inexperienced
- have accidentally got it wrong
- or unable to be extremely precise due to circumstances

I, for example, often take handheld shots because I'm out in the
wilderness trecking and can't drag a tripod around with me because I'd
rather have something to eat in my pack.

These users musn't be left behind just because there is also a fair
portion of 'serious panographers'. In fact, the tool should (and,
currently, does) adapt to different qualities of input, but if it
deals with the serious panographer's input, it could maybe be put into
a mode with fewer degrees of freedom to exploit the comfortable data
situation.

> ...
> The general principle here is to let the user specify rough local
> alignments, and have the SW refine them into a precise global
> alignment.  And to take advantage of pre-calibrated lenses and
> shooting patterns.

Yes. I agree. And have proper pattern matching as a fallback option if
there is a lack of known constraints.

Kay

-- 
You received this message because you are subscribed to the Google Groups 
"Hugin and other free panoramic software" group.
A list of frequently asked questions is available at: 
http://wiki.panotools.org/Hugin_FAQ
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at http://groups.google.com/group/hugin-ptx

Reply via email to