Dear All,
- In recent days I saw this thread: http://lists.osgeo.org/pipermail/gdal-dev/2012-August/033897.html subject to a feature that would be requested in GDAL. Also I am looking at initial requirements here: http://trac.osgeo.org/gdal/wiki/Correlator * I walk through the thread, and found some issues and concerns. You may find interesting some of my comments and observations: a) Looking thorough the code proposed seems like only co-registration of only two image is possible at once (?), and implementation uses SURF keypoint detector. - SURF and also SIFT descriptors are patented algorithms, even OpenCV decided to move out such code in a "non-free" called sub-folder, see here: http://code.opencv.org/projects/opencv/repository/revisions/master/show/modules/nonfree/src - Their, and also any other SIFT or SURF implementation of the original papers are considered "non-free", however there are much better options than those two. b) There are free alternatives such BRISK, FREAK, ORB (in order of their quality), that operates even in hamming space instead of euclid (provides very fast and dense match), also have the ability to tell the exact amount of features extracted (SIFT does not have e.g), so extraction can be surrpressed to X best quality amount of. c) I can't see where is FLANN matcher in the code, FLANN can provide huge speed-up booth for euclid and hamming space descriptors, also doing little post-filtering and validation with a small homography calculs using RANSAC would take a big boost for further gradient-descent steps, and it would take additional <1 millisecond plus to find and eliminate outliers as a post-process after keypoint matching. Also can decide if a pair is matchable or not matchable at all. This way also gradient-descent will be not overloaded with too many bad outliers. d) Regarding multi-band rasters, a good approach in case of RGB rasters is to sum channels: (0.299*R+0.587*G+0.114*B)/3 - 0.5 (thats exactly how e.g opencv is doing, and how some of CVision community are doing, not sure how this recepie camed up), but very same results in my very experience is also with plain (R+G+B)/3 , suitable with any N band rasters, in worse case of too many raster-bands user could select what raster bands to take in consideration at matching steps. e) I don't got the point with "segmentation". Perhaps the idea was born since doing SURF or SIFT detection over hi-res >50k*50k rasters is painfully slow, but is not the case at all with BRISK or other such descriptors. The approach to chop down rasters is not bad anyway, but take in account that will significantly increase required matching amount across pairs, since the rasters will increase in count. ------- * Performance boosts can be achieved, also n-to-n rasters can be addressed: 1) To address co-registration of massive collection of datasets (not only single pairs at once) without any prior knowledge of any relationships will need to brute-force and test all images against all neighbouring images, that require a minimum (n)*(n-1)/2 computation expense, counting that for each pair-match best speed could be ~10ms per pair (E.G using BRISK in best configuration with FLANN). 2) If prior knowledge exists (e.g. rasters have a declared coordinate frame), then is more easy, only overlapping rasters must be checked and registered, perhaps in much dense and deep way with more amount of extracted features per image. 3) Regarding smooth fusion of pixels and gradient descent error minimization OpenCV really holds very nice routines doing that (using state of art Levenberg-Marquard minimization), also have options to select some projection-mode. Using some additional serialization tricks (using smaller block of rasters at once in memory, GDAL routines can address this with ease) by avoiding memory greedy OpenCV internal raster routines, it can go up with massive amount of pixels (infinite) for any seamless fusion process. 4) To store computed feature points and descriptors in temporary metadata attached a very good and elegant way found by myself is to do that in an attached .HDF5 for each raster. HDF5 has the ability to store multidimensional data in so called "compound" types, points and its descriptors can be enumerated in nice and very fast way. Many more data's like homography matrix, or other constrains can be attached also into the HDF5 data in a highly organized way, and also there could be also room for small raster-bands (and any OSGEO suite can see them in native way) for debug informations like relationship's matrix of whole project and other usefull data for more advanced users. I disliked any sqlite or xml metadata afer unleashing what hdf formats can do. 5) There may be cases when so called "singleton group" will appear in case of n-to-n collection matching, so raster images will be divided in clusters with no relations between, in this case a segmentation over the cross-matching matrix could reveal and identify such groups, and OGR layer can be generated to reflect these areas. ------ - I am currently doing such work for a public 3D SfM project that should derive 3D point-clouds from plain raster collections: http://code.google.com/p/featsfm, the key-points detection and matching parts are well implemented so far, if someone wants to evaluate let me know, the final code will be released around March anyway. Also I try to improove things in constant way, e.g. using recent sparse descriptors like http://cvlab.epfl.ch/research/detect/dbrief, extremely fast sparse ones that could do an a-priori co-relation job for massive datasets (> 10k) in first step, and in second step (after relation between pairs are determined) using e.g. BRISK to do a very deep co-matching of known pairs. Also I am experimenting a recentish MIH matcher http://www.cs.toronto.edu/~norouzi/research/mih instead of FLANN, it has the ability to mach 10k keypoints against 4M keypoints within <4s (nad is doint "exact" match not inexact like FLANN) but the drawback is that all keypoints must be present in memory at once, so need to do kind of "block" logic and match one image to multiple ones in single time but also take care to not unload and reload same parts too many times back&forth in memory (its trickier, need clear logistics to avoid duplicate matches). - If one wants "insane" level of post co-registration (effective "pixel-warp", and not affine or projective ones) there are much deeper post ways to go with special class of descriptors like DAISY in context of some statistics (bayes/markov, some papers and matlab implementatios where tested by me) that can do so fine co-registration that accurate DEM's can be extracted with ease. It is possible to co-register rasters taken at very different times e.g 2004winter 2012summer (i tried before such experiment with excelent results) which is consiodered very challenging, in this case a difference analisys is possible. - To have an basic idea, my own implementation of matcher do something like 10k vs. 10k BRISK features using FLANN in 25ms on a decent 2.5Ghz core, and deepest 100k vs 100k features take ~3s, very good timings. A simple calculs having 25ms matching time on average and 1000 rasters would be: 1000 * 999 / 2 * 25 ms = 3,46875 hours. Suite designed by me is able to derive 3D clouds out from thousands of hi-res UAV imagery with quite ease. Sparse DBRIEF can go way under <1ms per match, its a new filed to explore going "sparse" ways. - If interested we can improve in some ways described by me the proposed GDAL tool taking easy steps, so let me know, I am willing to contribute but need to settle the exact ways how You folks would like to do it in deepest details. Honestly ideas are lots where we could go. With Respect, -- Balint Cristian e-mail: [email protected] ........................................................................... _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
