On Apr 11, 2014, at 12:59 PM, Chris McCann <[email protected]> wrote:
> I'm looking for a solution to a search problem and want to survey the > community to see if anyone else has dealt with this type of search. > > The application I'm building supports an image processing system. We have a > mathematical way of uniquely representing any particular image as a vector of > 16 values, each ranging between 0 and 255. > > I need to implement a search mechanism that finds the closest matches to a > given image, also represented as a 16 element vector. This is usually called > a "vector space model" search, and it's implemented for full text search in > Postgres as well as Lucene, and probably many other full text search systems. > > The problem I'm wrestling with is I'm not searching on text, I'm searching on > integers. I basically need to search for the closest match like this: > > Say my search image has a vector with elements q(1) to q(16), [q(1) = 122, > q(2) = 7, q(3) = 89,, ..., q(16) = 224]. > > To compare that vector against the image vectors in the database I need to > calculate the "distance" between the query vector (q) and each of the > database vectors (d): > > distance = square_root( (q(1) - d(1))^2 + (q(2) - d(2))^2 + ... (q(16) - > d(16))^2) > > The lower the distance the closer the match, with dist == 0 being an exact > match. > > My research hasn't led me to a direct implementation of this in Postgres or > Lucene since they are designed for text searching, though the underlying > principles are the exact same. Anyone ever tackle this type of search with > numerical values? This isn't my area of expertise, but I believe Postgres is the go-to database of choice for spatial work because of its advanced indexing options. I'm 75% sure you can do something that will give you an index across your 16 values that will let you do a fast nearest-neighbor or nearest-k. Stack Overflow appears to agree: <http://stackoverflow.com/questions/16676644/postgresql-k-nearest-neighbor-knn-on-multidimensional-cube> Regards, Guyren G Howe Relevant Logic LLC guyren-at-relevantlogic.com ~ http://relevantlogic.com ~ +1 512 784 3178 Ruby/Rails, Xojo, PHP programming PostgreSQL, MySQL database design and consulting Technical writing and training Read my book, Real OOP with REALbasic: <http://relevantlogic.com/oop-book/about-the-oop-book.php> -- -- SD Ruby mailing list [email protected] http://groups.google.com/group/sdruby --- You received this message because you are subscribed to the Google Groups "SD Ruby" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
