Raul:

Can you expand on why trees are bad in J? Coming from a C and Lisp
mentality, tree search wins a lot of problems. kd-trees break down in higher
dimensions (aka, for things like image databases, where I guess locality
hashing will beat a brute force search), but for modest dimensionality
(~10), they seem to do well, with cache misses and everything. 
Your technique works, if I understand it correctly, and I'm guessing it
doesn't keep extra copies of the data around to fill up the memory, making
it effectively a sort of tree.

It is going to take me a while to digest the code you and Marshall wrote,
thank you both very kindly for taking the time!

-Scott


Raul Miller-4 wrote:
> 
> Tree structures -- especially binary tree structures -- are frequently
> a bad choice in J.
> 
> Personally, if I were working with spatial data in J, I'd probably
> start with something like this:
> 
> NB. example arbitrary set of points:
> points=: ?.20 3$0
> prepad=: $&__@,~@{:@$, ], $&_@,~@{:@$
> searchable=: /:~@:((|."0 1) ,:^:(2 - #@$))"0 2~ i.@{:@$
> SP=: searchable prepad points
> 
> In other words: make n copies of the list of points (where n is the
> number of dimensions) and sort each list so that it can be binary
> searched along one of those dimensions.
> 
> Of course, it's not that simple.  To find the nearest point, you need
> to search a volume which includes your candidate points.  So, in the
> worst case (most points equidistant from the point you are checking)
> you wind up searching the entire data structure.
> 
> I've assumed, here, that once we've done a binary search to find a
> close match to the point we are looking for that it's worth searching
> the n nearest points in each direction from that match (where n is the
> number of dimensions).  So this means an initial search of n*(1+n)
> points (n lists of candidates each containing 1+n points).  We need to
> extend our search if any of the 2n points bounding each of the n lists
> of candidates has a coordinate distance from the point we are looking
> for which is closer than our closest candidate.
> 
> This might sound significantly less efficient than a kd-tree, but we
> need to keep in mind that the kd-tree (or r* tree) has significant
> costs from cache misses and from pipeline busting.
> 
> That said, there are a few other tricks that might be needed.  For
> example, if you are incrementally building the structure, adding to
> sorted lists is inefficient.  You can work around this issue by
> maintaining multiple instances of the logical structure and searching
> both (or all) of them and then picking the best result (for example).
> Then occasionally you find some time to merge them together.
> 
> That said, I have not had time to draft up a real sample.  (The
> project I am currently working on releases this weekend, and in my
> "spare" time I'm taking a class, which leaves me without much time for
> fun stuff like this.)
> 
> I hope this helps,
> 
> -- 
> Raul
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Spatial-trees-tp34422178s24193p34429335.html
Sent from the J Programming mailing list archive at Nabble.com.

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to