On Dec 14, 2008, at 10:06 AM, Dan Woolley wrote:
I'm researching Couchdb for a project dealing with real estate
listing data. I'm very interested in Couchdb because the schema
less nature, RESTful interface, and potential off-line usage with
syncing fit my problem very well. I've been able to do some
prototyping and search on ranges for a single field very
successfully. I'm having trouble wrapping my mind around views for
a popular use case in real estate, which is a query like:
Price = 350000-400000
Beds = 4-5
Baths = 2-3
Any single range above is trivial, but what is the best model for
handling this AND scenario with views? The only thing I've been
able to come up with is three views returning doc id's - which
should be very fast - with an array intersection calculation on the
client side. Although I haven't tried it yet, that client side
calculation worries me with a potential document with 1M records -
the client would potentially be dealing with calculating the
intersection of multiple 100K element arrays. Is that a realistic
calculation?
Please tell me there is a better model for dealing with this type of
scenario - or that this use case is not well suited for Couchdb at
this time and I should move along.
using ruby or js i can compute the intersection of two 100k arrays in
about 10/th a sec, for example with this code
a=Array.new(100_000).map{ rand }
b=Array.new(100_000).map{ rand }
start_time=Time.now.to_f
intersection = b | a
end_time=Time.now.to_f
puts(end_time - start_time) #=> 0.197230815887451
and that's on my laptop which isn't too quick using ruby which also
isn't too quick.
i guess to me it's seems like keeping an index of each attribute to
search by and doing refinements is going to plenty fast, offload cpu
cycles to the client, and keep the code orthogonal and easy to
understand - you have one index per field, period.
in addition it seems like are always going to have a natural first
criteria and that you might be able to use startkey_docid/endkey_docid
to limit the result set of the second and third queries to smaller and
smaller ranges of ids (in the normal case).
cheers.
a @ http://codeforpeople.com/
--
we can deny everything, except that we have the possibility of being
better. simply reflect on that.
h.h. the 14th dalai lama