On Oct 24, 2010, at 12:09 PM, John Logsdon <[email protected]> wrote:

Hi Keith

Very helpful. I hadn't seen/realised that I could do multi part indexes with startkey endkey.

Just to confirm:

1. startkey =myp, endkey=myp\u9999 is the Couchdb equivalent of an SQL "LIKE 'myp%'"

Sort of. Just don't think of it as an SQL equavilent. You want the entire range of characters that can occur after "myp", which is nothing ("myp") to the last possible Unicode character ("myp\u9999"). In this case, this technique behaves very similar to %, but it also allows you to chunk your query alphabetically. myp0 to myp9 is 0-9, mypa to mypa\u9999 is all of the A's, etc.

The \u9999 trick applies to strings only.

2. startkey = 0, endkey = {} is the CouchDB equiv of an SQL "LIKE '%''"

In an array, 0 is the first possible item alphabetically, and a hash/ object is the last possible item. If you want everything with a key that starts with the first item being "abc", you need to say ["abc", 0] to ["abc", {}].

Think about 0 and {} specifying a range in an array.


My only other issue is that my indexes even for simple indexes e.g. emit(doc.name, 1) have produced index sizes far larger than the database file. Do you have any idea why that might be?

It's just how it goes. For every index, CouchDB makes a new file, which is in a different format than the database to allow lookup.


Regards

John

On 24 Oct 2010, at 17:46, Keith Gable wrote:

You'd use multiple indexes:

On Sun, 2010-10-24 at 11:53 +0100, John Logsdon wrote:
Hi

I have an index that has three 'groups' to to represent an Account Name, a Contained entity name and a contained entity type e.g. {"account":"johnl", "name":"myplan", "type":"plan"}

I'm after the equivalent of a startkey endkey but for a composite index so I could do the following types of queries:

1) Search across all Accounts for any Entity type starting 'myp" (This supports ajax search as the user starts typing in the search box)

e.g. Account = *, Type = *, Name starts with myp

by_name:

emit(name, 1);

startkey=myp
endkey=myp\u9999


2) Search a list of Accounts for any Entity type starting 'myp"

e.g. Account in johnl, mycompany, myreseller, global, Type = *, Name starts with myp

by_account_and_name:

emit([account, name], 1);

query once for every company and then merge the results in your
application:

startkey=["johnl", "myp"]
endkey=["johnl", "myp\u9999"]
(etc.)



3) Search for all plans named "myplan" in all accounts

e.g. Account = *, Type = Plan, Name = myplan

by_type_and_name:

emit([type, name], 1);

startkey=["Plan", "myplan"]
endkey=["Plan", "myplan"]

(or you can probably use key=["Plan", "myplan"])



4) Search a list of Accounts for all plans

e.g. Account = *, Type = Plan, Name = *


Use by_type_and_name:

startkey=["Plan", 0]
endkey=["Plan", {}]


5) Search a List of Accounts for all contained entities

e.g. Account in johnl, mycompany, myreseller, global, Type = *, Name = *

Use by_account_and_name, or perhaps make a new view called by_account
and then just query for accounts:

startkey=["johnl", 0]
endkey=["johnl", {}]

and merge it with the results from the other accounts.




If you need something significantly more complex, like type = x, name = x, y, or z, account is in unpaid status, and not a corporate account, or something else that can't really be mapped to a key-value methodology,
then you'll probably need to check out CouchDB-Lucene.


P.S. I emit 1 as the value so that I have an easy way to count the
results. If you have something better (dollars? hits?), then you should emit that instead. I don't see the point in emitting the ID because you
can use include_docs=true to get the documents, and IIRC, the ID is
passed anyways.


Reply via email to