Nicolas,
I am not sure if I fully understand your use case (however it does sound
intriguing and unusual).
A couple of things stick out in your commentary;
"The data is only weakly relational."
"DB updates are relatively few"
I assume that you are getting data out of your legacy MySQL system using
complex joins.??
Have you considered totally denormalising your data and input data to
couchdb based on the output of your MySQL reports ??
Perhaps couchdb-lucene (or my current fav of the moment elasticsearch
which is also based on lucene) would be useful ??
If none of the two suggestions are of any use. Could you post a more
detailed description (with a data sample if possible) of
"The hiccup is reporting. Some of it involves the full set of documents.
Let's
say I have 5 categories of documents involved in a report, A to E. A
links to B,
B links to C, etc. The report needs data from A, B, and E. As far as I can
think, there's no way to do a view collation, because A and B share an
ID but E
doesn't. I can't pull a million documents from the DB to process elsewhere
either, so that nixes simple indexing and the '_id' object values."
Very best regards
Cliff
On 17/11/10 16:13, Nicolas Jessus wrote:
All right; no one should like what they're going to read.
I have a medium-sized MySQL system, which translates to a Couch with about a
million documents of about 20 types. The system would really benefit from a
schema-free design. The data is only weakly relational. Couch would fit really
well, enough that I don't mind twisting its arm in a few places if need be; the
tradeoff would be worth it.
The hiccup is reporting. Some of it involves the full set of documents. Let's
say I have 5 categories of documents involved in a report, A to E. A links to B,
B links to C, etc. The report needs data from A, B, and E. As far as I can
think, there's no way to do a view collation, because A and B share an ID but E
doesn't. I can't pull a million documents from the DB to process elsewhere
either, so that nixes simple indexing and the '_id' object values.
I could however write a special view_server that will emit keys after checking
the linked ID through an HTTP call (that's where you scream). Indexing
performance is totally unimportant to me, DB updates are relatively few, and I
can live with the dirty side-effects (again, the system as a whole would still
be much cleaner than the MySQL one).
With that solution I can have a map function that just handle docs of type A.
But I still need to reindex the relevant As when B or E changes. I could simply
listen to the change stream and force a reindex, but that doesn't work well with
legitimate updates when the _rev number goes up at random even though the doc
hasn't changed, and there's no auto-merge. So I'm pretty stuck.
I'm not asking that this type of functionality be encouraged. It's clearly
subverting the point of Couch. On the other hand, it doesn't seem like having a
force-reindex function would dirty the concept, and if it's easy to code, then
it's a shame it doesn't exist.