Andrea,
Just like Robert and Garren said, CouchDB is not designed to work this way.
But I'm thinking of an architecture on top of it:
You could, for instance, create document versions that point to their
ancestors. This way you would create a tree of versions (it's necessary if
you want to preserve the eventual consistency capabilities).
The document layout would look like this:
type DocumentVersion = {
// CouchDB's own ID system
_id: string;
// Your document history ID - the same across all versions of a document
historyId: string;
// The _id of the ancestor document (undefined if first version)
parentId: string;
// Indicates if this version is archived or not
archived: boolean;
// Utility flags. Might be useful in future implementations of
reconciliation
// algorithms, or the construction of a changelog feature
createdAt: Date;
};
You would have to handle the versioning manually - something like this:
const saveDocumentVersion = async ({ _id, doc }) => {
// Avoid saving a document version that's already archived
delete doc.archived;
const newDoc = {
...doc,
createdAt: new Date().toISOString(),
};
let parentDoc;
if (doc.parentId) {
parentDoc = await loadDocument(doc.parentId);
}
if (parentDoc) {
// Rely on parentDoc to get historyId and parentId
newDoc.historyId = parentDoc.historyId;
newDoc.parentId = parentDoc._id;
} else {
newDoc.historyId = await generateHistoryId();
}
const savedDoc = await saveDocument(newDoc);
// Archives the parent document only after new version is saved
// (sends `parentDoc.archived = true` to CouchDB)
if (parentDoc) {
await archiveDocument(parentDoc);
}
return savedDoc;
};
It could serve you for simpler use cases, but not for complex ones (you
would need a stronger reconciliation algorithm to perform reconciliation in
updates across nodes).
I made a Gist with this code, so you can read it more comfortably:
https://gist.github.com/joelwallis/67bfbdcf0aa42418eed4c0246f5fe2a4
I hope it helps you in some way. Stay safe!
On Tue, Apr 28, 2020 at 7:46 AM Garren Smith <[email protected]> wrote:
> I think it would be better to create a daily or hourly snapshot of your
> database instead of relying on a database that doesn't run compaction.
> Depending on the versioning history of a CouchDB database is a really bad
> idea.
> As Bob said, rather create new docs than one document with lots of
> revisions. PouchDB is slow to replicate documents with lots of revisions
> versus lots of new documents.
>
> Cheers
> Garren
>
>
>
> On Tue, Apr 28, 2020 at 9:06 AM Andrea Brancatelli
> <[email protected]> wrote:
>
> > Hello Robert,
> >
> > I see your point and mostly understand it. The plan was not to "use"
> > this secondary database as an active one, but as a passively replicated
> > database from a main instance, so performances of the secondary database
> > weren't a big priority - the idea is to keep the whole "journal" of the
> > main database.
> >
> > We thought of having multiple copies of the documents as well, but the
> > "client" is a React/Pouch application and that would become a pita.
> >
> > My plan was to have a main database with a very aggressive compaction
> > rule, so that pouch replication would be as fast as possibile and the
> > local storage be as little as possible (also because pouch isn't blazing
> > fast with local views and indexes when you have a lot of documents) and
> > a secondary replicated database with a more relaxed compaction rule (as
> > I was saying maybe disabled at all) to run backups on or to do
> > post-mortem analysis of any problem that may rise on business logic.
> >
> > ---
> >
> > Andrea Brancatelli
> >
> > On 2020-04-27 20:34, Robert Samuel Newson wrote:
> >
> > > Hi,
> > >
> > > This is the most common mistake made with CouchDB, that it provides (or
> > could provide) a full history of document changes.
> > >
> > > Compaction is essential, it's the only time that the b+tree's are
> > rebalanced and obsolete version of b+tree
> > > nodes are removed from disk.
> > >
> > > If the old revisions of your documents really matter, make new
> documents
> > instead of updating them, and use some scheme of your choice to group
> them
> > (you could use a view on some property common to all revisions of
> > > the same logical document).
> > >
> > > B.
> > >
> > >> On 27 Apr 2020, at 17:10, Andrea Brancatelli <
> [email protected]>
> > wrote:
> > >>
> > >> Let's say I'd like to keep the whole revision history for documents
> in a
> > >> specific database (but maybe drop old views, if it's possible).
> > >>
> > >> What compaction setting would do that overriding the more-reasonable
> > >> default we usually have?
> > >>
> > >> --
> > >>
> > >> Andrea Brancatelli
>
--
Joel Jucá
joelwallis.com