Hi Wendall,
this is something I was looking for. Also, thanks for your feedback on
performance, you have saved me a lot of time.
Stephan
On 13-03-14 08:22 PM, Wendall Cada wrote:
The performance of a write per read in updating the doc with a
timestamp would be very, very poor in CouchDB.
The best scenario is create a separate stats database. Every time a
doc in the database you are tracking for is accessed, create a doc
describing the request in a stats database. Creating new docs in
CouchDB is very inexpensive, so you'll not see any performance issues
with this versus updating docs per request.
Create a new doc in the stats db like this:
{
"db": "name_of_tracked_db",
"id": "_id_of_doc_being_tracked",
"timestamp": timestamp
}
Then create a view in this database for your database that maps the
values. You can create several view indexes to separate the data for
whatever your needs are.
To view :
"doc_access": {
"map": "function(doc) {
emit([doc.db, doc.id, doc.timestamp], 1);
}",
"reduce": "_sum"
}
A mock query for this to see the number of times a doc was accessed
over the entire date range would be:
http://localhost:5984/stats/_design/data/_view/doc_access?startkey=["name_of_tracked_db","_id_of_doc_being_tracked",""]&endkey=["name_of_tracked_db","_id_of_doc_being_tracked",{}]&group=true
You'd get back a result like this:
{"rows": [
{"key":["name_of_tracked_db","_id_of_doc_being_tracked"], "value": 42}
]}
If you want to get results for a specific range of dates, simply add
the dates to the third component of the query.
This method gives you the ability to get stats for the access counts
for an entire db, a range of docs, or a single doc for any given
period of time.
The advantage of this approach 1. it's fast 2. it is extremely flexible
The disadvantage is that it takes up a ton of disk space if you never
purge old items from the db. I've been tracking every single page
request to our servers in this way with quite a bit of metadata in the
docs since Dec. 2010. That database is currently 5GB compacted for
~50k page requests per day over this period of time. I never had the
need to delete a single doc from this db.
I don't have any benchmarks for a comparison between the two methods,
but I'd strongly discourage a write per read model for your accessed
docs.
For an understanding about how the ordering for views works, see
http://wiki.apache.org/couchdb/View_collation
HTH,
Wendall
On 03/14/2013 07:16 PM, Stephan Bardubitzki wrote:
Hi Thomas,
no, I need only to track read, and I need the timestamp for some charts.
Stephan
On 13-03-14 07:02 PM, Thomas Hommers wrote:
Hi Stephan,
With 'accessed' do you mean read and write ? In case you just want
to track write access i believe you could use the _rev attribute.
Regards
Thomas
----- Reply message -----
From: "Stephan Bardubitzki" <[email protected]>
To: "[email protected]" <[email protected]>
Subject: Tracking doc access
Date: Fri, Mar 15, 2013 08:57
Hi there,
I have a task where I need to track how often a doc is accessed. The
two
possible ways I can think of are:
1. add an array to the doc and add the timestamp when it is accessed
2. create a new document and add the doc._id and the timestamp
Which one would you prefer? Or is there a better solution?
Thanks,
Stephan
--------------------------------
Spam/Virus scanning by CanIt Pro
For more information see
http://www.kgbinternet.com/SpamFilter.htm
To control your spam filter, log in at
http://filter.kgbinternet.com
--------------------------------
Spam/Virus scanning by CanIt Pro
For more information see
http://www.kgbinternet.com/SpamFilter.htm
To control your spam filter, log in at
http://filter.kgbinternet.com