nickva opened a new pull request, #5713:
URL: https://github.com/apache/couchdb/pull/5713
Use a map to store and lookup FDI records in `couch_db_updater`.
The map helps in two ways:
* In `apply_purge_requests` we avoid removing then re-adding the active
FDI record to the list. Looking up the record is a faster operation, especially
with large batch and a simple update vs remove + replace helps with generating
less garbage.
* In `purge_docs` avoid nested `lists:keyfind/3` lookups when building
the FDI pairs list. For instance with 1000 docs, we replace 1000 calls to
keyfind, an O(n) operation, with 1000 map lookups, O(log n) operations.
Benchmark:
```
./conflicts.py -a adm:pass -q 1 -n 100000 -x 1.0 -z -c 0
docs: 100k,
purge batch size: 1000
q:1
all deleted
no conflicts
```
Results (7 calls with main and 10 calls with the PR)
Unoptimized (main)
*** purging 100000 docs 15 sec, rate = 6466/sec
*** purging 100000 docs 15 sec, rate = 6880/sec
*** purging 100000 docs 14 sec, rate = 7019/sec
*** purging 100000 docs 17 sec, rate = 6056/sec
*** purging 100000 docs 14 sec, rate = 7239/sec
*** purging 100000 docs 14 sec, rate = 7124/sec
*** purging 100000 docs 14 sec, rate = 7121/sec
Averge: 6844
Optimized (pr)
*** purging 100000 docs 11 sec, rate = 9003/sec
*** purging 100000 docs 12 sec, rate = 8591/sec
*** purging 100000 docs 10 sec, rate = 9692/sec
*** purging 100000 docs 11 sec, rate = 9442/sec
*** purging 100000 docs 12 sec, rate = 8364/sec
*** purging 100000 docs 11 sec, rate = 8784/sec
*** purging 100000 docs 11 sec, rate = 9103/sec
Average: 8926
Speedup: 30%
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]