[GitHub] couchdb-snappy issue #9: Fix memory bug in SnappyNifSink::Append
Github user nickva commented on the issue: https://github.com/apache/couchdb-snappy/pull/9 To check for memory leaks ran the snappy compression decompression test in a loop ``` rebar shell ==> couchdb-snappy (shell) Erlang/OTP 20 [erts-9.3.3.1] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:10] [hipe] [kernel-poll:false] Eshell V9.3.3.1 (abort with ^G) 1> cd("test"). 2> c(snappy_tests). 3> [begin snappy_tests:test(), timer:sleep(10) end || _ <- lists:seq(1,1)]. ... ``` Memory stayed at 40+/2 Mb ---
[GitHub] couchdb-ibrowse issue #1: ibrowse: enable inet6 from upstream
Github user nickva commented on the issue: https://github.com/apache/couchdb-ibrowse/pull/1 Noticed there was another ibrowse commit after that which fixed an issue related to ipv6 being the new default: https://github.com/cmullaparthi/ibrowse/commit/e6a0c366fc8fc982d7981087c1e9d1e28e2a235a Wonder if we'd need that as well? ---
[GitHub] couchdb-snappy issue #7: Build with Erlang 21
Github user nickva commented on the issue: https://github.com/apache/couchdb-snappy/pull/7 Yap, merged it via ASF. Still not sure why build failed. I had tried compiling with rebar3 locally and that worked as well. A different C compiler or libc version... who knows... ---
[GitHub] couchdb-snappy pull request #8: Test commit
Github user nickva closed the pull request at: https://github.com/apache/couchdb-snappy/pull/8 ---
[GitHub] couchdb-snappy issue #7: Build with Erlang 21
Github user nickva commented on the issue: https://github.com/apache/couchdb-snappy/pull/7 Yap all travis builds fail apparently now. Here is a change to the README file: https://travis-ci.org/apache/couchdb-snappy/builds/395127559?utm_source=github_status_medium=notification ---
[GitHub] couchdb-snappy pull request #8: Test commit
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-snappy/pull/8 Test commit TEST DON'T MERGE You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-snappy test-branch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-snappy/pull/8.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8 commit 1a3cc6c10e73358ae9d4aa976203172d8621f636 Author: Nick Vatamaniuc Date: 2018-06-21T17:19:34Z Test commit ---
[GitHub] couchdb-snappy issue #7: Build with Erlang 21
Github user nickva commented on the issue: https://github.com/apache/couchdb-snappy/pull/7 Trying to use rebar3 maybe and it can't build the nif properly? ---
[GitHub] couchdb-snappy issue #7: Build with Erlang 21
Github user nickva commented on the issue: https://github.com/apache/couchdb-snappy/pull/7 Not sure why travis doesn't work but local tests do pass with 21: ``` make check rebar compile ==> snappy (compile) rebar eunit ==> snappy (eunit) Compiled src/snappy.erl Compiled test/snappy_tests.erl EUnit module 'snappy' module 'snappy_tests' snappy_tests: compression_test_...ok snappy_tests: decompression_test_...ok [done in 0.496 s] [done in 0.496 s] === 2 tests passed. ``` ---
[GitHub] couchdb-snappy pull request #7: Build with Erlang 21
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-snappy/pull/7 Build with Erlang 21 Issue #1396 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-snappy allow-erlang-21 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-snappy/pull/7.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #7 commit 57b5e25574944ab5bf387f630dccfb37f4c6b51c Author: Nick Vatamaniuc Date: 2018-06-20T21:15:14Z Build with Erlang 21 Issue #1396 ---
[GitHub] couchdb-config pull request #18: Use callback directive for config_listener ...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-config/pull/18 Use callback directive for config_listener behaviour This knocks out a few dialyzer errors such as: `Callback info about the config_listener behaviour is not available` It is also more descriptive as it specifies types and argument names for each callback. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-config use-callbacks-for-behavior Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-config/pull/18.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18 commit 454a854520dc146ff11dd170cdd9c9288e141e56 Author: Nick Vatamaniuc <vatamane@...> Date: 2018-02-15T06:30:44Z Use callback directive for config_listener behaviour This knocks out a few dialyzer errors such as: `Callback info about the config_listener behaviour is not available` It is also more descriptive as it specifies types and argument names for each callback. ---
[GitHub] couchdb-khash pull request #9: Handle deprecated random module
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-khash/pull/9 Handle deprecated random module Use a compile time check for platform versions, then a macro conditional in a separate rand module. Removed redunant beam file compile rule from Makefile as it prevented erl_opts rebar options form taking effect. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/couchdb-khash rand-compat Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-khash/pull/9.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9 commit 7cacece3668fcbbccdf5533ef27190a00fecb3cc Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-10-03T06:03:19Z Handle deprecated random module Use a compile time check for platform versions, then a macro conditional in a separate rand module. Removed redunant beam file compile rule from Makefile as it prevented erl_opts rebar options form taking effect. ---
[GitHub] couchdb-khash pull request #8: Fix iterator expiry test
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-khash/pull/8 Fix iterator expiry test This test started failing with Erlang 20.0 release. The reason is that opaaque NIF resources stopped being identifed as empty binaries in Erlang so previously it matched but once refs were used it stopped matching. Fixes #855 You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/couchdb-khash fix-iterator-test Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-khash/pull/8.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8 commit 8048aaf74994b1f21a6259adcbe69ace3af4bbb3 Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-10-02T20:37:56Z Fix iterator expiry test This test started failing with Erlang 20.0 release. The reason is that opaaque NIF resources stopped being identifed as empty binaries in Erlang so previously it matched but once refs were used it stopped matching. Fixes #855 ---
[GitHub] couchdb-khash issue #7: Replace deprecated random module
Github user nickva commented on the issue: https://github.com/apache/couchdb-khash/pull/7 Will do! Good idea. I just noticed it was wonky ---
[GitHub] couchdb-khash pull request #7: Replace deprecated random module
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-khash/pull/7 Replace deprecated random module Replaced with crypto:rand_uniform functions. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/couchdb-khash replace-deprecated-random-module Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-khash/pull/7.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #7 commit c777cee7717f75199b298a6888bbea77a7ea599d Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-10-02T18:51:32Z Replace deprecated random module Replaced with crypto:rand_uniform functions. ---
[GitHub] couchdb-config pull request #16: Add longer timeouts for operations which co...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-config/pull/16 Add longer timeouts for operations which could write to disk It turns out that 5 seconds is often not enough in a severly throttled test environment, and simple operations like config:set and config:delete raise timeout errors. Increase default 5 second timeout to half a minute. This should hopefully handle even heavily throttled IO environments. Fixed #703 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-config issue-703 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-config/pull/16.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16 commit ac2a33e240669f24af5b8e86499d9b88f8df61b2 Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-07-22T03:13:01Z Add longer timeouts for operations which could write to disk It turns out that 5 seconds is often not enough in a severly throttled test environment, and simple operations like config:set and config:delete raise timeout errors. Increase default 5 second timeout to half a minute. This should hopefully handle even heavily throttled IO environments. Fixed #703 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-ets-lru issue #5: Fix flaky tests
Github user nickva commented on the issue: https://github.com/apache/couchdb-ets-lru/pull/5 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #236: Close idle dbs
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/236 Closing, moved to monorepo PR https://github.com/apache/couchdb/pull/539 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #236: Close idle dbs
Github user nickva closed the pull request at: https://github.com/apache/couchdb-couch/pull/236 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator pull request #64: 63012 scheduler
Github user nickva closed the pull request at: https://github.com/apache/couchdb-couch-replicator/pull/64 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #238: Add _replication_start_time to the doc fiel...
Github user nickva closed the pull request at: https://github.com/apache/couchdb-couch/pull/238 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-mrview issue #73: Fix unused variables warning
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch-mrview/pull/73 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #239: An ETS based couch_lru
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/239 I ended up not using `update_counter` because update_counter/4 is not available in R16B which CouchDB still supports. Tried simply replacing update_element with update_counter/3 but that didn't seem to provide any visible speed improvement. @eiri thank you for the test module. I updated it to work with ETS table, but kept the same structure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #239: An ETS based couch_lru
Github user nickva commented on a diff in the pull request: https://github.com/apache/couchdb-couch/pull/239#discussion_r106270691 --- Diff: src/couch_lru.erl --- @@ -16,48 +16,57 @@ -include_lib("couch/include/couch_db.hrl"). new() -> -{gb_trees:empty(), dict:new()}. - -insert(DbName, {Tree0, Dict0}) -> -Lru = erlang:now(), -{gb_trees:insert(Lru, DbName, Tree0), dict:store(DbName, Lru, Dict0)}. - -update(DbName, {Tree0, Dict0}) -> -case dict:find(DbName, Dict0) of -{ok, Old} -> -New = erlang:now(), -Tree = gb_trees:insert(New, DbName, gb_trees:delete(Old, Tree0)), -Dict = dict:store(DbName, New, Dict0), -{Tree, Dict}; -error -> -% We closed this database before processing the update. Ignore -{Tree0, Dict0} +Updates = ets:new(couch_lru_updates, [ordered_set]), +Dbs = ets:new(couch_lru_dbs, [set]), +{0, Updates, Dbs}. + +insert(DbName, {Count, Updates, Dbs}) -> +update(DbName, {Count, Updates, Dbs}). + +update(DbName, {Count, Updates, Dbs}) -> --- End diff -- Though thinking about it would change the logic. Notice now `Count` is a globally incrementing counter. This change would make so that the count is increment per db shard. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #239: An ETS based couch_lru
Github user nickva commented on a diff in the pull request: https://github.com/apache/couchdb-couch/pull/239#discussion_r106262038 --- Diff: src/couch_lru.erl --- @@ -16,48 +16,57 @@ -include_lib("couch/include/couch_db.hrl"). new() -> -{gb_trees:empty(), dict:new()}. - -insert(DbName, {Tree0, Dict0}) -> -Lru = erlang:now(), -{gb_trees:insert(Lru, DbName, Tree0), dict:store(DbName, Lru, Dict0)}. - -update(DbName, {Tree0, Dict0}) -> -case dict:find(DbName, Dict0) of -{ok, Old} -> -New = erlang:now(), -Tree = gb_trees:insert(New, DbName, gb_trees:delete(Old, Tree0)), -Dict = dict:store(DbName, New, Dict0), -{Tree, Dict}; -error -> -% We closed this database before processing the update. Ignore -{Tree0, Dict0} +Updates = ets:new(couch_lru_updates, [ordered_set]), +Dbs = ets:new(couch_lru_dbs, [set]), +{0, Updates, Dbs}. + +insert(DbName, {Count, Updates, Dbs}) -> +update(DbName, {Count, Updates, Dbs}). + +update(DbName, {Count, Updates, Dbs}) -> +case ets:lookup(Dbs, DbName) of +[] -> +true = ets:insert(Updates, {{Count, DbName}}), --- End diff -- Ha! Caught my trick. To allow iterating easier without doing an extra lookup in close_int --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #239: An ETS based couch_lru
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/239 @davisp @eiri I tried a 5 minute eprof run. At default 500 max_dbs_open. Cluster had 20 continuous 1-to-n replications. ``` master + ETS lru (this PR) eprof:start_profiling([couch_server], {couch_lru, '_', '_'}), timer:sleep(30), eprof:stop_profiling(), eprof:analyze(), eprof:stop(). ** Process <4641.251.0>-- 100.00 % of profiled time *** FUNCTION CALLS%TIME [uS / CALLS] - --- [--] couch_lru:insert/2 564 1.46 11437 [ 20.28] couch_lru:close_int/3 1800 5.36 42068 [ 23.37] couch_lru:close/1450 6.29 49341 [109.65] couch_lru:update/2 2654286.90 681987 [ 25.69] - - --- -- [--] Total: 29356 100.00% 784833 [ 26.74] master + monotonic counter (https://github.com/cloudant/couchdb-couch/commit/4b49e71c14837c0f1b63e5d0ba2fea34c5fd997e) eprof:start_profiling([couch_server], {couch_lru, '_', '_'}), timer:sleep(30), eprof:stop_profiling(), eprof:analyze(), eprof:stop(). ** Process <4641.251.0>-- 100.00 % of profiled time *** FUNCTION CALLS%TIME [uS / CALLS] - --- [--] couch_lru:close_int/2323 1.50 13817 [ 42.78] couch_lru:insert/2 621 2.36 21723 [ 34.98] couch_lru:close/1320 3.90 35962 [112.38] couch_lru:update/2 2595392.24 849899 [ 32.75] - - --- -- [--] Total: 27217 100.00% 921401 [ 33.85] current master eprof:start_profiling([couch_server], {couch_lru, '_', '_'}), timer:sleep(30), eprof:stop_profiling(), eprof:analyze(), eprof:stop(). ** Process <4641.251.0>-- 100.00 % of profiled time *** FUNCTION CALLS% TIME [uS / CALLS] - --- [--] couch_lru:close_int/2513 1.6022893 [ 44.63] couch_lru:insert/2 727 2.0629407 [ 40.45] couch_lru:close/1507 3.9055623 [109.71] couch_lru:update/2 3511492.44 1318808 [ 37.56] - - --- --- [--] Total: 36861 100.00% 1426731 [ 38.71] ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #239: An ETS based couch_lru
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch/pull/239 An ETS based couch_lru The interface is the same as previous couch_lru. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch ets-based-lru Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch/pull/239.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #239 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #237: Add sys_dbs to the LRU
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/237 Tried it locally. Looks good. +1 After tests are fixed. We'd want to performance tests this at some point. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-chttpd pull request #158: 63012 scheduler
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-chttpd/pull/158 63012 scheduler This is part of a set of PRs to merge the new scheduling replicator Main PR is this: apache/couchdb-couch-replicator#64 Top level PR to gather and help test all the dependencies: apache/couchdb#454 COUCHDB-3324 You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/couchdb-chttpd 63012-scheduler Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-chttpd/pull/158.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #158 commit b8c218b2e586adeb84d8d496105ede4cc6357331 Author: Benjamin Bastian <benjamin.bast...@gmail.com> Date: 2017-03-10T21:06:04Z Add `_scheduler/{jobs,docs}` API endpoints The `_scheduler/docs` endpoint provides a view of all replicator docs which have been seen by the scheduler. This endpoint includes useful information such as the state of the replication and the coordinator node. The `_scheduler/jobs` endpoint provides a view of all replications managed by the scheduler. This endpoint includes more information on the replication than the `_scheduler/docs` endpoint, including the history of state transitions of the replication. commit d493c9573f88520c5d24ef0f7ad9fc2e19a398b2 Author: Benjamin Bastian <benjamin.bast...@gmail.com> Date: 2017-03-14T18:15:52Z Merge pull request #3 from cloudant/63012-scheduler-tasks-api Add `_scheduler/{jobs,docs}` API endpoints --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #238: Add _replication_start_time to the doc fiel...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch/pull/238 Add _replication_start_time to the doc field validation section. This is part of a set of PRs to merge the new scheduling replicator Main PR is this: https://github.com/apache/couchdb-couch-replicator/pull/64 Top level PR to gather and help test all the dependencies: https://github.com/apache/couchdb/pull/454 COUCHDB-3324 You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/couchdb-couch 63012-scheduler Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch/pull/238.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #238 commit f92b68f66bfd286b8702527a2322cfc4c0c2a2bc Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2016-11-04T21:27:13Z Add _replication_start_time to the doc field validation section. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator pull request #64: 63012 scheduler
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch-replicator/pull/64 63012 scheduler Pull request to merge scheduling replicator work to ASF master. This repository has most of the changes. The feature overall consists of updates to 3 repositories: this one (replicator), couch and chttpd. To test all there is a separate top level PR to couchdb: https://github.com/apache/couchdb/pull/454 COUCHDB-3324 You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/couchdb-couch-replicator 63012-scheduler Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch-replicator/pull/64.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #64 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #236: Close idle dbs
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/236 And it would be non-portable (on windows at least) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #236: Close idle dbs
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/236 @davisp python has a way to check file descriptor limits, I am not sure erlang has it: ``` In [9]: resource.getrlimit(resource.RLIMIT_NOFILE) Out[9]: (65536, 65536) In [10]: ``` But could do a sys command perhaps ``` os:cmd("ulimit -Hn"). "65536\n" ``` It would be interesting to be able to detect it and then make it a bit less to allow for sockets. Or at least if set too low to warn user with a log message or some similar thing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #236: Close idle dbs
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/236 Updated with using hibernate `wakeup` trick and not having the silly timer in there as well --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #236: Close idle dbs
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/236 @eiri the eviction policy would be used when we are bumping against the limit. This could happen under a minute time-frame. A minute could be for example 1 hour (maybe it should be?). Then if we try to open a db then we'd want to evict some dbs and not wait for an hour for idle to remove them. So it is an immediate cleanup thing vs slowly evict stuff that's not used thing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #236: Close idle dbs
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/236 @eiri lru would be used to remove and makes room immediately if we are bumping against the limit, specifically it would remove non sys db shards first. idle-based remove is to clean up memory and processes which are not used anymore. I think maybe having 1 min is too aggressive and we want a longer time scale. Or have 2 values for sys_db non-sys dbs. If we scan a shard and never look at again it remains in a memory forever (the assumption is we'd used it again) but it is not true. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #236: Close idle dbs
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/236 Hibernate thing can be simplified then. I'll give that a try. On _users I see it with a few replications running. And I see replication sources and _users shard appear in the ets table. Then disappear and pop right back up. I am guessing those are opened for reading but couch_db_updater times out because there are no updates done to that shard? Is that plausible explanation. Is it worth trying to factor in last open time in is_idle? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #236: Close idle dbs
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/236 Ah makes sense we don't do that on delete so was wondering about that. You're right will update code. Also what do you think of not using hibernate and instead just a gc to avoid managing the extra timer reference? Another issue is if there any thing we want to do to make sys_db behave differently than non-sys-dbs? Maybe longer timeouts for some? Then one more thing is read updates. Idle doesn't pay attention if it was recently read. So some dbs like _users ends up read and opened then they close after a 1minute. Then open again and soon. Is it worth having an is_recently_idle function which checks lru?. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #236: Close idle dbs
Github user nickva commented on a diff in the pull request: https://github.com/apache/couchdb-couch/pull/236#discussion_r105918717 --- Diff: src/couch_db_updater.erl --- @@ -1454,3 +1468,44 @@ default_security_object(_DbName) -> "everyone" -> [] end. + +% These functions rely on using the process dictionary. This is usually frowned +% upon howver in this case it is done to avoid changing to a different server state +% record. Once PSE (Pluggable Storage Engine) code lands this should be moved to the +% #db{} record. + +schedule_timeout() -> +case get(idle_timer_ref) of +TimerRef when is_reference(TimerRef) -> +erlang:cancel_timer(TimerRef); +undefined -> +ok +end, +case idle_limit() of +infinity -> +ok; +Timeout when is_integer(Timeout) -> +TRef = erlang:send_after(Timeout, self(), timeout), --- End diff -- It was to preserve hibernate semantics. If we didn't have hibernate there for update_docs could have just been returning a timeout on gen_server replies. But because the reply is {reply, Result, State, Timeout|hibernate) we could not both hibernate and trigger a timeout. The timeouts in the gen_server replies are used to detect idleness. I wonder if we just want to do a gc call instead of a hibernate and get rid of the extra timer management. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #235: Allow limiting maximum document body size
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/235 @sagelywizard should be a bit better now. But ended up with another pr for fabric as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #235: Allow limiting maximum document body size
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/235 Related pr: * https://github.com/apache/couchdb-fabric/pull/91 * https://github.com/apache/couchdb-chttpd/pull/157 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-fabric pull request #91: Allow limiting maximum document body size
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-fabric/pull/91 Allow limiting maximum document body size Update doc function to check and validate document body sizes Main implementation is in PR: https://github.com/apache/couchdb-couch/pull/235 COUCHDB-2992 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-fabric couchdb-2992 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-fabric/pull/91.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #91 commit 3b15107df83a16a26dbc6c06a1a080437cb558b8 Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-03-14T06:59:31Z Allow limiting maximum document body size Update doc function to check and validate document body sizes Main implementation is in PR: https://github.com/apache/couchdb-couch/pull/235 COUCHDB-2992 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-fabric issue #91: Allow limiting maximum document body size
Github user nickva commented on the issue: https://github.com/apache/couchdb-fabric/pull/91 Related to: https://github.com/apache/couchdb-couch/pull/235 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #235: Allow limiting maximum document body size
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/235 @sagelywizard I added a validating version of the function. The other code like cleanup_index_files can still call the old function and not crash on size limit change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #236: Close idle dbs
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/236 I updated the code to both bump down the open dbs count on close as well as remove entries from Lru. The last bit was shamelessly stolen from Eric. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #236: Close idle dbs
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/236 @eiri Nice find. Thank you. Yap I see when we delete a non-sys_db we forget to handle lru properly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #236: Close idle dbs
Github user nickva commented on a diff in the pull request: https://github.com/apache/couchdb-couch/pull/236#discussion_r105822026 --- Diff: src/couch_db_updater.erl --- @@ -1454,3 +1468,28 @@ default_security_object(_DbName) -> "everyone" -> [] end. + +% These functions rely on using the process dictionary. This is usually frowned upon +% howver in this case it is done to avoid changing to a different server state record + +schedule_timeout() -> +case get(idle_timer_ref) of +TimerRef when is_reference(TimerRef) -> +erlang:cancel_timer(TimerRef); +undefined -> +ok +end, +put(idle_timer_ref, erlang:send_after(idle_limit(), self(), timeout)). + +update_idle_limit_from_config() -> +Default = integer_to_list(?IDLE_LIMIT_DEFAULT), +IdleLimit = case config:get("couchdb", "idle_check_timeout", Default) of +"infinity" -> +infinity; --- End diff -- Using `hibernate` here would force a hibernation but then it would force it more often (on every callback) than before. We did that only after update_docs. And the rest was effectively `infinity`. I'll try to restore the old behavior so on infinity idle_check_timeout we hibernate after update_docs only. So we have a way to restore that old mode if need be in production. Also one downside with this simplistic approach with config is that setting infinity means not having timeout fire so then config value will not be updated. This might involve a change listener or sticking update_idle_limit_from_config() call in other places that might be called periodically. Is it worth doing that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #235: Allow limiting maximum document body size
Github user nickva commented on a diff in the pull request: https://github.com/apache/couchdb-couch/pull/235#discussion_r105816177 --- Diff: src/couch_doc.erl --- @@ -125,7 +125,14 @@ doc_to_json_obj(#doc{id=Id,deleted=Del,body=Body,revs={Start, RevIds}, }. from_json_obj({Props}) -> --- End diff -- Hmm good point. It seems like the cleanest place code-wise as we do other validations in that function (`throw({doc_validation,...`). Wonder if it makes sens to make a validating and non-validation version of the function. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #236: Close idle dbs
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/236 Since first (dirty) is_idle check was moved to `couch_db_updater` wonder if it is even worth doing this additional read in couch_server ``` close_db_if_idle(DbName) -> case ets:lookup(couch_dbs, DbName) of [#db{} = Db] -> gen_server:cast(couch_server, {close_db_if_idle, DbName}); _ -> ok end. ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #236: Close idle dbs
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/236 @davisp @sagelywizard Used process dictionary to fix two issues: 1) Lack of configuration. Configuration is now read from `couchdb.idle_check_timeout` with a default of 6 msec. Every timeout period it is refreshed from config module and put into process dictionary. 2) Potentially having more than 1 timeout message in flight. This could happen if shorty after a hibernate event, there is a gen_server message handled which sets a new timeout. So now they'd be two of them. To fix that, timer ref from send_after is stuck in the process dictionary and it is canceled before sending another. Now there is still a chance that previous timeout already is in the message queue, but there is no chance of an overlapping timeout message cascade forming. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #236: Close idle dbs
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/236 @sagelywizard we are doing a check first is see if shard is idle to hopefully avoid backing up couch_server with messages. Also because this will close idle shards, couch_server should have a bit of an easier time, since they'd be less entries in the lru and in the couchdbs_ets table. Since PSE merge would modify this code wonder if we should wait till that merge happens and then make it nicer and configurable --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #236: Close idle dbs
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch/pull/236 Close idle dbs Previously idle dbs, especially sys_dbs like _replicator shards once opened once for scanning would stay open forever. In a large cluster with many _replicator shards that can add up to a significant overhead, mostly in terms of number of active processes. Add a mechanism to close dbs which have an idle db updater. Before hibernation was used to limit the memory pressure, however that is often not enough in practice. Ideally timeout value would be a configuration option, however we don't want to add an ets call for every couch_db_updater callback, and modifiying the db record is prohibitive for this patch, however PSE does this work and once it lands we can read the idle configuration when the process starts. (Original idea for this belongs to Paul Davis) COUCHDB-3323 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch close-idle-dbs Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch/pull/236.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #236 commit 85a9f13632a6828420e01cc7d39cdaa7c19c2170 Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-03-10T19:24:01Z Close idle dbs Previously idle dbs, especially sys_dbs like _replicator shards once opened once for scanning would stay open forever. In a large cluster with many _replicator shards that can add up to a significant overhead, mostly in terms of number of active processes. Add a mechanism to close dbs which have an idle db updater. Before hibernation was used to limit the memory pressure, however that is often not enough in practice. Ideally timeout value would be a configuration option, however we don't want to add an ets call for every couch_db_updater callback, and modifiying the db record is prohibitive for this patch, however PSE does this work and once it lands we can read the idle configuration when the process starts. (Original idea for this belongs to Paul Davis) COUCHDB-3323 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-chttpd issue #157: Allow limiting maximum document body size
Github user nickva commented on the issue: https://github.com/apache/couchdb-chttpd/pull/157 Tests are failing because it needs changes from https://github.com/apache/couchdb-couch/pull/235 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-chttpd pull request #157: Allow limiting maximum document body size
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-chttpd/pull/157 Allow limiting maximum document body size This is the HTTP layer and some tests. The actual checking is done in couch application's from_json_obj/1 function. If a document is too large it will return a 413 response code. The error reason will be the document ID. The intent is to help users identify the document if they used _bulk_docs endpoint. It will also help replicator skip over documents which are too large. COUCHDB-2992 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-chttpd couchdb-2992 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-chttpd/pull/157.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #157 commit 95ecd629f77444dd5fa2820fbc18cccaba350f6d Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-03-13T06:22:19Z Allow limiting maximum document body size This is the HTTP layer and some tests. The actual checking is done in couch application's from_json_obj/1 function. If a document is too large it will return a 413 response code. The error reason will be the document ID. The intent is to help users identify the document if they used _bulk_docs endpoint. It will also help replicator skip over documents which are too large. COUCHDB-2992 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #235: Allow limiting maximum document body size
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch/pull/235 Allow limiting maximum document body size Configuration is via the `couchdb.max_document_size`. In the past that was implemented as a maximum http request body size and this finally implements it by actually checking a document's body size. COUCHDB-2992 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch couchdb-2992 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch/pull/235.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #235 commit c51c3f2bdc8e2f2a135c8363096762607ed33f2c Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-03-13T06:33:57Z Allow limiting maximum document body size Configuration is via the `couchdb.max_document_size`. In the past that was implemented as a maximum http request body size and this finally implements it by actually checking a document's body size. COUCHDB-2992 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator pull request #63: Prevent change feeds from being s...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch-replicator/pull/63 Prevent change feeds from being stuck You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch-replicator prevent-change-feeds-from-being-stuck Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch-replicator/pull/63.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #63 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator issue #62: Don't scan empty replicator databases
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch-replicator/pull/62 +1 Tested it seems to work well. Log shows ``` ignoring empty shards/8000-9fff/blah/_replicator.1489084820 ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-documentation issue #95: Grammar and an inaccuracy.
Github user nickva commented on the issue: https://github.com/apache/couchdb-documentation/pull/95 +1 Thank you for your contribution! I'll rebase on latest master, fix the tests and merge it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-documentation issue #101: Add documentation for the `$allMatch` sele...
Github user nickva commented on the issue: https://github.com/apache/couchdb-documentation/pull/101 +1 Thanks you @satabin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-chttpd pull request #156: Introduce max_http_request_size to replace...
Github user nickva commented on a diff in the pull request: https://github.com/apache/couchdb-chttpd/pull/156#discussion_r104803991 --- Diff: src/chttpd.erl --- @@ -630,7 +630,7 @@ body(#httpd{mochi_req=MochiReq, req_body=ReqBody}) -> undefined -> % Maximum size of document PUT request body (4GB) MaxSize = list_to_integer( -config:get("couchdb", "max_document_size", "4294967296")), +config:get("httpd", "max_http_request_size", "4294967296")), --- End diff -- @iilyak updated to use get_integer --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-chttpd pull request #156: Introduce max_http_request_size to replace...
Github user nickva commented on a diff in the pull request: https://github.com/apache/couchdb-chttpd/pull/156#discussion_r104803095 --- Diff: src/chttpd.erl --- @@ -630,7 +630,7 @@ body(#httpd{mochi_req=MochiReq, req_body=ReqBody}) -> undefined -> % Maximum size of document PUT request body (4GB) MaxSize = list_to_integer( -config:get("couchdb", "max_document_size", "4294967296")), +config:get("httpd", "max_http_request_size", "4294967296")), --- End diff -- Good idea --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #233: Rename max_document_size to max_http_request_size
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/233 Related PRs: * https://github.com/apache/couchdb-couch-replicator/pull/61 * https://github.com/apache/couchdb-chttpd/pull/156 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator pull request #61: Fix unit test after renaming max_...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch-replicator/pull/61 Fix unit test after renaming max_document_size config parameter `couchdb.max_document_size` was renamed to `httpd.max_http_request_size` The unit tests was testing how replicator behaves when faced with reduced request size configuration on the target. COUCHDB-2992 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch-replicator 64229-add-new-request-parameter Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch-replicator/pull/61.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #61 commit d43aa56c8814dbfd94c8856dc017bfe45047 Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-03-07T22:24:00Z Fix unit test after renaming max_document_size config parameter `couchdb.max_document_size` was renamed to `httpd.max_http_request_size` The unit tests was testing how replicator behaves when faced with reduced request size configuration on the target. COUCHDB-2992 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-chttpd issue #156: Introduce max_http_request_size to replace max_do...
Github user nickva commented on the issue: https://github.com/apache/couchdb-chttpd/pull/156 Related PR for couch: https://github.com/apache/couchdb-couch/pull/233 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #233: Rename max_document_size to max_http_reques...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch/pull/233 Rename max_document_size to max_http_request_size `max_document_size` is implemented as `max_http_request_size`. There was no real check for document size. In some cases the implementation was close enough of a proxy (PUT-ing and GET-ing single docs), but in some edge cases, like _bulk_docs requests the discrepancy between request size and document size could be rather large. The section was changed accordingly from `couchdb` to `httpd`. `httpd` was chosen as it applies to both clustered as well as local interface. There is a parallel effort to implement an actual max_document_size check. The set of commit should be merged close enough together to allow for a backwards compatible transition. COUCHDB-2992 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch 64229-add-new-request-parameter Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch/pull/233.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #233 commit d3055f191797f8b637e3b64f6a50e522d2c8bbde Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-03-07T21:47:11Z Rename max_document_size to max_http_request_size `max_document_size` is implemented as `max_http_request_size`. There was no real check for document size. In some cases the implementation was close enough of a proxy (PUT-ing and GET-ing single docs), but in some edge cases, like _bulk_docs requests the discrepancy between request size and document size could be rather large. The section was changed accordingly from `couchdb` to `httpd`. `httpd` was chosen as it applies to both clustered as well as local interface. There is a parallel effort to implement an actual max_document_size check. The set of commit should be merged close enough together to allow for a backwards compatible transition. COUCHDB-2992 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-chttpd issue #156: Introduce max_http_request_size to replace max_do...
Github user nickva commented on the issue: https://github.com/apache/couchdb-chttpd/pull/156 I think for simplicity I'll try `httpd` since the setting is used by both clustered and un-clustered interface. And have 2 more prs for couch and replicator (it used in a test there). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-chttpd issue #156: Introduce max_http_request_size to replace max_do...
Github user nickva commented on the issue: https://github.com/apache/couchdb-chttpd/pull/156 Wonder if we should use the httpd config section since `max_http_request_size` related more the HTTP layer than db core. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator pull request #60: Remove unused mp_parse_doc functi...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch-replicator/pull/60 Remove unused mp_parse_doc function from replicator It was left accidentally when merging Cloudant's dbcore work. COUCHDB-2992 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-2992-remove-dead-code Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch-replicator/pull/60.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #60 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-chttpd issue #156: Introduce max_http_request_size to replace max_do...
Github user nickva commented on the issue: https://github.com/apache/couchdb-chttpd/pull/156 Hmm I don't see where we are using this additional mp_parse_doc function from replicator: https://github.com/apache/couchdb-couch-replicator/blob/master/src/couch_replicator_utils.erl#L452 I think it is dead code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #231: Fix `badarith` error in couch_db:get_db_inf...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch/pull/231 Fix `badarith` error in couch_db:get_db_info call When folding we account for a previous `null`, `undefined`, or a number. However btree:size/1 returns 0, `nil` or a number. Switched `undefined` to `nil`. Couldn't find where btree:size would ever return `undefined`, it seems we meant to use `nil` instead. COUCHDB-3316 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch couchdb-3316 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch/pull/231.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #231 commit 63ef337475a266379c7208f52f827980e69d1d1b Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-03-03T02:24:52Z Fix `badarith` error in couch_db:get_db_info call When folding we account for a previous `null`, `undefined`, or a number. However btree:size/1 returns 0, `nil` or a number. Switched `undefined` to `nil`. Couldn't find where btree:size would ever return `undefined`, it seems we meant to use `nil` instead. COUCHDB-3316 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-fabric issue #89: Prevent attachment upload from timing out during u...
Github user nickva commented on the issue: https://github.com/apache/couchdb-fabric/pull/89 Can now PUT aprox a 1Gb attachment Could never do that before: ``` ./attach_large.py --size=10 --mintime=120 ... > sent data in 0.000 sec res: None > Sent data: 100774 bytes, receiving > Received: 393 bytes > HTTP/1.1 201 Created Cache-Control: must-revalidate Content-Length: 67 Content-Type: application/json Date: Tue, 21 Feb 2017 23:28:28 GMT ETag: "1-2df9eed63e6f4df24c6a7b593adfc195" Location: http://127.0.0.1:15984/db/doc2 Server: CouchDB/2.0.0 (Erlang OTP/18) X-Couch-Request-ID: 01ba1602c1 X-CouchDB-Body-Time: 0 {"ok":true,"id":"doc2","rev":"1-2df9eed63e6f4df24c6a7b593adfc195"} ``` Link to attachment uploader test code: https://issues.apache.org/jira/secure/attachment/12853603/attach_large.py --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-fabric pull request #89: Prevent attachment upload from timing out d...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-fabric/pull/89 Prevent attachment upload from timing out during update_docs fabric call Currently if an attachment was large enough or the connection was slow enough such that it took more than fabric.request_timeout = 6 milliseconds, the fabric request would time out during attachment data transfer from coordinator node to other nodes and the whole request would fail. This was most evident when replicating database with large attachments. The fix is to periodically send `attachment_chunk_received` to coordinator to prevent the timeout. COUCHDB-3302 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-fabric couchdb-3302 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-fabric/pull/89.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #89 commit 6e9074bc8778e00471d96191319ac67d6c78c05a Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-02-21T22:46:57Z Prevent attachment upload from timing out during update_docs fabric call Currently if an attachment was large enough or the connection was slow enough such that it took more than fabric.request_timeout = 6 milliseconds, the fabric request would time out during attachment data transfer from coordinator node to other nodes and the whole request would fail. This was most evident when replicating database with large attachments. The fix is to periodically send `attachment_chunk_received` to coordinator to prevent the timeout. COUCHDB-3302 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-fabric issue #88: Use RealReplyCount to distinguish worker replies a...
Github user nickva commented on the issue: https://github.com/apache/couchdb-fabric/pull/88 _bulk_get and open_revs=all work +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-chttpd issue #155: Mock config module in tests
Github user nickva commented on the issue: https://github.com/apache/couchdb-chttpd/pull/155 Note: I couldn't reproduce this error locally. However I have seen similar failure when working on couch pr resulting from config being in the path of basic doc operations. The fix was similar to here so and tests still pass with the pr. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-chttpd issue #155: Mock config module in tests
Github user nickva commented on the issue: https://github.com/apache/couchdb-chttpd/pull/155 +1 ``` chttpd_error_info_tests: error_info_test (module 'chttpd_error_info_tests')...[0.010 s] ok === All 136 tests passed. ==> rel (eunit) ==> db-chttpd (eunit) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-documentation issue #100: Some documentation improvements
Github user nickva commented on the issue: https://github.com/apache/couchdb-documentation/pull/100 +1 Thank you! I'll merge the commit and update master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator pull request #56: Use string formatting to shorten ...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch-replicator/pull/56 Use string formatting to shorten document ID during logging. Previously used an explicit lists:sublist call but value was never used anywhere besides the log message. COUCHDB-3291 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3291-better-formatting Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch-replicator/pull/56.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #56 commit c306fab27dbd88d8ecc8f60fb5ec04e7911fd786 Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-02-08T17:02:34Z Use string formatting to shorten document ID during logging. Previously used an explicit lists:sublist call but value was never used anywhere besides the log message. COUCHDB-3291 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #226: Allow limiting length of document ID
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/226 @rnewson @kxepal pr to use infinity for replicator as well https://github.com/apache/couchdb-couch-replicator/pull/55 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator pull request #55: Switch replicator max_document_id...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch-replicator/pull/55 Switch replicator max_document_id_length config to use infinity Default value switched to be `infinity` instead of 0 COUCHDB-3291 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3291-use-infinity Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch-replicator/pull/55.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #55 commit 46f70c73427e618774872a388287ba682c1376f1 Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-02-08T16:46:13Z Switch replicator max_document_id_length config to use infinity Default value switched to be `infinity` instead of 0 COUCHDB-3291 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #226: Allow limiting length of document ID
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/226 Will do. Maybe we should even have it as a config function get_integer_or_infinity? But that's for another PR... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #226: Allow limiting length of document ID
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/226 Before setting the limit (python code): ``` d1 = {"_id":"2"*21} d.update([d1]) >>> [(True, u'2', u'1-967a00dff5e02add41819138abb3284d')] ``` after `> config:set("couchdb", "max_document_id_length", "20").` ``` d2 = {"_id":"x"*21} d.update([d2]) ServerError: (400, (u'illegal_docid', u'Document id is too long')) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #226: Allow limiting length of document ID
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch/pull/226 Allow limiting length of document ID Previously it was not possibly to define a maxum document ID size. That meant large document ID would hit various limitations and corner cases. For example, large document IDs could be inserted via a _bulk_docs endpoint but then trying to insert the same document via a single HTTP method like PUT would fail because of a limitation in Mochiweb's HTTP parser. Let operators specify a maxium document ID length via the: ``` couchdb.max_document_id_length = 0 ``` configuration. The default value of 0 means current behavior where document ID length is not checked. COUCHDB-3293 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch couchdb-3293 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch/pull/226.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #226 commit 8e1443a8165d07311ab1c0b1a5c0a33fb48de08b Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-02-07T04:53:51Z Allow limiting length of document ID Previously it was not possibly to define a maxum document ID size. That meant large document ID would hit various limitations and corner cases. For example, large document IDs could be inserted via a _bulk_docs endpoint but then trying to insert the same document via a single HTTP method like PUT would fail because of a limitation in Mochiweb's HTTP parser. Let operators specify a maxium document ID length via the: ``` couchdb.max_document_id_length = 0 ``` configuration. The default value of 0 means current behavior where document ID length is not checked. COUCHDB-3293 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator issue #54: Allow configuring maximum document ID le...
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch-replicator/pull/54 Thanks @iilyak ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-fabric issue #86: Return Error When Workers Crash
Github user nickva commented on the issue: https://github.com/apache/couchdb-fabric/pull/86 @tonysun83 Yap it will still end up retrying in open_revs. It is not ideal but for now we can keep it like that --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator pull request #54: Allow configuring maximum documen...
Github user nickva commented on a diff in the pull request: https://github.com/apache/couchdb-couch-replicator/pull/54#discussion_r99454566 --- Diff: src/couch_replicator_changes_reader.erl --- @@ -89,9 +89,20 @@ process_change(#doc_info{id = <<>>} = DocInfo, {_, Db, _, _, _}) -> "source database `~s` (_changes sequence ~p)", [couch_replicator_api_wrap:db_uri(Db), DocInfo#doc_info.high_seq]); -process_change(#doc_info{} = DocInfo, {_, _, ChangesQueue, _, _}) -> -ok = couch_work_queue:queue(ChangesQueue, DocInfo), -put(last_seq, DocInfo#doc_info.high_seq); +process_change(#doc_info{id = Id} = DocInfo, {Parent, Db, ChangesQueue, _, _}) -> +case is_doc_id_too_long(byte_size(Id)) of +true -> +ShortId = lists:sublist(binary_to_list(Id), 64), +SourceDb = couch_replicator_api_wrap:db_uri(Db), +couch_log:error("Replicator: document id `~s...` from source db " +" `~s` is too long, ignoring.", [ShortId, SourceDb]), +Stats = couch_replicator_stats:new([{doc_write_failures, 1}]), +ok = gen_server:call(Parent, {add_stats, Stats}, infinity), +ok; --- End diff -- You're right. That was silly of me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator pull request #54: Allow configuring maximum documen...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch-replicator/pull/54 Allow configuring maximum document ID length during replication Currently due to a bug in http parser and lack of document ID length enforcement, large document IDs will break replication jobs. Large IDs will pass through the _change feed, revs diffs, but then fail during open_revs get request. open_revs request will keep retrying until it gives up after long enough time, then replication task crashes and restart again with the same pattern. The current effective limit is around 8k or so. (The buffer size default 8192 and if the first line of the request is larger than that, request will fail). (See http://erlang.org/pipermail/erlang-questions/2011-June/059567.html for more information about the possible failure mechanism). Bypassing the parser bug by increasing recbuf size, will alow replication to finish, however that means simply spreading the abnormal document through the rest of the system, and might not be desirable always. Also once long document IDs have been inserted in the source DB. Simply deleting them doesn't work as they'd still appear in the change feed. They'd have to be purged or somehow skipped during the replication step. This commit helps do the later. Operators can configure maximum length via this setting: ``` replicator.max_document_id_length=0 ``` The default value is 0 which means there is no maximum enforced, which is backwards compatible behavior. During replication if maximum is hit by a document, that document is skipped, an error is written to the log: ``` Replicator: document id `a...` from source db `http://.../cdyno-001/` is too long, ignoring. ``` and `"doc_write_failures"` statistic is bumped. COUCHDB-3291 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3291-limit-doc-id-size-in-replicator Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch-replicator/pull/54.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #54 commit 3ff2d83893481afd68025a52a6d859a2efaf0bcf Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-02-03T23:00:37Z Allow configuring maximum document ID length during replication Currently due to a bug in http parser and lack of document ID length enforcement, large document IDs will break replication jobs. Large IDs will pass through the _change feed, revs diffs, but then fail during open_revs get request. open_revs request will keep retrying until it gives up after long enough time, then replication task crashes and restart again with the same pattern. The current effective limit is around 8k or so. (The buffer size default 8192 and if the first line of the request is larger than that, request will fail). (See http://erlang.org/pipermail/erlang-questions/2011-June/059567.html for more information about the possible failure mechanism). Bypassing the parser bug by increasing recbuf size, will alow replication to finish, however that means simply spreading the abnormal document through the rest of the system, and might not be desirable always. Also once long document IDs have been inserted in the source DB. Simply deleting them doesn't work as they'd still appear in the change feed. They'd have to be purged or somehow skipped during the replication step. This commit helps do the later. Operators can configure maximum length via this setting: ``` replicator.max_document_id_length=0 ``` The default value is 0 which means there is no maximum enforced, which is backwards compatible behavior. During replication if maximum is hit by a document, that document is skipped, an error is written to the log: ``` Replicator: document id `a...` from source db `http://.../cdyno-001/` is too long, ignoring. ``` and `"doc_write_failures"` statistic is bumped. COUCHDB-3291 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #225: Remove dead code from couch_file
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch/pull/225 Remove dead code from couch_file This code was left over after removing 8kB read-ahead https://github.com/apache/couchdb-couch/pull/223/commits/d52a5335d930d11ade4953c8576d22f55872ff6f COUCHDB-3284 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch couchdb-3284-remove-dead-code Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch/pull/225.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #225 commit bf12a7a3336bd438493e7db1d090160f1c1a09f3 Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-02-01T19:42:59Z Remove dead code from couch_file This code was left over after removing 8kB read-ahead https://github.com/apache/couchdb-couch/pull/223/commits/d52a5335d930d11ade4953c8576d22f55872ff6f COUCHDB-3284 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-fabric issue #86: Return Error When Workers Crash
Github user nickva commented on the issue: https://github.com/apache/couchdb-fabric/pull/86 A 5xx error seems more appropriate it seems as this implies something has gone wrong on the server. The other question could be what should we do in the replicator. Should we keep trying in the open_revs request or exit and crash the whole job? Most errors at that point will result in retries with backoff until it crashes with a `kaboom` exit. Then it will restart the whole replication job ( which starts a new change feed etc) and it repeats again. If 3rd node was down temporarily and comes back then retries will work ok. But what if it doesn't? Does the change feed crash or just skips the document... I am entirely sure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #224: Don't crash on unexpected validation's error type
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/224 +1 as well Was curious how it worked. ``` In [33]: db = srv.create('db') In [34]: db['_design/des'] = {'validate_doc_update':'''function(nd,od,ctx) { throw({zig:"blah"}); }'''} In [35]: db['x'] = {} ServerError: (500, (u'unknown_error', u'function_clause')) ``` with fix ``` ServerError: (500, (u'unknown_error', u'blah')) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch issue #223: Remove 8kB read-ahead from couch_file
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch/pull/223 Out of curiosity create a branch which allows configuring file:open's read_ahead parameter: https://github.com/cloudant/couchdb-couch/commit/3e35ae40f71fa84a3453b178caf774147300e87d --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-fabric issue #86: Return Error When Workers Crash
Github user nickva commented on the issue: https://github.com/apache/couchdb-fabric/pull/86 Created a 5 nodes clusters with `./dev/run -n5` that did not go smoothly and had to tweak it by hand some. Somehow ended up with a 'couchdb@127.0.0.1' node in mem3:nodes() instead of node4 and node5. Then created a `<<"db">>` and `<<"doc1">>` they ended up on nodes 2,3 and 4 ``` rp(mem3:shards(<<"db">>, <<"doc1">>)). [{shard,<<"shards/c000-dfff/db.1485801065">>, 'node2@127.0.0.1',<<"db">>, [3221225472,3758096383], undefined}, {shard,<<"shards/c000-dfff/db.1485801065">>, 'node3@127.0.0.1',<<"db">>, [3221225472,3758096383], undefined}, {shard,<<"shards/c000-dfff/db.1485801065">>, 'node4@127.0.0.1',<<"db">>, [3221225472,3758096383], undefined}] ``` Put all 3 nodes in maintenance and when querying via http got bad_match error ``` $ http 'http://adm:pass@localhost:15984/db/doc1?open_revs=["1-967a00dff5e02add41819138abb3284d"]=true=true' HTTP/1.1 500 Internal Server Error { "error": "badmatch", "reason": "{error,all_workers_died}", "ref": 281556634 } ``` Just to compare *without* the fix I get: ``` http 'http://adm:pass@localhost:15984/db/doc1?open_revs=["1-967a00dff5e02add41819138abb3284d"]=true=true' 'Accept:application/json' HTTP/1.1 200 OK [] ``` Also notice, we apparently also do handle `{ok, []}` result and return missing 404 but only if open_revs=all case: ``` {ok, Results} = fabric:open_revs(Db, DocId, Revs, Options), case Results of [] when Revs == all -> chttpd:send_error(Req, {not_found, missing}); ``` In https://github.com/cloudant/couchdb-chttpd/blob/master/src/chttpd_db.erl#L675 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch pull request #223: Remove 8kB read-ahead from couch_file
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch/pull/223 Remove 8kB read-ahead from couch_file In production it showed an increased input Erlang IO and increased binary memory usage. See attached file in Jira ticket: COUCHDB-3284 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch couchdb-3284 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch/pull/223.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #223 commit d52a5335d930d11ade4953c8576d22f55872ff6f Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-01-27T17:04:01Z Remove 8kB read-ahead from couch_file In production it showed an increased input Erlang IO and increased binary memory usage. See attached file in Jira ticket: COUCHDB-3284 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator pull request #53: Fix shards db name typo from prev...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch-replicator/pull/53 Fix shards db name typo from previous commit Previous commit which switched to using mem3 for replicator shard discovery introduced a typo. `config:get("mem3", "shard_db", "dbs")` should be: `config:get("mem3", "shards_db", "_dbs")` COUCHDB-3277 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3277-typo Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch-replicator/pull/53.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #53 commit be0060f3fffc308b7532e6b99355f0e0cdede88e Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-01-25T04:17:26Z Fix shards db name typo from previous commit Previous commit which switched to using mem3 for replicator shard discovery introduced a typo. `config:get("mem3", "shard_db", "dbs")` should be: `config:get("mem3", "shards_db", "_dbs")` COUCHDB-3277 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-fabric issue #85: Add admin ctx on open ddoc in group_info
Github user nickva commented on the issue: https://github.com/apache/couchdb-fabric/pull/85 +1 ```node1@127.0.0.1)4> fabric:get_view_group_info(<<"_users">>, <<"_design/_auth">>). {ok,[{updates_pending,{[{minimum,2}, {preferred,0}, {total,2}]}}, {waiting_commit,false}, {waiting_clients,0}, {updater_running,false}, {update_seq,0}, {sizes,{[{file,408},{external,0},{active,0}]}}, {signature,<<"3e823c2a4383ac0c18d4e574135a5b08">>}, {purge_seq,0}, {language,<<"javascript">>}, {disk_size,408}, {data_size,0}, {compact_running,false}]} ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator issue #52: Use mem3 to discover all _replicator sha...
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch-replicator/pull/52 @kxepal Oh good point. And Hey! It's been a while :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator issue #52: Use mem3 to discover all _replicator sha...
Github user nickva commented on the issue: https://github.com/apache/couchdb-couch-replicator/pull/52 Did an unscientific performance benchmark. Created 280 replicator replicator dbs. Then logged how long it took scan_all_dbs to scan all dbs in the file system and from dbs: ``` scan_all_dbs done in 0.330834 sec``` vs ``` xxx scan all_dbs done in 2.277361 sec``` with a total of 2249 directories and 2540 files --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator pull request #52: Use mem3 to discover all _replica...
Github user nickva commented on a diff in the pull request: https://github.com/apache/couchdb-couch-replicator/pull/52#discussion_r97558473 --- Diff: src/couch_replicator_manager.erl --- @@ -980,3 +990,34 @@ get_json_value(Key, Props, Default) when is_binary(Key) -> Else -> Else end. + + --- End diff -- @iilyak good catch, I messed up a merge --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-couch-replicator pull request #52: Use mem3 to discover all _replica...
GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch-replicator/pull/52 Use mem3 to discover all _replicator shards in replicator manager Previously this was done via recursive db directory traversal, looking for shards names ending in `_replicator`. However, if there are orphanned shard files (not associated with a clustered db), replicator manager crashes. It restarts eventually, but as long as the orphanned shard file without an entry in dbs db is present on the file system, replicator manager will keep crashing and never reach some replication documents in shards which would be traversed after the problematic shard. The user-visible effect of this is some replication documents are never triggered. To fix, use mem3 to traverse and discover `_replicator` shards. This was used Cloudant's production code for many years it is battle-tested and it doesn't suffer from file system vs mem3 inconsistency. Local `_replicator` db is a special case. Since it is not clustered it will not appear in the clustered db list. However it is already handled as a special case in `init(_)` so that behavior is not affected by this change. COUCHDB-3277 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3277 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch-replicator/pull/52.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #52 commit 8205420d4249cea98ec5568344c43ccf11bbc9b1 Author: Nick Vatamaniuc <vatam...@apache.org> Date: 2017-01-24T05:35:32Z Use mem3 to discover all _replicator shards in replicator manager Previously this was done via recursive db directory traversal, looking for shards names ending in `_replicator`. However, if there are orphanned shard files (not associated with a clustered db), replicator manager crashes. It restarts eventually, but as long as the orphanned shard file without an entry in dbs db is present on the file system, replicator manager will keep crashing and never reach some replication documents in shards which would be traversed after the problematic shard. The user-visible effect of this is some replication documents are never triggered. To fix, use mem3 to traverse and discover `_replicator` shards. This was used Cloudant's production code for many years it is battle-tested and it doesn't suffer from file system vs mem3 inconsistency. Local `_replicator` db is a special case. Since it is not clustered it will not appear in the clustered db list. However it is already handled as a special case in `init(_)` so that behavior is not affected by this change. COUCHDB-3277 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-fabric pull request #84: Fix open revs quorum when error
Github user nickva closed the pull request at: https://github.com/apache/couchdb-fabric/pull/84 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-fabric issue #84: Fix open revs quorum when error
Github user nickva commented on the issue: https://github.com/apache/couchdb-fabric/pull/84 Merged (rebased on master so this wasn't auto-closed). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] couchdb-fabric issue #84: Fix open revs quorum when error
Github user nickva commented on the issue: https://github.com/apache/couchdb-fabric/pull/84 Apache Jira came back online so created a ticket there, and updated commits with ticket reference. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---