[jira] [Created] (COUCHDB-3175) When PUT-ing a multipart/related doc with attachment get a 500 error on md5 mismatch

2016-10-03 Thread Nick Vatamaniuc (JIRA)
Nick Vatamaniuc created COUCHDB-3175:


 Summary: When PUT-ing a multipart/related doc with attachment get 
a 500 error on md5 mismatch
 Key: COUCHDB-3175
 URL: https://issues.apache.org/jira/browse/COUCHDB-3175
 Project: CouchDB
  Issue Type: Bug
Reporter: Nick Vatamaniuc


fabric_doc_updater handle_message crashes with a function_clause which crashes 
the whole request.

Instead, perhaps is should handle:

{code}
{md5_mismatch, Blah}, _Worker, _Acc0) -> ...
{code}

and return a 4xx code...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (COUCHDB-3174) max_document_size setting can by bypassed by issuing multipart/related requests

2016-10-03 Thread Nick Vatamaniuc (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15544234#comment-15544234
 ] 

Nick Vatamaniuc commented on COUCHDB-3174:
--

The problem seems to be here:

https://github.com/apache/couchdb-chttpd/blob/master/src/chttpd_db.erl#L763-L776

and here:

https://github.com/apache/couchdb-chttpd/blob/master/src/chttpd_db.erl#L763-L776

We need to call the json_body bit in order to get the max document size which 
is passed to `MochiReq:recv_body(MaxSize)`.

Presumably we could retrieve Content-Length ourselves before mp parsing and 
raise a 413, but I haven't thought about it too much yet...

> max_document_size setting can by bypassed by issuing multipart/related 
> requests
> ---
>
> Key: COUCHDB-3174
> URL: https://issues.apache.org/jira/browse/COUCHDB-3174
> Project: CouchDB
>  Issue Type: Bug
>Reporter: Nick Vatamaniuc
>
> Testing how replicator handled small values of max_document_size parameter, 
> discovered if user issues PUT requests which are multipart/related, then 
> max_document_size setting is bypassed.
> Wireshark capture of a PUT with attachments request coming from replicator in 
> a EUnit test I wrote.  max_document_size was set to 1 yet a 70k byte 
> document with a 70k byte attachment was created.
> {code}
> PUT /eunit-test-db-147555017168185/doc0?new_edits=false HTTP/1.1
> Content-Type: multipart/related; boundary="e5d21d5fd988dc1c6c6e8911030213b3"
> Content-Length: 140515
> Accept: application/json
> --e5d21d5fd988dc1c6c6e8911030213b3
> Content-Type: application/json
> {"_id":"doc0","_rev":"1-40a6a02761aba1474c4a1ad9081a4c2e","x":"
> ...","_revisions":{"start":1,"ids":["40a6a02761aba1474c4a1ad9081a4c2e"]},"_attachments":{"att1":{"content_type":"app/binary","revpos":1,"digest":"md5-u+COd6RLUd6BGz0wJyuZFg==","length":7,"follows":true}}}
> --e5d21d5fd988dc1c6c6e8911030213b3
> Content-Disposition: attachment; filename="att1"
> Content-Type: app/binary
> Content-Length: 7
> xx
> --e5d21d5fd988dc1c6c6e8911030213b3--
> HTTP/1.1 201 Created
> {code}
> Here is a regular request which works as expected:
> {code}
> PUT /dbl/dl2 HTTP/1.1
> Content-Length: 100026
> Content-Type: application/json
> Accept: application/json
> {"_id": "dl2", "size": "...xxx"}
> HTTP/1.1 413 Request Entity Too Large
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (COUCHDB-3174) max_document_size setting can by bypassed by issuing multipart/related requests

2016-10-03 Thread Nick Vatamaniuc (JIRA)
Nick Vatamaniuc created COUCHDB-3174:


 Summary: max_document_size setting can by bypassed by issuing 
multipart/related requests
 Key: COUCHDB-3174
 URL: https://issues.apache.org/jira/browse/COUCHDB-3174
 Project: CouchDB
  Issue Type: Bug
Reporter: Nick Vatamaniuc


Testing how replicator handled small values of max_document_size parameter, 
discovered if user issues PUT requests which are multipart/related, then 
max_document_size setting is bypassed.

Wireshark capture of a PUT with attachments request coming from replicator in a 
EUnit test I wrote.  max_document_size was set to 1 yet a 70k byte document 
with a 70k byte attachment was created.

{code}
PUT /eunit-test-db-147555017168185/doc0?new_edits=false HTTP/1.1
Content-Type: multipart/related; boundary="e5d21d5fd988dc1c6c6e8911030213b3"
Content-Length: 140515
Accept: application/json

--e5d21d5fd988dc1c6c6e8911030213b3
Content-Type: application/json

{"_id":"doc0","_rev":"1-40a6a02761aba1474c4a1ad9081a4c2e","x":"
...","_revisions":{"start":1,"ids":["40a6a02761aba1474c4a1ad9081a4c2e"]},"_attachments":{"att1":{"content_type":"app/binary","revpos":1,"digest":"md5-u+COd6RLUd6BGz0wJyuZFg==","length":7,"follows":true}}}
--e5d21d5fd988dc1c6c6e8911030213b3
Content-Disposition: attachment; filename="att1"
Content-Type: app/binary
Content-Length: 7

xx
--e5d21d5fd988dc1c6c6e8911030213b3--

HTTP/1.1 201 Created
{code}


Here is a regular request which works as expected:

{code}
PUT /dbl/dl2 HTTP/1.1
Content-Length: 100026
Content-Type: application/json
Accept: application/json
{"_id": "dl2", "size": "...xxx"}

HTTP/1.1 413 Request Entity Too Large

{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] couchdb-couch-mrview issue #56: Make Query Limit Results Configurable

2016-10-03 Thread rnewson
Github user rnewson commented on the issue:

https://github.com/apache/couchdb-couch-mrview/pull/56
  
I still think the default behaviour of _all_docs, _changes and _view should 
be to return everything that the parameters dictate (that is, we should fix the 
default, the longstanding large-but-finite default is also a bug).

If we all agree to change that, then it needs to be abundantly clear to the 
user that they did not get all results. One suggestion is, instead of changing 
the default, we introduce an optional _maximum_ value for limit and then, if 
set, reject all requests without an explicit limit parameter. So that's a 400 
Bad Request if limit is too high _or_ if it's missing.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch pull request #185: 3061 adaptive header search

2016-10-03 Thread jaydoane
Github user jaydoane commented on a diff in the pull request:

https://github.com/apache/couchdb-couch/pull/185#discussion_r81645651
  
--- Diff: src/couch_file.erl ---
@@ -531,27 +537,96 @@ find_header(Fd, Block) ->
 {ok, Bin} ->
 {ok, Bin};
 _Error ->
-find_header(Fd, Block -1)
+find_header_pread2(Fd, Block -1)
 end.
 
 load_header(Fd, Block) ->
 {ok, <<1, HeaderLen:32/integer, RestBlock/binary>>} =
 file:pread(Fd, Block * ?SIZE_BLOCK, ?SIZE_BLOCK),
-TotalBytes = calculate_total_read_len(5, HeaderLen),
-case TotalBytes > byte_size(RestBlock) of
+TotalSize = total_read_size(5, HeaderLen),
+case TotalSize > byte_size(RestBlock) of
 false ->
-<> = RestBlock;
+<> = RestBlock;
 true ->
 {ok, Missing} = file:pread(
 Fd, (Block * ?SIZE_BLOCK) + 5 + byte_size(RestBlock),
-TotalBytes - byte_size(RestBlock)),
+TotalSize - byte_size(RestBlock)),
 RawBin = <>
 end,
 <> =
 iolist_to_binary(remove_block_prefixes(5, RawBin)),
 Md5Sig = couch_crypto:hash(md5, HeaderBin),
 {ok, HeaderBin}.
 
+
+%% Read a configurable number of block locations at a time using
+%% file:pread/2.  At each block location, read ?PREFIX_SIZE (5) bytes;
+%% if the first byte is <<1>>, assume it's a header block, and the
+%% next 4 bytes hold the size of the header.
+-spec find_header_pread2(file:fd(), block_id()) ->
+{ok, binary()} | no_valid_header.
+find_header_pread2(Fd, Block) ->
+find_header_pread2(Fd, Block, read_count()).
+
+-spec find_header_pread2(file:fd(), block_id(), non_neg_integer()) ->
+{ok, binary()} | no_valid_header.
+find_header_pread2(_Fd, Block, _ReadCount) when Block < 0 ->
+no_valid_header;
+find_header_pread2(Fd, Block, ReadCount) ->
+Locations = block_locations(Block, ReadCount),
+{ok, DataL} = file:pread(Fd, [{L, ?PREFIX_SIZE} || L <- Locations]),
+case locate_header(Fd, header_locations(Locations, DataL)) of
+{ok, _Location, HeaderBin} ->
+io:format("header at block ~p~n", [_Location div 
?SIZE_BLOCK]), % TEMP
+{ok, HeaderBin};
+_ ->
+ok = file:advise(
+Fd, hd(Locations), ReadCount * ?SIZE_BLOCK, dont_need),
+NextBlock = hd(Locations) div ?SIZE_BLOCK - 1,
+find_header_pread2(Fd, NextBlock, ReadCount)
+end.
+
+-spec read_count() -> non_neg_integer().
+read_count() ->
+config:get_integer("couchdb", "find_header_read_count", 
?DEFAULT_READ_COUNT).
+
+-spec block_locations(block_id(), non_neg_integer()) -> [location()].
+block_locations(Block, ReadCount) ->
+First = max(0, Block - ReadCount + 1),
+[?SIZE_BLOCK*B || B <- lists:seq(First, Block)].
+
+-spec header_locations([location()], [data()]) -> [{location(), 
header_size()}].
+header_locations(Locations, DataL) ->
+lists:foldl(fun
+({Loc, <<1, HeaderSize:32/integer>>}, Acc) ->
+[{Loc, HeaderSize} | Acc];
+(_, Acc) ->
+Acc
+end, [], lists:zip(Locations, DataL)).
+
+-spec locate_header(file:fd(), [{location(), header_size()}]) ->
+{ok, binary()} | not_found.
+locate_header(_Fd, []) ->
+not_found;
+locate_header(Fd, [{Location, Size}|LocationSizes]) ->
+case (catch load_header(Fd, Location, Size)) of
+{ok, HeaderBin} ->
+{ok, Location, HeaderBin};
+_Error ->
+locate_header(Fd, LocationSizes)
+end.
+
+-spec load_header(file:fd(), location(), header_size()) ->
+{ok, binary()} | no_return().
+load_header(Fd, Location, Size) ->
--- End diff --

@davisp, thank you for your patience. Your solution seems fine, even though 
it somewhat obscures the simplicity of the vectored algorithm. But I agree it's 
more idiomatic and easier to maintain your way, so I made the change pretty 
much verbatim.

Finally, I noted that the `find_header/2` recursion termination clause is 
no longer being used, and removed it.

Do you see any other changes that need to be made?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch-replicator issue #49: Fix replicator handling of max_document_...

2016-10-03 Thread rnewson
Github user rnewson commented on the issue:

https://github.com/apache/couchdb-couch-replicator/pull/49
  
+1 , v nice!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (COUCHDB-3168) Replicator doesn't handle well writing documents to a target db which has a small max_document_size

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543211#comment-15543211
 ] 

ASF GitHub Bot commented on COUCHDB-3168:
-

GitHub user nickva opened a pull request:

https://github.com/apache/couchdb-couch-replicator/pull/49

Fix replicator handling of max_document_size when posting to _bulk_docs

Currently `max_document_size` setting is a misnomer, it actually configures
maximum request body size. For single document requests it is a good enough
approximation. However, _bulk_docs updates could fail the total request size
check even if individual documents stay below the maximum limit.

Before this fix during replication, `_bulk_docs` reqeust would crash, which
eventually leads to an infinite cycles of crashes and restarts (with a
potential large state being dumped to logs), without replicaton job making
progress.

The is to do binary split on the batch size until either all documents will
fit under max_document_size limit, or some documents will fail to replicate.

If documents fail to replicate, they bump the `doc_write_failures` count.
Effectively `max_document_size` acts as in implicit replication filter in 
this
case.

Jira: COUCHDB-3168

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3168

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/couchdb-couch-replicator/pull/49.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #49


commit a9cd0b191524428ece0ebd0a1e18c88bb2afcbaa
Author: Nick Vatamaniuc 
Date:   2016-10-03T19:30:23Z

Fix replicator handling of max_document_size when posting to _bulk_docs

Currently `max_document_size` setting is a misnomer, it actually configures
maximum request body size. For single document requests it is a good enough
approximation. However, _bulk_docs updates could fail the total request size
check even if individual documents stay below the maximum limit.

Before this fix during replication, `_bulk_docs` reqeust would crash, which
eventually leads to an infinite cycles of crashes and restarts (with a
potential large state being dumped to logs), without replicaton job making
progress.

The is to do binary split on the batch size until either all documents will
fit under max_document_size limit, or some documents will fail to replicate.

If documents fail to replicate, they bump the `doc_write_failures` count.
Effectively `max_document_size` acts as in implicit replication filter in 
this
case.

Jira: COUCHDB-3168




> Replicator doesn't handle well writing documents to a target db which has a 
> small max_document_size
> ---
>
> Key: COUCHDB-3168
> URL: https://issues.apache.org/jira/browse/COUCHDB-3168
> Project: CouchDB
>  Issue Type: Bug
>Reporter: Nick Vatamaniuc
>
> If a target db has set a smaller document max size, replication crashes.
> It might make sense for the replication to not crash and instead treat 
> document size as an implicit replication filter then display doc write 
> failures in the stats / task info / completion record of normal replications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] couchdb-couch-replicator pull request #49: Fix replicator handling of max_do...

2016-10-03 Thread nickva
GitHub user nickva opened a pull request:

https://github.com/apache/couchdb-couch-replicator/pull/49

Fix replicator handling of max_document_size when posting to _bulk_docs

Currently `max_document_size` setting is a misnomer, it actually configures
maximum request body size. For single document requests it is a good enough
approximation. However, _bulk_docs updates could fail the total request size
check even if individual documents stay below the maximum limit.

Before this fix during replication, `_bulk_docs` reqeust would crash, which
eventually leads to an infinite cycles of crashes and restarts (with a
potential large state being dumped to logs), without replicaton job making
progress.

The is to do binary split on the batch size until either all documents will
fit under max_document_size limit, or some documents will fail to replicate.

If documents fail to replicate, they bump the `doc_write_failures` count.
Effectively `max_document_size` acts as in implicit replication filter in 
this
case.

Jira: COUCHDB-3168

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3168

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/couchdb-couch-replicator/pull/49.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #49


commit a9cd0b191524428ece0ebd0a1e18c88bb2afcbaa
Author: Nick Vatamaniuc 
Date:   2016-10-03T19:30:23Z

Fix replicator handling of max_document_size when posting to _bulk_docs

Currently `max_document_size` setting is a misnomer, it actually configures
maximum request body size. For single document requests it is a good enough
approximation. However, _bulk_docs updates could fail the total request size
check even if individual documents stay below the maximum limit.

Before this fix during replication, `_bulk_docs` reqeust would crash, which
eventually leads to an infinite cycles of crashes and restarts (with a
potential large state being dumped to logs), without replicaton job making
progress.

The is to do binary split on the batch size until either all documents will
fit under max_document_size limit, or some documents will fail to replicate.

If documents fail to replicate, they bump the `doc_write_failures` count.
Effectively `max_document_size` acts as in implicit replication filter in 
this
case.

Jira: COUCHDB-3168




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch-replicator issue #48: Port HTTP 429 Commits

2016-10-03 Thread tonysun83
Github user tonysun83 commented on the issue:

https://github.com/apache/couchdb-couch-replicator/pull/48
  
@kxepal: made every macro configurable


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch-log issue #16: a systemd-journald compatible log output on std...

2016-10-03 Thread gdamjan
Github user gdamjan commented on the issue:

https://github.com/apache/couchdb-couch-log/pull/16
  
@kxepal I guess so, let me see what the systemd community thinks too (maybe 
there's previous precedent).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch-log issue #16: a systemd-journald compatible log output on std...

2016-10-03 Thread kxepal
Github user kxepal commented on the issue:

https://github.com/apache/couchdb-couch-log/pull/16
  
LGFM, but I'm not systemd user.

@gdamjan may be name it journald to direct match with the service name 
without leaving a room for confusion?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch-mrview issue #56: Make Query Limit Results Configurable

2016-10-03 Thread kxepal
Github user kxepal commented on the issue:

https://github.com/apache/couchdb-couch-mrview/pull/56
  
@tonysun83 
> I still think we should make all our index limits configurable, starting 
with views. 

That's a good position while requesting indexes is not cheap. Say, there 
are some O(N) operations which can cause DoS by requesting a little bit of big 
data. Otherwise, or if  we can stream results chunk by chunk ala changes feed 
without any harm for server, there is no need in such limitations, even 
configurable, as they makes things only complicated while it's the client who 
will cause DoS himself by requesting more than it can process.

I would like to take a look on this feature under such kind of angle. Would 
the proposed limit save server from some kind of resources drain or it's about 
protection the clients? Can we make all our indexes steam like and encourage 
people use them instead? Will that help everyone to solve the problem of 
requesting unlimited data? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch-log issue #16: a systemd-journald compatible log output on std...

2016-10-03 Thread gdamjan
Github user gdamjan commented on the issue:

https://github.com/apache/couchdb-couch-log/pull/16
  
this setup works fine when configured in in `local.ini`:
```
[log]
writer = journal
```

but perhaps it would be better if the writer can be forced from the systemd 
service file (via a environment variable or command line option).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch-log pull request #16: a systemd-journald compatible log output...

2016-10-03 Thread gdamjan
GitHub user gdamjan opened a pull request:

https://github.com/apache/couchdb-couch-log/pull/16

a systemd-journald compatible log output on stderr

based on the stderr logger but changed:
- doesn't output the timestamp, the journal already has a timestamp
- output the log level as  where num is defined as in `sd-daemon(3)`

https://www.freedesktop.org/software/systemd/man/sd-daemon.html

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gdamjan/couchdb-couch-log master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/couchdb-couch-log/pull/16.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16


commit 406dccdc296e5638d7a52818c27180cfe168b0fd
Author: Damjan Georgievski 
Date:   2016-10-03T16:17:59Z

a systemd-journald compatible log output on stderr

based on the stderr logger but changed:
- doesn't output the timestamp, the journal already has a timestamp
- output the log level as  where num is defined as in `sd-daemon(3)`

https://www.freedesktop.org/software/systemd/man/sd-daemon.html




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-couch pull request #185: 3061 adaptive header search

2016-10-03 Thread davisp
Github user davisp commented on a diff in the pull request:

https://github.com/apache/couchdb-couch/pull/185#discussion_r81586171
  
--- Diff: src/couch_file.erl ---
@@ -531,27 +537,96 @@ find_header(Fd, Block) ->
 {ok, Bin} ->
 {ok, Bin};
 _Error ->
-find_header(Fd, Block -1)
+find_header_pread2(Fd, Block -1)
 end.
 
 load_header(Fd, Block) ->
 {ok, <<1, HeaderLen:32/integer, RestBlock/binary>>} =
 file:pread(Fd, Block * ?SIZE_BLOCK, ?SIZE_BLOCK),
-TotalBytes = calculate_total_read_len(5, HeaderLen),
-case TotalBytes > byte_size(RestBlock) of
+TotalSize = total_read_size(5, HeaderLen),
+case TotalSize > byte_size(RestBlock) of
 false ->
-<> = RestBlock;
+<> = RestBlock;
 true ->
 {ok, Missing} = file:pread(
 Fd, (Block * ?SIZE_BLOCK) + 5 + byte_size(RestBlock),
-TotalBytes - byte_size(RestBlock)),
+TotalSize - byte_size(RestBlock)),
 RawBin = <>
 end,
 <> =
 iolist_to_binary(remove_block_prefixes(5, RawBin)),
 Md5Sig = couch_crypto:hash(md5, HeaderBin),
 {ok, HeaderBin}.
 
+
+%% Read a configurable number of block locations at a time using
+%% file:pread/2.  At each block location, read ?PREFIX_SIZE (5) bytes;
+%% if the first byte is <<1>>, assume it's a header block, and the
+%% next 4 bytes hold the size of the header.
+-spec find_header_pread2(file:fd(), block_id()) ->
+{ok, binary()} | no_valid_header.
+find_header_pread2(Fd, Block) ->
+find_header_pread2(Fd, Block, read_count()).
+
+-spec find_header_pread2(file:fd(), block_id(), non_neg_integer()) ->
+{ok, binary()} | no_valid_header.
+find_header_pread2(_Fd, Block, _ReadCount) when Block < 0 ->
+no_valid_header;
+find_header_pread2(Fd, Block, ReadCount) ->
+Locations = block_locations(Block, ReadCount),
+{ok, DataL} = file:pread(Fd, [{L, ?PREFIX_SIZE} || L <- Locations]),
+case locate_header(Fd, header_locations(Locations, DataL)) of
+{ok, _Location, HeaderBin} ->
+io:format("header at block ~p~n", [_Location div 
?SIZE_BLOCK]), % TEMP
+{ok, HeaderBin};
+_ ->
+ok = file:advise(
+Fd, hd(Locations), ReadCount * ?SIZE_BLOCK, dont_need),
+NextBlock = hd(Locations) div ?SIZE_BLOCK - 1,
+find_header_pread2(Fd, NextBlock, ReadCount)
+end.
+
+-spec read_count() -> non_neg_integer().
+read_count() ->
+config:get_integer("couchdb", "find_header_read_count", 
?DEFAULT_READ_COUNT).
+
+-spec block_locations(block_id(), non_neg_integer()) -> [location()].
+block_locations(Block, ReadCount) ->
+First = max(0, Block - ReadCount + 1),
+[?SIZE_BLOCK*B || B <- lists:seq(First, Block)].
+
+-spec header_locations([location()], [data()]) -> [{location(), 
header_size()}].
+header_locations(Locations, DataL) ->
+lists:foldl(fun
+({Loc, <<1, HeaderSize:32/integer>>}, Acc) ->
+[{Loc, HeaderSize} | Acc];
+(_, Acc) ->
+Acc
+end, [], lists:zip(Locations, DataL)).
+
+-spec locate_header(file:fd(), [{location(), header_size()}]) ->
+{ok, binary()} | not_found.
+locate_header(_Fd, []) ->
+not_found;
+locate_header(Fd, [{Location, Size}|LocationSizes]) ->
+case (catch load_header(Fd, Location, Size)) of
+{ok, HeaderBin} ->
+{ok, Location, HeaderBin};
+_Error ->
+locate_header(Fd, LocationSizes)
+end.
+
+-spec load_header(file:fd(), location(), header_size()) ->
+{ok, binary()} | no_return().
+load_header(Fd, Location, Size) ->
--- End diff --

How about something like this:

```erlang
load_header(Fd, Block) ->
{ok, <<1, HeaderLen:32/integer, RestBlock/binary>>} =
file:pread(Fd, Block * ?SIZE_BLOCK, ?SIZE_BLOCK),
load_header(Fd, Block * ?SIZE_BLOCK, HeaderLen, RestBlock).


load_header(Fd, Pos, HeaderLen) ->
load_header(Fd, Pos, HeaderLen, <<>>).


load_header(Fd, Pos, HeaderLen, RestBlock)
TotalBytes = calculate_total_read_len(?PREFIX_SIZE, HeaderLen),
RawBin = case TotalBytes =< byte_size(RestBlock) of
true->
<> = RestBlock,
RawBin0;
false ->
ReadStart = Pos + ?PREFIX_SIZE + byte_size(RestBlock),
ReadLen = TotalBytes - byte_size(RestBlock),
{ok, Missing} = file:pread(Fd, ReadStart, ReadLen),
<>
end,
<> =
iolist_to_binary(remove_block_prefixes(?PREFIX_SIZE, 

[jira] [Commented] (COUCHDB-3007) Navigating to 'Main config' fails after visiting 'CORS'

2016-10-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542126#comment-15542126
 ] 

ASF GitHub Bot commented on COUCHDB-3007:
-

Github user samk1 closed the pull request at:

https://github.com/apache/couchdb-fauxton/pull/750


> Navigating to 'Main config' fails after visiting 'CORS'
> ---
>
> Key: COUCHDB-3007
> URL: https://issues.apache.org/jira/browse/COUCHDB-3007
> Project: CouchDB
>  Issue Type: Bug
>  Components: Fauxton
>Reporter: Mathis Wiehl
>
> If the user visits the 'CORS' section in the Configuration tab and clicks on 
> 'Main config' the content of main content pane doesn't change.
> The interesting thing is though, that the '+ Add Option' button in the 
> headerbar reappears when the 'Main config' button is clicked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] couchdb-fauxton pull request #750: Config section reactjs rewrite

2016-10-03 Thread samk1
Github user samk1 closed the pull request at:

https://github.com/apache/couchdb-fauxton/pull/750


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-fauxton issue #750: Config section reactjs rewrite

2016-10-03 Thread samk1
Github user samk1 commented on the issue:

https://github.com/apache/couchdb-fauxton/pull/750
  
Done thankyou.

On 3 October 2016 at 19:05, garren smith  wrote:

> @samk1  could you close this PR. We can't close
> it from our side. And the work is now all merged.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> 
,
> or mute the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (COUCHDB-3173) Views return corrupt data for text fields containing non-BMP characters

2016-10-03 Thread Loke (JIRA)
Loke created COUCHDB-3173:
-

 Summary: Views return corrupt data for text fields containing 
non-BMP characters
 Key: COUCHDB-3173
 URL: https://issues.apache.org/jira/browse/COUCHDB-3173
 Project: CouchDB
  Issue Type: Bug
  Components: View Server Support
Reporter: Loke


When inserting non-BMP character (i.e. characters with a Unicode codepoint 
above {{U+}}), the content gets corrupted after reading it from a view. 
Every instance of these characters are returned with an appended {{U+FFFD 
REPLACEMENT CHARACTER}}.

To reproduce, use the following commands.

Create the document containing a field with the character {{U+1F604 SMILING 
FACE WITH OPEN MOUTH AND SMILING EYES}}:

{noformat}
$ curl -X PUT -d '{"type":"foo","value":""}' http://localhost:5984/foo/foo2
{"ok":true,"id":"foo2","rev":"1-d7da3cd352ef74f6391cc13601081214"}
{noformat}

Get the document to ensure that it was saved properly:

{noformat}
curl -X GET http://localhost:5984/foo/foo2
{"_id":"foo2","_rev":"1-d7da3cd352ef74f6391cc13601081214","type":"foo","value":""}
{noformat}

Create a view that will return that document:

{noformat}
$ curl --user user:password -X PUT -d 
'{"language":"javascript","views":{"v":{"map":"function(doc){if(doc.type===\"foo\")emit(doc._id,doc);}"}}}'
 http://localhost:5984/foo/_design/bugdemo
{"ok":true,"id":"_design/bugdemo","rev":"1-817af2dafecb4cf8213aa7063551daac"}
{noformat}

Get the document from the view:

{noformat}
$ curl -X GET  http://localhost:5984/foo/_design/bugdemo/_view/v
{"total_rows":1,"offset":0,"rows":[
{"id":"foo2","key":"foo2","value":{"_id":"foo2","_rev":"1-d7da3cd352ef74f6391cc13601081214","type":"foo","value":"�"}}
]}
{noformat}

Now we can see that the field {{value}} now contains two characters. The 
original character as well as {{U+FFFD}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] couchdb-fauxton issue #750: Config section reactjs rewrite

2016-10-03 Thread garrensmith
Github user garrensmith commented on the issue:

https://github.com/apache/couchdb-fauxton/pull/750
  
@samk1 could you close this PR. We can't close it from our side. And the 
work is now all merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] couchdb-fauxton issue #781: Make jump to database async

2016-10-03 Thread garrensmith
Github user garrensmith commented on the issue:

https://github.com/apache/couchdb-fauxton/pull/781
  
This is merged


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---