Hi Robert,
1. I admit it was a bad example because W3C defines no side effect for
the HTTP POST method. Nevertheless, there are two cases when the W3C
specifications do not apply for that:
a) ISP limit for HTTP POST method (I heard some are doing that even if I
cannot point any);
b) cURL default behavior which can be suppressed by its options for
including large POST data.
2. Well, in my case, I was transferring the data from Erlang list to
construct the cURL command. Multiple lists of the same kind of data were
held in the memory in the same time, so, I wouldn't say I hit the RAM
limit (when I tested cURL command, I reduced the number of lists to be
sure I don't hit the RAM limit). But I admit at that time I didn't think
of (or I discarded because of my lack of knowledge) cURL limitation, but
I had the impression it was related to the command line length (well, I
never hit that limit by then and that was strange enough for me, but I
had no time to dig too much in the problem - I think I was just happy I
found a solution which worked).
And, yes, 255 kdocs can reach the RAM limit, as you said. In any case, I
would recommend using chunks (defined by multipart or just dividing the
documents for multiple cURL independent instances).
CGS
On 01/10/2012 01:23 PM, Robert Newson wrote:
1) That refers to the length of the URL (to _bulk_docs, in this case),
not the body.
2) That refers to the length of the command line, not the lengths of
files referenced on the command line.
B.
On 10 January 2012 11:58, CGS<[email protected]> wrote:
There is a limit for sure, but there are two factors you have to consider:
1. HTTP request limit in the number of characters (for example, read this:
http://stackoverflow.com/questions/2659952/maximum-length-of-http-get-request);
2. prompter command under Linux/Cygwin has a maximum number of characters
(depends on the Linux flavor).
Under CentOS 6, I was able to send 800 documents per instance (document =
few simple pairs key-value including _id), but not 1000. At 1 kdocs I got
shell error. Nevertheless, this test is not complete because I used CentOS 6
for both client and CouchDB server and I don't know the exact length of the
command.
CGS
On 01/10/2012 12:20 PM, Zekeriya KOÇ wrote:
Thanks for all replies.
The problem was first, the BOM character. After that i split my files into
chunks that smaller than 30mb. and it started to work.
There is a request size limit isn't there?
Again, thanks for all the replies.
2012/1/10 CGS<[email protected]>
Oh, I forgot to write the solution, in case it's not obvious. Just divide
the number of docs for multiple instances of cURL and it will work. Don't
worry, you still use the power of the bulk operation (I had an insertion
rate like 5-6 kdocs/s on a not-that-greate server even if I had to send
more requests at the same time).
CGS
On 01/10/2012 11:45 AM, CGS wrote:
Hi,
With 255000 documents in one session, you go over the number of the
characters allowed either for a prompter command or for a HTTP request
(if
not for both). The session truncates the command, so, your JSON is
incomplete. That gave me that response in the past.
CGS
On 01/10/2012 10:11 AM, Zekeriya KOÇ wrote:
Sorry for subjectless message!!!
Hello,
my problem: i am trying to insert approximately 255000 documents to a
couchdb instance with bulk docs api. i always get invalid json
error.
so i am trying to test the problem with just one document. because
the error raises wether with a large file or a file with just one
document.
my system:
couchdb: on an ubuntu server 10.04
client: windows 7 with cygwin curl
$ curl -X GET http://admin:ad...<https://**groups.google.com/groups/**
unlock?hl=tr&_done=/group/**couchbase/browse_thread/**
thread/7f908b186f025047%3Fhl%**3Dtr&msg=25cba4108fd1a8e8<https://groups.google.com/groups/unlock?hl=tr&_done=/group/couchbase/browse_thread/thread/7f908b186f025047%3Fhl%3Dtr&msg=25cba4108fd1a8e8>
@10.81.2.100:5984
{"couchdb":"Welcome","version"**:"1.1.0","vendor":
{"version":"1.2.0","name":"**Couchbase","url":"http://
www.couchbase.com/<http://www.**google.com/url?sa=D&q=www.**
couchbase.com/&usg=**AFQjCNGuaH0E_Cygc_yqQqgX0s-**cmb5BuQ<http://www.google.com/url?sa=D&q=www.couchbase.com/&usg=AFQjCNGuaH0E_Cygc_yqQqgX0s-cmb5BuQ>>
"}}
$ curl -d @test.txt -H "Content-Type:application/**json" -X POST
http://admin:ad...<https://**groups.google.com/groups/**
unlock?hl=tr&_done=/group/**couchbase/browse_thread/**
thread/7f908b186f025047%3Fhl%**3Dtr&msg=25cba4108fd1a8e8<https://groups.google.com/groups/unlock?hl=tr&_done=/group/couchbase/browse_thread/thread/7f908b186f025047%3Fhl%3Dtr&msg=25cba4108fd1a8e8>>
@10.81.2.100:5984/dbmerkez/_**bulk_docs<http://10.81.2.100:5984/dbmerkez/_bulk_docs>
{"error":"bad_request","**reason":"invalid UTF-8 JSON:<<\"\ufeff{\\
\"docs\\\":[{\\\"adi\\\": \\\"zeko\\\"}]}\">>"}
now i copy the content of test.txt and paste it to my command line:
$ curl -d '{"docs":[{"adi": "zeko"}]}' -H "Content-Type:application/
json" -X POST http://admin:ad...<https://**groups.google.com/groups/**
unlock?hl=tr&_done=/group/**couchbase/browse_thread/**
thread/7f908b186f025047%3Fhl%**3Dtr&msg=25cba4108fd1a8e8<https://groups.google.com/groups/unlock?hl=tr&_done=/group/couchbase/browse_thread/thread/7f908b186f025047%3Fhl%3Dtr&msg=25cba4108fd1a8e8>
@10.81.2.100:5984/dbmerkez/_**bulk_docs<http://10.81.2.100:5984/dbmerkez/_bulk_docs>
[{"id":"**74a5d37e71215e2095d00f90a00007**ac","rev":"1-**111c10804ee9f2b8384ab95e
f66268e0"}]
as you can see same content gives an invalid json error within a file
but from direct command line it inserts fine.
my text file is encoded in utf-8.
i am so close to give up. i am fighting with this for hours. if i can
not insert initial data to my instance i can not test the replication
cases.
please help!!