We have an application written in C, running on both Windows and Linux, which uses libcurl's easy interface to interact with an Apache server via an HTTP API over the internet. Normally, the API transactions are short, and bandwidth performance is inconsequential, but recently we have had to start working with a transaction which sends large volumes of data to the API. In our case, large consists of up to ~10MB per HTTP POST, but could be as small as a few KB. We'll be making many of these transactions, sending anywhere from a ~100MB to a 100GB during program execution.
The problem that we've encountered, is that our total bandwidth (total amount of data sent / (amount of time it takes to execute all upload transactions - server side processing time)) is quite bad. If we take the raw data and FTP it to the same webserver which hosts the Apache instance, we get ~5x more bandwidth than the current setup. We understand that sending the data in chunks will eat into performance to some degree, but a 5x slowdown seems excessive (maybe it's not, that's why we're asking!). More precisely If we take the data being sent as a single text file, compress it using deflate and FTP the resulting 125MB file with FileZilla, it takes about 2 minutes (8Mb/s). Our application POSTs the data one chunk at a time, each individually compressed, so it's sending 175MB of data, but takes 14.5 minutes (1.5Mb/s). The distribution of POST body sizes is as follows: Range: 3.7 KB - 9.6 MB Median: 55 KB Average: 1.6 MB Less than 10% of POST bodies are less than 1500 bytes. We're looking for suggestions to improve performance, or an argument as to why it can't be improved under our constraints. We must continue to use HTTP POSTs to send the data one chunk at a time. Ideally there'd be some way to configure libcurl to perform better with our non-standard load (we've read about TCP_CORK, but weren't able to find any documentation on how to use it in libcurl). Roughly, our code currently looks like this (error checking removed for terseness) CURL *cur = curl_easy_init())) struct curl_slist *headers = NULL headers = curl_slist_append(headers, "Connection: Keep-Alive"); headers = curl_slist_append(headers, "Keep-Alive: 60"); headers = curl_slist_append(headers, "Content-Type: text/xml"); headers = curl_slist_append(headers, xff); // xff declared elsewhere curl_easy_setopt(cur, CURLOPT_PROXY, proxyurl); // proxyurl declared elsewhere // proxyauth declared elsewhere curl_easy_setopt(cur, CURLOPT_PROXYUSERPWD, proxyauth); curl_easy_setopt(cur, CURLOPT_URL, urlbuf); // urlbuf declared elsewhere // user_agent declared elsewhere curl_easy_setopt(cur, CURLOPT_USERAGENT, user_agent); // callback makes a copy of a tiny (<100 byte) response which is processes later. curl_easy_setopt(cur, CURLOPT_WRITEFUNCTION, callback); curl_easy_setopt(cur, CURLOPT_WRITEDATA, ptr); // ptr declared elsewhere // postBody is the up to ~10MB data buffer being sent curl_easy_setopt(cur, CURLOPT_POSTFIELDS, postBody); curl_easy_setopt(cur, CURLOPT_POSTFIELDSIZE, postBodySize); // postBody size curl_easy_setopt(cur, CURLOPT_HTTPHEADER, headers); curl_easy_setopt(cur, CURLOPT_BUFFERSIZE, CURL_MAX_WRITE_SIZE); curl_easy_setopt(cur, CURLOPT_NOSIGNAL, 1); curl_easy_setopt(cur, CURLOPT_SSL_VERIFYPEER, 0); // connectTimeout declared elsewhere curl_easy_setopt(cur, CURLOPT_CONNECTTIMEOUT, connectTimeout); curl_easy_setopt(cur, CURLOPT_TIMEOUT, timeout); // timeout declared elsewhere /* Turn on http keep-alive packets */ curl_easy_setopt(cur, CURLOPT_TCP_KEEPALIVE, 1); curl_easy_perform(cur) Any and all thoughts/suggestions are very much appreciated. ------------------------------------------------------------------- List admin: http://cool.haxx.se/list/listinfo/curl-library Etiquette: http://curl.haxx.se/mail/etiquette.html
