Re: [PATCH] http.c: don't rewrite the user:passwd string multiple times
On Tue, 18 Jun 2013, Jeff King wrote: But, I don't know if there is any multi-processing happening within the curl library. I don't think curl does any threading; when we are not inside curl_*_perform, there is no curl code running at all (Daniel can correct me if I'm wrong on that). Correct, that's true. The default setup of libcurl never uses any threading at all, everything is done using non-blocking calls and state-machines. There's but a minor exception, so let me describe that case just to be perfectly clear: When you've build libcurl with the threaded resolver backend, libcurl fires up a new thread to resolve host names with during the name resolving phase of a transfer and that thread can then actually continue to run when curl_multi_perform() returns. That's however very isolated, stricly only for name resolving and there should be no way for an application to mess that up. Nothing of what you've discussed in this thread would affect or harm that thread. The biggest impact it tends to have on applications (that aren't following the API properly or assume a little too much) is that it changes the nature of what file descriptors to wait for slightly during the name resolve phase. Some Linux distros ship their default libcurl builds using the threaded resolver. -- / daniel.haxx.se -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] http.c: don't rewrite the user:passwd string multiple times
On Tue, 18 Jun 2013, Jeff King wrote: TL;DR: I'm just confirming what's said here! =) My understanding of curl's pointer requirements are: 1. Older versions of curl (and I do not recall which version off-hand, but it is not important) stored just the pointer. Calling code was required to manage the string lifetime itself. 2. Newer versions of curl will strdup the string in curl_easy_setopt. That's correct. This new behavior in (2) was introduced in libcurl 7.17.0 - released in September 2007 and should thus be fairly rare by now. I mention this primarily because I think it should be noted that there will probably be very little testing by users with such old libcurl versions. It may increase the time between a committed change and people notice brekages caused by it. Even Debian old-stable has a much newer version. For older versions, if we were to grow the strbuf, we might free() the pointer provided to an earlier call to curl_easy_setopt. But since we are about to call curl_easy_setopt with the new value, I would assume that curl will never actually look at the old one (i.e., when replacing an old pointer, it would not dereference it, but simply overwrite it with the new value). Another accurate description. -- / daniel.haxx.se -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] http.c: don't rewrite the user:passwd string multiple times
Daniel Stenberg dan...@haxx.se writes: On Tue, 18 Jun 2013, Jeff King wrote: TL;DR: I'm just confirming what's said here! =) Thanks. We are very fortunate to have you as the cURL guru who gives prompt responses and sanity checks to us. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] http.c: don't rewrite the user:passwd string multiple times
On Mon, Jun 17, 2013 at 10:19 PM, Jeff King p...@peff.net wrote: On Mon, Jun 17, 2013 at 07:00:40PM -0700, Brandon Casey wrote: Curl requires that we manage any strings that we pass to it as pointers. So, we should not be overwriting this strbuf after we've passed it to curl. My understanding of curl's pointer requirements are: 1. Older versions of curl (and I do not recall which version off-hand, but it is not important) stored just the pointer. Calling code was required to manage the string lifetime itself. Daniel mentions that the change happened in libcurl 7.17. RHEL 4.X (yes, ancient, dead, I realize) provides 7.12 and RHEL 5.X (yes, ancient, but still widely in use) provides 7.15. Just pointing it out. 2. Newer versions of curl will strdup the string in curl_easy_setopt. So we do not have to worry about newer versions, as they do not care about our pointer after curl_easy_setopt returns. I was probably reading the docs on one of these older platforms when I wrote this. I've actually had this patch sitting around for a while. For older versions, if we were to grow the strbuf, we might free() the pointer provided to an earlier call to curl_easy_setopt. But since we are about to call curl_easy_setopt with the new value, I would assume that curl will never actually look at the old one (i.e., when replacing an old pointer, it would not dereference it, but simply overwrite it with the new value). So for a single curl handle, I don't think it is a problem. It could be a problem when we have multiple handles in play simultaneously (we invalidate the pointer that another simultaneous handle is using, but do not immediately reset its pointer). Don't we have multiple handles in play at the same time? What's going on in get_active_slot() when USE_CURL_MULTI is defined? It appears to be maintaining a list of slot 's, each with its own curl handle initialized either by curl_easy_duphandle() or get_curl_handle(). So, yeah, this is what I was referring to when I mentioned potentially dangerous. Since the current code does not change the size of the string, the pointer will never change, so we won't ever invalidate a pointer that another handle is using. The other thing I thought was potentially dangerous, was just truncating the string. Again, if there are multiple curl handles in use (which I thought was a possibility), then merely truncating the string that contains the username/password could potentially cause a problem for another handle that could be in the middle of authenticating using the string. But, I don't know if there is any multi-processing happening within the curl library. snip Snip the remaining comments about allowing the user to specify multiple passwords since I'm not sure they're relevant if we are indeed using multiple curl handles. If we _don't_ ever use multiple curl handles, and/or if there is no threading going on in the background within libcurl, then I don't think there is really any danger in what the current code does. It would just be an issue of needlessly rewriting the same string over and over again, which is probably not a big deal depending on how often that happens. -Brandon -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] http.c: don't rewrite the user:passwd string multiple times
On Tue, Jun 18, 2013 at 12:29:03PM -0700, Brandon Casey wrote: 1. Older versions of curl (and I do not recall which version off-hand, but it is not important) stored just the pointer. Calling code was required to manage the string lifetime itself. Daniel mentions that the change happened in libcurl 7.17. RHEL 4.X (yes, ancient, dead, I realize) provides 7.12 and RHEL 5.X (yes, ancient, but still widely in use) provides 7.15. Just pointing it out. Yeah, I didn't mean to imply we don't care about these versions, only that our analysis is different between the two sets. We have #ifdefs for curl going back to 7.7.4. That's probably excessive, but AFAIK, we would still work with such old versions. It could be a problem when we have multiple handles in play simultaneously (we invalidate the pointer that another simultaneous handle is using, but do not immediately reset its pointer). Don't we have multiple handles in play at the same time? What's going on in get_active_slot() when USE_CURL_MULTI is defined? It appears to be maintaining a list of slot 's, each with its own curl handle initialized either by curl_easy_duphandle() or get_curl_handle(). Yes, we do; the dumb http walker will pipeline loose pack and object requests (which makes a big difference when fetching small files). The smart http code may use the curl-multi interface under the hood, but it should only have a single handle, and the use of the multi interface is just for sharing code with the dumb fetch. So, yeah, this is what I was referring to when I mentioned potentially dangerous. Since the current code does not change the size of the string, the pointer will never change, so we won't ever invalidate a pointer that another handle is using. Agreed. I did not so much mean to dispute your potentially dangerous claim as clarify exactly what the potential is. :) The other thing I thought was potentially dangerous, was just truncating the string. Again, if there are multiple curl handles in use (which I thought was a possibility), then merely truncating the string that contains the username/password could potentially cause a problem for another handle that could be in the middle of authenticating using the string. But, I don't know if there is any multi-processing happening within the curl library. I don't think curl does any threading; when we are not inside curl_*_perform, there is no curl code running at all (Daniel can correct me if I'm wrong on that). So I think from curl's perspective a truncation and exact rewrite is atomic, and it sees only the final content. I don't know what would happen if you truncated and put in _different_ contents. For example, if curl would have written out half of the username/password, blocked and returned from curl_multi_perform, then you update the buffer, then it resumes writing. IOW, I believe the current code is safe (though in a very subtle way), but if you were to allow password update, I'm not sure if it would be or not (and if not, you would need a per-handle buffer to make it safe). I'm fine with making the safety less subtle (e.g., your patch, with a comment added). If we _don't_ ever use multiple curl handles, and/or if there is no threading going on in the background within libcurl, then I don't think there is really any danger in what the current code does. It would just be an issue of needlessly rewriting the same string over and over again, which is probably not a big deal depending on how often that happens. It should be once per http request. But copying a dozen bytes is probably nothing compared to the actual request. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] http.c: don't rewrite the user:passwd string multiple times
From: Brandon Casey draf...@gmail.com Curl requires that we manage any strings that we pass to it as pointers. So, we should not be overwriting this strbuf after we've passed it to curl. Additionally, it is unnecessary since we only prompt for the user name and password once, so we end up overwriting the strbuf with the same sequence of characters each time. This is why in practice it has not caused any problems for git's use of curl; the internal strbuf char pointer does not change, and get's overwritten with the same string each time. But it's unnecessary and potentially dangerous, so let's avoid it. Signed-off-by: Brandon Casey draf...@gmail.com --- http.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/http.c b/http.c index 92aba59..6828269 100644 --- a/http.c +++ b/http.c @@ -228,8 +228,8 @@ static void init_curl_http_auth(CURL *result) #else { static struct strbuf up = STRBUF_INIT; - strbuf_reset(up); - strbuf_addf(up, %s:%s, + if (!up.len) + strbuf_addf(up, %s:%s, http_auth.username, http_auth.password); curl_easy_setopt(result, CURLOPT_USERPWD, up.buf); } -- 1.8.3.1.440.gc2bf105 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] http.c: don't rewrite the user:passwd string multiple times
On Mon, Jun 17, 2013 at 10:00 PM, Brandon Casey bca...@nvidia.com wrote: From: Brandon Casey draf...@gmail.com Curl requires that we manage any strings that we pass to it as pointers. So, we should not be overwriting this strbuf after we've passed it to curl. Additionally, it is unnecessary since we only prompt for the user name and password once, so we end up overwriting the strbuf with the same sequence of characters each time. This is why in practice it has not caused any problems for git's use of curl; the internal strbuf char pointer does not change, and get's overwritten with the same string s/get's/gets/ each time. But it's unnecessary and potentially dangerous, so let's avoid it. Signed-off-by: Brandon Casey draf...@gmail.com -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] http.c: don't rewrite the user:passwd string multiple times
On Mon, Jun 17, 2013 at 07:00:40PM -0700, Brandon Casey wrote: Curl requires that we manage any strings that we pass to it as pointers. So, we should not be overwriting this strbuf after we've passed it to curl. My understanding of curl's pointer requirements are: 1. Older versions of curl (and I do not recall which version off-hand, but it is not important) stored just the pointer. Calling code was required to manage the string lifetime itself. 2. Newer versions of curl will strdup the string in curl_easy_setopt. So we do not have to worry about newer versions, as they do not care about our pointer after curl_easy_setopt returns. For older versions, if we were to grow the strbuf, we might free() the pointer provided to an earlier call to curl_easy_setopt. But since we are about to call curl_easy_setopt with the new value, I would assume that curl will never actually look at the old one (i.e., when replacing an old pointer, it would not dereference it, but simply overwrite it with the new value). So for a single curl handle, I don't think it is a problem. It could be a problem when we have multiple handles in play simultaneously (we invalidate the pointer that another simultaneous handle is using, but do not immediately reset its pointer). Additionally, it is unnecessary since we only prompt for the user name and password once, so we end up overwriting the strbuf with the same sequence of characters each time. This is why in practice it has not caused any problems for git's use of curl; the internal strbuf char pointer does not change, and get's overwritten with the same string each time. In the current code, yes, we only do this once (and if we have a username/password from the URL, we do not re-prompt if that fails). diff --git a/http.c b/http.c index 92aba59..6828269 100644 --- a/http.c +++ b/http.c @@ -228,8 +228,8 @@ static void init_curl_http_auth(CURL *result) #else { static struct strbuf up = STRBUF_INIT; - strbuf_reset(up); - strbuf_addf(up, %s:%s, + if (!up.len) + strbuf_addf(up, %s:%s, http_auth.username, http_auth.password); curl_easy_setopt(result, CURLOPT_USERPWD, up.buf); This is correct for the current code because of the reasoning above. I'm slightly negative on this only because it feels like we are setting a trap for somebody who later wants to do: for (sanity = 0; sanity 5; sanity++) { int ret = http_request(...); if (ret != HTTP_REAUTH) return ret; } to give the user a few chances to input. We would continue to update the credential struct but never actually give the new value to curl. Another option would be to just use a static fixed-size buffer. That removes all memory management issues. I dunno. Maybe I am being too picky, as I do not have plans to do anything like the above (since we don't do significant work before the http contact, there is no reason not to just die() and let the user re-run the shell command). I'd also be OK with just putting a comment above the code in question to say something like Note that we assume we only ever have a single set of credentials in a given program run, so we do not have to worry about updating this buffer, only setting its initial value. Then the trap at least has a warning sign. :) What do you think? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html