On Thu, 14 Jan 2010, Nick Gerner wrote:

so the time spent in timestamp_remove and clean_with_criterium go up toward the end of the run (by a lot).

So that might then be because the hash didn't get pruned porperly. It'll be interesting to see if you see better results with current CVS.

keep in mind that we might hit a million (or 10s of millions) of hosts. Is the dns cache a hash table or a tree? What about the connection cache?

It's a hash table, so yeah it will get slower when the amount grows very large (we could perhaps consider improving it to scale better). But with a timeout of 0 it really shouldn't.

we are using the multi_socket API, sorry if got that confused.

Ok, then the socket hash is used a lot too since it needs to map passed in socket descriptors to what actual internal connection handle it is.

while(more_work_to_do())
{
 populate_easy_handles_and_add_to_multi_handle();
 nfds = get_all_the_waiting_sockets(&fds);
 ret = poll(fds, nfds, timeout_from_multi_timeout);
 for(int fdIndex = 0; fdIndex < nfds; fdIndex++)
 {
   while(handle_socket_via_curl_multi_socket_action(fds[fdIndex]) ==
CURLM_CALL_MULTI_PERFORM);
 }
 handle_all_completed_handles();
}

Are you really doing it with a loop like that?

It's a bit unusual, since the multi_socket supports and really encourages an app to rather use something event-based. You will also get significantly better performance by using an event-based layer like libev or libevent and let that call libcurl for each socket event rather than have poll() poll them all all the time. At least if the amount of sockets is fairly high, like a hundred or more.

That said, I don't think this is the reason for the hash table to get used more than ordinary anyway.

--

 / daniel.haxx.se
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette:  http://curl.haxx.se/mail/etiquette.html

Reply via email to