Re: [AsyncHttpClient] data collection instrumentation
With the additional patch for the start times and connect times and the TimeMonitor, I feel this is in good enough shape to be committed, so I checked it in. I also wrote some additional tests for this. I did find a small anomaly with my tests, however. I was looking for a case that would trigger the CONNECTION_CLOSED event, and discovered there are no places in the code where that event is raised. It appears, in general, that close events are handled by the server closing the connection and these come through as CONNECTION_CLOSED_BY_SERVER events. Ok, perhaps the CONNECTION_CLOSED event is not needed. However, when writing some tests for the redirection cases, it appears that when we receive a redirection response back from the server, we're not getting a CONNECTION_CLOSED_BY_SERVER event. If the connection is not cached, we appear to ended up with a dangling live connection. Should something to be done to explicitly close this connection in that situation? Rick Sangjin Lee wrote: On Jan 23, 2008 6:50 AM, Rick McGuire [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Only if it can be done without having to maintain the same sort of request-to-start time map that you don't wish to do with the listener. The process of adding data collection should cause memory bloat there either, particularly if monitoring is not being used (the more likely case). It seems more reasonable that this type of processing should be pushed into the monitoring implementation rather than have the async client try to keep track of everything. This way, the overhead is only introduced while metrics are being gathered. A very simple and relatively low cost means might be to add a couple of time stamp fields to the request object, but only for the most significant events. Perhaps request start and connection start, but nothing else. Another possible approach would be to have a mechanism that would allow the monitor to attach an annotation object to the request that could be used to implement a lifecycle memory if needed. The cost of doing this is relatively minor when this type of information is not needed, but it's flexible enough to be tailored to any type of data collection. I agree, the only things that make sense to be present are request start time and connect start time... I'm going to upload a revised patch that shows this along with a proof-of-concept monitor that uses this to compute the average response time. Thanks, Sangjin
Re: [AsyncHttpClient] data collection instrumentation
Thanks much! I'm going over the changes, and will let you know if there is anything I'd like to add/change. On Jan 24, 2008 10:32 AM, Rick McGuire [EMAIL PROTECTED] wrote: With the additional patch for the start times and connect times and the TimeMonitor, I feel this is in good enough shape to be committed, so I checked it in. I also wrote some additional tests for this. I did find a small anomaly with my tests, however. I was looking for a case that would trigger the CONNECTION_CLOSED event, and discovered there are no places in the code where that event is raised. It appears, in general, that close events are handled by the server closing the connection and these come through as CONNECTION_CLOSED_BY_SERVER events. Ok, perhaps the CONNECTION_CLOSED event is not needed. Yeah, I suspect CONNECTION_CLOSED_BY_SERVER would be enough. BTW, I noticed that CONNECTION_CLOSED_BY_SERVER fires in HttpIoHandler.exceptionCaught() as well. I think it's redundant and will cause double counting. The exceptionCaught() method calls IoSession.close() at the end, and HttpIoHandler.sessionClosed() will be invoked as a result, which also fires the same event. So I do not think we should fire the event in exceptionCaught(). However, when writing some tests for the redirection cases, it appears that when we receive a redirection response back from the server, we're not getting a CONNECTION_CLOSED_BY_SERVER event. If the connection is not cached, we appear to ended up with a dangling live connection. Should something to be done to explicitly close this connection in that situation? Hmm, that's a good point. The same behavior applies for other non-redirect cases as well. If a session cache is provided, we cache it. But if not, we do nothing. Perhaps it is safer to close the session explicitly in all 3 cases in HttpIoHandler.messageReceived(). This hasn't been a problem basically because if session cache is not enabled we explicitly add Connection: close to the request headers, so servers will close them anyway. But I think it might be safer to close on our side still... What do you think? Rick Sangjin Lee wrote: On Jan 23, 2008 6:50 AM, Rick McGuire [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Only if it can be done without having to maintain the same sort of request-to-start time map that you don't wish to do with the listener. The process of adding data collection should cause memory bloat there either, particularly if monitoring is not being used (the more likely case). It seems more reasonable that this type of processing should be pushed into the monitoring implementation rather than have the async client try to keep track of everything. This way, the overhead is only introduced while metrics are being gathered. A very simple and relatively low cost means might be to add a couple of time stamp fields to the request object, but only for the most significant events. Perhaps request start and connection start, but nothing else. Another possible approach would be to have a mechanism that would allow the monitor to attach an annotation object to the request that could be used to implement a lifecycle memory if needed. The cost of doing this is relatively minor when this type of information is not needed, but it's flexible enough to be tailored to any type of data collection. I agree, the only things that make sense to be present are request start time and connect start time... I'm going to upload a revised patch that shows this along with a proof-of-concept monitor that uses this to compute the average response time. Thanks, Sangjin
Re: [AsyncHttpClient] data collection instrumentation
Sangjin Lee wrote: The modified patch is there on JIRA. Some follow-up discussions... I think the current implementation works well, but one thing that's difficult to do is to collecting timing data. For example, some of the most important instrumentation data are things like average response time (from request start to request complete) and average connect time (from connect start to connect complete). Currently the context object that's available to monitoring listeners is the request object, along with the timestamp of the event itself. To be able to compute a response time for a given request, one would need to take the timestamp from the request start event, associate it with the request, and store it on the listener. When the request complete event fires, then one would need to look up the stored data using the request object as a key to retrieve the timestamp for the request start event, compute the delta, and store the delta. While all this is possible, it has a number of issues, not the least of which is that one would need to maintain a map of request to start time (as well as request to connect time). This would bloat memory as well as other implications. A substantially easier solution would be to provide the request start time and connect start time as part of the information that's passed to the monitoring listener. Then listeners could simply compute the diff to get the elapsed time very easily with no need to maintain maps of any kind. This could be either part of the request object itself, or if desirable, one could consider a separate context or event object that contains this information. What do you think? Thanks, Sangjin On Jan 22, 2008 1:33 PM, Sangjin Lee [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: I took a look at the patch on GERONIMO-3761, and it looks great. Thanks. I have modified your patch for several things, though, and I'm nearly ready to add it to the JIRA report. Comments about the changes... - I rewrote the EventQueue class to use an Executor. Since the Executor implementation provided by the JDK is basically a thread pool associated with a task queue, it provides an identical functionality to what was in EventQueue. I think that it is good to use the constructs from java.util.concurrent.* whenever it makes sense, and I believe this is one of them. - This change also enables us to remove synchronized from notifyMonitoringListener(). The notify method will be called very often and concurrently, and reducing the lock contention will be important. Using an Executor makes it possible to eliminate synchronization, at least at that level. - I associated a shared thread pool (Executor) for all dispatchers. I think it is desirable for dispatchers to share this thread pool rather than each instance of dispatchers creating and maintaining its own thread. - Renamed EventQueue to EventDispatcher. - I also moved the monitoring listener list to EventDispatcher. I also used CopyOnWriteArrayList as the implementation for the list. CopyOnWriteArrayList is an ideal choice for this as it is thread safe and lock-free. Also, our use case is heavy read-access but very infrequent write-access, which CopyOnWriteArrayList is suitable for. - I moved the connection_failed notification to before the getSession() call. The getSession() call here always throws an exception (by design), and thus notification needs to be done before calling getSession(). - I rewrote the CountingMonitor to use AtomicIntegers. This should be slightly safer. - I changed the timestamp calls from System.currentTimeMillis() to System.nanoTime()/100. The nanoTime() call is more high-res, as currentTimeMillis() may be tens of milliseconds accurate on some platforms, and thus not suitable for these measurements. I also have some more follow-up questions, which I'll post soon. Thanks, Sangjin On Jan 17, 2008 10:51 AM, Sangjin Lee [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: I like your idea of using the event listener as the main way of doing this. Basically no or multiple listeners would be invoked on a different thread when events occur. The event listener APIs would define those key methods which would contain all the necessary information about the events in an immutable fashion. We could provide a simple adapter that is no op so people can override necessary methods easily. Also, we could provide one implementation which is a counting listener that does the basic metrics collection. What do you think? Only if it can be done without having to maintain the same sort of request-to-start time map that you don't wish to do with the listener. The process of adding data collection should
Re: [AsyncHttpClient] data collection instrumentation
The changes you made look really nice. I think my only concern is whether the Executor used by the EventDispatcher needs to be cleaned up at some point. If it gets finalized appropriately and doesn't leave dangling threads, then I guess I'm ok with it. Rick Sangjin Lee wrote: I took a look at the patch on GERONIMO-3761, and it looks great. Thanks. I have modified your patch for several things, though, and I'm nearly ready to add it to the JIRA report. Comments about the changes... - I rewrote the EventQueue class to use an Executor. Since the Executor implementation provided by the JDK is basically a thread pool associated with a task queue, it provides an identical functionality to what was in EventQueue. I think that it is good to use the constructs from java.util.concurrent.* whenever it makes sense, and I believe this is one of them. - This change also enables us to remove synchronized from notifyMonitoringListener(). The notify method will be called very often and concurrently, and reducing the lock contention will be important. Using an Executor makes it possible to eliminate synchronization, at least at that level. - I associated a shared thread pool (Executor) for all dispatchers. I think it is desirable for dispatchers to share this thread pool rather than each instance of dispatchers creating and maintaining its own thread. - Renamed EventQueue to EventDispatcher. - I also moved the monitoring listener list to EventDispatcher. I also used CopyOnWriteArrayList as the implementation for the list. CopyOnWriteArrayList is an ideal choice for this as it is thread safe and lock-free. Also, our use case is heavy read-access but very infrequent write-access, which CopyOnWriteArrayList is suitable for. - I moved the connection_failed notification to before the getSession() call. The getSession() call here always throws an exception (by design), and thus notification needs to be done before calling getSession(). - I rewrote the CountingMonitor to use AtomicIntegers. This should be slightly safer. - I changed the timestamp calls from System.currentTimeMillis() to System.nanoTime()/100. The nanoTime() call is more high-res, as currentTimeMillis() may be tens of milliseconds accurate on some platforms, and thus not suitable for these measurements. I also have some more follow-up questions, which I'll post soon. Thanks, Sangjin On Jan 17, 2008 10:51 AM, Sangjin Lee [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: I like your idea of using the event listener as the main way of doing this. Basically no or multiple listeners would be invoked on a different thread when events occur. The event listener APIs would define those key methods which would contain all the necessary information about the events in an immutable fashion. We could provide a simple adapter that is no op so people can override necessary methods easily. Also, we could provide one implementation which is a counting listener that does the basic metrics collection. What do you think? Thanks, Sangjin On Jan 17, 2008 2:58 AM, Rick McGuire [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Thunderbird is playing very strange games with me this morning, somehow deleting the original post. Anyway, here are my comments on this. I'd like to propose changes to enable some basic stat collection and/or instrumentation to have visibility into performance of AHC. For a given *AsyncHttpClient*, one might want to know metrics like - total request count - total success count - total exception count - total timeout count - connection attempt count - connection failure count - connect time average - connection close count - average response time (as measured from the invocation time to having the response ready) - and others? Collection of metric information would, I think, be a good thing. However, I think we should separate the consolidation of the information from the collection. That is, the client should just have different types of events for data collection, and the event listener would be responsible for presenting the information appropriately. For example, to create the list above, I'd see the following set of events needed: - request made - request completed - request failed - request timeout - connection attempt started - connection failed - connection closed All events would be timestamped, which would allow metrics like average request time to be calculated. This set of events would mean the client would not need to maintain any metric
Re: [AsyncHttpClient] data collection instrumentation
The executor created in the EventDispatcher is a daemon thread pool (that's why I added DaemonThreadFactory), so it will go away cleanly without cleanup. Thanks, Sangjin On Jan 23, 2008 6:54 AM, Rick McGuire [EMAIL PROTECTED] wrote: The changes you made look really nice. I think my only concern is whether the Executor used by the EventDispatcher needs to be cleaned up at some point. If it gets finalized appropriately and doesn't leave dangling threads, then I guess I'm ok with it. Rick Sangjin Lee wrote: I took a look at the patch on GERONIMO-3761, and it looks great. Thanks. I have modified your patch for several things, though, and I'm nearly ready to add it to the JIRA report. Comments about the changes... - I rewrote the EventQueue class to use an Executor. Since the Executor implementation provided by the JDK is basically a thread pool associated with a task queue, it provides an identical functionality to what was in EventQueue. I think that it is good to use the constructs from java.util.concurrent.* whenever it makes sense, and I believe this is one of them. - This change also enables us to remove synchronized from notifyMonitoringListener(). The notify method will be called very often and concurrently, and reducing the lock contention will be important. Using an Executor makes it possible to eliminate synchronization, at least at that level. - I associated a shared thread pool (Executor) for all dispatchers. I think it is desirable for dispatchers to share this thread pool rather than each instance of dispatchers creating and maintaining its own thread. - Renamed EventQueue to EventDispatcher. - I also moved the monitoring listener list to EventDispatcher. I also used CopyOnWriteArrayList as the implementation for the list. CopyOnWriteArrayList is an ideal choice for this as it is thread safe and lock-free. Also, our use case is heavy read-access but very infrequent write-access, which CopyOnWriteArrayList is suitable for. - I moved the connection_failed notification to before the getSession() call. The getSession() call here always throws an exception (by design), and thus notification needs to be done before calling getSession(). - I rewrote the CountingMonitor to use AtomicIntegers. This should be slightly safer. - I changed the timestamp calls from System.currentTimeMillis() to System.nanoTime()/100. The nanoTime() call is more high-res, as currentTimeMillis() may be tens of milliseconds accurate on some platforms, and thus not suitable for these measurements. I also have some more follow-up questions, which I'll post soon. Thanks, Sangjin On Jan 17, 2008 10:51 AM, Sangjin Lee [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: I like your idea of using the event listener as the main way of doing this. Basically no or multiple listeners would be invoked on a different thread when events occur. The event listener APIs would define those key methods which would contain all the necessary information about the events in an immutable fashion. We could provide a simple adapter that is no op so people can override necessary methods easily. Also, we could provide one implementation which is a counting listener that does the basic metrics collection. What do you think? Thanks, Sangjin On Jan 17, 2008 2:58 AM, Rick McGuire [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Thunderbird is playing very strange games with me this morning, somehow deleting the original post. Anyway, here are my comments on this. I'd like to propose changes to enable some basic stat collection and/or instrumentation to have visibility into performance of AHC. For a given *AsyncHttpClient*, one might want to know metrics like - total request count - total success count - total exception count - total timeout count - connection attempt count - connection failure count - connect time average - connection close count - average response time (as measured from the invocation time to having the response ready) - and others? Collection of metric information would, I think, be a good thing. However, I think we should separate the consolidation of the information from the collection. That is, the client should just have different types of events for data collection, and the event listener would be responsible for presenting the information appropriately. For example, to create the list above, I'd see the following set of events needed: - request
Re: [AsyncHttpClient] data collection instrumentation
On Jan 23, 2008 6:50 AM, Rick McGuire [EMAIL PROTECTED] wrote: Only if it can be done without having to maintain the same sort of request-to-start time map that you don't wish to do with the listener. The process of adding data collection should cause memory bloat there either, particularly if monitoring is not being used (the more likely case). It seems more reasonable that this type of processing should be pushed into the monitoring implementation rather than have the async client try to keep track of everything. This way, the overhead is only introduced while metrics are being gathered. A very simple and relatively low cost means might be to add a couple of time stamp fields to the request object, but only for the most significant events. Perhaps request start and connection start, but nothing else. Another possible approach would be to have a mechanism that would allow the monitor to attach an annotation object to the request that could be used to implement a lifecycle memory if needed. The cost of doing this is relatively minor when this type of information is not needed, but it's flexible enough to be tailored to any type of data collection. I agree, the only things that make sense to be present are request start time and connect start time... I'm going to upload a revised patch that shows this along with a proof-of-concept monitor that uses this to compute the average response time. Thanks, Sangjin
Re: [AsyncHttpClient] data collection instrumentation
I took a look at the patch on GERONIMO-3761, and it looks great. Thanks. I have modified your patch for several things, though, and I'm nearly ready to add it to the JIRA report. Comments about the changes... - I rewrote the EventQueue class to use an Executor. Since the Executor implementation provided by the JDK is basically a thread pool associated with a task queue, it provides an identical functionality to what was in EventQueue. I think that it is good to use the constructs from java.util.concurrent.* whenever it makes sense, and I believe this is one of them. - This change also enables us to remove synchronized from notifyMonitoringListener(). The notify method will be called very often and concurrently, and reducing the lock contention will be important. Using an Executor makes it possible to eliminate synchronization, at least at that level. - I associated a shared thread pool (Executor) for all dispatchers. I think it is desirable for dispatchers to share this thread pool rather than each instance of dispatchers creating and maintaining its own thread. - Renamed EventQueue to EventDispatcher. - I also moved the monitoring listener list to EventDispatcher. I also used CopyOnWriteArrayList as the implementation for the list. CopyOnWriteArrayList is an ideal choice for this as it is thread safe and lock-free. Also, our use case is heavy read-access but very infrequent write-access, which CopyOnWriteArrayList is suitable for. - I moved the connection_failed notification to before the getSession() call. The getSession() call here always throws an exception (by design), and thus notification needs to be done before calling getSession(). - I rewrote the CountingMonitor to use AtomicIntegers. This should be slightly safer. - I changed the timestamp calls from System.currentTimeMillis() to System.nanoTime()/100. The nanoTime() call is more high-res, as currentTimeMillis() may be tens of milliseconds accurate on some platforms, and thus not suitable for these measurements. I also have some more follow-up questions, which I'll post soon. Thanks, Sangjin On Jan 17, 2008 10:51 AM, Sangjin Lee [EMAIL PROTECTED] wrote: I like your idea of using the event listener as the main way of doing this. Basically no or multiple listeners would be invoked on a different thread when events occur. The event listener APIs would define those key methods which would contain all the necessary information about the events in an immutable fashion. We could provide a simple adapter that is no op so people can override necessary methods easily. Also, we could provide one implementation which is a counting listener that does the basic metrics collection. What do you think? Thanks, Sangjin On Jan 17, 2008 2:58 AM, Rick McGuire [EMAIL PROTECTED] wrote: Thunderbird is playing very strange games with me this morning, somehow deleting the original post. Anyway, here are my comments on this. I'd like to propose changes to enable some basic stat collection and/or instrumentation to have visibility into performance of AHC. For a given *AsyncHttpClient*, one might want to know metrics like - total request count - total success count - total exception count - total timeout count - connection attempt count - connection failure count - connect time average - connection close count - average response time (as measured from the invocation time to having the response ready) - and others? Collection of metric information would, I think, be a good thing. However, I think we should separate the consolidation of the information from the collection. That is, the client should just have different types of events for data collection, and the event listener would be responsible for presenting the information appropriately. For example, to create the list above, I'd see the following set of events needed: - request made - request completed - request failed - request timeout - connection attempt started - connection failed - connection closed All events would be timestamped, which would allow metrics like average request time to be calculated. This set of events would mean the client would not need to maintain any metric accumulators, and if the event information is done correctly, would even allow more fine grained monitoring (e.g., average connection time for requests to domain foo.bar.com). Collecting these metrics should have little effect on the overall performance. There would be an API to access these stats. I was initially thinking of an IoFilter to consolidate these hooks, but I realize some of these metrics are not readily available to an IoFilter (e.g. connect-related numbers). It might be unavoidable to spread the instrumentation in a couple of places (IoHandler, ConnectFutureListener, etc.). Taking this one step further, one might think of callbacks or
Re: [AsyncHttpClient] data collection instrumentation
The modified patch is there on JIRA. Some follow-up discussions... I think the current implementation works well, but one thing that's difficult to do is to collecting timing data. For example, some of the most important instrumentation data are things like average response time (from request start to request complete) and average connect time (from connect start to connect complete). Currently the context object that's available to monitoring listeners is the request object, along with the timestamp of the event itself. To be able to compute a response time for a given request, one would need to take the timestamp from the request start event, associate it with the request, and store it on the listener. When the request complete event fires, then one would need to look up the stored data using the request object as a key to retrieve the timestamp for the request start event, compute the delta, and store the delta. While all this is possible, it has a number of issues, not the least of which is that one would need to maintain a map of request to start time (as well as request to connect time). This would bloat memory as well as other implications. A substantially easier solution would be to provide the request start time and connect start time as part of the information that's passed to the monitoring listener. Then listeners could simply compute the diff to get the elapsed time very easily with no need to maintain maps of any kind. This could be either part of the request object itself, or if desirable, one could consider a separate context or event object that contains this information. What do you think? Thanks, Sangjin On Jan 22, 2008 1:33 PM, Sangjin Lee [EMAIL PROTECTED] wrote: I took a look at the patch on GERONIMO-3761, and it looks great. Thanks. I have modified your patch for several things, though, and I'm nearly ready to add it to the JIRA report. Comments about the changes... - I rewrote the EventQueue class to use an Executor. Since the Executor implementation provided by the JDK is basically a thread pool associated with a task queue, it provides an identical functionality to what was in EventQueue. I think that it is good to use the constructs from java.util.concurrent.* whenever it makes sense, and I believe this is one of them. - This change also enables us to remove synchronized from notifyMonitoringListener(). The notify method will be called very often and concurrently, and reducing the lock contention will be important. Using an Executor makes it possible to eliminate synchronization, at least at that level. - I associated a shared thread pool (Executor) for all dispatchers. I think it is desirable for dispatchers to share this thread pool rather than each instance of dispatchers creating and maintaining its own thread. - Renamed EventQueue to EventDispatcher. - I also moved the monitoring listener list to EventDispatcher. I also used CopyOnWriteArrayList as the implementation for the list. CopyOnWriteArrayList is an ideal choice for this as it is thread safe and lock-free. Also, our use case is heavy read-access but very infrequent write-access, which CopyOnWriteArrayList is suitable for. - I moved the connection_failed notification to before the getSession() call. The getSession() call here always throws an exception (by design), and thus notification needs to be done before calling getSession(). - I rewrote the CountingMonitor to use AtomicIntegers. This should be slightly safer. - I changed the timestamp calls from System.currentTimeMillis() to System.nanoTime()/100. The nanoTime() call is more high-res, as currentTimeMillis() may be tens of milliseconds accurate on some platforms, and thus not suitable for these measurements. I also have some more follow-up questions, which I'll post soon. Thanks, Sangjin On Jan 17, 2008 10:51 AM, Sangjin Lee [EMAIL PROTECTED] wrote: I like your idea of using the event listener as the main way of doing this. Basically no or multiple listeners would be invoked on a different thread when events occur. The event listener APIs would define those key methods which would contain all the necessary information about the events in an immutable fashion. We could provide a simple adapter that is no op so people can override necessary methods easily. Also, we could provide one implementation which is a counting listener that does the basic metrics collection. What do you think? Thanks, Sangjin On Jan 17, 2008 2:58 AM, Rick McGuire [EMAIL PROTECTED] wrote: Thunderbird is playing very strange games with me this morning, somehow deleting the original post. Anyway, here are my comments on this. I'd like to propose changes to enable some basic stat collection and/or instrumentation to have visibility into performance of AHC. For a given *AsyncHttpClient*, one might want to know metrics like - total request
Re: [AsyncHttpClient] data collection instrumentation
Thunderbird is playing very strange games with me this morning, somehow deleting the original post. Anyway, here are my comments on this. I'd like to propose changes to enable some basic stat collection and/or instrumentation to have visibility into performance of AHC. For a given *AsyncHttpClient*, one might want to know metrics like - total request count - total success count - total exception count - total timeout count - connection attempt count - connection failure count - connect time average - connection close count - average response time (as measured from the invocation time to having the response ready) - and others? Collection of metric information would, I think, be a good thing. However, I think we should separate the consolidation of the information from the collection. That is, the client should just have different types of events for data collection, and the event listener would be responsible for presenting the information appropriately. For example, to create the list above, I'd see the following set of events needed: - request made - request completed - request failed - request timeout - connection attempt started - connection failed - connection closed All events would be timestamped, which would allow metrics like average request time to be calculated. This set of events would mean the client would not need to maintain any metric accumulators, and if the event information is done correctly, would even allow more fine grained monitoring (e.g., average connection time for requests to domain foo.bar.com). Collecting these metrics should have little effect on the overall performance. There would be an API to access these stats. I was initially thinking of an IoFilter to consolidate these hooks, but I realize some of these metrics are not readily available to an IoFilter (e.g. connect-related numbers). It might be unavoidable to spread the instrumentation in a couple of places (IoHandler, ConnectFutureListener, etc.). Taking this one step further, one might think of callbacks or listeners for various key events such as connect complete, request sent, etc., so callers can provide instrumenting/logging code via event notification. However, I think this should be used judiciously as such injected code may cause havoc. I think listeners would be the way to go. This would allow multiple monitoring types to be attached to the pipe to gather data as needed. Perhaps the approached used with the javamail API might be of use here. The javamail Store APIs have a number of listener events that are broadcast (new mail arrived, message delete, folder created, etc.). Because there are similar concerns of havoc, the events get posted to a queue, and are dispatched on to a separate thread. The queue is only created (and the associated thread) are only created when there are listeners available to handle the events. This allows the events to very low overhead when there are no interested parties and prevents the listeners from interfering with normal javamail operations by being processed on a different thread. Thoughts? Suggestions?
Re: [AsyncHttpClient] data collection instrumentation
I like your idea of using the event listener as the main way of doing this. Basically no or multiple listeners would be invoked on a different thread when events occur. The event listener APIs would define those key methods which would contain all the necessary information about the events in an immutable fashion. We could provide a simple adapter that is no op so people can override necessary methods easily. Also, we could provide one implementation which is a counting listener that does the basic metrics collection. What do you think? Thanks, Sangjin On Jan 17, 2008 2:58 AM, Rick McGuire [EMAIL PROTECTED] wrote: Thunderbird is playing very strange games with me this morning, somehow deleting the original post. Anyway, here are my comments on this. I'd like to propose changes to enable some basic stat collection and/or instrumentation to have visibility into performance of AHC. For a given *AsyncHttpClient*, one might want to know metrics like - total request count - total success count - total exception count - total timeout count - connection attempt count - connection failure count - connect time average - connection close count - average response time (as measured from the invocation time to having the response ready) - and others? Collection of metric information would, I think, be a good thing. However, I think we should separate the consolidation of the information from the collection. That is, the client should just have different types of events for data collection, and the event listener would be responsible for presenting the information appropriately. For example, to create the list above, I'd see the following set of events needed: - request made - request completed - request failed - request timeout - connection attempt started - connection failed - connection closed All events would be timestamped, which would allow metrics like average request time to be calculated. This set of events would mean the client would not need to maintain any metric accumulators, and if the event information is done correctly, would even allow more fine grained monitoring (e.g., average connection time for requests to domain foo.bar.com). Collecting these metrics should have little effect on the overall performance. There would be an API to access these stats. I was initially thinking of an IoFilter to consolidate these hooks, but I realize some of these metrics are not readily available to an IoFilter (e.g. connect-related numbers). It might be unavoidable to spread the instrumentation in a couple of places (IoHandler, ConnectFutureListener, etc.). Taking this one step further, one might think of callbacks or listeners for various key events such as connect complete, request sent, etc., so callers can provide instrumenting/logging code via event notification. However, I think this should be used judiciously as such injected code may cause havoc. I think listeners would be the way to go. This would allow multiple monitoring types to be attached to the pipe to gather data as needed. Perhaps the approached used with the javamail API might be of use here. The javamail Store APIs have a number of listener events that are broadcast (new mail arrived, message delete, folder created, etc.). Because there are similar concerns of havoc, the events get posted to a queue, and are dispatched on to a separate thread. The queue is only created (and the associated thread) are only created when there are listeners available to handle the events. This allows the events to very low overhead when there are no interested parties and prevents the listeners from interfering with normal javamail operations by being processed on a different thread. Thoughts? Suggestions?