Re: [AsyncHttpClient] data collection instrumentation

2008-01-24 Thread Rick McGuire
With the additional patch for the start times and connect times and the 
TimeMonitor, I feel this is in good enough shape to be committed, so I 
checked it in.  I also wrote some additional tests for this. 

I did find a small anomaly with my tests, however.  I was looking for a 
case that would trigger the CONNECTION_CLOSED event, and discovered 
there are no places in the code where that event is raised.  It appears, 
in general, that close events are handled by the server closing the 
connection and these come through as CONNECTION_CLOSED_BY_SERVER 
events.  Ok, perhaps the CONNECTION_CLOSED event is not needed. 

However, when writing some tests for the redirection cases, it appears 
that when we receive a redirection response back from the server, we're 
not getting a CONNECTION_CLOSED_BY_SERVER event.  If the connection is 
not cached, we appear to ended up with a dangling live connection.  
Should something to be done to explicitly close this connection in that 
situation?


Rick

Sangjin Lee wrote:



On Jan 23, 2008 6:50 AM, Rick McGuire [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:
 


Only if it can be done without having to maintain the same sort of
request-to-start time map that you don't wish to do with the listener.
The process of adding data collection should cause memory bloat there
either, particularly if monitoring is not being used (the more likely
case).  It seems more reasonable that this type of processing
should be
pushed into the monitoring implementation rather than have the async
client try to keep track of everything.  This way, the overhead is
only
introduced while metrics are being gathered.  A very simple and
relatively low cost means might be to add a couple of time stamp
fields
to the request object, but only for the most significant events.
Perhaps request start and connection start, but nothing else.  Another
possible approach would be to have a mechanism that would allow the
monitor to attach an annotation object to the request that could
be used
to implement a lifecycle memory if needed.  The cost of doing this is
relatively minor when this type of information is not needed, but it's
flexible enough to be tailored to any type of data collection.


I agree, the only things that make sense to be present are request 
start time and connect start time...  I'm going to upload a revised 
patch that shows this along with a proof-of-concept monitor that uses 
this to compute the average response time.


Thanks,
Sangjin
 




Re: [AsyncHttpClient] data collection instrumentation

2008-01-24 Thread Sangjin Lee
Thanks much!  I'm going over the changes, and will let you know if there is
anything I'd like to add/change.

On Jan 24, 2008 10:32 AM, Rick McGuire [EMAIL PROTECTED] wrote:

 With the additional patch for the start times and connect times and the
 TimeMonitor, I feel this is in good enough shape to be committed, so I
 checked it in.  I also wrote some additional tests for this.

 I did find a small anomaly with my tests, however.  I was looking for a
 case that would trigger the CONNECTION_CLOSED event, and discovered
 there are no places in the code where that event is raised.  It appears,
 in general, that close events are handled by the server closing the
 connection and these come through as CONNECTION_CLOSED_BY_SERVER
 events.  Ok, perhaps the CONNECTION_CLOSED event is not needed.


Yeah, I suspect CONNECTION_CLOSED_BY_SERVER would be enough.

BTW, I noticed that CONNECTION_CLOSED_BY_SERVER fires in
HttpIoHandler.exceptionCaught() as well.  I think it's redundant and will
cause double counting.  The exceptionCaught() method calls IoSession.close()
at the end, and HttpIoHandler.sessionClosed() will be invoked as a result,
which also fires the same event.  So I do not think we should fire the event
in exceptionCaught().




 However, when writing some tests for the redirection cases, it appears
 that when we receive a redirection response back from the server, we're
 not getting a CONNECTION_CLOSED_BY_SERVER event.  If the connection is
 not cached, we appear to ended up with a dangling live connection.
 Should something to be done to explicitly close this connection in that
 situation?


Hmm, that's a good point.  The same behavior applies for other non-redirect
cases as well.  If a session cache is provided, we cache it.  But if not, we
do nothing.  Perhaps it is safer to close the session explicitly in all 3
cases in HttpIoHandler.messageReceived().

This hasn't been a problem basically because if session cache is not enabled
we explicitly add Connection: close to the request headers, so servers
will close them anyway.  But I think it might be safer to close on our side
still...  What do you think?




 Rick

 Sangjin Lee wrote:
 
 
  On Jan 23, 2008 6:50 AM, Rick McGuire [EMAIL PROTECTED]
  mailto:[EMAIL PROTECTED] wrote:
 
 
  Only if it can be done without having to maintain the same sort of
  request-to-start time map that you don't wish to do with the
 listener.
  The process of adding data collection should cause memory bloat
 there
  either, particularly if monitoring is not being used (the more
 likely
  case).  It seems more reasonable that this type of processing
  should be
  pushed into the monitoring implementation rather than have the async
  client try to keep track of everything.  This way, the overhead is
  only
  introduced while metrics are being gathered.  A very simple and
  relatively low cost means might be to add a couple of time stamp
  fields
  to the request object, but only for the most significant events.
  Perhaps request start and connection start, but nothing else.
  Another
  possible approach would be to have a mechanism that would allow the
  monitor to attach an annotation object to the request that could
  be used
  to implement a lifecycle memory if needed.  The cost of doing this
 is
  relatively minor when this type of information is not needed, but
 it's
  flexible enough to be tailored to any type of data collection.
 
 
  I agree, the only things that make sense to be present are request
  start time and connect start time...  I'm going to upload a revised
  patch that shows this along with a proof-of-concept monitor that uses
  this to compute the average response time.
 
  Thanks,
  Sangjin
 




Re: [AsyncHttpClient] data collection instrumentation

2008-01-23 Thread Rick McGuire

Sangjin Lee wrote:

The modified patch is there on JIRA.  Some follow-up discussions...

I think the current implementation works well, but one thing that's 
difficult to do is to collecting timing data.  For example, some of 
the most important instrumentation data are things like average 
response time (from request start to request complete) and average 
connect time (from connect start to connect complete).


Currently the context object that's available to monitoring listeners 
is the request object, along with the timestamp of the event itself. 
 To be able to compute a response time for a given request, one would 
need to take the timestamp from the request start event, associate it 
with the request, and store it on the listener.  When the request 
complete event fires, then one would need to look up the stored data 
using the request object as a key to retrieve the timestamp for the 
request start event, compute the delta, and store the delta.


While all this is possible, it has a number of issues, not the least 
of which is that one would need to maintain a map of request to start 
time (as well as request to connect time).  This would bloat memory as 
well as other implications.


A substantially easier solution would be to provide the request start 
time and connect start time as part of the information that's passed 
to the monitoring listener.  Then listeners could simply compute the 
diff to get the elapsed time very easily with no need to maintain maps 
of any kind.  This could be either part of the request object itself, 
or if desirable, one could consider a separate context or event object 
that contains this information.  What do you think?


Thanks,
Sangjin

On Jan 22, 2008 1:33 PM, Sangjin Lee [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:


I took a look at the patch on GERONIMO-3761, and it looks great.
 Thanks.  I have modified your patch for several things, though,
and I'm nearly ready to add it to the JIRA report.  Comments about
the changes...

- I rewrote the EventQueue class to use an Executor.  Since the
Executor implementation provided by the JDK is basically a thread
pool associated with a task queue, it provides an identical
functionality to what was in EventQueue.  I think that it is good
to use the constructs from java.util.concurrent.* whenever it
makes sense, and I believe this is one of them.

- This change also enables us to remove synchronized from
notifyMonitoringListener().  The notify method will be called very
often and concurrently, and reducing the lock contention will be
important.  Using an Executor makes it possible to eliminate
synchronization, at least at that level.

- I associated a shared thread pool (Executor) for all
dispatchers.  I think it is desirable for dispatchers to share
this thread pool rather than each instance of dispatchers creating
and maintaining its own thread.

- Renamed EventQueue to EventDispatcher.

- I also moved the monitoring listener list to EventDispatcher.  I
also used CopyOnWriteArrayList as the implementation for the list.
 CopyOnWriteArrayList is an ideal choice for this as it is thread
safe and lock-free.  Also, our use case is heavy read-access but
very infrequent write-access, which CopyOnWriteArrayList is
suitable for.

- I moved the connection_failed notification to before the
getSession() call.  The getSession() call here always throws an
exception (by design), and thus notification needs to be done
before calling getSession().

- I rewrote the CountingMonitor to use AtomicIntegers.  This
should be slightly safer.

- I changed the timestamp calls from System.currentTimeMillis() to
System.nanoTime()/100.  The nanoTime() call is more high-res,
as currentTimeMillis() may be tens of milliseconds accurate on
some platforms, and thus not suitable for these measurements.

I also have some more follow-up questions, which I'll post soon.

Thanks,
Sangjin


On Jan 17, 2008 10:51 AM, Sangjin Lee [EMAIL PROTECTED]
mailto:[EMAIL PROTECTED] wrote:

I like your idea of using the event listener as the main way
of doing this.  Basically no or multiple listeners would be
invoked on a different thread when events occur.

The event listener APIs would define those key methods which
would contain all the necessary information about the events
in an immutable fashion.

We could provide a simple adapter that is no op so people can
override necessary methods easily.  Also, we could provide one
implementation which is a counting listener that does the
basic metrics collection.

What do you think?

Only if it can be done without having to maintain the same sort of 
request-to-start time map that you don't wish to do with the listener.  
The process of adding data collection should 

Re: [AsyncHttpClient] data collection instrumentation

2008-01-23 Thread Rick McGuire
The changes you made look really nice.  I think my only concern is 
whether the Executor used by the EventDispatcher needs to be cleaned up 
at some point.  If it gets finalized appropriately and doesn't leave 
dangling threads, then I guess I'm ok with it.


Rick


Sangjin Lee wrote:
I took a look at the patch on GERONIMO-3761, and it looks great. 
 Thanks.  I have modified your patch for several things, though, and 
I'm nearly ready to add it to the JIRA report.  Comments about the 
changes...


- I rewrote the EventQueue class to use an Executor.  Since the 
Executor implementation provided by the JDK is basically a thread pool 
associated with a task queue, it provides an identical functionality 
to what was in EventQueue.  I think that it is good to use the 
constructs from java.util.concurrent.* whenever it makes sense, and I 
believe this is one of them.


- This change also enables us to remove synchronized from 
notifyMonitoringListener().  The notify method will be called very 
often and concurrently, and reducing the lock contention will be 
important.  Using an Executor makes it possible to eliminate 
synchronization, at least at that level.


- I associated a shared thread pool (Executor) for all dispatchers.  I 
think it is desirable for dispatchers to share this thread pool rather 
than each instance of dispatchers creating and maintaining its own 
thread.


- Renamed EventQueue to EventDispatcher.

- I also moved the monitoring listener list to EventDispatcher.  I 
also used CopyOnWriteArrayList as the implementation for the list. 
 CopyOnWriteArrayList is an ideal choice for this as it is thread safe 
and lock-free.  Also, our use case is heavy read-access but very 
infrequent write-access, which CopyOnWriteArrayList is suitable for.


- I moved the connection_failed notification to before the 
getSession() call.  The getSession() call here always throws an 
exception (by design), and thus notification needs to be done before 
calling getSession().


- I rewrote the CountingMonitor to use AtomicIntegers.  This should be 
slightly safer.


- I changed the timestamp calls from System.currentTimeMillis() to 
System.nanoTime()/100.  The nanoTime() call is more high-res, as 
currentTimeMillis() may be tens of milliseconds accurate on some 
platforms, and thus not suitable for these measurements.


I also have some more follow-up questions, which I'll post soon.

Thanks,
Sangjin


On Jan 17, 2008 10:51 AM, Sangjin Lee [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:


I like your idea of using the event listener as the main way of
doing this.  Basically no or multiple listeners would be invoked
on a different thread when events occur.

The event listener APIs would define those key methods which would
contain all the necessary information about the events in an
immutable fashion.

We could provide a simple adapter that is no op so people can
override necessary methods easily.  Also, we could provide one
implementation which is a counting listener that does the basic
metrics collection.

What do you think?

Thanks,
Sangjin

On Jan 17, 2008 2:58 AM, Rick McGuire  [EMAIL PROTECTED]
mailto:[EMAIL PROTECTED] wrote:

Thunderbird is playing very strange games with me this
morning, somehow
deleting the original post.   Anyway, here are my comments on
this.

 I'd like to propose changes to enable some basic stat collection
 and/or instrumentation to have visibility into performance
of AHC.
  For a given *AsyncHttpClient*, one might want to know
metrics like

 - total request count
 - total success count
 - total exception count
 - total timeout count
 - connection attempt count
 - connection failure count
 - connect time average
 - connection close count
 - average response time (as measured from the invocation time to
 having the response ready)
 - and others?
Collection of metric information would, I think, be a good thing.
However, I think we should separate the consolidation of the
information
from the collection.  That is, the client should just have
different
types of events for data collection, and the event listener
would be
responsible for presenting the information appropriately.

For example, to create the list above, I'd see the following
set of
events needed:

- request made
- request completed
- request failed
- request timeout
- connection attempt started
- connection failed
- connection closed

All events would be timestamped, which would allow metrics
like average
request time to be calculated.  This set of events would mean the
client would not need to maintain any metric 

Re: [AsyncHttpClient] data collection instrumentation

2008-01-23 Thread Sangjin Lee
The executor created in the EventDispatcher is a daemon thread pool (that's
why I added DaemonThreadFactory), so it will go away cleanly without
cleanup.
Thanks,
Sangjin

On Jan 23, 2008 6:54 AM, Rick McGuire [EMAIL PROTECTED] wrote:

 The changes you made look really nice.  I think my only concern is
 whether the Executor used by the EventDispatcher needs to be cleaned up
 at some point.  If it gets finalized appropriately and doesn't leave
 dangling threads, then I guess I'm ok with it.

 Rick


 Sangjin Lee wrote:
  I took a look at the patch on GERONIMO-3761, and it looks great.
   Thanks.  I have modified your patch for several things, though, and
  I'm nearly ready to add it to the JIRA report.  Comments about the
  changes...
 
  - I rewrote the EventQueue class to use an Executor.  Since the
  Executor implementation provided by the JDK is basically a thread pool
  associated with a task queue, it provides an identical functionality
  to what was in EventQueue.  I think that it is good to use the
  constructs from java.util.concurrent.* whenever it makes sense, and I
  believe this is one of them.
 
  - This change also enables us to remove synchronized from
  notifyMonitoringListener().  The notify method will be called very
  often and concurrently, and reducing the lock contention will be
  important.  Using an Executor makes it possible to eliminate
  synchronization, at least at that level.
 
  - I associated a shared thread pool (Executor) for all dispatchers.  I
  think it is desirable for dispatchers to share this thread pool rather
  than each instance of dispatchers creating and maintaining its own
  thread.
 
  - Renamed EventQueue to EventDispatcher.
 
  - I also moved the monitoring listener list to EventDispatcher.  I
  also used CopyOnWriteArrayList as the implementation for the list.
   CopyOnWriteArrayList is an ideal choice for this as it is thread safe
  and lock-free.  Also, our use case is heavy read-access but very
  infrequent write-access, which CopyOnWriteArrayList is suitable for.
 
  - I moved the connection_failed notification to before the
  getSession() call.  The getSession() call here always throws an
  exception (by design), and thus notification needs to be done before
  calling getSession().
 
  - I rewrote the CountingMonitor to use AtomicIntegers.  This should be
  slightly safer.
 
  - I changed the timestamp calls from System.currentTimeMillis() to
  System.nanoTime()/100.  The nanoTime() call is more high-res, as
  currentTimeMillis() may be tens of milliseconds accurate on some
  platforms, and thus not suitable for these measurements.
 
  I also have some more follow-up questions, which I'll post soon.
 
  Thanks,
  Sangjin
 
 
  On Jan 17, 2008 10:51 AM, Sangjin Lee [EMAIL PROTECTED]
  mailto:[EMAIL PROTECTED] wrote:
 
  I like your idea of using the event listener as the main way of
  doing this.  Basically no or multiple listeners would be invoked
  on a different thread when events occur.
 
  The event listener APIs would define those key methods which would
  contain all the necessary information about the events in an
  immutable fashion.
 
  We could provide a simple adapter that is no op so people can
  override necessary methods easily.  Also, we could provide one
  implementation which is a counting listener that does the basic
  metrics collection.
 
  What do you think?
 
  Thanks,
  Sangjin
 
  On Jan 17, 2008 2:58 AM, Rick McGuire  [EMAIL PROTECTED]
  mailto:[EMAIL PROTECTED] wrote:
 
  Thunderbird is playing very strange games with me this
  morning, somehow
  deleting the original post.   Anyway, here are my comments on
  this.
 
   I'd like to propose changes to enable some basic stat
 collection
   and/or instrumentation to have visibility into performance
  of AHC.
For a given *AsyncHttpClient*, one might want to know
  metrics like
  
   - total request count
   - total success count
   - total exception count
   - total timeout count
   - connection attempt count
   - connection failure count
   - connect time average
   - connection close count
   - average response time (as measured from the invocation time
 to
   having the response ready)
   - and others?
  Collection of metric information would, I think, be a good
 thing.
  However, I think we should separate the consolidation of the
  information
  from the collection.  That is, the client should just have
  different
  types of events for data collection, and the event listener
  would be
  responsible for presenting the information appropriately.
 
  For example, to create the list above, I'd see the following
  set of
  events needed:
 
  - request 

Re: [AsyncHttpClient] data collection instrumentation

2008-01-23 Thread Sangjin Lee
On Jan 23, 2008 6:50 AM, Rick McGuire [EMAIL PROTECTED] wrote:


 Only if it can be done without having to maintain the same sort of
 request-to-start time map that you don't wish to do with the listener.
 The process of adding data collection should cause memory bloat there
 either, particularly if monitoring is not being used (the more likely
 case).  It seems more reasonable that this type of processing should be
 pushed into the monitoring implementation rather than have the async
 client try to keep track of everything.  This way, the overhead is only
 introduced while metrics are being gathered.  A very simple and
 relatively low cost means might be to add a couple of time stamp fields
 to the request object, but only for the most significant events.
 Perhaps request start and connection start, but nothing else.  Another
 possible approach would be to have a mechanism that would allow the
 monitor to attach an annotation object to the request that could be used
 to implement a lifecycle memory if needed.  The cost of doing this is
 relatively minor when this type of information is not needed, but it's
 flexible enough to be tailored to any type of data collection.


I agree, the only things that make sense to be present are request start
time and connect start time...  I'm going to upload a revised patch that
shows this along with a proof-of-concept monitor that uses this to compute
the average response time.

Thanks,
Sangjin


Re: [AsyncHttpClient] data collection instrumentation

2008-01-22 Thread Sangjin Lee
I took a look at the patch on GERONIMO-3761, and it looks great.  Thanks.  I
have modified your patch for several things, though, and I'm nearly ready to
add it to the JIRA report.  Comments about the changes...
- I rewrote the EventQueue class to use an Executor.  Since the Executor
implementation provided by the JDK is basically a thread pool associated
with a task queue, it provides an identical functionality to what was in
EventQueue.  I think that it is good to use the constructs from
java.util.concurrent.* whenever it makes sense, and I believe this is one of
them.

- This change also enables us to remove synchronized from
notifyMonitoringListener().  The notify method will be called very often and
concurrently, and reducing the lock contention will be important.  Using an
Executor makes it possible to eliminate synchronization, at least at that
level.

- I associated a shared thread pool (Executor) for all dispatchers.  I think
it is desirable for dispatchers to share this thread pool rather than each
instance of dispatchers creating and maintaining its own thread.

- Renamed EventQueue to EventDispatcher.

- I also moved the monitoring listener list to EventDispatcher.  I also used
CopyOnWriteArrayList as the implementation for the list.
 CopyOnWriteArrayList is an ideal choice for this as it is thread safe and
lock-free.  Also, our use case is heavy read-access but very infrequent
write-access, which CopyOnWriteArrayList is suitable for.

- I moved the connection_failed notification to before the getSession()
call.  The getSession() call here always throws an exception (by design),
and thus notification needs to be done before calling getSession().

- I rewrote the CountingMonitor to use AtomicIntegers.  This should be
slightly safer.

- I changed the timestamp calls from System.currentTimeMillis() to
System.nanoTime()/100.  The nanoTime() call is more high-res, as
currentTimeMillis() may be tens of milliseconds accurate on some platforms,
and thus not suitable for these measurements.

I also have some more follow-up questions, which I'll post soon.

Thanks,
Sangjin


On Jan 17, 2008 10:51 AM, Sangjin Lee [EMAIL PROTECTED] wrote:

 I like your idea of using the event listener as the main way of doing
 this.  Basically no or multiple listeners would be invoked on a different
 thread when events occur.
 The event listener APIs would define those key methods which would contain
 all the necessary information about the events in an immutable fashion.

 We could provide a simple adapter that is no op so people can override
 necessary methods easily.  Also, we could provide one implementation which
 is a counting listener that does the basic metrics collection.

 What do you think?

 Thanks,
 Sangjin

 On Jan 17, 2008 2:58 AM, Rick McGuire  [EMAIL PROTECTED] wrote:

  Thunderbird is playing very strange games with me this morning, somehow
  deleting the original post.   Anyway, here are my comments on this.
 
   I'd like to propose changes to enable some basic stat collection
   and/or instrumentation to have visibility into performance of AHC.
For a given *AsyncHttpClient*, one might want to know metrics like
  
   - total request count
   - total success count
   - total exception count
   - total timeout count
   - connection attempt count
   - connection failure count
   - connect time average
   - connection close count
   - average response time (as measured from the invocation time to
   having the response ready)
   - and others?
  Collection of metric information would, I think, be a good thing.
  However, I think we should separate the consolidation of the information
  from the collection.  That is, the client should just have different
  types of events for data collection, and the event listener would be
  responsible for presenting the information appropriately.
 
  For example, to create the list above, I'd see the following set of
  events needed:
 
  - request made
  - request completed
  - request failed
  - request timeout
  - connection attempt started
  - connection failed
  - connection closed
 
  All events would be timestamped, which would allow metrics like average
 
  request time to be calculated.  This set of events would mean the
  client would not need to maintain any metric accumulators, and if the
  event information is done correctly, would even allow more fine grained
  monitoring (e.g., average connection time for requests to domain
  foo.bar.com).
 
 
  
   Collecting these metrics should have little effect on the overall
   performance.  There would be an API to access these stats.
  
   I was initially thinking of an IoFilter to consolidate these hooks,
   but I realize some of these metrics are not readily available to an
   IoFilter (e.g. connect-related numbers).  It might be unavoidable to
   spread the instrumentation in a couple of places (IoHandler,
   ConnectFutureListener, etc.).
  
   Taking this one step further, one might think of callbacks or
  

Re: [AsyncHttpClient] data collection instrumentation

2008-01-22 Thread Sangjin Lee
The modified patch is there on JIRA.  Some follow-up discussions...
I think the current implementation works well, but one thing that's
difficult to do is to collecting timing data.  For example, some of the most
important instrumentation data are things like average response time (from
request start to request complete) and average connect time (from connect
start to connect complete).

Currently the context object that's available to monitoring listeners is the
request object, along with the timestamp of the event itself.  To be able to
compute a response time for a given request, one would need to take the
timestamp from the request start event, associate it with the request, and
store it on the listener.  When the request complete event fires, then one
would need to look up the stored data using the request object as a key to
retrieve the timestamp for the request start event, compute the delta, and
store the delta.

While all this is possible, it has a number of issues, not the least of
which is that one would need to maintain a map of request to start time (as
well as request to connect time).  This would bloat memory as well as other
implications.

A substantially easier solution would be to provide the request start time
and connect start time as part of the information that's passed to the
monitoring listener.  Then listeners could simply compute the diff to get
the elapsed time very easily with no need to maintain maps of any kind.
 This could be either part of the request object itself, or if desirable,
one could consider a separate context or event object that contains this
information.  What do you think?

Thanks,
Sangjin

On Jan 22, 2008 1:33 PM, Sangjin Lee [EMAIL PROTECTED] wrote:

 I took a look at the patch on GERONIMO-3761, and it looks great.  Thanks.
  I have modified your patch for several things, though, and I'm nearly ready
 to add it to the JIRA report.  Comments about the changes...
 - I rewrote the EventQueue class to use an Executor.  Since the Executor
 implementation provided by the JDK is basically a thread pool associated
 with a task queue, it provides an identical functionality to what was in
 EventQueue.  I think that it is good to use the constructs from
 java.util.concurrent.* whenever it makes sense, and I believe this is one
 of them.

 - This change also enables us to remove synchronized from
 notifyMonitoringListener().  The notify method will be called very often and
 concurrently, and reducing the lock contention will be important.  Using an
 Executor makes it possible to eliminate synchronization, at least at that
 level.

 - I associated a shared thread pool (Executor) for all dispatchers.  I
 think it is desirable for dispatchers to share this thread pool rather than
 each instance of dispatchers creating and maintaining its own thread.

 - Renamed EventQueue to EventDispatcher.

 - I also moved the monitoring listener list to EventDispatcher.  I also
 used CopyOnWriteArrayList as the implementation for the list.
  CopyOnWriteArrayList is an ideal choice for this as it is thread safe and
 lock-free.  Also, our use case is heavy read-access but very infrequent
 write-access, which CopyOnWriteArrayList is suitable for.

 - I moved the connection_failed notification to before the getSession()
 call.  The getSession() call here always throws an exception (by design),
 and thus notification needs to be done before calling getSession().

 - I rewrote the CountingMonitor to use AtomicIntegers.  This should be
 slightly safer.

 - I changed the timestamp calls from System.currentTimeMillis() to
 System.nanoTime()/100.  The nanoTime() call is more high-res, as
 currentTimeMillis() may be tens of milliseconds accurate on some platforms,
 and thus not suitable for these measurements.

 I also have some more follow-up questions, which I'll post soon.

 Thanks,
 Sangjin


 On Jan 17, 2008 10:51 AM, Sangjin Lee [EMAIL PROTECTED] wrote:

  I like your idea of using the event listener as the main way of doing
  this.  Basically no or multiple listeners would be invoked on a different
  thread when events occur.
  The event listener APIs would define those key methods which would
  contain all the necessary information about the events in an immutable
  fashion.
 
  We could provide a simple adapter that is no op so people can override
  necessary methods easily.  Also, we could provide one implementation which
  is a counting listener that does the basic metrics collection.
 
  What do you think?
 
  Thanks,
  Sangjin
 
  On Jan 17, 2008 2:58 AM, Rick McGuire  [EMAIL PROTECTED] wrote:
 
   Thunderbird is playing very strange games with me this morning,
   somehow
   deleting the original post.   Anyway, here are my comments on this.
  
I'd like to propose changes to enable some basic stat collection
and/or instrumentation to have visibility into performance of AHC.
 For a given *AsyncHttpClient*, one might want to know metrics like
   
- total request 

Re: [AsyncHttpClient] data collection instrumentation

2008-01-17 Thread Rick McGuire
Thunderbird is playing very strange games with me this morning, somehow 
deleting the original post.   Anyway, here are my comments on this.


I'd like to propose changes to enable some basic stat collection 
and/or instrumentation to have visibility into performance of AHC. 
 For a given *AsyncHttpClient*, one might want to know metrics like


- total request count
- total success count
- total exception count
- total timeout count
- connection attempt count
- connection failure count
- connect time average
- connection close count
- average response time (as measured from the invocation time to 
having the response ready)

- and others?
Collection of metric information would, I think, be a good thing.  
However, I think we should separate the consolidation of the information 
from the collection.  That is, the client should just have different 
types of events for data collection, and the event listener would be 
responsible for presenting the information appropriately. 

For example, to create the list above, I'd see the following set of 
events needed:


- request made
- request completed
- request failed
- request timeout
- connection attempt started
- connection failed
- connection closed

All events would be timestamped, which would allow metrics like average 
request time to be calculated.  This set of events would mean the 
client would not need to maintain any metric accumulators, and if the 
event information is done correctly, would even allow more fine grained 
monitoring (e.g., average connection time for requests to domain 
foo.bar.com).





Collecting these metrics should have little effect on the overall 
performance.  There would be an API to access these stats.


I was initially thinking of an IoFilter to consolidate these hooks, 
but I realize some of these metrics are not readily available to an 
IoFilter (e.g. connect-related numbers).  It might be unavoidable to 
spread the instrumentation in a couple of places (IoHandler, 
ConnectFutureListener, etc.).


Taking this one step further, one might think of callbacks or 
listeners for various key events such as connect complete, request 
sent, etc., so callers can provide instrumenting/logging code via 
event notification.  However, I think this should be used judiciously 
as such injected code may cause havoc.
I think listeners would be the way to go.  This would allow multiple 
monitoring types to be attached to the pipe to gather data as needed.  
Perhaps the approached used with the javamail API might be of use here.  
The javamail Store APIs have a number of listener events that are 
broadcast (new mail arrived, message delete, folder created, etc.).  
Because there are similar concerns of havoc, the events get posted to a 
queue, and are dispatched on to a separate thread.  The queue is only 
created (and the associated thread) are only created when there are 
listeners available to handle the events.  This allows the events to 
very low overhead when there are no interested parties and prevents the 
listeners from interfering with normal javamail operations by being 
processed on a different thread.





Thoughts?  Suggestions?


Re: [AsyncHttpClient] data collection instrumentation

2008-01-17 Thread Sangjin Lee
I like your idea of using the event listener as the main way of doing this.
 Basically no or multiple listeners would be invoked on a different thread
when events occur.
The event listener APIs would define those key methods which would contain
all the necessary information about the events in an immutable fashion.

We could provide a simple adapter that is no op so people can override
necessary methods easily.  Also, we could provide one implementation which
is a counting listener that does the basic metrics collection.

What do you think?

Thanks,
Sangjin

On Jan 17, 2008 2:58 AM, Rick McGuire [EMAIL PROTECTED] wrote:

 Thunderbird is playing very strange games with me this morning, somehow
 deleting the original post.   Anyway, here are my comments on this.

  I'd like to propose changes to enable some basic stat collection
  and/or instrumentation to have visibility into performance of AHC.
   For a given *AsyncHttpClient*, one might want to know metrics like
 
  - total request count
  - total success count
  - total exception count
  - total timeout count
  - connection attempt count
  - connection failure count
  - connect time average
  - connection close count
  - average response time (as measured from the invocation time to
  having the response ready)
  - and others?
 Collection of metric information would, I think, be a good thing.
 However, I think we should separate the consolidation of the information
 from the collection.  That is, the client should just have different
 types of events for data collection, and the event listener would be
 responsible for presenting the information appropriately.

 For example, to create the list above, I'd see the following set of
 events needed:

 - request made
 - request completed
 - request failed
 - request timeout
 - connection attempt started
 - connection failed
 - connection closed

 All events would be timestamped, which would allow metrics like average
 request time to be calculated.  This set of events would mean the
 client would not need to maintain any metric accumulators, and if the
 event information is done correctly, would even allow more fine grained
 monitoring (e.g., average connection time for requests to domain
 foo.bar.com).


 
  Collecting these metrics should have little effect on the overall
  performance.  There would be an API to access these stats.
 
  I was initially thinking of an IoFilter to consolidate these hooks,
  but I realize some of these metrics are not readily available to an
  IoFilter (e.g. connect-related numbers).  It might be unavoidable to
  spread the instrumentation in a couple of places (IoHandler,
  ConnectFutureListener, etc.).
 
  Taking this one step further, one might think of callbacks or
  listeners for various key events such as connect complete, request
  sent, etc., so callers can provide instrumenting/logging code via
  event notification.  However, I think this should be used judiciously
  as such injected code may cause havoc.
 I think listeners would be the way to go.  This would allow multiple
 monitoring types to be attached to the pipe to gather data as needed.
 Perhaps the approached used with the javamail API might be of use here.
 The javamail Store APIs have a number of listener events that are
 broadcast (new mail arrived, message delete, folder created, etc.).
 Because there are similar concerns of havoc, the events get posted to a
 queue, and are dispatched on to a separate thread.  The queue is only
 created (and the associated thread) are only created when there are
 listeners available to handle the events.  This allows the events to
 very low overhead when there are no interested parties and prevents the
 listeners from interfering with normal javamail operations by being
 processed on a different thread.


 
  Thoughts?  Suggestions?