Hi Phil,
FWIW, if these patches haven't been committed, it looks good :)
I am really backlogged with all my emails.

auto-sm-tracking.patch:
-----------------------
At some point, new linked lists were added to track state machines that
are currently running within the server.  When an SM completes, it is
implicitly removed from the list.  However, SMs that were started
without a request (ie internal state machines) were not added to any
list.  This caused a segfault if any of these internal state machines
stopped (because the completion code assumes that all SMs should be
removed from a list).  This patch corrects the problem by just making an
extra linked list for state machine instances that are not associated
with a particular request.

Just curious: why/how did the internal state machine stop? I recall
asking Sam this
question when I added those lists and he mentioned that they shouldn't
stop unless the server was being terminated. The only place where I
thought the sm structure was being removed from the list was in
server_state_machine_complete() which is invoked on state machines
invoked via a request.
Or perhaps I misunderstood what he said :)
thanks,
Murali


mgmt-getconfig-assignment.patch:
--------------------------------
This patch adds some missing variable assignments in the mgmt-getconfig
state machine.  This problem might have previously caused pvfs2-validate
to complain about not being able to retrieve configuration data from
some servers.

mgmt-remove-dirent-handlecount.patch:
-------------------------------------
This patch updates mgmt-remove-dirent to match the normal rmdirent state
machine; a flag must now be set to inform trove when to update keyval
handle counts.  Otherwise the directory entry counting will get out of
whack when pvfs2-remove-object is used.

skip-retry-delay-on-cancel.patch:
---------------------------------
PVFS2 adds a "retry delay" between operation retries on the client side
to prevent busy spinning (and quick retry exhaustion) on network errors
that appear quickly.  However, it also tacked on this retry delay on
operations that had simply timed out.  This causes timeouts and retries
to take excessively long; if an operation fails because of a job
timeout, then you don't need to artificially delay before trying again.
  This patch addresses the problem by having the client state machines
differentiate between job timeouts and normal error codes.




_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers



_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to