Hi Phil, FWIW, if these patches haven't been committed, it looks good :) I am really backlogged with all my emails.
auto-sm-tracking.patch: ----------------------- At some point, new linked lists were added to track state machines that are currently running within the server. When an SM completes, it is implicitly removed from the list. However, SMs that were started without a request (ie internal state machines) were not added to any list. This caused a segfault if any of these internal state machines stopped (because the completion code assumes that all SMs should be removed from a list). This patch corrects the problem by just making an extra linked list for state machine instances that are not associated with a particular request.
Just curious: why/how did the internal state machine stop? I recall asking Sam this question when I added those lists and he mentioned that they shouldn't stop unless the server was being terminated. The only place where I thought the sm structure was being removed from the list was in server_state_machine_complete() which is invoked on state machines invoked via a request. Or perhaps I misunderstood what he said :) thanks, Murali
mgmt-getconfig-assignment.patch: -------------------------------- This patch adds some missing variable assignments in the mgmt-getconfig state machine. This problem might have previously caused pvfs2-validate to complain about not being able to retrieve configuration data from some servers. mgmt-remove-dirent-handlecount.patch: ------------------------------------- This patch updates mgmt-remove-dirent to match the normal rmdirent state machine; a flag must now be set to inform trove when to update keyval handle counts. Otherwise the directory entry counting will get out of whack when pvfs2-remove-object is used. skip-retry-delay-on-cancel.patch: --------------------------------- PVFS2 adds a "retry delay" between operation retries on the client side to prevent busy spinning (and quick retry exhaustion) on network errors that appear quickly. However, it also tacked on this retry delay on operations that had simply timed out. This causes timeouts and retries to take excessively long; if an operation fails because of a job timeout, then you don't need to artificially delay before trying again. This patch addresses the problem by having the client state machines differentiate between job timeouts and normal error codes. _______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
