Re: [vdsm] flowID schema
On Mon, Feb 13, 2012 at 12:06:46PM -0500, Ayal Baron wrote: - Original Message - On 02/13/2012 02:28 PM, Ayal Baron wrote: ... is that it (ab)uses an http header for carrying FlowID, Yes, it certainly does appear to overload it. I would be nice to have something formal given to it by engine, but I can appreciate the difficulty implementing such a scheme. Technically I disagree, this is a cross cutting concern which has nothing to do with any specific call hence it should be passed as a header, that is actually rather elegant. To the specific matter at hand though. what would really be nice is solving the real problem properly, and not contaminating the API and the log with things which have marginal benefit if at all. going back to the 'grep' issue. vdsm logs are verbose. they are multi-threaded as well. I think this should be more than just about finding the entry point of the flow, then identifying for this specific log format how to trace it, which would require writing a log analyzer with plugins for each component. having all lines which are relevant to a flow with a flowid logged in them would make it much easier to get all (or most) of relevant parts of the flow (most, since something orthogonal to the flow may have happened affecting it, like loss of network) I'm sorry but what you're proposing is to make the log even more difficult to read for absolutely NO reason. I haven't seen 1 good reason to add more to the log. What we should be focusing on is: 1. adding the relevant data that is needed to the engine log so that most of the time users wouldn't need to go the host 2. reducing the verbosity of the vdsm log and increasing readability (the flow ID does exactly the opposite). As opposed to most people here who are thinking that this sounds like a good idea, I actually have debugged at least dozens of issues in engine and vdsm and can assure you that not once would this have been beneficial to me. When I debug an Engine-related issue, I tend to find a silly API call in Vdsm. Then I have to start correlating this to Engine logs. This step can be made quicker and less error-prone by logging FlowID both Engine and Vdsm. To me, this is the 1 good reason for logging FlowID on API entry in Vdsm. However, I find adding FlowID to each and every log line a bit excessive. Our log is too cluttered as it is. Logging FlowID whenever a new thread is spawned makes more sense to me. What was mostly missing in the engine logs was understanding what thread in engine called what operation in vdsm and what vdsm's response was. In 3.0 my understanding is that engine fixed this so this entire feature will be counter productive (will make logs less readable and harder to decipher, adds complexity to the API and adds complexity to users of the rest API). All cross hosts issues stem from *different* flows, so this would not help in this case and single host issues are easily traceable today (and you *never* need to follow an entire flow, it's entirely redundant and inefficient). I'm more than willing to show this on any set of logs by the way and would be happy to be proven wrong. More often than not by the way, the issue is that inside a specific call (i.e. 1 verb, not a flow) people are not proficient in finding the offending line (which is why I wrote the 'how to read the vdsm log' wiki). ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] vdsm hangs in SamplingMethod after reinstall
On Sun, Feb 12, 2012 at 06:46:25PM -0500, Ayal Baron wrote: - Original Message - On Thu, Feb 09, 2012 at 07:15:48PM -0500, Ayal Baron wrote: - Original Message - Hi. I am running into a very annoying problem when working on vdsm lately. My development process involves stopping vdsm, replacing files, and restarting it. I do this pretty frequently. Sometimes, after restarting vdsm the XMLRPC call getStorageDomainsList() hangs. The following line is the last to Can you post the exact flow you're running? Still working on this. It isn't reproducing reliably -- only when I really need to get some work done :) print in the log: Thread-18::DEBUG::2012-02-09 17:11:46,793::misc::1017::SamplingMethod::(__call__) Trying to enter sampling method (storage.sdc.refreshStorage) The only solution I've been able to come up with is restarting my machine. When stopping vdsm I search for any stale threads but I am unable to find them. Do you know what else might be causing DynamicBarrier.enter() to hang for a long period of time? Do the threading primitives use some sort of temporary disk storage that needs to be cleaned up? Thanks for the help! Try to add some logging in sdc.py: def refreshStorage(self): ADD LOG HERE Yep have done this and I am not even getting into the refreshStorage function. We actually hang in DynamicBarrier.enter(). I am going to add some debugging to determine which locking operation gets stuck. On the face of it it sounds like a python bug. Is supervdsm running? did you try killing it as well? Are you sure there is no 'Got in to sampling method' line in the log? Have you tried adding logging in 'enter' to see at what stage exactly you get stuck? (side note - code should probably be updated with 'with' as it was originally written for use with python 2.4) multipath.rescan() I have a feeling that your issue is not with SamplingMethod -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] vdsm hangs in SamplingMethod after reinstall
- Original Message - On Sun, Feb 12, 2012 at 06:46:25PM -0500, Ayal Baron wrote: - Original Message - On Thu, Feb 09, 2012 at 07:15:48PM -0500, Ayal Baron wrote: - Original Message - Hi. I am running into a very annoying problem when working on vdsm lately. My development process involves stopping vdsm, replacing files, and restarting it. I do this pretty frequently. Sometimes, after restarting vdsm the XMLRPC call getStorageDomainsList() hangs. The following line is the last to Can you post the exact flow you're running? Still working on this. It isn't reproducing reliably -- only when I really need to get some work done :) So try finalizing those MOM patches and you should see this in no time ;) print in the log: Thread-18::DEBUG::2012-02-09 17:11:46,793::misc::1017::SamplingMethod::(__call__) Trying to enter sampling method (storage.sdc.refreshStorage) The only solution I've been able to come up with is restarting my machine. When stopping vdsm I search for any stale threads but I am unable to find them. Do you know what else might be causing DynamicBarrier.enter() to hang for a long period of time? Do the threading primitives use some sort of temporary disk storage that needs to be cleaned up? Thanks for the help! Try to add some logging in sdc.py: def refreshStorage(self): ADD LOG HERE Yep have done this and I am not even getting into the refreshStorage function. We actually hang in DynamicBarrier.enter(). I am going to add some debugging to determine which locking operation gets stuck. On the face of it it sounds like a python bug. Is supervdsm running? did you try killing it as well? Are you sure there is no 'Got in to sampling method' line in the log? Have you tried adding logging in 'enter' to see at what stage exactly you get stuck? (side note - code should probably be updated with 'with' as it was originally written for use with python 2.4) multipath.rescan() I have a feeling that your issue is not with SamplingMethod -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel