Re: [vdsm] flowID schema

2012-02-20 Thread Keith Robertson

On 02/19/2012 05:32 AM, Ayal Baron wrote:

FlowID does absolutely nothing towards that end.  What is required for that 
(and should indeed be done) is work on improving the existing logs:
1. reduce verbosity
2. make sure that required points in the flow are logged
3. improve log messages

Don't forget the MSGID change that Keith promised to push as well.
Keith hasn't forgotten.  He is just trying to get a standardized format 
and tooling that are *easy* to use and which is minimally intrusive to 
development.


Had a few meetings in Brno last week to discuss logging and I think that 
we have a fairly good plan now.  As soon as I finish the wiki, I'll send 
a link.


Cheers,
Keith

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-19 Thread Ayal Baron
[SNIP]

> > 
> > The approach to this thing has been wrong from the word go.
> > Someone suggested a *solution* (flowid) without stating the
> > problem.
> > What Saggi and I are trying to figure out is the problem and from
> > there devise a proper solution.
> > Because the problem has not been well defined (real use case with
> > real logs to back it up) then this discussion is up in the air.
> > We can argue about it till kingdom come but it's all theoretical
> > and
> > irrelevant.
> > The fact of the matter is though that adding more IDs will clutter
> > the log and that this is also adding code so it's more places for
> > bugs, so there had better be a good reason for it otherwise we're
> > making things more difficult to debug, not easier.
> > The case where you had to spend 30m finding the right log is simply
> > because you did not have enough info in the engine log before.  Now
> > that we do have this info (every call to vdsm and every result),
> > there is absolutely no reason for you to waste so much time.
> > 
> > Wrt end users debugging flows - end users should get all the
> > information they need in the engine log, the second they have to
> > reach the host we've failed.  Improving the logs on the hosts for
> > the sake of the end users is like putting a band aid on a severed
> > limb.
> 
> Unfortunately, this almost never happens. with 3.0 things improved,
> but debugging an ovirt issue is still a complicated business. If you
> want to define a problem and start working on the solution - this is
> it. The ultimate goal is to provide logging, that are
> 1. Uncluttered

FlowId does the opposite

> 2. easy to read and understand

Adding more (flowID) to each line makes things less legible.

> 3. do not require special training (you and Saggi know every verb,
> and you know what to look for. Imagine for a split second, that
> someone is not this vdsm-savvy)

FlowID does absolutely nothing towards that end.  What is required for that 
(and should indeed be done) is work on improving the existing logs:
1. reduce verbosity
2. make sure that required points in the flow are logged
3. improve log messages

Don't forget the MSGID change that Keith promised to push as well.

> 4. FAST! (And by this I mean I am able to pinpoint the exact point in

Simple matter of grepping the message that was sent to engine (you anyway start 
from the engine log, otherwise you wouldn't have the 'flowID')

> the log where things happened, or failed). Think of this in terms of
> kernel messages log - something goes wrong - you just look for the
> word "ERROR" and possibly something else (hinted in the GUI/API) in
> the log, and that puts you right at the spot. Can we somehow do the
> same? Can we, under the given conditions (lots of different entities
> cluttering the log with very verbose output; complex flows that for
> every action would prepare, check sanity, execute, rollback, finish,
> report; constant polling \hello repostat!\ that also adds to the
> clutter; etc etc) provide easily readable and quickly searchable
> logs, that a rookie can get through?
> 5. Easily follow-able (yes, I do want to be able to see exactly what
> the system does when I click a button in the GUI, flowID or whatever
> else, but it's something people do to understand and learn the
> system, think of additional scenarios, and to avoid shooting

Which is why engine logs now contain every call to vdsm and return value 
(except when return value is too long in which case I hope what they do is 
simply truncate it)

[SNIP]

> 
> This is what I would want to see. I am still convinced a flowID could
> provide at least a partial solution to all of the above, one way or
> another, but if you can provide a better solution, one that will
> adhere to #3, I'd love to hear of it

So let's bring this conversation down to earth.  Please provide example logs 
with a real problem and let's see what flowID would do to these and what out of 
the above would be solved.

[SNIP]

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-16 Thread Saggi Mizrahi
Pinning down specific moments where stuff went horribly wrong is usually quite 
simple. The only reason I can't think of for someone to track a problem is if 
there was a bug in the storage subsystem that cause a corruption that only 
became apparent later on. But this can't benefit from flowID.

I'm going to to day this again. I want to hear someone to give a case where 
this is useful. Don't say things like.
"It will allow me to do X"
or
"I know of a guy who spent 2 years doing X to VDSM\Engine logs"

I want an example like:

The user complained about X. So I had to do Y to figure out what is wrong and 
it was a pain.

The Only reason I can think of is someone trying to figure out what wrong 
knowing nothing about how VDSM\Engine works.
While I just jump from place to place because I know what is going on, other 
people would just want to go step by step to get the complete picture.

But even with that I don't see why you would want to keep going from the Engine 
down to VDSM and back out again. 

Other then that I just simply can't imagine a use case where I'd need to do the 
stuff you talk about.
But again, I usually get my bug reports from QA and QE, and they are more 
skilled at bug reporting then the general public.

Further more, WHAT IS A FLOW? When does it start? When does it end? create 
Image is a flow? is the connect to the domain included? isn't it all just part 
of a big flow to create a VM? Does the engine even track it as a flow?

Flows depend on the debugger the problem and the scope. Throwing another 
useless ID in the pile will give you nothing. What you want is to be able to 
map sophisticated connections between resources and operations inside every 
component and between them. For instance, in VDSM. To track a resource locking 
issues you use the resourceID. It crosses "flows". To debug connection issues 
you use the connection information (and soon, the reference ID). An it crosses 
"flows" as well.

I'm not saying that debugging Ovirt is easy. I'm just saying that this is not 
the solution IMHO. Good anchors to resources like taskIDs, connection reference 
IDs, domain IDs and resource IDs give you the ability to track whatever you 
want. You just need better tools to cross reference them across log files so 
that the log tricks I *know* are implemented in a way that everyone can use 
them.

This is when good tools come in to play. Think about it like a DB. Should 
something be a table with an index or should it just be a view.

(more inline)

- Original Message -
> From: "Simon Grinberg" 
> To: "Saggi Mizrahi" 
> Cc: vdsm-devel@lists.fedorahosted.org, "Dan Yasny" , "Ayal 
> Baron" 
> Sent: Thursday, February 16, 2012 11:07:34 AM
> Subject: Re: [vdsm] flowID schema
> 
> 
> 
> - Original Message -
> > From: "Saggi Mizrahi" 
> > To: "Simon Grinberg" 
> > Cc: vdsm-devel@lists.fedorahosted.org, "Dan Yasny"
> > , "Ayal Baron" 
> > Sent: Thursday, February 16, 2012 5:18:26 PM
> > Subject: Re: [vdsm] flowID schema
> > 
> > You could just cross check flowID (that is printed in RHEV-M to all
> > the call IDs in that flow, It's a simple enough tool to make).
> > 
> > Also if you look at the way VDSM is heading you will see that just
> > grepping for the flow ID will gradually give you a smaller and
> > smaller picture of the actual flow.
> > 
> > - The new connection management means that the actual connect is
> > not
> > a direct part of the flow anymore.
> > - We plan to have a similar mechanism for domains.
> > - The new image code will do a lot of things "out of band".
> > - There will be more flows using multiple hosts because of SDM and
> > the increased safety gained by sanlock and they will not share the
> > flow ID in internal communication.
> > - Engine actually polls all tasks in a single thread (with it's own
> > flow ID? No flowID?) so even the actual result for async tasks
> > might
> > have a different flowID in the VDSM log.
> 
> OK so they may be a terminology mismatch here:
> A flow as an end user would see it is: Everything that happened since
> the moment he clicked "OK" and to the moment the operation failed
> succeeded.
> Same goes for internal/auto triggered actions in response to an event
> (Storage failure etc).
> 
> It may be composed as you notes above of several connections,
> multiple tasks etc.
> The flow ID as I see it is the linkage that makes sense in all of
> these. Everything that you've wrote above just convinces me that
> such an ID is a must.
I understood that. The point is that the system is too complex to be able t

Re: [vdsm] flowID schema

2012-02-16 Thread Saggi Mizrahi
You could just cross check flowID (that is printed in RHEV-M to all the call 
IDs in that flow, It's a simple enough tool to make).

Also if you look at the way VDSM is heading you will see that just grepping for 
the flow ID will gradually give you a smaller and smaller picture of the actual 
flow.

- The new connection management means that the actual connect is not a direct 
part of the flow anymore.
- We plan to have a similar mechanism for domains.
- The new image code will do a lot of things "out of band".
- There will be more flows using multiple hosts because of SDM and the 
increased safety gained by sanlock and they will not share the flow ID in 
internal communication.
- Engine actually polls all tasks in a single thread (with it's own flow ID? No 
flowID?) so even the actual result for async tasks might have a different 
flowID in the VDSM log.

In order to keep flowID sane in that kind of environment, there will have to be 
a lot of code moving that value around just to log it.

What we really need is something like the systemd journal 
(http://bit.ly/vbZEBQ).
The project is still in it's infancy but the actual ideas are solid.
* Always log in UTC
* Log in a query-able format with typed key\value pairs instead of just free 
text messages
* Use MSGID for *some* of the more important log lines for easy matching (and 
use UUID for MSGID instead of a rolling number)

With this kind of format you can easily write tools that sync up logs and do 
any other post processing.
It will also allow us to easily migrate to the systemd journal when it's 
available. 

We could just log to an sqlite db but I have a felling the constant 
fsyncing\flocking will be a performance issue.

So I suggest implementing this by having a log target which writes json objects 
and have a tool to put it all in an sqlite DB when necessary. This way you 
don't get a performance hit for doing something drastically different then 
appending to text files but you can later easily "pack" the logs to a db file 
that can easily be navigated, merged and processed.

You could get the "classic view" by just using the msg field from the json 
object for each line. You could do that without going through the "packing" 
process so it's instantly available.

I've been pondering this for quite some time and this might be a good excuse to 
give it a shot. As long as all the components in OVIRT write json objects we 
could merge all logs together to a single log db.
You could then try and detect small time time drifts using heuristics and have 
a unified view of everything.

I can do a more in depth write up if people think it's a good direction to go 
to.

- Original Message -
> From: "Simon Grinberg" 
> To: "Ayal Baron" , "Saggi Mizrahi" 
> Cc: vdsm-devel@lists.fedorahosted.org, "Dan Yasny" 
> Sent: Wednesday, February 15, 2012 5:50:34 AM
> Subject: Re: [vdsm] flowID schema
> 
> Hi Ayal,
> 
>  > > actual use case where someone would have actually
>  > > benefited
>  > > from
>  > > this.
> 
> 
> Sorry joining that late into the discussion.
> 
> Forget the automatic analyses/past issues recognition that may or may
> not be possible if we do have flow ID. (Personally I believe it's
> possible for known issues)
> 
> Think of large setups, not the few hosts setups that are running in
> the labs.
> Dozens of hosts, hundreds of VMs and tasks, running for few months.
> --->>> Huge logs (yes we even got 50G sized logs)
> 
> Now try to do what you and Saggy are suggesting said, yes it will
> work!!! Sometimes it is the only approach that will be possible.
> 
> But what about efficiency?
> 
> Some times I found myself spending 30 minutes just getting to the
> files that I need to view and to the approximate location of the
> error.
> If you have a flow ID you can have a tool that will extract all the
> relevant information from all the hosts participating in that flow
> and correlate that with the rhevm.log as the anchor.
> 
> (Valdik actually wrote a tool that follows the SPM and creates an SPM
> log no matter on which host it resides - this allowed tracking flows
> and proved itself useful and time saving, especially since when
> storage issues begin SPM tends to move around :))
> 
> In addition to that think about skill levels,
> For you it's very easy to go and debug, you wrote the code, you know
> what to expect and what is the normal behavior (most of the time).
> But what about others?
> When encountering an issue within a flow for a front line or a
> customer self debug the easiest way will be to actually compare it
> to the same flow that succeed (if exist) and understand what was
> supposed to happen next (or before) and did

Re: [vdsm] flowID schema

2012-02-15 Thread Simon Grinberg
Hi Ayal,

 > > actual use case where someone would have actually
 > > benefited
 > > from
 > > this.


Sorry joining that late into the discussion.

Forget the automatic analyses/past issues recognition that may or may not be 
possible if we do have flow ID. (Personally I believe it's possible for known 
issues)

Think of large setups, not the few hosts setups that are running in the labs.
Dozens of hosts, hundreds of VMs and tasks, running for few months. 
--->>> Huge logs (yes we even got 50G sized logs)

Now try to do what you and Saggy are suggesting said, yes it will work!!! 
Sometimes it is the only approach that will be possible.

But what about efficiency?

Some times I found myself spending 30 minutes just getting to the files that I 
need to view and to the approximate location of the error. 
If you have a flow ID you can have a tool that will extract all the relevant 
information from all the hosts participating in that flow and correlate that 
with the rhevm.log as the anchor.

(Valdik actually wrote a tool that follows the SPM and creates an SPM log no 
matter on which host it resides - this allowed tracking flows and proved itself 
useful and time saving, especially since when storage issues begin SPM tends to 
move around :))

In addition to that think about skill levels, 
For you it's very easy to go and debug, you wrote the code, you know what to 
expect and what is the normal behavior (most of the time). But what about 
others?
When encountering an issue within a flow for a front line or a customer self 
debug the easiest way will be to actually compare it to the same flow that 
succeed (if exist) and understand what was supposed to happen next (or before) 
and did not. I actually used that technique a lot. Again a flow ID will save 
time. All you have to do is extract an identical flow that worked using the 
flow-extractor utility and then compare.


I agree that when you get to the really tough cases, they will span across 
flows and your only alternative may be just to dig in. However for many others, 
flows tracking will save support calls, support escalations and will 
considerably help with education. For 2.2 we had Marina going into the code and 
documenting flows, so support people can learn what to expect and what to look 
for in the logs.

I believe it will also help engineering, lot's of times I had to gather 3 
engineers and sat together just to understand a flow in order to debug an issue 
or understand what to expect if someone complained about a weird behavior. 
Since each of them only new what his part is doing - all had to go into their 
code and discuss. With flow ID you can just extract the flow and easily see 
what is being done (you may find out as happened before that it's not exactly 
what they meant to do) 

In short, if we aim to look at RHEV as a system, flows are helpful in many 
aspects. Instead of having discrete components and requiring a high level of 
expertise and spending a lot of time (finding the right spot) to debug even 
minor issues you can look at a system and easily get (most of the time), 
understand what to expect and get to the most relevant logs fairly fast.

Simon.










- Original Message -
> From: "Dan Yasny" 
> To: "Ayal Baron" 
> Cc: "Simon Grinberg" , vdsm-devel@lists.fedorahosted.org
> Sent: Friday, February 10, 2012 4:01:01 PM
> Subject: Re: [vdsm] flowID schema
> 
> 
> 
> - Original Message -
> > From: "Ayal Baron" 
> > To: "Dan Yasny" 
> > Cc: "Simon Grinberg" ,
> > vdsm-devel@lists.fedorahosted.org
> > Sent: Friday, 10 February, 2012 3:51:01 PM
> > Subject: Re: [vdsm] flowID schema
> > 
> > 
> > 
> > - Original Message -
> > > 
> > > 
> > > - Original Message -
> > > > From: "Ayal Baron" 
> > > > To: "Dan Yasny" 
> > > > Cc: "Simon Grinberg" ,
> > > > vdsm-devel@lists.fedorahosted.org
> > > > Sent: Friday, 10 February, 2012 2:55:46 PM
> > > > Subject: Re: [vdsm] flowID schema
> > > > 
> > > > 
> > > > 
> > > > - Original Message -
> > > > > 
> > > > > 
> > > > > - Original Message -
> > > > > > From: "Ayal Baron" 
> > > > > > To: "Dan Yasny" 
> > > > > > Cc: "Simon Grinberg" ,
> > > > > > vdsm-devel@lists.fedorahosted.org
> > > > > > Sent: Friday, 10 February, 2012 12:50:04 AM
> > > > > > Subject: Re: [vdsm] flowID schema
> > > > > > 
> > > > > > 
> > > > > > 
> > &

Re: [vdsm] flowID schema

2012-02-14 Thread Dan Yasny


- Original Message -
> From: "Dan Kenigsberg" 
> To: "Ayal Baron" 
> Cc: "VDSM Project Development" 
> Sent: Tuesday, 14 February, 2012 10:40:40 AM
> Subject: Re: [vdsm] flowID schema
> 
> On Mon, Feb 13, 2012 at 12:06:46PM -0500, Ayal Baron wrote:
> > 
> > 
> > - Original Message -
> > > On 02/13/2012 02:28 PM, Ayal Baron wrote:
> > > >
> > > ...
> > > >>> is that it (ab)uses an http header for carrying FlowID,
> > > >> Yes, it certainly does appear to overload it.  I would be nice
> > > >> to
> > > >> have
> > > >> something formal given to it by engine, but I can appreciate
> > > >> the
> > > >> difficulty implementing such a scheme.
> > > >
> > > > Technically I disagree, this is a cross cutting concern which
> > > > has
> > > > nothing to do with any specific call hence it should be passed
> > > > as
> > > > a header, that is actually rather elegant.
> > > >
> > > > To the specific matter at hand though. what would really be
> > > > nice is
> > > > solving the real problem properly, and not contaminating the
> > > > API
> > > > and the log with things which have marginal benefit if at all.
> > > 
> > > going back to the 'grep' issue.
> > > vdsm logs are verbose. they are multi-threaded as well.
> > > I think this should be more than just about finding the entry
> > > point
> > > of
> > > the flow, then identifying for this specific log format how to
> > > trace
> > > it,
> > > which would require writing a log analyzer with plugins for each
> > > component.
> > > having all lines which are relevant to a flow with a flowid
> > > logged in
> > > them would make it much easier to get all (or most) of relevant
> > > parts
> > > of
> > > the flow (most, since something orthogonal to the flow may have
> > > happened
> > > affecting it, like loss of network)
> > > 
> > I'm sorry but what you're proposing is to make the log even more
> > difficult to read for absolutely NO reason.
> > I haven't seen 1 good reason to add more to the log.
> > What we should be focusing on is:
> > 1. adding the relevant data that is needed to the engine log so
> > that most of the time users wouldn't need to go the host
> > 2. reducing the verbosity of the vdsm log and increasing
> > readability (the flow ID does exactly the opposite).
> > 
> > As opposed to most people here who are thinking that this sounds
> > like a good idea, I actually have debugged at least dozens of
> > issues in engine and vdsm and can assure you that not once would
> > this have been beneficial to me.
> 
> When I debug an Engine-related issue, I tend to find a silly API call
> in
> Vdsm. Then I have to start correlating this to Engine logs. This step
> can be made quicker and less error-prone by logging FlowID both
> Engine
> and Vdsm. To me, this is the 1 good reason for logging FlowID on API
> entry in Vdsm.

What I've been saying all along. But Ayal will just say he didn't see a good 
reason brought up :)

> 
> However, I find adding FlowID to each and every log line a bit
> excessive. Our log is too cluttered as it is. Logging FlowID whenever
> a new thread is spawned makes more sense to me.

In that case, this needs to be done in a way that is easily grep-able and 
clearly visible when skimming over a logfile, in a bird's eye view, sort of 
speak. So far, looking for the "run and protect" line for a specific thread, 
making sure it pertains to the same thing the engine kicked off in it's own 
log, and then following that thread through was never very easy. Manageable 
with some experience, but not easy.

> 
> > What was mostly missing in the engine logs was understanding what
> > thread in engine called what operation in vdsm and what vdsm's
> > response was.  In 3.0 my understanding is that engine fixed this
> > so this entire feature will be counter productive (will make logs
> > less readable and harder to decipher, adds complexity to the API
> > and adds complexity to users of the rest API).
> > All cross hosts issues stem from *different* flows, so this would
> > not help in this case and single host issues are easily traceable
> > today (and you *never* need to follow an entire flow, it's
> > entirely redundant and inefficient).
> > 
> > I'm more than willing to show this on any set of logs by the way
> > and would be happy to be proven wrong.
> > 
> > More often than not by the way, the issue is that inside a specific
> > call (i.e. 1 verb, not a flow) people are not proficient in
> > finding the offending line (which is why I wrote the 'how to read
> > the vdsm log' wiki).
> ___
> vdsm-devel mailing list
> vdsm-devel@lists.fedorahosted.org
> https://fedorahosted.org/mailman/listinfo/vdsm-devel
> 

-- 



Regards, 

Dan Yasny 
Red Hat Israel 
+972 9769 2280
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-14 Thread Dan Kenigsberg
On Mon, Feb 13, 2012 at 12:06:46PM -0500, Ayal Baron wrote:
> 
> 
> - Original Message -
> > On 02/13/2012 02:28 PM, Ayal Baron wrote:
> > >
> > ...
> > >>> is that it (ab)uses an http header for carrying FlowID,
> > >> Yes, it certainly does appear to overload it.  I would be nice to
> > >> have
> > >> something formal given to it by engine, but I can appreciate the
> > >> difficulty implementing such a scheme.
> > >
> > > Technically I disagree, this is a cross cutting concern which has
> > > nothing to do with any specific call hence it should be passed as
> > > a header, that is actually rather elegant.
> > >
> > > To the specific matter at hand though. what would really be nice is
> > > solving the real problem properly, and not contaminating the API
> > > and the log with things which have marginal benefit if at all.
> > 
> > going back to the 'grep' issue.
> > vdsm logs are verbose. they are multi-threaded as well.
> > I think this should be more than just about finding the entry point
> > of
> > the flow, then identifying for this specific log format how to trace
> > it,
> > which would require writing a log analyzer with plugins for each
> > component.
> > having all lines which are relevant to a flow with a flowid logged in
> > them would make it much easier to get all (or most) of relevant parts
> > of
> > the flow (most, since something orthogonal to the flow may have
> > happened
> > affecting it, like loss of network)
> > 
> I'm sorry but what you're proposing is to make the log even more difficult to 
> read for absolutely NO reason.
> I haven't seen 1 good reason to add more to the log.
> What we should be focusing on is:
> 1. adding the relevant data that is needed to the engine log so that most of 
> the time users wouldn't need to go the host
> 2. reducing the verbosity of the vdsm log and increasing readability (the 
> flow ID does exactly the opposite).
> 
> As opposed to most people here who are thinking that this sounds like a good 
> idea, I actually have debugged at least dozens of issues in engine and vdsm 
> and can assure you that not once would this have been beneficial to me.

When I debug an Engine-related issue, I tend to find a silly API call in
Vdsm. Then I have to start correlating this to Engine logs. This step
can be made quicker and less error-prone by logging FlowID both Engine
and Vdsm. To me, this is the 1 good reason for logging FlowID on API
entry in Vdsm.

However, I find adding FlowID to each and every log line a bit
excessive. Our log is too cluttered as it is. Logging FlowID whenever a
new thread is spawned makes more sense to me.

> What was mostly missing in the engine logs was understanding what thread in 
> engine called what operation in vdsm and what vdsm's response was.  In 3.0 my 
> understanding is that engine fixed this so this entire feature will be 
> counter productive (will make logs less readable and harder to decipher, adds 
> complexity to the API and adds complexity to users of the rest API).
> All cross hosts issues stem from *different* flows, so this would not help in 
> this case and single host issues are easily traceable today (and you *never* 
> need to follow an entire flow, it's entirely redundant and inefficient).
> 
> I'm more than willing to show this on any set of logs by the way and would be 
> happy to be proven wrong.
> 
> More often than not by the way, the issue is that inside a specific call 
> (i.e. 1 verb, not a flow) people are not proficient in finding the offending 
> line (which is why I wrote the 'how to read the vdsm log' wiki).
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-13 Thread Ayal Baron


- Original Message -
> On 02/13/2012 02:28 PM, Ayal Baron wrote:
> >
> ...
> >>> is that it (ab)uses an http header for carrying FlowID,
> >> Yes, it certainly does appear to overload it.  I would be nice to
> >> have
> >> something formal given to it by engine, but I can appreciate the
> >> difficulty implementing such a scheme.
> >
> > Technically I disagree, this is a cross cutting concern which has
> > nothing to do with any specific call hence it should be passed as
> > a header, that is actually rather elegant.
> >
> > To the specific matter at hand though. what would really be nice is
> > solving the real problem properly, and not contaminating the API
> > and the log with things which have marginal benefit if at all.
> 
> going back to the 'grep' issue.
> vdsm logs are verbose. they are multi-threaded as well.
> I think this should be more than just about finding the entry point
> of
> the flow, then identifying for this specific log format how to trace
> it,
> which would require writing a log analyzer with plugins for each
> component.
> having all lines which are relevant to a flow with a flowid logged in
> them would make it much easier to get all (or most) of relevant parts
> of
> the flow (most, since something orthogonal to the flow may have
> happened
> affecting it, like loss of network)
> 
I'm sorry but what you're proposing is to make the log even more difficult to 
read for absolutely NO reason.
I haven't seen 1 good reason to add more to the log.
What we should be focusing on is:
1. adding the relevant data that is needed to the engine log so that most of 
the time users wouldn't need to go the host
2. reducing the verbosity of the vdsm log and increasing readability (the flow 
ID does exactly the opposite).

As opposed to most people here who are thinking that this sounds like a good 
idea, I actually have debugged at least dozens of issues in engine and vdsm and 
can assure you that not once would this have been beneficial to me.
What was mostly missing in the engine logs was understanding what thread in 
engine called what operation in vdsm and what vdsm's response was.  In 3.0 my 
understanding is that engine fixed this so this entire feature will be counter 
productive (will make logs less readable and harder to decipher, adds 
complexity to the API and adds complexity to users of the rest API).
All cross hosts issues stem from *different* flows, so this would not help in 
this case and single host issues are easily traceable today (and you *never* 
need to follow an entire flow, it's entirely redundant and inefficient).

I'm more than willing to show this on any set of logs by the way and would be 
happy to be proven wrong.

More often than not by the way, the issue is that inside a specific call (i.e. 
1 verb, not a flow) people are not proficient in finding the offending line 
(which is why I wrote the 'how to read the vdsm log' wiki).
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-13 Thread Itamar Heim

On 02/13/2012 12:52 AM, Ayal Baron wrote:
...


if the FlowID isn't logged for logs which are part of the action, it
will be easy to detect the entry point.
but still wouldn't make it easy to grep all related logs to a flow



If the flow id is present in all lines it would make the log a lot less 
readable.
Writing a script that accepts the flowid and 'grep's the 'flow' should be 
simple enough.

However, if this is to become (remotely) useful, the flowID should always 
contain an engine generated generation.  The thing is, really complex debug 
scenarios which span multiple hosts never stem from the same flow, so the only 
added benefit here would be if we could at least know what happened before what 
from logs across multiple hosts.  To achieve this we would need the engine to 
add a generation id to each unique flowID (the problem being that currently the 
flow ID would be determined by the UI and not by engine).
Engine would have to generate such an ID for all internal flows as well!


I thought it is clear the UI/api can pass the flow id for the rest of 
the components (engine and below)




I still have not heard even one good reason for having this though.


___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-13 Thread Dan Yasny


- Original Message -
> From: "Itamar Heim" 
> To: "Ayal Baron" 
> Cc: "VDSM Project Development" 
> Sent: Monday, 13 February, 2012 6:55:41 PM
> Subject: Re: [vdsm] flowID schema
> 
> On 02/13/2012 02:28 PM, Ayal Baron wrote:
> >
> ...
> >>> is that it (ab)uses an http header for carrying FlowID,
> >> Yes, it certainly does appear to overload it.  I would be nice to
> >> have
> >> something formal given to it by engine, but I can appreciate the
> >> difficulty implementing such a scheme.
> >
> > Technically I disagree, this is a cross cutting concern which has
> > nothing to do with any specific call hence it should be passed as
> > a header, that is actually rather elegant.
> >
> > To the specific matter at hand though. what would really be nice is
> > solving the real problem properly, and not contaminating the API
> > and the log with things which have marginal benefit if at all.
> 
> going back to the 'grep' issue.
> vdsm logs are verbose. they are multi-threaded as well.
> I think this should be more than just about finding the entry point
> of
> the flow, then identifying for this specific log format how to trace
> it,
> which would require writing a log analyzer with plugins for each
> component.
> having all lines which are relevant to a flow with a flowid logged in
> them would make it much easier to get all (or most) of relevant parts
> of the flow 

+1

> (most, since something orthogonal to the flow may have
> happened
> affecting it, like loss of network)
> ___
> vdsm-devel mailing list
> vdsm-devel@lists.fedorahosted.org
> https://fedorahosted.org/mailman/listinfo/vdsm-devel
> 

-- 



Regards, 

Dan Yasny 
Red Hat Israel 
+972 9769 2280
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-13 Thread Itamar Heim

On 02/13/2012 02:28 PM, Ayal Baron wrote:



...

is that it (ab)uses an http header for carrying FlowID,

Yes, it certainly does appear to overload it.  I would be nice to
have
something formal given to it by engine, but I can appreciate the
difficulty implementing such a scheme.


Technically I disagree, this is a cross cutting concern which has nothing to do 
with any specific call hence it should be passed as a header, that is actually 
rather elegant.

To the specific matter at hand though. what would really be nice is solving the 
real problem properly, and not contaminating the API and the log with things 
which have marginal benefit if at all.


going back to the 'grep' issue.
vdsm logs are verbose. they are multi-threaded as well.
I think this should be more than just about finding the entry point of 
the flow, then identifying for this specific log format how to trace it, 
which would require writing a log analyzer with plugins for each component.
having all lines which are relevant to a flow with a flowid logged in 
them would make it much easier to get all (or most) of relevant parts of 
the flow (most, since something orthogonal to the flow may have happened 
affecting it, like loss of network)

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-13 Thread Keith Robertson

On 02/13/2012 07:28 AM, Ayal Baron wrote:

Technically I disagree, this is a cross cutting concern which has nothing to do 
with any specific call hence it should be passed as a header, that is actually 
rather elegant.
Overloading a header with this?  Seems like a clunky bolt on to me.  
Further, what if you wanted to use some other form of managment 
interface that didn't allow for this?


IMHO, the ideal case would be for all verbs to inherit from a  base type 
(e.g. a base XSD) that defined all attributes that each verb should 
have.  Then each specific verb would extend that verb with the 
attributes necessary to perform a specific action.


[BaseVDSMType.xsd]
  |
[verb1.xsd, verb2.xsd,...,verbN.xsd]

To the specific matter at hand though. what would really be nice is solving the 
real problem properly, and not contaminating the API and the log with things 
which have marginal benefit if at all.
I agree.  Is this a solicitation for proposals or is there already one 
being put forward? ;)


Cheers,
Keith
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-13 Thread Ayal Baron


- Original Message -
> On 02/12/2012 02:50 AM, Dan Kenigsberg wrote:
> > On Thu, Feb 09, 2012 at 03:23:19PM -0500, Ayal Baron wrote:
> >>
> >> - Original Message -
> >>> -1
> >>>
> >>> I agree that for messaging environment having a Message ID is a
> >>> must
> >>> because you sometimes don't have a particular target so when you
> >>> get
> >>> a response you need to know what this node is actually responding
> >>> to.
> >>>
> >>> The message ID could be composed with  so you can
> >>> reuse the field.
> >>>
> >>> But that is all besides the point.
> >>>
> >>> I understand that someone might find it fun to go on following
> >>> the
> >>> entire flow in the Engine and in VDSM. But I would like to hear
> >>> an
> >>> actual use case where someone would have actually benefited from
> >>> this.
> >>> As I see it having VSDM return the task ID with every response
> >>> (and
> >>> not just for async tasks) is a lot more useful and correct.
> >>>
> >>> A generic debugging scenario as I see it.
> >>>
> >>> 1. Something went wrong
> >>> 2. You go looking in the ENGINE log trying to figure out what
> >>> happend.
> >>> 3. You see that ENGINE got SomeError.
> >>> 4. Check to see if this error makes sense imagining that VDSM is
> >>> always right and is a black box.
> >>> 5. You did your digging and now you think that VDSM is as fault.
> >>> 6. Go look for the call that failed. (If we returned the taskID
> >>> it's
> >>> pretty simple to find that call).
> >>> 7. Look around the call to check VDSM state.
> >>> 8. Profit.
> >>>
> >>> There is never a point where you want to follow a whole flow call
> >>> by
> >>> call going back and forth, and even if you did having the VDSM
> >>> taskID is a better anchor then flowID.
> >>>
> >>> VDSM is built in a way that every call takes in to account the
> >>> current state only. Debugging it with an engine flow mindset is
> >>> just
> >>> wrong and distracting. I see it doing more harm the good by
> >>> reinforcing bad debugging practices.
> >> I don't know about harm, but, today the engine logs every call and
> >> return value to and from vdsm.  This means that all the info that
> >> is needed to follow a flow is already present in the engine log
> >> (which was not the case previously) so I believe that the flow id
> >> is redundant.
> >> In addition, instead of focusing on how to track a flow between
> >> components, we should focus on how to improve the engine log so
> >> that the users don't need to go to the hosts in the first place.
> >> My biggest problem with it is that it changes each and every verb
> >> in the API and makes the log itself also more verbose and less
> >> readable.
> > The good thing about the currently suggested implementation
> >
> > http://gerrit.ovirt.org/#patch,sidebyside,1221,6,vdsm/BindingXMLRPC.py
> >
> > is that it (ab)uses an http header for carrying FlowID,
> Yes, it certainly does appear to overload it.  I would be nice to
> have
> something formal given to it by engine, but I can appreciate the
> difficulty implementing such a scheme.

Technically I disagree, this is a cross cutting concern which has nothing to do 
with any specific call hence it should be passed as a header, that is actually 
rather elegant.

To the specific matter at hand though. what would really be nice is solving the 
real problem properly, and not contaminating the API and the log with things 
which have marginal benefit if at all.

> 
> >   thus keeping the
> > formal API intact. FlowID is logged only on API entry point, so it
> > would
> > not clutter the logs too much.
> > Keith, Dan, I will need your support in Gerrit for the patch.
> +1 if I am asked to review.
> > Dan.
> 
> 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-12 Thread Keith Robertson

On 02/12/2012 02:50 AM, Dan Kenigsberg wrote:

On Thu, Feb 09, 2012 at 03:23:19PM -0500, Ayal Baron wrote:


- Original Message -

-1

I agree that for messaging environment having a Message ID is a must
because you sometimes don't have a particular target so when you get
a response you need to know what this node is actually responding
to.

The message ID could be composed with  so you can
reuse the field.

But that is all besides the point.

I understand that someone might find it fun to go on following the
entire flow in the Engine and in VDSM. But I would like to hear an
actual use case where someone would have actually benefited from
this.
As I see it having VSDM return the task ID with every response (and
not just for async tasks) is a lot more useful and correct.

A generic debugging scenario as I see it.

1. Something went wrong
2. You go looking in the ENGINE log trying to figure out what
happend.
3. You see that ENGINE got SomeError.
4. Check to see if this error makes sense imagining that VDSM is
always right and is a black box.
5. You did your digging and now you think that VDSM is as fault.
6. Go look for the call that failed. (If we returned the taskID it's
pretty simple to find that call).
7. Look around the call to check VDSM state.
8. Profit.

There is never a point where you want to follow a whole flow call by
call going back and forth, and even if you did having the VDSM
taskID is a better anchor then flowID.

VDSM is built in a way that every call takes in to account the
current state only. Debugging it with an engine flow mindset is just
wrong and distracting. I see it doing more harm the good by
reinforcing bad debugging practices.

I don't know about harm, but, today the engine logs every call and return value 
to and from vdsm.  This means that all the info that is needed to follow a flow 
is already present in the engine log (which was not the case previously) so I 
believe that the flow id is redundant.
In addition, instead of focusing on how to track a flow between components, we 
should focus on how to improve the engine log so that the users don't need to 
go to the hosts in the first place.
My biggest problem with it is that it changes each and every verb in the API 
and makes the log itself also more verbose and less readable.

The good thing about the currently suggested implementation

http://gerrit.ovirt.org/#patch,sidebyside,1221,6,vdsm/BindingXMLRPC.py

is that it (ab)uses an http header for carrying FlowID,
Yes, it certainly does appear to overload it.  I would be nice to have 
something formal given to it by engine, but I can appreciate the 
difficulty implementing such a scheme.



  thus keeping the
formal API intact. FlowID is logged only on API entry point, so it would
not clutter the logs too much.
Keith, Dan, I will need your support in Gerrit for the patch.

+1 if I am asked to review.

Dan.


___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-12 Thread Ayal Baron


- Original Message -
> On 02/12/2012 09:50 AM, Dan Kenigsberg wrote:
> > On Thu, Feb 09, 2012 at 03:23:19PM -0500, Ayal Baron wrote:
> >>
> >>
> >> - Original Message -
> >>> -1
> >>>
> >>> I agree that for messaging environment having a Message ID is a
> >>> must
> >>> because you sometimes don't have a particular target so when you
> >>> get
> >>> a response you need to know what this node is actually responding
> >>> to.
> >>>
> >>> The message ID could be composed with  so you can
> >>> reuse the field.
> >>>
> >>> But that is all besides the point.
> >>>
> >>> I understand that someone might find it fun to go on following
> >>> the
> >>> entire flow in the Engine and in VDSM. But I would like to hear
> >>> an
> >>> actual use case where someone would have actually benefited from
> >>> this.
> >>> As I see it having VSDM return the task ID with every response
> >>> (and
> >>> not just for async tasks) is a lot more useful and correct.
> >>>
> >>> A generic debugging scenario as I see it.
> >>>
> >>> 1. Something went wrong
> >>> 2. You go looking in the ENGINE log trying to figure out what
> >>> happend.
> >>> 3. You see that ENGINE got SomeError.
> >>> 4. Check to see if this error makes sense imagining that VDSM is
> >>> always right and is a black box.
> >>> 5. You did your digging and now you think that VDSM is as fault.
> >>> 6. Go look for the call that failed. (If we returned the taskID
> >>> it's
> >>> pretty simple to find that call).
> >>> 7. Look around the call to check VDSM state.
> >>> 8. Profit.
> >>>
> >>> There is never a point where you want to follow a whole flow call
> >>> by
> >>> call going back and forth, and even if you did having the VDSM
> >>> taskID is a better anchor then flowID.
> >>>
> >>> VDSM is built in a way that every call takes in to account the
> >>> current state only. Debugging it with an engine flow mindset is
> >>> just
> >>> wrong and distracting. I see it doing more harm the good by
> >>> reinforcing bad debugging practices.
> >>
> >> I don't know about harm, but, today the engine logs every call and
> >> return value to and from vdsm.  This means that all the info that
> >> is needed to follow a flow is already present in the engine log
> >> (which was not the case previously) so I believe that the flow id
> >> is redundant.
> >> In addition, instead of focusing on how to track a flow between
> >> components, we should focus on how to improve the engine log so
> >> that the users don't need to go to the hosts in the first place.
> >
> >> My biggest problem with it is that it changes each and every verb
> >> in the API and makes the log itself also more verbose and less
> >> readable.
> >
> > The good thing about the currently suggested implementation
> >
> > http://gerrit.ovirt.org/#patch,sidebyside,1221,6,vdsm/BindingXMLRPC.py
> >
> > is that it (ab)uses an http header for carrying FlowID, thus
> > keeping the
> > formal API intact. FlowID is logged only on API entry point, so it
> > would
> > not clutter the logs too much.
> 
> if the FlowID isn't logged for logs which are part of the action, it
> will be easy to detect the entry point.
> but still wouldn't make it easy to grep all related logs to a flow
> 

If the flow id is present in all lines it would make the log a lot less 
readable.
Writing a script that accepts the flowid and 'grep's the 'flow' should be 
simple enough.

However, if this is to become (remotely) useful, the flowID should always 
contain an engine generated generation.  The thing is, really complex debug 
scenarios which span multiple hosts never stem from the same flow, so the only 
added benefit here would be if we could at least know what happened before what 
from logs across multiple hosts.  To achieve this we would need the engine to 
add a generation id to each unique flowID (the problem being that currently the 
flow ID would be determined by the UI and not by engine).
Engine would have to generate such an ID for all internal flows as well!

I still have not heard even one good reason for having this though.
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-12 Thread Itamar Heim

On 02/12/2012 09:50 AM, Dan Kenigsberg wrote:

On Thu, Feb 09, 2012 at 03:23:19PM -0500, Ayal Baron wrote:



- Original Message -

-1

I agree that for messaging environment having a Message ID is a must
because you sometimes don't have a particular target so when you get
a response you need to know what this node is actually responding
to.

The message ID could be composed with  so you can
reuse the field.

But that is all besides the point.

I understand that someone might find it fun to go on following the
entire flow in the Engine and in VDSM. But I would like to hear an
actual use case where someone would have actually benefited from
this.
As I see it having VSDM return the task ID with every response (and
not just for async tasks) is a lot more useful and correct.

A generic debugging scenario as I see it.

1. Something went wrong
2. You go looking in the ENGINE log trying to figure out what
happend.
3. You see that ENGINE got SomeError.
4. Check to see if this error makes sense imagining that VDSM is
always right and is a black box.
5. You did your digging and now you think that VDSM is as fault.
6. Go look for the call that failed. (If we returned the taskID it's
pretty simple to find that call).
7. Look around the call to check VDSM state.
8. Profit.

There is never a point where you want to follow a whole flow call by
call going back and forth, and even if you did having the VDSM
taskID is a better anchor then flowID.

VDSM is built in a way that every call takes in to account the
current state only. Debugging it with an engine flow mindset is just
wrong and distracting. I see it doing more harm the good by
reinforcing bad debugging practices.


I don't know about harm, but, today the engine logs every call and return value 
to and from vdsm.  This means that all the info that is needed to follow a flow 
is already present in the engine log (which was not the case previously) so I 
believe that the flow id is redundant.
In addition, instead of focusing on how to track a flow between components, we 
should focus on how to improve the engine log so that the users don't need to 
go to the hosts in the first place.



My biggest problem with it is that it changes each and every verb in the API 
and makes the log itself also more verbose and less readable.


The good thing about the currently suggested implementation

http://gerrit.ovirt.org/#patch,sidebyside,1221,6,vdsm/BindingXMLRPC.py

is that it (ab)uses an http header for carrying FlowID, thus keeping the
formal API intact. FlowID is logged only on API entry point, so it would
not clutter the logs too much.


if the FlowID isn't logged for logs which are part of the action, it 
will be easy to detect the entry point.

but still wouldn't make it easy to grep all related logs to a flow
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-11 Thread Dan Kenigsberg
On Thu, Feb 09, 2012 at 03:23:19PM -0500, Ayal Baron wrote:
> 
> 
> - Original Message -
> > -1
> > 
> > I agree that for messaging environment having a Message ID is a must
> > because you sometimes don't have a particular target so when you get
> > a response you need to know what this node is actually responding
> > to.
> > 
> > The message ID could be composed with  so you can
> > reuse the field.
> > 
> > But that is all besides the point.
> > 
> > I understand that someone might find it fun to go on following the
> > entire flow in the Engine and in VDSM. But I would like to hear an
> > actual use case where someone would have actually benefited from
> > this.
> > As I see it having VSDM return the task ID with every response (and
> > not just for async tasks) is a lot more useful and correct.
> > 
> > A generic debugging scenario as I see it.
> > 
> > 1. Something went wrong
> > 2. You go looking in the ENGINE log trying to figure out what
> > happend.
> > 3. You see that ENGINE got SomeError.
> > 4. Check to see if this error makes sense imagining that VDSM is
> > always right and is a black box.
> > 5. You did your digging and now you think that VDSM is as fault.
> > 6. Go look for the call that failed. (If we returned the taskID it's
> > pretty simple to find that call).
> > 7. Look around the call to check VDSM state.
> > 8. Profit.
> > 
> > There is never a point where you want to follow a whole flow call by
> > call going back and forth, and even if you did having the VDSM
> > taskID is a better anchor then flowID.
> > 
> > VDSM is built in a way that every call takes in to account the
> > current state only. Debugging it with an engine flow mindset is just
> > wrong and distracting. I see it doing more harm the good by
> > reinforcing bad debugging practices.
> 
> I don't know about harm, but, today the engine logs every call and return 
> value to and from vdsm.  This means that all the info that is needed to 
> follow a flow is already present in the engine log (which was not the case 
> previously) so I believe that the flow id is redundant.
> In addition, instead of focusing on how to track a flow between components, 
> we should focus on how to improve the engine log so that the users don't need 
> to go to the hosts in the first place.

> My biggest problem with it is that it changes each and every verb in the API 
> and makes the log itself also more verbose and less readable.

The good thing about the currently suggested implementation

http://gerrit.ovirt.org/#patch,sidebyside,1221,6,vdsm/BindingXMLRPC.py

is that it (ab)uses an http header for carrying FlowID, thus keeping the
formal API intact. FlowID is logged only on API entry point, so it would
not clutter the logs too much.

Keith, Dan, I will need your support in Gerrit for the patch.

Dan.
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-10 Thread Dan Yasny


- Original Message -
> From: "Ayal Baron" 
> To: "Dan Yasny" 
> Cc: "Simon Grinberg" , vdsm-devel@lists.fedorahosted.org
> Sent: Friday, 10 February, 2012 3:51:01 PM
> Subject: Re: [vdsm] flowID schema
> 
> 
> 
> - Original Message -
> > 
> > 
> > - Original Message -
> > > From: "Ayal Baron" 
> > > To: "Dan Yasny" 
> > > Cc: "Simon Grinberg" ,
> > > vdsm-devel@lists.fedorahosted.org
> > > Sent: Friday, 10 February, 2012 2:55:46 PM
> > > Subject: Re: [vdsm] flowID schema
> > > 
> > > 
> > > 
> > > - Original Message -
> > > > 
> > > > 
> > > > - Original Message -
> > > > > From: "Ayal Baron" 
> > > > > To: "Dan Yasny" 
> > > > > Cc: "Simon Grinberg" ,
> > > > > vdsm-devel@lists.fedorahosted.org
> > > > > Sent: Friday, 10 February, 2012 12:50:04 AM
> > > > > Subject: Re: [vdsm] flowID schema
> > > > > 
> > > > > 
> > > > > 
> > > > > - Original Message -
> > > > > > > From: "Saggi Mizrahi" 
> > > > > > > To: "Keith Robertson" 
> > > > > > > Cc: "VDSM Project Development"
> > > > > > > 
> > > > > > > Sent: Thursday, February 9, 2012 2:24:44 PM
> > > > > > > Subject: Re: [vdsm] flowID schema
> > > > > > >
> > > > > > > -1
> > > > > > >
> > > > > > > I agree that for messaging environment having a Message
> > > > > > > ID
> > > > > > > is
> > > > > > > a
> > > > > > > must
> > > > > > > because you sometimes don't have a particular target so
> > > > > > > when
> > > > > > > you
> > > > > > > get
> > > > > > > a response you need to know what this node is actually
> > > > > > > responding
> > > > > > > to.
> > > > > > >
> > > > > > > The message ID could be composed with  so
> > > > > > > you
> > > > > > > can
> > > > > > > reuse the field.
> > > > > > >
> > > > > > > But that is all besides the point.
> > > > > > >
> > > > > > > I understand that someone might find it fun to go on
> > > > > > > following
> > > > > > > the
> > > > > > > entire flow in the Engine and in VDSM. But I would like
> > > > > > > to
> > > > > > > hear
> > > > > > > an
> > > > > > > actual use case where someone would have actually
> > > > > > > benefited
> > > > > > > from
> > > > > > > this.
> > > > > > > As I see it having VSDM return the task ID with every
> > > > > > > response
> > > > > > > (and
> > > > > > > not just for async tasks) is a lot more useful and
> > > > > > > correct.
> > > > > > 
> > > > > > Actually, the only way to understand what happened in a
> > > > > > certain
> > > > > > flow
> > > > > > is to follow it through. From the engine log where an
> > > > > > action
> > > > > > was
> > > > > > initiated, down to the hosts that did the execution.
> > > > > > Everything
> > > > > > RHEV
> > > > > > does is a flow, and with no correlation between hosts
> > > > > > executing
> > > > > > parts of the same flow, troubleshooting turns into
> > > > > > guesswork,
> > > > > > because the only contact point left is time, which is
> > > > > > useless
> > > > > > when
> > > > > > you're talking about vdsm - there are sometimes hundreds of
> > > > > > log
> > > > > > records in a single second, and not every host is in
> > > > > > absolute
> > > > > > sync
> > > > > > with every other.
> > >

Re: [vdsm] flowID schema

2012-02-10 Thread Ayal Baron


- Original Message -
> 
> 
> - Original Message -
> > From: "Ayal Baron" 
> > To: "Dan Yasny" 
> > Cc: "Simon Grinberg" ,
> > vdsm-devel@lists.fedorahosted.org
> > Sent: Friday, 10 February, 2012 2:55:46 PM
> > Subject: Re: [vdsm] flowID schema
> > 
> > 
> > 
> > - Original Message -
> > > 
> > > 
> > > - Original Message -
> > > > From: "Ayal Baron" 
> > > > To: "Dan Yasny" 
> > > > Cc: "Simon Grinberg" ,
> > > > vdsm-devel@lists.fedorahosted.org
> > > > Sent: Friday, 10 February, 2012 12:50:04 AM
> > > > Subject: Re: [vdsm] flowID schema
> > > > 
> > > > 
> > > > 
> > > > - Original Message -
> > > > > > From: "Saggi Mizrahi" 
> > > > > > To: "Keith Robertson" 
> > > > > > Cc: "VDSM Project Development"
> > > > > > 
> > > > > > Sent: Thursday, February 9, 2012 2:24:44 PM
> > > > > > Subject: Re: [vdsm] flowID schema
> > > > > >
> > > > > > -1
> > > > > >
> > > > > > I agree that for messaging environment having a Message ID
> > > > > > is
> > > > > > a
> > > > > > must
> > > > > > because you sometimes don't have a particular target so
> > > > > > when
> > > > > > you
> > > > > > get
> > > > > > a response you need to know what this node is actually
> > > > > > responding
> > > > > > to.
> > > > > >
> > > > > > The message ID could be composed with  so
> > > > > > you
> > > > > > can
> > > > > > reuse the field.
> > > > > >
> > > > > > But that is all besides the point.
> > > > > >
> > > > > > I understand that someone might find it fun to go on
> > > > > > following
> > > > > > the
> > > > > > entire flow in the Engine and in VDSM. But I would like to
> > > > > > hear
> > > > > > an
> > > > > > actual use case where someone would have actually benefited
> > > > > > from
> > > > > > this.
> > > > > > As I see it having VSDM return the task ID with every
> > > > > > response
> > > > > > (and
> > > > > > not just for async tasks) is a lot more useful and correct.
> > > > > 
> > > > > Actually, the only way to understand what happened in a
> > > > > certain
> > > > > flow
> > > > > is to follow it through. From the engine log where an action
> > > > > was
> > > > > initiated, down to the hosts that did the execution.
> > > > > Everything
> > > > > RHEV
> > > > > does is a flow, and with no correlation between hosts
> > > > > executing
> > > > > parts of the same flow, troubleshooting turns into guesswork,
> > > > > because the only contact point left is time, which is useless
> > > > > when
> > > > > you're talking about vdsm - there are sometimes hundreds of
> > > > > log
> > > > > records in a single second, and not every host is in absolute
> > > > > sync
> > > > > with every other.
> > > > 
> > > > What are you talking about? you know exactly what operation the
> > > > engine ran at vdsm level.
> > > 
> > > Not always true. Haven't had much chance to deeply dive into 3.0
> > > logs, and if things changed there, it's already a huge step in
> > > the
> > > right direction.
> > 
> > engine logs now always log calls to vdsm.
> 
> finally! :)
> 
> > 
> > > 
> > > > If it's a task then you also have a task
> > > > id which is a uuid so you don't need anything else.
> > > 
> > > Right, but 1. not everything is a task and
> > 
> > Perhaps that should be fixed...
> > 
> > > 2. tasks spawn other tasks, and need to be followed through for
> > > that
> > 
> > In vdsm? no they don't
> 
> Err, ok, in the engine then. The poin

Re: [vdsm] flowID schema

2012-02-10 Thread Dan Yasny


- Original Message -
> From: "Ayal Baron" 
> To: "Dan Yasny" 
> Cc: "Simon Grinberg" , vdsm-devel@lists.fedorahosted.org
> Sent: Friday, 10 February, 2012 2:55:46 PM
> Subject: Re: [vdsm] flowID schema
> 
> 
> 
> - Original Message -
> > 
> > 
> > - Original Message -
> > > From: "Ayal Baron" 
> > > To: "Dan Yasny" 
> > > Cc: "Simon Grinberg" ,
> > > vdsm-devel@lists.fedorahosted.org
> > > Sent: Friday, 10 February, 2012 12:50:04 AM
> > > Subject: Re: [vdsm] flowID schema
> > > 
> > > 
> > > 
> > > - Original Message -----
> > > > > From: "Saggi Mizrahi" 
> > > > > To: "Keith Robertson" 
> > > > > Cc: "VDSM Project Development"
> > > > > 
> > > > > Sent: Thursday, February 9, 2012 2:24:44 PM
> > > > > Subject: Re: [vdsm] flowID schema
> > > > >
> > > > > -1
> > > > >
> > > > > I agree that for messaging environment having a Message ID is
> > > > > a
> > > > > must
> > > > > because you sometimes don't have a particular target so when
> > > > > you
> > > > > get
> > > > > a response you need to know what this node is actually
> > > > > responding
> > > > > to.
> > > > >
> > > > > The message ID could be composed with  so you
> > > > > can
> > > > > reuse the field.
> > > > >
> > > > > But that is all besides the point.
> > > > >
> > > > > I understand that someone might find it fun to go on
> > > > > following
> > > > > the
> > > > > entire flow in the Engine and in VDSM. But I would like to
> > > > > hear
> > > > > an
> > > > > actual use case where someone would have actually benefited
> > > > > from
> > > > > this.
> > > > > As I see it having VSDM return the task ID with every
> > > > > response
> > > > > (and
> > > > > not just for async tasks) is a lot more useful and correct.
> > > > 
> > > > Actually, the only way to understand what happened in a certain
> > > > flow
> > > > is to follow it through. From the engine log where an action
> > > > was
> > > > initiated, down to the hosts that did the execution. Everything
> > > > RHEV
> > > > does is a flow, and with no correlation between hosts executing
> > > > parts of the same flow, troubleshooting turns into guesswork,
> > > > because the only contact point left is time, which is useless
> > > > when
> > > > you're talking about vdsm - there are sometimes hundreds of log
> > > > records in a single second, and not every host is in absolute
> > > > sync
> > > > with every other.
> > > 
> > > What are you talking about? you know exactly what operation the
> > > engine ran at vdsm level.
> > 
> > Not always true. Haven't had much chance to deeply dive into 3.0
> > logs, and if things changed there, it's already a huge step in the
> > right direction.
> 
> engine logs now always log calls to vdsm.

finally! :)

> 
> > 
> > > If it's a task then you also have a task
> > > id which is a uuid so you don't need anything else.
> > 
> > Right, but 1. not everything is a task and
> 
> Perhaps that should be fixed...
> 
> > 2. tasks spawn other tasks, and need to be followed through for
> > that
> 
> In vdsm? no they don't

Err, ok, in the engine then. The point being, a flow like MoveMultipleImages 
(pardon my bad memory for spelling) spawns several tasks, and even within those 
tasks there's usually more than a single flow.

> 
> > 3. long running tasks are hell to debug, because they span several
> > log files and thousands lines of logs
> 
> flowid would not solve this in any way.
> All you need to do to see the entire task is grep the taskID as the
> thread name is the taskID

which causes me to
1. jump between engine and vdsm logs, comparing task IDs and actions. 
2. do it many times in case of a complex flow

Seeing the entire thing on RHEV/oVirt as a system would make a better picture, 
and being able to grep the entire set of logs for a single ID is a huge plus, 
since it

Re: [vdsm] flowID schema

2012-02-10 Thread Ayal Baron


- Original Message -
> 
> 
> - Original Message -
> > From: "Ayal Baron" 
> > To: "Dan Yasny" 
> > Cc: "Simon Grinberg" ,
> > vdsm-devel@lists.fedorahosted.org
> > Sent: Friday, 10 February, 2012 12:50:04 AM
> > Subject: Re: [vdsm] flowID schema
> > 
> > 
> > 
> > - Original Message -
> > > > From: "Saggi Mizrahi" 
> > > > To: "Keith Robertson" 
> > > > Cc: "VDSM Project Development"
> > > > 
> > > > Sent: Thursday, February 9, 2012 2:24:44 PM
> > > > Subject: Re: [vdsm] flowID schema
> > > >
> > > > -1
> > > >
> > > > I agree that for messaging environment having a Message ID is a
> > > > must
> > > > because you sometimes don't have a particular target so when
> > > > you
> > > > get
> > > > a response you need to know what this node is actually
> > > > responding
> > > > to.
> > > >
> > > > The message ID could be composed with  so you
> > > > can
> > > > reuse the field.
> > > >
> > > > But that is all besides the point.
> > > >
> > > > I understand that someone might find it fun to go on following
> > > > the
> > > > entire flow in the Engine and in VDSM. But I would like to hear
> > > > an
> > > > actual use case where someone would have actually benefited
> > > > from
> > > > this.
> > > > As I see it having VSDM return the task ID with every response
> > > > (and
> > > > not just for async tasks) is a lot more useful and correct.
> > > 
> > > Actually, the only way to understand what happened in a certain
> > > flow
> > > is to follow it through. From the engine log where an action was
> > > initiated, down to the hosts that did the execution. Everything
> > > RHEV
> > > does is a flow, and with no correlation between hosts executing
> > > parts of the same flow, troubleshooting turns into guesswork,
> > > because the only contact point left is time, which is useless
> > > when
> > > you're talking about vdsm - there are sometimes hundreds of log
> > > records in a single second, and not every host is in absolute
> > > sync
> > > with every other.
> > 
> > What are you talking about? you know exactly what operation the
> > engine ran at vdsm level.
> 
> Not always true. Haven't had much chance to deeply dive into 3.0
> logs, and if things changed there, it's already a huge step in the
> right direction.

engine logs now always log calls to vdsm.

> 
> > If it's a task then you also have a task
> > id which is a uuid so you don't need anything else.
> 
> Right, but 1. not everything is a task and

Perhaps that should be fixed...

> 2. tasks spawn other tasks, and need to be followed through for that

In vdsm? no they don't

> 3. long running tasks are hell to debug, because they span several
> log files and thousands lines of logs

flowid would not solve this in any way.
All you need to do to see the entire task is grep the taskID as the thread name 
is the taskID

> 
> > In addition, now that engine logs results, you can just grep that
> > instead of a flow id and land at the exact correct command and not
> > have to figure out which out of the 5 run in this flow is the
> > relevant one.
> 
> Haven't seen that yet, but again, what are results? When the failure
> is somewhere in the middle of the flow, the resulting failure can be
> totally irrelevant.

When you have the failure message then all you have to do is grep it in the log 
and reach the exact call that failed.  Just search for "ERROR|FAILED" above 
that and you reach the place of error.

> 
> > 
> > If you could give a real example where this would be beneficial
> > (i.e.
> > log excerpts, how you correlated them and how flow id would have
> > eased your job) that would be great.
> 
> Don't have these handy, guess Vladik, who was collecting interesting
> fail flows could have helped.
> 
> 
> > Note that I've also discussed this with Yaniv from qe who said they
> > don't really need it.
> 
> I'm not saying I want to see a flow ID as such, what I _am_ saying is
> that flows are important, and we need an easy way of following them
> through.
> When a user starts a process in the engine, it should be clearly
> logged and

Re: [vdsm] flowID schema

2012-02-10 Thread Dan Yasny


- Original Message -
> From: "Ayal Baron" 
> To: "Dan Yasny" 
> Cc: "Simon Grinberg" , vdsm-devel@lists.fedorahosted.org
> Sent: Friday, 10 February, 2012 12:50:04 AM
> Subject: Re: [vdsm] flowID schema
> 
> 
> 
> - Original Message -
> > > From: "Saggi Mizrahi" 
> > > To: "Keith Robertson" 
> > > Cc: "VDSM Project Development"
> > > 
> > > Sent: Thursday, February 9, 2012 2:24:44 PM
> > > Subject: Re: [vdsm] flowID schema
> > >
> > > -1
> > >
> > > I agree that for messaging environment having a Message ID is a
> > > must
> > > because you sometimes don't have a particular target so when you
> > > get
> > > a response you need to know what this node is actually responding
> > > to.
> > >
> > > The message ID could be composed with  so you can
> > > reuse the field.
> > >
> > > But that is all besides the point.
> > >
> > > I understand that someone might find it fun to go on following
> > > the
> > > entire flow in the Engine and in VDSM. But I would like to hear
> > > an
> > > actual use case where someone would have actually benefited from
> > > this.
> > > As I see it having VSDM return the task ID with every response
> > > (and
> > > not just for async tasks) is a lot more useful and correct.
> > 
> > Actually, the only way to understand what happened in a certain
> > flow
> > is to follow it through. From the engine log where an action was
> > initiated, down to the hosts that did the execution. Everything
> > RHEV
> > does is a flow, and with no correlation between hosts executing
> > parts of the same flow, troubleshooting turns into guesswork,
> > because the only contact point left is time, which is useless when
> > you're talking about vdsm - there are sometimes hundreds of log
> > records in a single second, and not every host is in absolute sync
> > with every other.
> 
> What are you talking about? you know exactly what operation the
> engine ran at vdsm level. 

Not always true. Haven't had much chance to deeply dive into 3.0 logs, and if 
things changed there, it's already a huge step in the right direction.

> If it's a task then you also have a task
> id which is a uuid so you don't need anything else.

Right, but 1. not everything is a task and
2. tasks spawn other tasks, and need to be followed through for that
3. long running tasks are hell to debug, because they span several log files 
and thousands lines of logs

> In addition, now that engine logs results, you can just grep that
> instead of a flow id and land at the exact correct command and not
> have to figure out which out of the 5 run in this flow is the
> relevant one.

Haven't seen that yet, but again, what are results? When the failure is 
somewhere in the middle of the flow, the resulting failure can be totally 
irrelevant.

> 
> If you could give a real example where this would be beneficial (i.e.
> log excerpts, how you correlated them and how flow id would have
> eased your job) that would be great.

Don't have these handy, guess Vladik, who was collecting interesting fail flows 
could have helped. 


> Note that I've also discussed this with Yaniv from qe who said they
> don't really need it.

I'm not saying I want to see a flow ID as such, what I _am_ saying is that 
flows are important, and we need an easy way of following them through.
When a user starts a process in the engine, it should be clearly logged and 
marked, then it should consistently report progress and interim outputs, say 
which host was picked for what action, and how that action can be identified in 
the vdsm logs.

We cannot rely on timing. We cannot rely on everyone knowing obscure 
engine/vdsm action naming conventions, that are not exactly the same as they 
look in the GUI. A person with no understanding of engine and vdsm internals, 
should be able to easily follow an action through to conclusion, and understand 
what was done at each step, what the system got in return and how it reacted. 
And all of these actions should be clearly inter-related, so a single grep can 
select a flow.

If you prefer to go in the other direction, like turning everything into a 
task, that can also work, but each task, when spawning another, should very 
clearly show the relation as well, and sets of tasks should be marked as sets.

> 
> > 
> > >
> > > A generic debugging scenario as I see it.
> > >
> > > 1. Something went wrong
> > > 2. You go looking in the ENGINE log try

Re: [vdsm] flowID schema

2012-02-09 Thread Ayal Baron


- Original Message -
> > From: "Saggi Mizrahi" 
> > To: "Keith Robertson" 
> > Cc: "VDSM Project Development" 
> > Sent: Thursday, February 9, 2012 2:24:44 PM
> > Subject: Re: [vdsm] flowID schema
> >
> > -1
> >
> > I agree that for messaging environment having a Message ID is a
> > must
> > because you sometimes don't have a particular target so when you
> > get
> > a response you need to know what this node is actually responding
> > to.
> >
> > The message ID could be composed with  so you can
> > reuse the field.
> >
> > But that is all besides the point.
> >
> > I understand that someone might find it fun to go on following the
> > entire flow in the Engine and in VDSM. But I would like to hear an
> > actual use case where someone would have actually benefited from
> > this.
> > As I see it having VSDM return the task ID with every response (and
> > not just for async tasks) is a lot more useful and correct.
> 
> Actually, the only way to understand what happened in a certain flow
> is to follow it through. From the engine log where an action was
> initiated, down to the hosts that did the execution. Everything RHEV
> does is a flow, and with no correlation between hosts executing
> parts of the same flow, troubleshooting turns into guesswork,
> because the only contact point left is time, which is useless when
> you're talking about vdsm - there are sometimes hundreds of log
> records in a single second, and not every host is in absolute sync
> with every other.

What are you talking about? you know exactly what operation the engine ran at 
vdsm level. If it's a task then you also have a task id which is a uuid so you 
don't need anything else.
In addition, now that engine logs results, you can just grep that instead of a 
flow id and land at the exact correct command and not have to figure out which 
out of the 5 run in this flow is the relevant one.

If you could give a real example where this would be beneficial (i.e. log 
excerpts, how you correlated them and how flow id would have eased your job) 
that would be great.
Note that I've also discussed this with Yaniv from qe who said they don't 
really need it.

> 
> >
> > A generic debugging scenario as I see it.
> >
> > 1. Something went wrong
> > 2. You go looking in the ENGINE log trying to figure out what
> > happend.
> > 3. You see that ENGINE got SomeError.
> 
> ok, the rest are all downhill.
> 
> 4. You follow the failure back to the start of the flow, then go with
> the flow to the point where the engine exited to vdsm
> 5. switch over to vdsm logs, make sure you have the timing right
> (with no flow ID that's the olny orientation after all)
> 6. find the start of the vdsm-side flow, follow it to the failure,
> pray the error makes sense.
> 
> In many cases the answer is not in the vdsm failure traceback but
> somewhere in the middle of the flow, with no errors reported, this
> is why we need a way to easily follow things through. Moreover, the
> logs should be readable enough to make sense to a typical sysadmin,
> and not a RHEV expert.
> 
> > 4. Check to see if this error makes sense imagining that VDSM is
> > always right and is a black box.
> > 5. You did your digging and now you think that VDSM is as fault.
> > 6. Go look for the call that failed. (If we returned the taskID
> > it's
> > pretty simple to find that call).
> > 7. Look around the call to check VDSM state.
> > 8. Profit.
> >
> > There is never a point where you want to follow a whole flow call
> > by
> > call going back and forth, and even if you did having the VDSM
> > taskID is a better anchor then flowID.
> 
> not everything is a task, flow IDs would unify entire flows, and make
> following them easy.
> 
> >
> > VDSM is built in a way that every call takes in to account the
> > current state only. Debugging it with an engine flow mindset is
> > just
> > wrong and distracting. I see it doing more harm the good by
> > reinforcing bad debugging practices.
> 
> Maybe you're right, though I can't see how from my experience so far,
> but following the flows is the only thing that got cases resolved.
> Not event IDs making every possible error, and not task IDs (though
> these do have their uses) - slow and meticulous mapping of flows to
> log records.
> 
> >
> > - Original Message -
> > > From: "Keith Robertson" 
> > > To: "VDSM Project Development"
> > > 
> > > Sent: Thursday, February 9, 201

Re: [vdsm] flowID schema

2012-02-09 Thread Dan Yasny
> From: "Saggi Mizrahi" 
> To: "Keith Robertson" 
> Cc: "VDSM Project Development" 
> Sent: Thursday, February 9, 2012 2:24:44 PM
> Subject: Re: [vdsm] flowID schema
>
> -1
>
> I agree that for messaging environment having a Message ID is a must
> because you sometimes don't have a particular target so when you get
> a response you need to know what this node is actually responding
> to.
>
> The message ID could be composed with  so you can
> reuse the field.
>
> But that is all besides the point.
>
> I understand that someone might find it fun to go on following the
> entire flow in the Engine and in VDSM. But I would like to hear an
> actual use case where someone would have actually benefited from
> this.
> As I see it having VSDM return the task ID with every response (and
> not just for async tasks) is a lot more useful and correct.

Actually, the only way to understand what happened in a certain flow is to 
follow it through. From the engine log where an action was initiated, down to 
the hosts that did the execution. Everything RHEV does is a flow, and with no 
correlation between hosts executing parts of the same flow, troubleshooting 
turns into guesswork, because the only contact point left is time, which is 
useless when you're talking about vdsm - there are sometimes hundreds of log 
records in a single second, and not every host is in absolute sync with every 
other.

>
> A generic debugging scenario as I see it.
>
> 1. Something went wrong
> 2. You go looking in the ENGINE log trying to figure out what
> happend.
> 3. You see that ENGINE got SomeError.

ok, the rest are all downhill. 

4. You follow the failure back to the start of the flow, then go with the flow 
to the point where the engine exited to vdsm 
5. switch over to vdsm logs, make sure you have the timing right (with no flow 
ID that's the olny orientation after all)
6. find the start of the vdsm-side flow, follow it to the failure, pray the 
error makes sense.

In many cases the answer is not in the vdsm failure traceback but somewhere in 
the middle of the flow, with no errors reported, this is why we need a way to 
easily follow things through. Moreover, the logs should be readable enough to 
make sense to a typical sysadmin, and not a RHEV expert.

> 4. Check to see if this error makes sense imagining that VDSM is
> always right and is a black box.
> 5. You did your digging and now you think that VDSM is as fault.
> 6. Go look for the call that failed. (If we returned the taskID it's
> pretty simple to find that call).
> 7. Look around the call to check VDSM state.
> 8. Profit.
>
> There is never a point where you want to follow a whole flow call by
> call going back and forth, and even if you did having the VDSM
> taskID is a better anchor then flowID.

not everything is a task, flow IDs would unify entire flows, and make following 
them easy.

>
> VDSM is built in a way that every call takes in to account the
> current state only. Debugging it with an engine flow mindset is just
> wrong and distracting. I see it doing more harm the good by
> reinforcing bad debugging practices.

Maybe you're right, though I can't see how from my experience so far, but 
following the flows is the only thing that got cases resolved. Not event IDs 
making every possible error, and not task IDs (though these do have their uses) 
- slow and meticulous mapping of flows to log records.

>
> - Original Message -
> > From: "Keith Robertson" 
> > To: "VDSM Project Development" 
> > Sent: Thursday, February 9, 2012 1:34:43 PM
> > Subject: Re: [vdsm] flowID schema
> >
> > On 02/09/2012 12:18 PM, Andrew Cathrow wrote:
> > >
> > > - Original Message -
> > >> From: "Ayal Baron"
> > >> To: "Dan Kenigsberg"
> > >> Cc: "VDSM Project
> > >> Development"
> > >> Sent: Monday, February 6, 2012 10:35:54 AM
> > >> Subject: Re: [vdsm] flowID schema
> > >>
> > >>
> > >>
> > >> - Original Message -
> > >>> On Thu, Feb 02, 2012 at 10:32:49AM -0500, Saggi Mizrahi wrote:
> > >>>> flowID makes no sense after the initial API call as stuff like
> > >>>> cacheing\threadpools\samplingtasks\resources\asyncTasks so
> > >>>> flowing
> > >>>> a flow like that will not give you the entire picture while
> > >>>> debugging.
> > >>>>
> > >>>> Also adding it now will make everything even more ugly.
> > >>>> You know what, just imagine I wrote one of my long rambles
> > >&

Re: [vdsm] flowID schema

2012-02-09 Thread Ayal Baron


- Original Message -
> 
> 
> - Original Message -
> > From: "Saggi Mizrahi" 
> > To: "Keith Robertson" 
> > Cc: "VDSM Project Development" 
> > Sent: Thursday, February 9, 2012 2:24:44 PM
> > Subject: Re: [vdsm] flowID schema
> > 
> > -1
> > 
> > I agree that for messaging environment having a Message ID is a
> > must
> > because you sometimes don't have a particular target so when you
> > get
> > a response you need to know what this node is actually responding
> > to.
> > 
> > The message ID could be composed with  so you can
> > reuse the field.
> > 
> > But that is all besides the point.
> > 
> > I understand that someone might find it fun to go on following the
> > entire flow in the Engine and in VDSM. But I would like to hear an
> > actual use case where someone would have actually benefited from
> > this.
> > As I see it having VSDM return the task ID with every response (and
> > not just for async tasks) is a lot more useful and correct.
> 
> I'd like to hear from some folks who've had to support rhev and do
> exactly this.
> Dan, Simon, care to chip in?

Agreed.  However, I can personally say that I've debugged dozens of issues 
cross components and having an anchor in the engine log (result from vdsm) was 
all I needed to find the relevant part in the vdsm log.

> 
> 
> 
> > 
> > A generic debugging scenario as I see it.
> > 
> > 1. Something went wrong
> > 2. You go looking in the ENGINE log trying to figure out what
> > happend.
> > 3. You see that ENGINE got SomeError.
> > 4. Check to see if this error makes sense imagining that VDSM is
> > always right and is a black box.
> > 5. You did your digging and now you think that VDSM is as fault.
> > 6. Go look for the call that failed. (If we returned the taskID
> > it's
> > pretty simple to find that call).
> > 7. Look around the call to check VDSM state.
> > 8. Profit.
> > 
> > There is never a point where you want to follow a whole flow call
> > by
> > call going back and forth, and even if you did having the VDSM
> > taskID is a better anchor then flowID.
> > 
> > VDSM is built in a way that every call takes in to account the
> > current state only. Debugging it with an engine flow mindset is
> > just
> > wrong and distracting. I see it doing more harm the good by
> > reinforcing bad debugging practices.
> > 
> > - Original Message -
> > > From: "Keith Robertson" 
> > > To: "VDSM Project Development"
> > > 
> > > Sent: Thursday, February 9, 2012 1:34:43 PM
> > > Subject: Re: [vdsm] flowID schema
> > > 
> > > On 02/09/2012 12:18 PM, Andrew Cathrow wrote:
> > > >
> > > > - Original Message -
> > > >> From: "Ayal Baron"
> > > >> To: "Dan Kenigsberg"
> > > >> Cc: "VDSM Project
> > > >> Development"
> > > >> Sent: Monday, February 6, 2012 10:35:54 AM
> > > >> Subject: Re: [vdsm] flowID schema
> > > >>
> > > >>
> > > >>
> > > >> - Original Message -
> > > >>> On Thu, Feb 02, 2012 at 10:32:49AM -0500, Saggi Mizrahi
> > > >>> wrote:
> > > >>>> flowID makes no sense after the initial API call as stuff
> > > >>>> like
> > > >>>> cacheing\threadpools\samplingtasks\resources\asyncTasks so
> > > >>>> flowing
> > > >>>> a flow like that will not give you the entire picture while
> > > >>>> debugging.
> > > >>>>
> > > >>>> Also adding it now will make everything even more ugly.
> > > >>>> You know what, just imagine I wrote one of my long rambles
> > > >>>> about
> > > >>>> why I don't agree with doing this.
> > > >>> I cannot imagine you write anything like that. Really. I do
> > > >>> not
> > > >>> understand why you object logging flowID on API entry point.
> > > >> The question is, what problem is this really trying to solve
> > > >> and
> > > >> is
> > > >> there a simpler and less obtrusive solution to that problem?
> > > > correlating logs between ovirt

Re: [vdsm] flowID schema

2012-02-09 Thread Andrew Cathrow


- Original Message -
> From: "Saggi Mizrahi" 
> To: "Keith Robertson" 
> Cc: "VDSM Project Development" 
> Sent: Thursday, February 9, 2012 2:24:44 PM
> Subject: Re: [vdsm] flowID schema
> 
> -1
> 
> I agree that for messaging environment having a Message ID is a must
> because you sometimes don't have a particular target so when you get
> a response you need to know what this node is actually responding
> to.
> 
> The message ID could be composed with  so you can
> reuse the field.
> 
> But that is all besides the point.
> 
> I understand that someone might find it fun to go on following the
> entire flow in the Engine and in VDSM. But I would like to hear an
> actual use case where someone would have actually benefited from
> this.
> As I see it having VSDM return the task ID with every response (and
> not just for async tasks) is a lot more useful and correct.

I'd like to hear from some folks who've had to support rhev and do exactly this.
Dan, Simon, care to chip in?



> 
> A generic debugging scenario as I see it.
> 
> 1. Something went wrong
> 2. You go looking in the ENGINE log trying to figure out what
> happend.
> 3. You see that ENGINE got SomeError.
> 4. Check to see if this error makes sense imagining that VDSM is
> always right and is a black box.
> 5. You did your digging and now you think that VDSM is as fault.
> 6. Go look for the call that failed. (If we returned the taskID it's
> pretty simple to find that call).
> 7. Look around the call to check VDSM state.
> 8. Profit.
> 
> There is never a point where you want to follow a whole flow call by
> call going back and forth, and even if you did having the VDSM
> taskID is a better anchor then flowID.
> 
> VDSM is built in a way that every call takes in to account the
> current state only. Debugging it with an engine flow mindset is just
> wrong and distracting. I see it doing more harm the good by
> reinforcing bad debugging practices.
> 
> - Original Message -
> > From: "Keith Robertson" 
> > To: "VDSM Project Development" 
> > Sent: Thursday, February 9, 2012 1:34:43 PM
> > Subject: Re: [vdsm] flowID schema
> > 
> > On 02/09/2012 12:18 PM, Andrew Cathrow wrote:
> > >
> > > - Original Message -
> > >> From: "Ayal Baron"
> > >> To: "Dan Kenigsberg"
> > >> Cc: "VDSM Project
> > >> Development"
> > >> Sent: Monday, February 6, 2012 10:35:54 AM
> > >> Subject: Re: [vdsm] flowID schema
> > >>
> > >>
> > >>
> > >> - Original Message -
> > >>> On Thu, Feb 02, 2012 at 10:32:49AM -0500, Saggi Mizrahi wrote:
> > >>>> flowID makes no sense after the initial API call as stuff like
> > >>>> cacheing\threadpools\samplingtasks\resources\asyncTasks so
> > >>>> flowing
> > >>>> a flow like that will not give you the entire picture while
> > >>>> debugging.
> > >>>>
> > >>>> Also adding it now will make everything even more ugly.
> > >>>> You know what, just imagine I wrote one of my long rambles
> > >>>> about
> > >>>> why I don't agree with doing this.
> > >>> I cannot imagine you write anything like that. Really. I do not
> > >>> understand why you object logging flowID on API entry point.
> > >> The question is, what problem is this really trying to solve and
> > >> is
> > >> there a simpler and less obtrusive solution to that problem?
> > > correlating logs between ovirt engine and potentially multiple
> > > vdsm
> > > nodes is a nightmare. It requires a lot skill to follow a
> > > transaction through from the front end all the way to the node,
> > > and even multiple nodes (eg actions on spm, then actions on other
> > > node to run a vm).
> > > Having a way to correlate the logs and follow a single event/flow
> > > is vital.
> > >
> > +1
> > 
> > Knowing what command caused a sequence of events in VDSM would be
> > really
> > helpful particularly in a threaded environment.  Further, wouldn't
> > such
> > an ID be helpful in an asynchronous request/response model?  I'm
> > not
> > sure what the plans are for AMQP or even if there are plans, but
> > I'd
> > think that something like this would be crucial for an async
> > response.
> > So, if you implemented it you might 

Re: [vdsm] flowID schema

2012-02-09 Thread Ayal Baron


- Original Message -
> -1
> 
> I agree that for messaging environment having a Message ID is a must
> because you sometimes don't have a particular target so when you get
> a response you need to know what this node is actually responding
> to.
> 
> The message ID could be composed with  so you can
> reuse the field.
> 
> But that is all besides the point.
> 
> I understand that someone might find it fun to go on following the
> entire flow in the Engine and in VDSM. But I would like to hear an
> actual use case where someone would have actually benefited from
> this.
> As I see it having VSDM return the task ID with every response (and
> not just for async tasks) is a lot more useful and correct.
> 
> A generic debugging scenario as I see it.
> 
> 1. Something went wrong
> 2. You go looking in the ENGINE log trying to figure out what
> happend.
> 3. You see that ENGINE got SomeError.
> 4. Check to see if this error makes sense imagining that VDSM is
> always right and is a black box.
> 5. You did your digging and now you think that VDSM is as fault.
> 6. Go look for the call that failed. (If we returned the taskID it's
> pretty simple to find that call).
> 7. Look around the call to check VDSM state.
> 8. Profit.
> 
> There is never a point where you want to follow a whole flow call by
> call going back and forth, and even if you did having the VDSM
> taskID is a better anchor then flowID.
> 
> VDSM is built in a way that every call takes in to account the
> current state only. Debugging it with an engine flow mindset is just
> wrong and distracting. I see it doing more harm the good by
> reinforcing bad debugging practices.

I don't know about harm, but, today the engine logs every call and return value 
to and from vdsm.  This means that all the info that is needed to follow a flow 
is already present in the engine log (which was not the case previously) so I 
believe that the flow id is redundant.
In addition, instead of focusing on how to track a flow between components, we 
should focus on how to improve the engine log so that the users don't need to 
go to the hosts in the first place.
My biggest problem with it is that it changes each and every verb in the API 
and makes the log itself also more verbose and less readable.

> 
> - Original Message -
> > From: "Keith Robertson" 
> > To: "VDSM Project Development" 
> > Sent: Thursday, February 9, 2012 1:34:43 PM
> > Subject: Re: [vdsm] flowID schema
> > 
> > On 02/09/2012 12:18 PM, Andrew Cathrow wrote:
> > >
> > > - Original Message -
> > >> From: "Ayal Baron"
> > >> To: "Dan Kenigsberg"
> > >> Cc: "VDSM Project
> > >> Development"
> > >> Sent: Monday, February 6, 2012 10:35:54 AM
> > >> Subject: Re: [vdsm] flowID schema
> > >>
> > >>
> > >>
> > >> - Original Message -
> > >>> On Thu, Feb 02, 2012 at 10:32:49AM -0500, Saggi Mizrahi wrote:
> > >>>> flowID makes no sense after the initial API call as stuff like
> > >>>> cacheing\threadpools\samplingtasks\resources\asyncTasks so
> > >>>> flowing
> > >>>> a flow like that will not give you the entire picture while
> > >>>> debugging.
> > >>>>
> > >>>> Also adding it now will make everything even more ugly.
> > >>>> You know what, just imagine I wrote one of my long rambles
> > >>>> about
> > >>>> why I don't agree with doing this.
> > >>> I cannot imagine you write anything like that. Really. I do not
> > >>> understand why you object logging flowID on API entry point.
> > >> The question is, what problem is this really trying to solve and
> > >> is
> > >> there a simpler and less obtrusive solution to that problem?
> > > correlating logs between ovirt engine and potentially multiple
> > > vdsm
> > > nodes is a nightmare. It requires a lot skill to follow a
> > > transaction through from the front end all the way to the node,
> > > and even multiple nodes (eg actions on spm, then actions on other
> > > node to run a vm).
> > > Having a way to correlate the logs and follow a single event/flow
> > > is vital.
> > >
> > +1
> > 
> > Knowing what command caused a sequence of events in VDSM would be
> > really
> > helpful particularly in a threaded environment.  Further, wouldn't
> > such
> > an ID be h

Re: [vdsm] flowID schema

2012-02-09 Thread Saggi Mizrahi
-1

I agree that for messaging environment having a Message ID is a must because 
you sometimes don't have a particular target so when you get a response you 
need to know what this node is actually responding to.

The message ID could be composed with  so you can reuse the 
field.

But that is all besides the point.

I understand that someone might find it fun to go on following the entire flow 
in the Engine and in VDSM. But I would like to hear an actual use case where 
someone would have actually benefited from this.
As I see it having VSDM return the task ID with every response (and not just 
for async tasks) is a lot more useful and correct.

A generic debugging scenario as I see it.

1. Something went wrong
2. You go looking in the ENGINE log trying to figure out what happend.
3. You see that ENGINE got SomeError.
4. Check to see if this error makes sense imagining that VDSM is always right 
and is a black box.
5. You did your digging and now you think that VDSM is as fault.
6. Go look for the call that failed. (If we returned the taskID it's pretty 
simple to find that call).
7. Look around the call to check VDSM state.
8. Profit.

There is never a point where you want to follow a whole flow call by call going 
back and forth, and even if you did having the VDSM taskID is a better anchor 
then flowID.

VDSM is built in a way that every call takes in to account the current state 
only. Debugging it with an engine flow mindset is just wrong and distracting. I 
see it doing more harm the good by reinforcing bad debugging practices.

- Original Message -
> From: "Keith Robertson" 
> To: "VDSM Project Development" 
> Sent: Thursday, February 9, 2012 1:34:43 PM
> Subject: Re: [vdsm] flowID schema
> 
> On 02/09/2012 12:18 PM, Andrew Cathrow wrote:
> >
> > - Original Message -
> >> From: "Ayal Baron"
> >> To: "Dan Kenigsberg"
> >> Cc: "VDSM Project Development"
> >> Sent: Monday, February 6, 2012 10:35:54 AM
> >> Subject: Re: [vdsm] flowID schema
> >>
> >>
> >>
> >> - Original Message -
> >>> On Thu, Feb 02, 2012 at 10:32:49AM -0500, Saggi Mizrahi wrote:
> >>>> flowID makes no sense after the initial API call as stuff like
> >>>> cacheing\threadpools\samplingtasks\resources\asyncTasks so
> >>>> flowing
> >>>> a flow like that will not give you the entire picture while
> >>>> debugging.
> >>>>
> >>>> Also adding it now will make everything even more ugly.
> >>>> You know what, just imagine I wrote one of my long rambles about
> >>>> why I don't agree with doing this.
> >>> I cannot imagine you write anything like that. Really. I do not
> >>> understand why you object logging flowID on API entry point.
> >> The question is, what problem is this really trying to solve and
> >> is
> >> there a simpler and less obtrusive solution to that problem?
> > correlating logs between ovirt engine and potentially multiple vdsm
> > nodes is a nightmare. It requires a lot skill to follow a
> > transaction through from the front end all the way to the node,
> > and even multiple nodes (eg actions on spm, then actions on other
> > node to run a vm).
> > Having a way to correlate the logs and follow a single event/flow
> > is vital.
> >
> +1
> 
> Knowing what command caused a sequence of events in VDSM would be
> really
> helpful particularly in a threaded environment.  Further, wouldn't
> such
> an ID be helpful in an asynchronous request/response model?  I'm not
> sure what the plans are for AMQP or even if there are plans, but I'd
> think that something like this would be crucial for an async
> response.
> So, if you implemented it you might be killing 2 birds with 1 stone.
> 
> FYI: If you want to see examples of other systems that use similar
> concepts, take a look at the correlation ID in JMS.
> 
> Cheers,
> Keith
> 
> 
> >>> ___
> >>> vdsm-devel mailing list
> >>> vdsm-devel@lists.fedorahosted.org
> >>> https://fedorahosted.org/mailman/listinfo/vdsm-devel
> >>>
> >> ___
> >> vdsm-devel mailing list
> >> vdsm-devel@lists.fedorahosted.org
> >> https://fedorahosted.org/mailman/listinfo/vdsm-devel
> >>
> > ___
> > vdsm-devel mailing list
> > vdsm-devel@lists.fedorahosted.org
> > https://fedorahosted.org/mailman/listinfo/vdsm-devel
> 
> ___
> vdsm-devel mailing list
> vdsm-devel@lists.fedorahosted.org
> https://fedorahosted.org/mailman/listinfo/vdsm-devel
> 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-09 Thread Keith Robertson

On 02/09/2012 12:18 PM, Andrew Cathrow wrote:


- Original Message -

From: "Ayal Baron"
To: "Dan Kenigsberg"
Cc: "VDSM Project Development"
Sent: Monday, February 6, 2012 10:35:54 AM
Subject: Re: [vdsm] flowID schema



- Original Message -

On Thu, Feb 02, 2012 at 10:32:49AM -0500, Saggi Mizrahi wrote:

flowID makes no sense after the initial API call as stuff like
cacheing\threadpools\samplingtasks\resources\asyncTasks so
flowing
a flow like that will not give you the entire picture while
debugging.

Also adding it now will make everything even more ugly.
You know what, just imagine I wrote one of my long rambles about
why I don't agree with doing this.

I cannot imagine you write anything like that. Really. I do not
understand why you object logging flowID on API entry point.

The question is, what problem is this really trying to solve and is
there a simpler and less obtrusive solution to that problem?

correlating logs between ovirt engine and potentially multiple vdsm nodes is a 
nightmare. It requires a lot skill to follow a transaction through from the 
front end all the way to the node, and even multiple nodes (eg actions on spm, 
then actions on other node to run a vm).
Having a way to correlate the logs and follow a single event/flow is vital.


+1

Knowing what command caused a sequence of events in VDSM would be really 
helpful particularly in a threaded environment.  Further, wouldn't such 
an ID be helpful in an asynchronous request/response model?  I'm not 
sure what the plans are for AMQP or even if there are plans, but I'd 
think that something like this would be crucial for an async response.  
So, if you implemented it you might be killing 2 birds with 1 stone.


FYI: If you want to see examples of other systems that use similar 
concepts, take a look at the correlation ID in JMS.


Cheers,
Keith



___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-09 Thread Lee Yarwood
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/09/2012 05:18 PM, Andrew Cathrow wrote:
>> > The question is, what problem is this really trying to solve and is
>> > there a simpler and less obtrusive solution to that problem?
> correlating logs between ovirt engine and potentially multiple vdsm nodes is 
> a nightmare. It requires a lot skill to follow a transaction through from the 
> front end all the way to the node, and even multiple nodes (eg actions on 
> spm, then actions on other node to run a vm).
> Having a way to correlate the logs and follow a single event/flow is vital.
> 

+1 anything that allows us to easily follow flows between engine, vdsm
and back again would be welcomed by those supporting it!

Lee
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPNAiFAAoJELymbjP2ci12PFYH/09f/c7xlSMZ1QKJo6QHgBmC
fucAUiKDWFxpJ3zyOJsoJfCL6TmXdvdUAJQXhmy2Its3KGYdhKZLezqJHpIGM/AR
f694/O5a1jUYGJnxtQknr1H4wJar1ot0gIJEuiDxEM9haR4XTTYbhsyl8ApQ8wHP
EERHi6eTzLY0j3ohbDOqRSrGdsofTYQ653MNQXggfBa41R1KbOPkp8XIf4RJ7+0J
8e7xGqU3SwITDeJ9c7yj5bwWdpuzUGvwPZif8VeuR0RajjKHCagiexWbRoLmoGYi
dqc+QcxOL8zs1e9mOiB1Cgfc36bbCjGFBNYe8N3gpnod9NCQZ1f9FzbQ82+Dosg=
=CMZJ
-END PGP SIGNATURE-
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-09 Thread Andrew Cathrow


- Original Message -
> From: "Ayal Baron" 
> To: "Dan Kenigsberg" 
> Cc: "VDSM Project Development" 
> Sent: Monday, February 6, 2012 10:35:54 AM
> Subject: Re: [vdsm] flowID schema
> 
> 
> 
> - Original Message -
> > On Thu, Feb 02, 2012 at 10:32:49AM -0500, Saggi Mizrahi wrote:
> > > flowID makes no sense after the initial API call as stuff like
> > > cacheing\threadpools\samplingtasks\resources\asyncTasks so
> > > flowing
> > > a flow like that will not give you the entire picture while
> > > debugging.
> > > 
> > > Also adding it now will make everything even more ugly.
> > > You know what, just imagine I wrote one of my long rambles about
> > > why I don't agree with doing this.
> > 
> > I cannot imagine you write anything like that. Really. I do not
> > understand why you object logging flowID on API entry point.
> 
> The question is, what problem is this really trying to solve and is
> there a simpler and less obtrusive solution to that problem?

correlating logs between ovirt engine and potentially multiple vdsm nodes is a 
nightmare. It requires a lot skill to follow a transaction through from the 
front end all the way to the node, and even multiple nodes (eg actions on spm, 
then actions on other node to run a vm).
Having a way to correlate the logs and follow a single event/flow is vital.

> 
> > 
> > ___
> > vdsm-devel mailing list
> > vdsm-devel@lists.fedorahosted.org
> > https://fedorahosted.org/mailman/listinfo/vdsm-devel
> > 
> ___
> vdsm-devel mailing list
> vdsm-devel@lists.fedorahosted.org
> https://fedorahosted.org/mailman/listinfo/vdsm-devel
> 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-06 Thread Ayal Baron


- Original Message -
> On Thu, Feb 02, 2012 at 10:32:49AM -0500, Saggi Mizrahi wrote:
> > flowID makes no sense after the initial API call as stuff like
> > cacheing\threadpools\samplingtasks\resources\asyncTasks so flowing
> > a flow like that will not give you the entire picture while
> > debugging.
> > 
> > Also adding it now will make everything even more ugly.
> > You know what, just imagine I wrote one of my long rambles about
> > why I don't agree with doing this.
> 
> I cannot imagine you write anything like that. Really. I do not
> understand why you object logging flowID on API entry point.

The question is, what problem is this really trying to solve and is there a 
simpler and less obtrusive solution to that problem?

> 
> ___
> vdsm-devel mailing list
> vdsm-devel@lists.fedorahosted.org
> https://fedorahosted.org/mailman/listinfo/vdsm-devel
> 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-06 Thread Dan Kenigsberg
On Thu, Feb 02, 2012 at 10:32:49AM -0500, Saggi Mizrahi wrote:
> flowID makes no sense after the initial API call as stuff like 
> cacheing\threadpools\samplingtasks\resources\asyncTasks so flowing a flow 
> like that will not give you the entire picture while debugging.
> 
> Also adding it now will make everything even more ugly.
> You know what, just imagine I wrote one of my long rambles about why I don't 
> agree with doing this.

I cannot imagine you write anything like that. Really. I do not
understand why you object logging flowID on API entry point.

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-06 Thread Dan Kenigsberg
On Thu, Feb 02, 2012 at 12:00:44AM -0500, Douglas Landgraf wrote:
> Hello,
> 
>  flowID is schema that we will be including in vdsm API to oVirt
> Engine people share the ID of engine transaction to vdsm.
> With this in hands, we will add the ID of transactions to our log.
> 
> I would like to know your opinion how we could do that without break
> our API, like include new parameters to our calls.
> 
> Should we add at BindingXMLRPC.py -> wrapper() a code to search for
> a 'flowID' key into functions which use dict as parameter (like
> create)? [1] Maybe change at other level inside BindingXMLRPC ?

Moti came up an idea from that department: we can abuse one of the http
headers (say, X-FlowID) and it in BindingXMLRPC._createXMLRPCServer,

class LoggingHandle:

def parse_request(self):
...
threadLocal.flowID = self.headers.get('X-FlowID')

I hate this crossing of layers, but it could keep our formal API intact.

Dan.
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] flowID schema

2012-02-02 Thread Saggi Mizrahi
flowID makes no sense after the initial API call as stuff like 
cacheing\threadpools\samplingtasks\resources\asyncTasks so flowing a flow like 
that will not give you the entire picture while debugging.

Also adding it now will make everything even more ugly.
You know what, just imagine I wrote one of my long rambles about why I don't 
agree with doing this.

As you plan on going on anyway here is my suggestion on how to push this in.

XMLRPC doesn't support named parameter, this means that you can't just ad-hoc a 
new arg to all the API calls called flow-id.
For simplicity's sake lets assume they always pass the last arg as flowID if it 
is a string that starts with "__FLOWID__".
What you do then is in dispatcher take the last arg and put it on the task 
object.
Have the logger print this value even when the task is in prepare next to the 
threadID.

You will have to make the clientIF calls use *another* dispatcher but the same 
task thread pool to have this supported at the clientIF verbs as well but I 
think it should have been done anyways.

- Original Message -
> From: "Douglas Landgraf" 
> To: "VDSM Project Development" 
> Sent: Thursday, February 2, 2012 12:00:44 AM
> Subject: [vdsm] flowID schema
> 
> Hello,
> 
>   flowID is schema that we will be including in vdsm API to oVirt
> Engine people share the ID of engine transaction to vdsm.
> With this in hands, we will add the ID of transactions to our log.
> 
> I would like to know your opinion how we could do that without break
> our
> API, like include new parameters to our calls.
> 
> Should we add at BindingXMLRPC.py -> wrapper() a code to search for a
> 'flowID' key into functions which use dict as parameter (like
> create)?
> [1] Maybe change at other level inside BindingXMLRPC ?
> 
> Ideas/Thoughts?
> 
> [1]
> http://gerrit.ovirt.org/#patch,sidebyside,1221,3,vdsm/BindingXMLRPC.py
> 
> Thanks!
> 
> --
> Cheers
> Douglas
> 
> ___
> vdsm-devel mailing list
> vdsm-devel@lists.fedorahosted.org
> https://fedorahosted.org/mailman/listinfo/vdsm-devel
> 
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel