Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-09 Thread Borislav Petkov
On Sun, Jun 08, 2014 at 09:10:15PM -0400, Chen, Gong wrote: > BTW, any comments from other guys? Boris, Tony? If not, I will send out the > new version tomorrow. Looks ok at a first glance - I'll take a deeper look at your new version with Steve's comments incorporated tomorrow - today's holiday

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-09 Thread Borislav Petkov
On Sun, Jun 08, 2014 at 09:10:15PM -0400, Chen, Gong wrote: BTW, any comments from other guys? Boris, Tony? If not, I will send out the new version tomorrow. Looks ok at a first glance - I'll take a deeper look at your new version with Steve's comments incorporated tomorrow - today's holiday

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-08 Thread Chen, Gong
On Fri, Jun 06, 2014 at 11:21:27AM -0400, Steven Rostedt wrote: > Date: Fri, 6 Jun 2014 11:21:27 -0400 > From: Steven Rostedt > To: "Chen, Gong" > Cc: "Luck, Tony" , Borislav Petkov , > "m.che...@samsung.com" , > "linux-a...@vger.kernel.org

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-08 Thread Chen, Gong
, linux-a...@vger.kernel.org linux-a...@vger.kernel.org, LKML linux-kernel@vger.kernel.org Subject: Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface X-Mailer: Claws Mail 3.9.3 (GTK+ 2.24.23; x86_64-pc-linux-gnu) On Fri, 6 Jun 2014 02:51:41 -0400 Chen, Gong gong.c

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-06 Thread Steven Rostedt
On Fri, 6 Jun 2014 02:51:41 -0400 "Chen, Gong" wrote: > +/* > + * MCE Extended Error Log trace event > + * > + * These events are generated when hardware detects a corrected or > + * uncorrected event. > + */ > + > +/* memory trace event */ > + > +TRACE_EVENT(extlog_mem_event, > +

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-06 Thread Chen, Gong
On Tue, Jun 03, 2014 at 10:35:44AM -0400, Steven Rostedt wrote: > Note, there's a pointer to a trace_seq structure "p" that is available. > Hmm, I should add a get_dynamic_array_len(field), to give you the > length. I'll add that now. I also don't like the trace_seq being "p" as > that is too

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-06 Thread Chen, Gong
On Tue, Jun 03, 2014 at 10:35:44AM -0400, Steven Rostedt wrote: Note, there's a pointer to a trace_seq structure p that is available. Hmm, I should add a get_dynamic_array_len(field), to give you the length. I'll add that now. I also don't like the trace_seq being p as that is too generic.

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-06 Thread Steven Rostedt
On Fri, 6 Jun 2014 02:51:41 -0400 Chen, Gong gong.c...@linux.intel.com wrote: +/* + * MCE Extended Error Log trace event + * + * These events are generated when hardware detects a corrected or + * uncorrected event. + */ + +/* memory trace event */ + +TRACE_EVENT(extlog_mem_event, +

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-04 Thread Steven Rostedt
On Tue, 3 Jun 2014 10:35:44 -0400 Steven Rostedt wrote: > I'll still need to add that __get_dynamic_array_len() helper. I'll send > you something tonight. > I got caught up in other work, but I wrote it this morning and I'm adding it to my 3.16 queue. Thus, you can use this: -- Steve >From

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-04 Thread Steven Rostedt
On Tue, 3 Jun 2014 10:35:44 -0400 Steven Rostedt rost...@goodmis.org wrote: I'll still need to add that __get_dynamic_array_len() helper. I'll send you something tonight. I got caught up in other work, but I wrote it this morning and I'm adding it to my 3.16 queue. Thus, you can use this:

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-03 Thread Steven Rostedt
On Tue, 3 Jun 2014 04:36:07 -0400 "Chen, Gong" wrote: > On Mon, Jun 02, 2014 at 12:57:48PM -0400, Steven Rostedt wrote: > > Also matters how big you expect these events to be. If you get a > > "christmas tree" set of flags, how big will that event grow with all > > the descriptions attached? > >

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-03 Thread Chen, Gong
On Mon, Jun 02, 2014 at 12:57:48PM -0400, Steven Rostedt wrote: > Also matters how big you expect these events to be. If you get a > "christmas tree" set of flags, how big will that event grow with all > the descriptions attached? > > The max event size after all headers is 4056 bytes. If you go

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-03 Thread Chen, Gong
On Mon, Jun 02, 2014 at 12:57:48PM -0400, Steven Rostedt wrote: Also matters how big you expect these events to be. If you get a christmas tree set of flags, how big will that event grow with all the descriptions attached? The max event size after all headers is 4056 bytes. If you go over

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-03 Thread Steven Rostedt
On Tue, 3 Jun 2014 04:36:07 -0400 Chen, Gong gong.c...@linux.intel.com wrote: On Mon, Jun 02, 2014 at 12:57:48PM -0400, Steven Rostedt wrote: Also matters how big you expect these events to be. If you get a christmas tree set of flags, how big will that event grow with all the descriptions

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-02 Thread Steven Rostedt
On Mon, 2 Jun 2014 16:22:19 + "Luck, Tony" wrote: > To which I'll counter that the trace ring buffer can handle tracing of > events like page faults and context switches (can't it?) that happen > at a rate of thousands per second. Our eMCA records will normally > happen at a rate of X per

RE: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-02 Thread Luck, Tony
>> All of this stuff only applies to server systems - so quibbling over >> a handful of *bytes* in an error record on a system that has tens, >> hundreds or even thousands of *gigabytes* of memory seems >> a bit pointless. > > But there's still only a limited number of bytes in the ring buffer no

RE: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-02 Thread Luck, Tony
All of this stuff only applies to server systems - so quibbling over a handful of *bytes* in an error record on a system that has tens, hundreds or even thousands of *gigabytes* of memory seems a bit pointless. But there's still only a limited number of bytes in the ring buffer no matter

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-02 Thread Steven Rostedt
On Mon, 2 Jun 2014 16:22:19 + Luck, Tony tony.l...@intel.com wrote: To which I'll counter that the trace ring buffer can handle tracing of events like page faults and context switches (can't it?) that happen at a rate of thousands per second. Our eMCA records will normally happen at a

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-30 Thread Steven Rostedt
On Fri, 30 May 2014 23:03:27 + "Luck, Tony" wrote: > All of this stuff only applies to server systems - so quibbling over > a handful of *bytes* in an error record on a system that has tens, > hundreds or even thousands of *gigabytes* of memory seems > a bit pointless. But there's still

RE: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-30 Thread Luck, Tony
>> For memory error location, I will utilize type offset to save one >> more byte, furthermore, I want to drop requestor_id, responder_id >> and target_id. 1) They are very rare (I've never seen them by now) > > My concern is, are we sure we're never going to need them at all? Tony, > what's your

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-30 Thread Borislav Petkov
On Fri, May 30, 2014 at 02:16:06PM -0700, Tony Luck wrote: > On Fri, May 30, 2014 at 3:07 AM, Borislav Petkov wrote: > > Please elaborate, what conditions? DIMM silk screen labels or so? Maybe > > we can generate a mapping between text labels and indices and we can > > dump the indices in the

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-30 Thread Tony Luck
On Fri, May 30, 2014 at 3:07 AM, Borislav Petkov wrote: > Please elaborate, what conditions? DIMM silk screen labels or so? Maybe > we can generate a mapping between text labels and indices and we can > dump the indices in the tracepoint and do the mapping back to strings in > userspace...? The

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-30 Thread Borislav Petkov
On Fri, May 30, 2014 at 05:22:32AM -0400, Chen, Gong wrote: > We have two big chunk string. One for memory error location, the other > for DIMM error location. Since DIMM error location depends on some > other conditions, how about just converting memory error location to a > compact mode but

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-30 Thread Chen, Gong
On Wed, May 28, 2014 at 12:56:25PM -0400, Steven Rostedt wrote: > Instead of making that a huge string, what about a dynamic array of > special structures? > > > struct __attribute__((__packed__)) cper_sec_mem_rec { > short type; > int data; > }; > > HI, Steven & Boris We have two

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-30 Thread Chen, Gong
On Wed, May 28, 2014 at 12:56:25PM -0400, Steven Rostedt wrote: Instead of making that a huge string, what about a dynamic array of special structures? struct __attribute__((__packed__)) cper_sec_mem_rec { short type; int data; }; HI, Steven Boris We have two big chunk

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-30 Thread Borislav Petkov
On Fri, May 30, 2014 at 05:22:32AM -0400, Chen, Gong wrote: We have two big chunk string. One for memory error location, the other for DIMM error location. Since DIMM error location depends on some other conditions, how about just converting memory error location to a compact mode but leaving

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-30 Thread Tony Luck
On Fri, May 30, 2014 at 3:07 AM, Borislav Petkov b...@alien8.de wrote: Please elaborate, what conditions? DIMM silk screen labels or so? Maybe we can generate a mapping between text labels and indices and we can dump the indices in the tracepoint and do the mapping back to strings in

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-30 Thread Borislav Petkov
On Fri, May 30, 2014 at 02:16:06PM -0700, Tony Luck wrote: On Fri, May 30, 2014 at 3:07 AM, Borislav Petkov b...@alien8.de wrote: Please elaborate, what conditions? DIMM silk screen labels or so? Maybe we can generate a mapping between text labels and indices and we can dump the indices in

RE: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-30 Thread Luck, Tony
For memory error location, I will utilize type offset to save one more byte, furthermore, I want to drop requestor_id, responder_id and target_id. 1) They are very rare (I've never seen them by now) My concern is, are we sure we're never going to need them at all? Tony, what's your take on

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-30 Thread Steven Rostedt
On Fri, 30 May 2014 23:03:27 + Luck, Tony tony.l...@intel.com wrote: All of this stuff only applies to server systems - so quibbling over a handful of *bytes* in an error record on a system that has tens, hundreds or even thousands of *gigabytes* of memory seems a bit pointless. But

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-29 Thread Chen, Gong
On Thu, May 29, 2014 at 09:12:51AM -0400, Steven Rostedt wrote: > What do you think gets recorded in the ring buffer? The pointer to the > string? No! You copy the entire string into the ring buffer, with > markers and all. How big is that string? 60 chars? 80? I see you > recording meta data

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-29 Thread Steven Rostedt
On Thu, 29 May 2014 03:43:45 -0400 "Chen, Gong" wrote: > On Wed, May 28, 2014 at 12:56:25PM -0400, Steven Rostedt wrote: > > My concern is passing in a large string and wasting a lot of the ring > > buffer space. The max you can hold per event is just under a page size > > (4k). And all these

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-29 Thread Borislav Petkov
On Thu, May 29, 2014 at 03:43:45AM -0400, Chen, Gong wrote: > On Wed, May 28, 2014 at 12:56:25PM -0400, Steven Rostedt wrote: > > My concern is passing in a large string and wasting a lot of the ring > > buffer space. The max you can hold per event is just under a page size > > (4k). And all these

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-29 Thread Chen, Gong
On Wed, May 28, 2014 at 12:56:25PM -0400, Steven Rostedt wrote: > My concern is passing in a large string and wasting a lot of the ring > buffer space. The max you can hold per event is just under a page size > (4k). And all these strings add up. If it happens to be 512bytes, then > you end up

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-29 Thread Chen, Gong
On Wed, May 28, 2014 at 12:56:25PM -0400, Steven Rostedt wrote: My concern is passing in a large string and wasting a lot of the ring buffer space. The max you can hold per event is just under a page size (4k). And all these strings add up. If it happens to be 512bytes, then you end up with

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-29 Thread Borislav Petkov
On Thu, May 29, 2014 at 03:43:45AM -0400, Chen, Gong wrote: On Wed, May 28, 2014 at 12:56:25PM -0400, Steven Rostedt wrote: My concern is passing in a large string and wasting a lot of the ring buffer space. The max you can hold per event is just under a page size (4k). And all these

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-29 Thread Steven Rostedt
On Thu, 29 May 2014 03:43:45 -0400 Chen, Gong gong.c...@linux.intel.com wrote: On Wed, May 28, 2014 at 12:56:25PM -0400, Steven Rostedt wrote: My concern is passing in a large string and wasting a lot of the ring buffer space. The max you can hold per event is just under a page size (4k).

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-29 Thread Chen, Gong
On Thu, May 29, 2014 at 09:12:51AM -0400, Steven Rostedt wrote: What do you think gets recorded in the ring buffer? The pointer to the string? No! You copy the entire string into the ring buffer, with markers and all. How big is that string? 60 chars? 80? I see you recording meta data there

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-28 Thread Steven Rostedt
On Wed, 28 May 2014 18:34:52 +0200 Borislav Petkov wrote: > Well, they're constructed from a bunch of values which are checked for > validity first: > > http://lkml.kernel.org/r/1400142646-10127-4-git-send-email-gong.c...@linux.intel.com OK, looks like you are saving a bunch of integers. >

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-28 Thread Borislav Petkov
On Wed, May 28, 2014 at 11:28:32AM -0400, Steven Rostedt wrote: > > +static void __trace_mem_error(const uuid_le *fru_id, char *fru_text, > > + u32 err_number, u8 severity, > > + struct cper_sec_mem_err *mem) > > +{ > > + u8 etype = ~0,

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-28 Thread Steven Rostedt
Added LKML On Tue, 27 May 2014 23:32:18 -0400 "Chen, Gong" wrote: > Add trace interface to elaborate all H/W error related information. > > v6 -> v5: format adjustment. > v5 -> v4: Add physical mask(LSB) in trace. > v4 -> v3: change ras trace dependency rule. > v3 -> v2: minor adjustment

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-28 Thread Steven Rostedt
Added LKML On Tue, 27 May 2014 23:32:18 -0400 Chen, Gong gong.c...@linux.intel.com wrote: Add trace interface to elaborate all H/W error related information. v6 - v5: format adjustment. v5 - v4: Add physical mask(LSB) in trace. v4 - v3: change ras trace dependency rule. v3 - v2: minor

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-28 Thread Borislav Petkov
On Wed, May 28, 2014 at 11:28:32AM -0400, Steven Rostedt wrote: +static void __trace_mem_error(const uuid_le *fru_id, char *fru_text, + u32 err_number, u8 severity, + struct cper_sec_mem_err *mem) +{ + u8 etype = ~0, pa_mask_lsb = ~0;

Re: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-28 Thread Steven Rostedt
On Wed, 28 May 2014 18:34:52 +0200 Borislav Petkov b...@alien8.de wrote: Well, they're constructed from a bunch of values which are checked for validity first: http://lkml.kernel.org/r/1400142646-10127-4-git-send-email-gong.c...@linux.intel.com OK, looks like you are saving a bunch of