Hi Scott,
Thanks for the review, please see some responses inline..
(responded only to the parts that i implemented)
Scott Davenport wrote:
On Wed, 2009-09-02 at 12:25 -0400, Tom Pothier wrote:
The webrev is now also available on opensolaris.org:
http://cr.opensolaris.org/~pothier/onnv-x86gentopo-pb/
Tom,
I've gotten through some of the code. Still have to review
the x86pi enumerator itself and most of the "uts" changes.
-scott
==usr/src/uts/common/sys/smbios.h==
1094 /*
1095 * SMBIOS OEM-specific (Type 144) Memory Array Extended Information.
1096 */
1097 typedef struct smbios_memarray_ext {
1098 uint16_t smbmae_ma; /* memory array handle */
1099 uint16_t smbmae_comp; /* component parent handle */
1100 uint16_t smbmae_bdf; /* Bus/Dev/Funct (PCI) */
1101 } smbios_memarray_ext_t;
==usr/src/common/smbios/smb_info.c==
896 int
897 smbios_info_extmemarray(smbios_hdl_t *shp, id_t id, smbios_memarray_ext_t
*emap)
898 {
...
911 emap->smbmae_ma = exma.smbmarre_ma;
912 emap->smbmae_comp = exma.smbmarre_component;
913 emap->smbmae_bdf = exma.smbmarre_bdf;
==usr/src/cmd/smbios/smbios.c==
880 static void
881 print_extmemarray(smbios_hdl_t *shp, id_t id, FILE *fp)
882 {
...
888 (void) smbios_info_extmemarray(shp, id, &em);
889
890 oprintf(fp, " Physical Memory Array Handle: %u\n", em.smbmae_ma);
891 oprintf(fp, " Component Parent Handle: %u\n", em.smbmae_comp);
892 oprintf(fp, " BDF: 0x%x\n", em.smbmae_bdf);
In the PRMS (BTW...is this available to external
reviewers yet?) there's also a "Segment Group" field
in the extended structure. This isn't being used in
enumeration.
But assuming it's important to someone, the field
should be grabbed and printed via smbios. If it's
vestigial, yank it from the PRMS.
==usr/src/lib/fm/topo/maps/i86pc/i86pc-legacy-hc-topology.xml==
28 <!--
29 This map file is loaded by the generic enumerator (x86pi.so) when
30 an FMA-compliance SMBIOS can't be found.
31 -->
Nit. "compliance" --> "compliant"
==usr/src/lib/fm/topo/modules/common/pcibus/did_props.c==
132 { NULL, &protocol_pgroup, TOPO_PROP_ASRU, ASRU_set },
133 /*
134 * These props need to be put at the end of table. x86pi has its
135 * own way to set them.
136 */
137 { NULL, &protocol_pgroup, TOPO_PROP_LABEL, label_set },
138 { NULL, &protocol_pgroup, TOPO_PROP_FRU, FRU_set }
The comment is intriguing. What makes label and FRU
special that the ordering in the props[] arrays matters?
Can the comment refer to functions/comments in the x86pi
code that can clarify?
==usr/src/lib/fm/topo/modules/i86pc/chip/chip.h==
40 /* Below should match the definitions in x86pi_impl.h */
41 #define X86PI_FULL 1
42 #define X86PI_PARTIAL 2
43 #define X86PI_NONE 3
44
45 /*
46 * FM_AWARE_SMBIOS means SMBIOS meets FMA needs
47 * X86PI_FULL is defined as 1 in x86pi.so
48 * X86PI_PARTIAL is defined as 2 in x86pi.so
49 * And passed from x86pi.so to chip.so as module
50 * private data
51 */
52 #define FM_AWARE_SMBIOS(mod) \
53 (topo_mod_getspecific(mod) != NULL && \
54 (*(int *)topo_mod_getspecific(mod) == X86PI_FULL || \
55 *(int *)topo_mod_getspecific(mod) == X86PI_PARTIAL))
I don't understand the usage of _PARTIAL here. In scanning
the rest of the code (particularly x86pi_check_comp()),
_PARTIAL is never a valid value for the x86pi enumerator.
Makes logical sense too. Either the SMBIOS has what the
enum needs to create the hierarchy or it doesn't.
I understand that certain fields (serial numbers, labels)
don't prevent construction of a valid hierarchy from
SMBIOS records. But it does not look like _PARTIAL is
used to convey this.
X86PI_PARTIAL needs to be removed in x86pi_impl.h & chip.h
Was in the pending list, but got missed..
I will file a pre-rti CR to fix this.
==usr/src/lib/fm/topo/modules/i86pc/chip/chip.c==
216 (void) topo_node_fru_set(strand, NULL, 0, &perr);
217 /*
218 * From the inherited FRU, extract the Serial
219 * number(if SMBIOS donates) and set it in the ASRU
220 */
221 if (FM_AWARE_SMBIOS(mod)) {
222 char *val = NULL;
223
224 if (topo_prop_get_fmri(strand, TOPO_PGROUP_PROTOCOL,
225 TOPO_PROP_FRU, &fmri, &err) != 0)
226 whinge(mod, NULL,
227 "create_strand: topo_prop_get_fmri failed\n");
228 if (nvlist_lookup_string(fmri, FM_FMRI_HC_SERIAL_ID, &val)
!= 0)
229 whinge(mod, NULL,
230 "create_strand: nvlist_lookup_string failed:
\n");
231 else
232 serial = topo_mod_strdup(mod, val);
233 nvlist_free(fmri);
234 }
Is this correct? The inherited FRU of a strand may be a baseboard,
in which case the FRU serial number does not make sense for an
ASRU. It should be the serial number from the Type 4 record (if
provided). And I think that's universal - whether or not the
socket itself is a FRU.
Agree, this needs to be fixed, ASRU's serial number should not be
dependent on the FRUness of the Socket.
Will raise a pre-rti CR and fix it.
271 if (FM_AWARE_SMBIOS(mod)) {
272 (void) topo_node_label_set(strand, NULL, &perr);
273
274 if (topo_node_resource(strand, &fmri, &perr) != 0)
275 whinge(mod, &nerr, "create_strand: "
276 "topo_node_resource failed\n");
277
278 perr += nvlist_lookup_string(fmri,
279 FM_FMRI_HC_PART, &part);
280 perr += nvlist_lookup_string(fmri,
281 FM_FMRI_HC_REVISION, &rev);
282
283 if (perr != 0)
284 whinge(mod, NULL,
285 "create_strand: nvlist_add_string failed\n");
286
287 perr += topo_prop_set_string(strand, PGNAME(STRAND),
288 FM_FMRI_HC_SERIAL_ID, TOPO_PROP_IMMUTABLE, serial,
&perr);
289 perr += topo_prop_set_string(strand, PGNAME(STRAND),
290 FM_FMRI_HC_PART, TOPO_PROP_IMMUTABLE, part, &perr);
291 perr += topo_prop_set_string(strand, PGNAME(STRAND),
292 FM_FMRI_HC_REVISION, TOPO_PROP_IMMUTABLE, rev, &perr);
293
294 if (perr != 0)
295 whinge(mod, NULL, "create_strand:
topo_prop_set_string"
296 "failed\n");
297
298 nvlist_free(fmri);
299 topo_mod_strfree(mod, serial);
300 }
Does any of this code belong? This is enumerating a strand,
which from a label perspective will be the same as its parent
core (which will be the same as its parent chip). Why isn't
a single topo_node_label_set(strand, NULL, &perr) sufficient?
After setting the Label for the Strand node, we also add Serial,
Part & Revision in the "strand-properties" ( - this was based upon
a comment from the portfolio-review that - chip/core/strand should
all carry the serial, part, revision properties separately.. )
Should this be reconsidered ?
If it does belong, line 285, copy/paste error? Debug message
should reference nvlist_lookup_string().
Yes, will fix it.
305 static int
306 create_core(topo_mod_t *mod, tnode_t *pnode, nvlist_t *cpu,
307 nvlist_t *auth, uint16_t chip_smbiosid)
Same questions from the strand logic above (ASRU serial,
label) apply to the core.
Yes, will fix it.
514 } else {
515 if (topo_node_resource(chip, &fmri, &perr)
!= 0)
516 whinge(mod, &nerr, "create_chip: "
517 "topo_node_resource failed\n");
...
527 perr += nvlist_lookup_string(fmri,
528 FM_FMRI_HC_SERIAL_ID, &serial);
529 perr += nvlist_lookup_string(fmri,
530 FM_FMRI_HC_PART, &part);
531 perr += nvlist_lookup_string(fmri,
532 FM_FMRI_HC_REVISION, &rev);
533
534 if (perr != 0)
535 whinge(mod, NULL,
536 "create_chip:
nvlist_add_string"
537 "failed\n");
First the nit. Line 536 I think should be nvlist_lookup_string.
Will fix.
Now, wrt perr. At 514, perr may be set if topo_node_resource()
fails. But the code keeps on marching. First, should it?
Yes, thats the way most of the Serial, Part, Revision related failures
(failures
related to SMBIOS derivations or other) is treated.. we complain
via debug messages, but we march, so that even without
Serials, Part, Revisions, Labels etc, a bare topology can still get
enumerated.
I can add a debug message for each nvlist_lookup_string() failure...
Second,
if it does and 514 has failed, you're guaranteed a misleading
debug msg at 535.
True, will fix.
This same theme continues through this portion
of the if/else body.
587 /*
588 * STATUS
589 * CPU Socket Populated
590 * CPU Socket Unpopulated
591 * Populated : Enabled
592 * Populated : Disabled by BIOS (Setup)
593 * Populated : Disabled by BIOS (Error)
594 * Populated : Idle
In addition to adding in where these comes from (DMTF spec),
I think this comment is better served colocated with
topo_status_smbios_get(). Essentially, that routine is
deciding whether or not to report the socket as somethin
to be enumerated.
ok.
==usr/src/lib/fm/topo/modules/i86pc/chip/chip_smbios.c==
General comment. The routines beginning with topo_
may cause confusion. When I saw these in other areas
of the chip enum, I thought these were calls into
libtopo.
ok. will pick function names that will start with chip_ instead of topo_
280 /*
281 * This could be defined as topo_mod_strlen()
282 */
283 size_t
284 topo_chip_strlen(const char *str)
Then should it be? If other enum plugins would make use
of this, then do so (note: likely additional ARC work)
I had raised a need for topo_mod_strlen() earlier, did not get feedback
and there was no other consumer in the x86gentopo project, apart
from chip_smbios.c, so confined it here.
Will take this change outside the scope of this project.
294 static const char *
295 chip_cleanup_smbios_str(topo_mod_t *mod, const char *begin, int str_type)
296 {
I think a comment here summarizing what string cleanup is
being done is warranted.
Will add a comment.
383 if (buf != NULL) {
384 if (label != NULL) {
385 (void) strcpy(buf, label);
386 (void) strcat(buf, delim);
387 /*
388 * If we are working on a DIMM
389 * smbi_location has the Device
Locator.
390 * add the Device Locator ex: CPU 0
391 * add the Bank Locator ex: DIMM 0
392 */
393 (void) strcat(buf,
c.smbi_location);
394 } else
395 (void) strcpy(buf,
c.smbi_location);
It looks like strcpy and strcat are OK here....generally get
nervous when these are around. Any reason not to use their
bounds-checking variants?
Yes, will use strlcpy, strlcat & check for buffer overruns..
also noted now that,
lines
377 buf = topo_mod_alloc(mod, topo_chip_strlen(label) +
378 topo_chip_strlen(delim) +
379 topo_chip_strlen(c.smbi_location) +
380 topo_chip_strlen(blank) +
381 topo_chip_strlen(dimm_bank));
should be
377 buf = topo_mod_alloc(mod, topo_chip_strlen(label) +
...
...
381 topo_chip_strlen(dimm_bank) + 1);
will fix it.
(will fix as review-feedback continues and will keep updating the
changes to Tom)
Thank you
Srihari
==usr/src/uts/common/os/fm.c==
1280 /*
1281 * same code as in fm_fmri_hc_set_common
1282 */
1283 if (version != FM_HC_SCHEME_VERSION) {
...
Then why not call fm_fmri_hc_set_common() vs. duplicating
code?
_______________________________________________
fm-discuss mailing list
fm-discuss@opensolaris.org
_______________________________________________
fm-discuss mailing list
fm-discuss@opensolaris.org