Laszlo, Thanks for the detailed response.
I agree that use of VOID * prevents the compiler from helping with type checking. This is why you see many existing structures that do use VOID * have a Signature field at the beginning and functions that use those structures check the Signature value. Since this is a runtime check, it is not as good as a compile type check, but is an improvement over a VOID * struct with no Signature at all. I agree that proposed approach #1 is better than proposed approach #0. Let's go with #1. The APIs in your library class can be summarized as: Init(), UnInit(), Insert(), Delete, IsEmpty(), Next(), Prev(), Min(), Max(), Find(), UserStruct(), Validate(). To me, this looks like a set of APIs to manage an ordered collection of items. There are many ways to implement an ordered collection. Depending on the frequency of the different actions, different internal implementations may have different performance, memory overhead, or code size. The Red/Black one is great when the number of nodes is large and there are frequent calls to Find(). I would like to suggest a more generic library class name such as OrderedCollectionLib.h and the structs/APIs in the lib class also be generic. The library instance can be something like BaseOrderedCollectionRedBlackTreeLib. The developer can select the lib instance for a platform/module in their DSC file that meets the needs of that platform/module. I do not think the class name should imply any specific size/performance criteria. We have existing examples of this in the MdePkg today. For example, the lib class BaseMemoryLib.h that contains APIs such as CopyMem()/SetMem()/ZeroMem()/CompareMem() has 10 different implementations with different compatibility, size, and performance profiles. Library Class: MdePKg/Include/Library/BaseMemoryLib.h Library Instances: ArmPkg\Library\BaseMemoryLibStm\BaseMemoryLibStm.inf ArmPkg\Library\BaseMemoryLibVstm\BaseMemoryLibVstm.inf MdePkg\Library\BaseMemoryLib\BaseMemoryLib.inf MdePkg\Library\BaseMemoryLibMmx\BaseMemoryLibMmx.inf MdePkg\Library\BaseMemoryLibOptDxe\BaseMemoryLibOptDxe.inf MdePkg\Library\BaseMemoryLibOptPei\BaseMemoryLibOptPei.inf MdePkg\Library\BaseMemoryLibRepStr\BaseMemoryLibRepStr.inf MdePkg\Library\BaseMemoryLibSse2\BaseMemoryLibSse2.inf MdePkg\Library\PeiMemoryLib\PeiMemoryLib.inf MdePkg\Library\UefiMemoryLib\UefiMemoryLib.inf Thanks, Mike -----Original Message----- From: Laszlo Ersek [mailto:ler...@redhat.com] Sent: Monday, August 04, 2014 9:55 AM To: edk2-devel@lists.sourceforge.net Subject: Re: [edk2] [PATCH 1/2] MdePkg: introduce RbTreeLib On 08/04/14 17:02, Kinney, Michael D wrote: > The incomplete types between lib class .h and lib instance is not > something we done anywhere else. I prefer the lib class .h file to > be able to stand on its own. It is able. > Meaning it can be included by a module, but if not lib instance is > linked to the module, the module should still be able to compile. It is able. The term "incomplete type" is a term from the C standard. Incomplete types are the standard way to introduce the name of a type, but not its contents. 6.2.5 Types 1 The meaning of a value stored in an object or returned by a function is determined by the /type/ of the expression used to access it. (An identifier declared to be an object is the simplest such expression; the type is specified in the declaration of the identifier.) Types are partitioned into /object types/ (types that fully describe objects), /function types/ (types that describe functions), and /incomplete types/ (types that describe objects but lack information needed to determine their sizes). Please compile and run the following C program: ----*---- #include <stdio.h> typedef struct HELLO_WORLD HELLO_WORLD; int main(void) { HELLO_WORLD *pointer; pointer = NULL; fprintf(stdout, "hello world\n"); return pointer == NULL ? 0 : 1; } ----*---- $ gcc -ansi -pedantic -Wall -Wextra -Werror -o hello hello.c $ ./hello hello world $ echo $? 0 Indeed in my personal opinion, hiding types by forcing everything to (VOID*) is sub-optimal in edk2. For example, in "MdePkg/Include/Uefi/UefiBaseType.h": /// /// A collection of related interfaces. /// typedef VOID *EFI_HANDLE; /// /// Handle to an event structure. /// typedef VOID *EFI_EVENT; This prevents the compiler from catching errors when someone mixes up an EFI_EVENT variable (of type pointer-to-void) with an EFI_HANDLE variable (also of type pointer-to-void). The following typedefs on the other hand: typedef struct INTERNAL_EFI_HANDLE *EFI_HANDLE; typedef struct INTERNAL_EFI_EVENT *EFI_EVENT; would: - hide information from client code just the same, - allow the client code to build independently just the same, - allow different library instances to define INTERNAL_EFI_HANDLE and INTERNAL_EFI_EVENT in their own way just the same, - and allow the compiler to catch type errors at build time (pointer-to-INTERNAL_EFI_HANDLE is incompatible with pointer-to-INTERNAL_EFI_EVENT, even though both of the pointed-to types are incomplete). > The allocation of the root node issue could be handled in a similar > way to the CryproPkg lib classes where there is a lib API to get the > size of the opaque struct. The caller calls the size function. > Does the allocation, and then is able to use that allocated buffer > for all other lib APIs. This may be a way to get to VOID * for both > ROOT and NODE opaque types. In my opinion, (VOID*) is not a goal. The goal is information hiding, and independence of interface and implementation, and (VOID*) is just a (suboptimal) means to that goal. Consider the following three patterns: * Current patch (approach 0, let's call it "complete type"): Header file: typedef struct { RB_TREE_NODE *Root; RB_TREE_USER_COMPARE UserStructCompare; RB_TREE_KEY_COMPARE KeyCompare; } RB_TREE; Library implementation: VOID EFIAPI RbTreeInit ( OUT RB_TREE *Tree, IN RB_TREE_USER_COMPARE UserStructCompare, IN RB_TREE_KEY_COMPARE KeyCompare ) { Tree->Root = NULL; Tree->UserStructCompare = UserStructCompare; Tree->KeyCompare = KeyCompare; } Client code: int main ( IN int ArgC, IN char **ArgV ) { RB_TREE Tree; /* ... */ RbTreeInit (&Tree, UserStructCompare, KeyCompare); /* ... */ } Properties of this approach: Information Client code can Compiler enforces Interface and hiding allocate auto type safety at build implementation are and static time independent ----------- --------------- -------------------- ------------------ no (bad) yes (good) yes (good) no (bad) * Proposed approach 1 (let's call it "incomplete type"): Header file: typedef struct RB_TREE RB_TREE; Library implementation: struct RB_TREE { RB_TREE_NODE *Root; RB_TREE_USER_COMPARE UserStructCompare; RB_TREE_KEY_COMPARE KeyCompare; }; RB_TREE * EFIAPI RbTreeInit ( IN RB_TREE_USER_COMPARE UserStructCompare, IN RB_TREE_KEY_COMPARE KeyCompare ) { RB_TREE *Tree; Tree = AllocatePool (sizeof *Tree); if (Tree == NULL) { return NULL; } Tree->Root = NULL; Tree->UserStructCompare = UserStructCompare; Tree->KeyCompare = KeyCompare; return Tree; } Client code: int main ( IN int ArgC, IN char **ArgV ) { RB_TREE *Tree; /* ... */ Tree = RbTreeInit (UserStructCompare, KeyCompare); /* ... */ } Properties of this approach: Information Client code can Compiler enforces Interface and hiding allocate auto type safety at build implementation are and static time independent ----------- --------------- -------------------- ------------------ yes (good) no (bad) yes (good) yes (good) * Proposed approach 2 (let's call it "no type"): Header file: typedef VOID *RB_TREE; Library implementation: typedef struct { RB_TREE_NODE *Root; RB_TREE_USER_COMPARE UserStructCompare; RB_TREE_KEY_COMPARE KeyCompare; } INTERNAL_RB_TREE; UINTN EFIAPI RbTreeSize ( VOID ) { return sizeof (INTERNAL_RB_TREE); } VOID EFIAPI RbTreeInit ( OUT RB_TREE Tree, IN RB_TREE_USER_COMPARE UserStructCompare, IN RB_TREE_KEY_COMPARE KeyCompare ) { INTERNAL_RB_TREE *InternalTree; InternalTree = Tree; InternalTree->Root = NULL; InternalTree->UserStructCompare = UserStructCompare; InternalTree->KeyCompare = KeyCompare; } Client code: int main ( IN int ArgC, IN char **ArgV ) { RB_TREE Tree; /* ... */ Tree = AllocatePool (RbTreeSize ()); RbTreeInit (Tree, UserStructCompare, KeyCompare); /* ... */ } Properties of this approach: Information Client code can Compiler enforces Interface and hiding allocate auto type safety at build implementation are and static time independent ----------- --------------- -------------------- ------------------ yes (good) no (bad) no (bad) yes (good) In summary: Information Client code can Compiler enforces Interface and hiding allocate auto type safety at build implementation are and static time independent ----------- --------------- -------------------- ------------------ 0 no (bad) yes (good) yes (good) no (bad) 1 yes (good) no (bad) yes (good) yes (good) 2 yes (good) no (bad) no (bad) yes (good) I'll accept that my current patch, with approach #0, is inferior to #1, and I'm willing to fix that by moving to approach #1. But I honestly think that #2 is also inferior to #1, because it throws away type safety that the compiler could otherwise enforce. I don't see what approach #2 gives us in return; it looks like a straight downgrade from approach #1. > I was also thinking that the use of Red Black algorithm is an > implementation choice. Does it make more sense to change the name of > the lib class to be more generic (no references to red black), and > one of many possible lib instances uses the read black algorithm? The behavior (esp. the time complexity) of the various functions is important. We'd have to find a generic name that captures: - worst case O(log(n)) for all of lookup, next, prev, min, max, insert, delete, - and ordered traversal (full traversal in O(n) time). I guess WORST_CASE_OLOGN_SELF_BALANCING_BINARY_SEARCH_TREE would be an appropriate generic type name: http://en.wikipedia.org/wiki/Self-balancing_binary_search_tree The article lists a number of implementations, but for example AVL and RB differ from Splay and Treap that the former two are worst case O(log(n)), while the latter two are "only amortized O(log(n))". If we pick a specific name, then the number of possible implementations becomes small. RbTreeLib is the extreme of this, although I don't really understand why one would implement AVL if RB is available, or vice versa. If we pick a generic name, then the number of implementations becomes larger, but the generic name might not reflect characteristics that are important to the user. We could look at how container libraries are organized (boost, C++ STL, Qt etc) but I think this is hugely overkill. I just wanted a fast dictionary for edk2. *Any* red-black tree implementation will be an improvement, compared to "lists and arrays only". Thanks Laszlo ------------------------------------------------------------------------------ Infragistics Professional Build stunning WinForms apps today! Reboot your WinForms applications with our WinForms controls. Build a bridge from your legacy apps to the future. http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk _______________________________________________ edk2-devel mailing list edk2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/edk2-devel ------------------------------------------------------------------------------ Infragistics Professional Build stunning WinForms apps today! Reboot your WinForms applications with our WinForms controls. Build a bridge from your legacy apps to the future. http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk _______________________________________________ edk2-devel mailing list edk2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/edk2-devel