Re: [edk2] [PATCH 1/2] MdePkg: introduce RbTreeLib

Kinney, Michael D Mon, 04 Aug 2014 13:51:14 -0700

Laszlo,

Thanks for the detailed response.

I  agree that use of VOID * prevents the compiler from helping with type 
checking.  This is why you see many existing structures that do use VOID * have 
a Signature field at the beginning and functions that use those structures 
check the Signature value.  Since this is a runtime check, it is not as good as 
a compile type check, but is an improvement over a VOID  * struct with no 
Signature at all.

I agree that proposed approach #1 is better than proposed approach #0.  Let's 
go with #1.

The APIs in your library class can be summarized as: Init(), UnInit(), 
Insert(), Delete, IsEmpty(), Next(), Prev(), Min(), Max(), Find(), 
UserStruct(), Validate().

To me, this looks like a set of APIs to manage an ordered collection of items.  
There are many ways to implement an ordered collection. Depending on the 
frequency of the different actions, different internal implementations may have 
different performance, memory overhead, or code size.  The Red/Black one is 
great when the number of nodes is large and there are frequent calls to Find(). 
 I would like to suggest a more generic library class name such as 
OrderedCollectionLib.h and the structs/APIs in the lib class also be generic.  
The library instance can be something like 
BaseOrderedCollectionRedBlackTreeLib.  The developer can select the lib 
instance for a platform/module in their DSC file that meets the needs of that 
platform/module.  I do not think the class name should imply any specific 
size/performance criteria.  We have existing examples of this in the MdePkg 
today.  For example, the lib class BaseMemoryLib.h that contains APIs such as 
CopyMem()/SetMem()/ZeroMem()/CompareMem() 
 has 10 different implementations with different compatibility, size, and 
performance profiles.

Library Class:
        MdePKg/Include/Library/BaseMemoryLib.h

Library Instances:
        ArmPkg\Library\BaseMemoryLibStm\BaseMemoryLibStm.inf 
        ArmPkg\Library\BaseMemoryLibVstm\BaseMemoryLibVstm.inf 
        MdePkg\Library\BaseMemoryLib\BaseMemoryLib.inf 
        MdePkg\Library\BaseMemoryLibMmx\BaseMemoryLibMmx.inf 
        MdePkg\Library\BaseMemoryLibOptDxe\BaseMemoryLibOptDxe.inf 
        MdePkg\Library\BaseMemoryLibOptPei\BaseMemoryLibOptPei.inf 
        MdePkg\Library\BaseMemoryLibRepStr\BaseMemoryLibRepStr.inf 
        MdePkg\Library\BaseMemoryLibSse2\BaseMemoryLibSse2.inf 
        MdePkg\Library\PeiMemoryLib\PeiMemoryLib.inf
        MdePkg\Library\UefiMemoryLib\UefiMemoryLib.inf

Thanks,

Mike

-----Original Message-----
From: Laszlo Ersek [mailto:ler...@redhat.com] 
Sent: Monday, August 04, 2014 9:55 AM
To: edk2-devel@lists.sourceforge.net
Subject: Re: [edk2] [PATCH 1/2] MdePkg: introduce RbTreeLib

On 08/04/14 17:02, Kinney, Michael D wrote:

> The incomplete types between lib class .h and lib instance is not 
> something we done anywhere else.  I prefer the lib class .h file to 
> be able to stand on its own.

It is able.

> Meaning it can be included by a module, but if not lib instance is
> linked to the module, the module should still be able to compile.

It is able.

The term "incomplete type" is a term from the C standard. Incomplete
types are the standard way to introduce the name of a type, but not its
contents.

  6.2.5 Types

  1 The meaning of a value stored in an object or returned by a
    function is determined by the /type/ of the expression used to
    access it. (An identifier declared to be an object is the
    simplest such expression; the type is specified in the declaration
    of the identifier.) Types are partitioned into /object types/
    (types that fully describe objects), /function types/ (types
    that describe functions), and /incomplete types/ (types that
    describe objects but lack information needed to determine their
    sizes).

Please compile and run the following C program:

----*----
#include <stdio.h>

typedef struct HELLO_WORLD HELLO_WORLD;

int
main(void)
{
  HELLO_WORLD *pointer;

  pointer = NULL;
  fprintf(stdout, "hello world\n");
  return pointer == NULL ? 0 : 1;
}
----*----

$ gcc -ansi -pedantic -Wall -Wextra -Werror -o hello hello.c
$ ./hello
hello world
$ echo $?
0

Indeed in my personal opinion, hiding types by forcing everything to
(VOID*) is sub-optimal in edk2. For example, in
"MdePkg/Include/Uefi/UefiBaseType.h":

///
/// A collection of related interfaces.
///
typedef VOID                      *EFI_HANDLE;
///
/// Handle to an event structure.
///
typedef VOID                      *EFI_EVENT;

This prevents the compiler from catching errors when someone mixes up
an EFI_EVENT variable (of type pointer-to-void) with an EFI_HANDLE
variable (also of type pointer-to-void). The following typedefs on the
other hand:

typedef struct INTERNAL_EFI_HANDLE *EFI_HANDLE;
typedef struct INTERNAL_EFI_EVENT  *EFI_EVENT;

would:
- hide information from client code just the same,

- allow the client code to build independently just the same,

- allow different library instances to define INTERNAL_EFI_HANDLE and
  INTERNAL_EFI_EVENT in their own way just the same,

- and allow the compiler to catch type errors at build time
  (pointer-to-INTERNAL_EFI_HANDLE is incompatible with
  pointer-to-INTERNAL_EFI_EVENT, even though both of the pointed-to
  types are incomplete).

> The allocation of the root node issue could be handled in a similar 
> way to the CryproPkg lib classes where there is a lib API to get the 
> size of the opaque struct.  The caller calls the size function.
> Does the allocation, and then is able to use that allocated buffer
> for all other lib APIs.  This may be a way to get to VOID * for both
> ROOT and NODE opaque types.

In my opinion, (VOID*) is not a goal. The goal is information hiding,
and independence of interface and implementation, and (VOID*) is just a
(suboptimal) means to that goal.

Consider the following three patterns:

* Current patch (approach 0, let's call it "complete type"):

Header file:

  typedef struct {
    RB_TREE_NODE          *Root;
    RB_TREE_USER_COMPARE  UserStructCompare;
    RB_TREE_KEY_COMPARE   KeyCompare;
  } RB_TREE;

Library implementation:

  VOID
  EFIAPI
  RbTreeInit (
    OUT RB_TREE             *Tree,
    IN RB_TREE_USER_COMPARE UserStructCompare,
    IN RB_TREE_KEY_COMPARE  KeyCompare
    )
  {
    Tree->Root              = NULL;
    Tree->UserStructCompare = UserStructCompare;
    Tree->KeyCompare        = KeyCompare;
  }

Client code:

  int
  main (
    IN int  ArgC,
    IN char **ArgV
    )
  {
    RB_TREE Tree;
    /* ... */

    RbTreeInit (&Tree, UserStructCompare, KeyCompare);
    /* ... */
  }

Properties of this approach:

  Information  Client code can  Compiler enforces     Interface and
  hiding       allocate auto    type safety at build  implementation are
               and static       time                  independent
  -----------  ---------------  --------------------  ------------------
  no (bad)     yes (good)       yes (good)            no (bad)

* Proposed approach 1 (let's call it "incomplete type"):

Header file:

  typedef struct RB_TREE RB_TREE;

Library implementation:

  struct RB_TREE {
    RB_TREE_NODE          *Root;
    RB_TREE_USER_COMPARE  UserStructCompare;
    RB_TREE_KEY_COMPARE   KeyCompare;
  };

  RB_TREE *
  EFIAPI
  RbTreeInit (
    IN RB_TREE_USER_COMPARE UserStructCompare,
    IN RB_TREE_KEY_COMPARE  KeyCompare
    )
  {
    RB_TREE *Tree;

    Tree = AllocatePool (sizeof *Tree);
    if (Tree == NULL) {
      return NULL;
    }

    Tree->Root              = NULL;
    Tree->UserStructCompare = UserStructCompare;
    Tree->KeyCompare        = KeyCompare;
    return Tree;
  }

Client code:

  int
  main (
    IN int  ArgC,
    IN char **ArgV
    )
  {
    RB_TREE *Tree;
    /* ... */

    Tree = RbTreeInit (UserStructCompare, KeyCompare);
    /* ... */
  }

Properties of this approach:

  Information  Client code can  Compiler enforces     Interface and
  hiding       allocate auto    type safety at build  implementation are
               and static       time                  independent
  -----------  ---------------  --------------------  ------------------
  yes (good)   no (bad)         yes (good)            yes (good)

* Proposed approach 2 (let's call it "no type"):

Header file:

  typedef VOID *RB_TREE;

Library implementation:

  typedef struct {
    RB_TREE_NODE          *Root;
    RB_TREE_USER_COMPARE  UserStructCompare;
    RB_TREE_KEY_COMPARE   KeyCompare;
  } INTERNAL_RB_TREE;

  UINTN
  EFIAPI
  RbTreeSize (
    VOID
    )
  {
    return sizeof (INTERNAL_RB_TREE);
  }

  VOID
  EFIAPI
  RbTreeInit (
    OUT RB_TREE             Tree,
    IN RB_TREE_USER_COMPARE UserStructCompare,
    IN RB_TREE_KEY_COMPARE  KeyCompare
    )
  {
    INTERNAL_RB_TREE *InternalTree;

    InternalTree = Tree;
    InternalTree->Root              = NULL;
    InternalTree->UserStructCompare = UserStructCompare;
    InternalTree->KeyCompare        = KeyCompare;
  }

Client code:

  int
  main (
    IN int  ArgC,
    IN char **ArgV
    )
  {
    RB_TREE Tree;
    /* ... */

    Tree = AllocatePool (RbTreeSize ());
    RbTreeInit (Tree, UserStructCompare, KeyCompare);
    /* ... */
  }

Properties of this approach:

  Information  Client code can  Compiler enforces     Interface and
  hiding       allocate auto    type safety at build  implementation are
               and static       time                  independent
  -----------  ---------------  --------------------  ------------------
  yes (good)   no (bad)         no (bad)              yes (good)

In summary:

  Information  Client code can  Compiler enforces     Interface and
  hiding       allocate auto    type safety at build  implementation are
               and static       time                  independent
  -----------  ---------------  --------------------  ------------------
0 no  (bad)    yes (good)       yes (good)            no  (bad)
1 yes (good)   no  (bad)        yes (good)            yes (good)
2 yes (good)   no  (bad)        no  (bad)             yes (good)

I'll accept that my current patch, with approach #0, is inferior to #1,
and I'm willing to fix that by moving to approach #1.

But I honestly think that #2 is also inferior to #1, because it throws
away type safety that the compiler could otherwise enforce. I don't see
what approach #2 gives us in return; it looks like a straight downgrade
from approach #1.

> I was also thinking that the use of Red Black algorithm is an
> implementation choice.  Does it make more sense to change the name of
> the lib class to be more generic (no references to red black), and
> one of many possible lib instances uses the read black algorithm?

The behavior (esp. the time complexity) of the various functions is
important. We'd have to find a generic name that captures:
- worst case O(log(n)) for all of lookup, next, prev, min, max,
  insert, delete,
- and ordered traversal (full traversal in O(n) time).

I guess WORST_CASE_OLOGN_SELF_BALANCING_BINARY_SEARCH_TREE would be an
appropriate generic type name:

http://en.wikipedia.org/wiki/Self-balancing_binary_search_tree

The article lists a number of implementations, but for example AVL and
RB differ from Splay and Treap that the former two are worst case
O(log(n)), while the latter two are "only amortized O(log(n))".

If we pick a specific name, then the number of possible implementations
becomes small. RbTreeLib is the extreme of this, although I don't really
understand why one would implement AVL if RB is available, or vice versa.

If we pick a generic name, then the number of implementations becomes
larger, but the generic name might not reflect characteristics that are
important to the user.

We could look at how container libraries are organized (boost, C++ STL,
Qt etc) but I think this is hugely overkill. I just wanted a fast
dictionary for edk2. *Any* red-black tree implementation will be an
improvement, compared to "lists and arrays only".

Thanks
Laszlo

------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/edk2-devel

------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/edk2-devel

Re: [edk2] [PATCH 1/2] MdePkg: introduce RbTreeLib

Reply via email to