TOI: introducing Split Lists hybrids of doubly linked lists within the Linux Kernel

Mitchell Erblich Tue, 10 Mar 2015 21:31:54 -0700

Please note that this proposal is from this engineer and not from the company 
he works for.


This SHOULD also fulfills any legal notification of work done, but not 
submitted to the Linux Kernel.


Transfer of Information  : Introduction to what I term as split lists
——————————-

                Double link list support has been available for decades within 
UNIX OSs.  There is a perceived perception that double link list support is 
slow because of walking the list from the head or from the tail. This proposal 
suggests a first a simple change to the logic to average an immediate 
approximate 2x performance improvement with creation of ordered double link 
lists. This achievement is for all operations of insertions, deletions, and 
searches of elements within the link list.  There are increasing difficulty 
versions that benefit only as the size of the link list significantly 
increases. The size of the base link list structure increases and additional 
checks are necessary as to where the start of the search is to begin.   The 
suggested implementation follow the syscall(system call) layouts, with a 
split1_ll_xxx, split2_ll_xxx, split3_ll_xxx, where xxx are the operations on 
the split link lists  and so on.  Only the first two will be mentioned in this 
TOI, but the logic starts being fairly simple and should end with the well 
known skip list functionality. The functionality is proposed to first be 
incorporated into a new header file within the linux kernel with just the 1st 
order split list support. Later support can be added to support the 2nd order 
split list and finally a skip list with dynamic / run time support to modify 
the list conversion from a split linked list to then the skip linked list.

                To keep the terminology simple, split1_ll will be known as just 
a split link list. The simplest performance change to an ordered doubly link 
list is to just search half the list. Yes, if it an ordered doubly linked list 
and the value that we search is fairly large, and we have increasing values 
from head to tail, with the doubly link list we can search from the tail. 
However, the distribution of values within the list that we search can be 
sparse and not equally distributed, thus the value MAY still be closer to the 
head. Thus we can a level of additional guarantee by increasing the size of the 
structure and adding additional logic. 

                Thus, for the simplest split linked list is  basically achieved 
by identifying a middle and then adding support to additionally search from the 
middle to the head or middle to the tail. This is of course additional to the 
original searching from the head or from the tail. I term this new ordered 
double link list as a split doubly link list. There is an implied assumption 
that almost100% of the elements to be added are not just increasing/decreasing, 
 else just default and add at the tail or head and walk the list for the very 
few exceptional unordered items.  This will in general allow us to search 1/2 
of the ordered link list and thus a major performance in a guarantee in walking 
at most just 1/2 of the items. This was originally done by this engineer for a 
different UNIX OS many years ago.

        Operations are based on first searching for the location for the new 
element to be added or deleted, thus the approximate maximum list size is 
important to be known before the set of functions are called. In the past the 
list was assumed to be anywhere from 64 to 256 elements in size. If one is 
aware that this list could reach an approximate 4,096 elements in size, then a 
more complicated version of the split list SHOULD be used to keep major 
performance improvements. To support this I add additional checks at the 1/4 
and 3/4 points to the list and allows the searching to only walk a maximum of 
1/4 of the number of elements in the double split list, with 4 sections.

        This split ordered link list could also be implemented with 
synchronization for each of the separate sections of the split list to minimize 
contention for the list.  The determination of how to find the middle in the 
simplest split list is based on the well known skip list algorithm. On every 
two insertions into the skip list, the middle is adjusted if both insertions 
fall on 1st or 2nd half of the list. If they are inserted into the first half, 
then the middle moves one position to the head. The reverse is true for every 
two deletions from the 1st half. The movement of the middle comparison item 
follows simple logic.

        If the number of elements are expected or grown beyond say 16k elements 
, then it makes sense to scale the split list to an actual skip list. The skip 
list and the split list will stay balanced, without additional overhead and 
will not decrease in performance as if some periodic order of the elements are 
inserted into the lists. Trees will need rebalancing, rotations, etc and thus 
are not being included in this suggestion. For the skeptical, add elements into 
a tree with the values from 1 to 100. Is the tree balanced without doing extra 
work? Dynamic upgrades from the split lists into the skip list should be a 
simple matter and can be configured via a /proc tunable if wanted.  A proper 
split list or skip list should show 16x, 32x, 64x or more higher performance 
finding the location to locate/search, add, or delete an element from the 
currently supported link list.

        Of course, all new functions will co-exist and only engineers who are 
comfortable with using the new functions will then possibly see performance 
improvements. Yes, only if these lists are frequently walked (possibly a 
bottleneck)  will the performance improvements be noticeable. 

        This engineer is willing to submit an initial prototype of the above 
logic functions to be incorporated within the Linux Kernel if/when there is a 
request to do so.

        Sincerely, Mitchell Erblich
                        Kernel Engineer--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

TOI: introducing Split Lists hybrids of doubly linked lists within the Linux Kernel

Reply via email to