>
> Any code that keeps data aligned to memory page and disk page sizes is 
> automatically significantly faster, because misalignment automatically 
> doubles the amount of memory that has to be accessed to satisfy a request. 
> This is why Binary Heaps are way slower than B-heaps.


My opinion is without measurement you will miss necessary knowledge found 
between assumptions. The package testing benchmark has non-obvious features 
for repeatable data: 
https://dave.cheney.net/2013/06/30/how-to-write-benchmarks-in-go

I see you have logic tests 
(https://github.com/calibrae-project/bast/blob/master/pkg/bast/bast_test.go), 
and benchmarks are another step.

Matt

On Wednesday, April 25, 2018 at 9:37:48 AM UTC-5, Louki Sumirniy wrote:
>
> Always the elements are inserted according to greater than and less than. 
> equal can't happen. The first value inserted will be the root to begin 
> with, but if the tree gets heavy on one side, you rotate the root to 
> rebalance. from any given node, you know that you will find the element you 
> are looking for according to the node you are at, and its greater or less 
> than test. You can know this is always going to be true because that's how 
> the inserts will populate the tree, and the 'completeness' constraint 
> enforces a requirement for the 'bunching up' of data. So long as there is 
> space between two nodes, they can be as high up in the tree as the distance 
> between them, each row downwards allows a power of two sequence of 
> increasing numbers. The 'completeness' constraint is extended by not 
> allowing orphans, every child has a parent, until you get to the root, so 
> vice versa, you know that you will find every element in the tree by 
> walking downwards and going left or right depending on greater than and 
> less than, this is the comparability constraint. You should be able to see 
> that a number of architectural features in this data storage system imply 
> heuristics for optimisation.
>
> Just to update, reading about B-heaps and how it structures storage, I 
> realised that I need to think of the tree in terms of 4kb blocks (well, for 
> most platforms) as this is the amount of memory that will be accessed in 
> one operation. However many rows a payload size fits into 4kb is the size 
> of each subtree. So for 32 bit values, I have one page at the top, and then 
> 1024 pages that are the left and right pairs of the 512 elements at the 
> bottom of the first page, and, of course, this repeats again. This changes 
> how I have to implement the walk functions, as when it overflows past 512, 
> I know I have moved to the second page-row.
>
> So... I have to kinda go back to the drawing board a little on the 
> structuring of the algorithm since it is working on 4kb page sizes. I may 
> have to switch up the design so that the pages are the primary indices of 
> the first dimension, and then the pages themselves, the subtrees, can be 
> allocated as fixed length arrays, which also simplifies how Go will deal 
> with them. Further, instead of allocating whole rows at a time, only those 
> rows that have been breached require allocation of a new page for a 
> subtree. In this respect, then the question about search cost becomes a 
> little different, since we use 4k pages, and we know walking them goes on 
> inside 14Gb/s SRAM cache, performance optimisation by tree balancing has a 
> somewhat different character than the Binary Heap linear mapping.
>
> Why do I seem so confident about the performance of this, potentially? 
> Look up B-heaps and Varnish reverse proxy. It is designed at the low level 
> with the same basic concept in mind, using B-heaps - structure the heap 
> segments according to memory pages. If you are familiar with filesystems, 
> you may know about block alignment. It's not quite so important for 
> spinning disks but for flash and S/DRAM, memory is only ever handled on the 
> hardware level in these block sizes. Disks usually use 512 byte blocks in 
> the FS level and generally most disks have 512 byte blocks, some have 1k, 
> 2k, 4k or 8k. Most current generation FS's use 4k blocks as the default 
> size, for the reason it is a 1:1 mapping with memory pages.
>
> The difference with what I am designing is that it does not order 
> vertically, it orders horizontally. It's basically fractal, each subtree 
> has the same number of potential subtrees below it.
>
> Any code that keeps data aligned to memory page and disk page sizes is 
> automatically significantly faster, because misalignment automatically 
> doubles the amount of memory that has to be accessed to satisfy a request. 
> This is why Binary Heaps are way slower than B-heaps.
>
> Anyway, because many of my assumptions based on my conceptual model have 
> changed, and big thanks to you guys, I have to rework the design to account 
> for this. Thanks :p
>
> On Wednesday, 25 April 2018 15:11:15 UTC+3, rog wrote:
>>
>> On 25 April 2018 at 10:24, Louki Sumirniy 
>> <louki.sumir...@gmail.com> wrote: 
>> > As you look deeper into the link discussing the B-heap you can see that 
>> actually I am pretty much exactly following the same general structure in 
>> my algorithm. The structure will align neatly with page boundaries and that 
>> means less page faults and reduced pressure on the virtual memory mapping 
>> system that means actually the tree can go a lot larger while cutting down 
>> delays on retrieving data. I intuitively understood this in my own visual 
>> model and this picture in particular is exactly how I visualised the data 
>> storage map: 
>>
>> To expand slightly on why this data structure isn't good for 
>> searching, imagine you're searching for the number 40 in either of 
>> those pictured data structures. When you're looking at the root, you 
>> have to make a decision whether to move to the left or the right 
>> child, but you can only see 2 and 3. How do you decide that you need 
>> to go left? Now imagine you're searching for 31 - how do you decide 
>> that the correct direction is right? 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to