05-Jul-2014 19:08, Wanderer пишет:
On Saturday, 5 July 2014 at 14:20:33 UTC, Dmitry Olshansky wrote:
Provision some extra space in each record. DBs do this all the time,
regardless of layout.

Which means waste of space you complained about just below.

There are trade-offs. The world is not black and white and I don't follow 'one rule everywhere'.

Besides, you understand this is not a solution: one byte more than that
reserve can provide, and the reordering should be done.

Which is what happens at a times, regardless of how DB was organized.

Stuff like FAT
was created exactly because linear storages are terribly inefficient
unless they are read-only.

Frankly FAT is junk. But in any case file maintenance problem is all about balancing wasting some memory on gaps with the need defragment linked chains. There is no silver bullet you seem to imply.

It's funny to see how an _ability_ to avoid indirection can be framed
as a problem. Nothing prevents having an array of pointers to structs
or just using classes if desired.

Pointers to structs? Manually add indirection for sorting? That's
reinventing the wheel. Indeed better (and simpler) using classes
instead. They provide the same functionality and without using unsafe
features.

Pointers are perfectly fine as long as there is no pointer arithmetic.
Also there is no problem just constructing a sorted index (e.g. there is function for that in Phobos) for the array of big structs, and voila - the same reference to the item, but the stuff is neatly packed in a contiguous array.

On the contrary sorting an array of pairs of integers in say Java is
shamefully inefficient. No wonder packed objects (=value types)  are
coming to Java, to address these sorts of need.

For pair of integers, you can use long and sort an array of longs.

Which is hopelessly admitting defeat. A pair may have non-trivial comparator. And a minor step beyond that such as a pair of double and integer and it fails completely.

But
if your structure is any more complex (or has nontrivial comparison
algorithm), one should really consider using classes first.

And there Java-style indirection-happy classes "shine". For instance modeling any complex stuff with classes alone would lead to things like:
House--reference->Bedroom---reference-->Bed--reference-->Pillow

Which is incredible waste of memory and speed on something so simple.

Ehm. One can always redesign struct layout to push stuff out through
an indirection. And it's not like structs grow organically on their
own ;)

You mean structs with pointers? That violates the whole principle of
structs IMHO.

What? But again it seems to be the general direction of you post - puristic one size fits all. The idea that there is true holy way, is very much province of OOP language proponents.

Ideally, when you assign/copy a struct, you assign/copy a
*value*. Usage of pointers would add undue aliasing to the copy process,
leading to bugs.

Copy constructors or in D parlance postblits. There is also CoW and whatnot. In fact swap doesn't create a copy in D, it just does a bitwise swap in-place.

In the same vane design of ref-to-instance is not scalable with
millions of items scattered across the memory causing awful cache miss
rate (as the indirection got to be crossed to compare things), not to
mention the memory wasted per each object.

It depends. With classes, you have to read and write only small portions
of memory - read only fields to compare

Strictly more then with structs to be precise as long as it's just compare we talk about. A reference + fields vs just fields. Also the cost of extra indirection is very much non-negligible.

write only references to swap.

That much is true.

Even with cache misses, this is still very small and efficient
operations.

Disagree - the moment a cache line is loaded it doesn't matter how much you read in this line. Even a tiny read missing a cache is paid in full.

With structs, you have to read and write much bigger regions
of memory - to swap two items, you have to read both of them fully, then
> write both of them fully into new locations -

I bet compare will load the whole thing into cache anyway, plus contiguous structs may fit into a single line. Sequential storage has advantages.

which would make caching
totally useless, whether structs are located sequentially or not.

How's that?

It's
like copying a large file - the cache gets exhausted instantly, and
doesn't speed things up at all.

And the only thing worse then copying a large file is copying a large file fragmented in tiny splinters.

--
Dmitry Olshansky

Reply via email to