Re: In-place extension of arrays only for certain alignment?

2022-08-16 Thread Steven Schveighoffer via Digitalmars-d-learn

On 8/16/22 4:53 PM, Ali Çehreli wrote:

On 8/16/22 12:31, Steven Schveighoffer wrote:
 >
 > No, it's based on 2 factors:
 >
 > 1. Is it a page-size-or-greater block?

I assume the length of the new block.


No, the length of the *existing* block.

Everything in the memory allocator is in terms of pages. A pool is a 
section of pages. The large blocks are a *multiple* of pages, whereas 
the small blocks are pages that are *divided* into same-sized chunks.


When you want to allocate a block, if it's a half-page or less, then it 
goes into the small pool, where you can't glue 2 blocks together.


If it's greater than 1/2 page, then it needs page+ sized blocks. These 
are too big really to set aside, so the allocator will just grab some 
pool that has enough free pages, and return it. These blocks can be 
stitched together depending on what is needed. Once freed, they can also 
be split up into pages.


It's here where the capacity can grow without copying.


 > The reason why your `bad` version fails is because when it must
 > reallocate, it still is only allocating 1 element.

That part I still don't understand. The same block of e.g. 16 bytes 
still has room. Why not use that remaining portion?


It does until it doesn't. Then it needs to reallocate.

Note that if you remove the front element, then you are pointing at a 
*slice* of the array.


Maybe some ASCII art? `A` is "used by the current slice", `x` is 
"allocated, but not referenced by the array". `.` is "part of the block 
but not used (i.e. can grow into this)". `M` is "metadata"


```d
auto arr = new ubyte[10];  //   AA.. ...M
arr = arr[1 .. $]; // xAAA  AA.. ...M
arr ~= 0;  // xAAA  AAA. ...M
arr ~= 0;  // xAAA   ...M
arr = arr[3 .. $]; //    ...M
arr ~= cast(ubyte[])[1, 2, 3]; //    AAAM // full!
arr ~= 1;  //    ...M // reallocated! 
arr.ptr has changed

```

Note in the last instance, it has now moved into a different 16-byte 
block -- the original is still intact, but just not pointed at. It will 
be garbage collected eventually.


But you can see that it only allocates enough to hold what the slice 
already contains + the new elements.




Here is a simpler test with better-than-expected results. Array c is 
what I want to do. In this test it works and I don't even need to call 
assumeSafeAppend() (this must be what you call "end slice"):


import std.stdio;

void main() {
   ubyte[] a;
   a ~= 0;
   a.length = 0;
   a.assumeSafeAppend(); // Needed
   assert(a.capacity == 15);

   // Essentially, the same as above
   ubyte[] b;
   b ~= 0;
   b = b[0..0];
   b.assumeSafeAppend();    // Needed
   assert(b.capacity == 15);

   ubyte[] c;
   c ~= 0;
   c = c[1..$];
   // c.assumeSafeAppend();
   assert(c.capacity == 14);
}


Yes, the "appendability" of an array slice depends on whether it *ends* 
at the end of the "used" space. Otherwise, if it ends earlier, appending 
would stomp on memory possibly referenced elsewhere. If it ends later, 
then you did something wrong in your code, and now the situation is 
corrupted.



 > metadata (e.g. typeinfo for destruction and append capacity).

I think it is 16 bytes total (on 64 bits): void* + size_t and I see this 
when I print .ptr: The change is always 0x10.


Metadata isn't always stored. Plus, it's optimized for the block size. 
For example, any block that is 256 bytes or less only needs a single 
byte to store the "used" space.


The TypeInfo needed for destruction of array elements is only stored if 
needed (e.g. an array of `int` does not need one).


.reserve sounds promising but it will sometimes allocate memory and move 
elements even if I will not really need e.g. more than just one more 
element. (In my case, I may not know how many will be needed.) In other 
words, I really don't know how much to reserve.


What is your focus? Why do you really want this "optimization" of gluing 
together items to happen?




What I seem to need is this function:

void increaseCapacityWithoutAllocating(T)(ref T[] arr) {
   // ...
}


You can make one with:

https://dlang.org/phobos/core_memory.html#.GC.extend

-Steve


Re: In-place extension of arrays only for certain alignment?

2022-08-16 Thread Salih Dincer via Digitalmars-d-learn

On Tuesday, 16 August 2022 at 18:11:54 UTC, Ali Çehreli wrote:

```d
version (good) {
  // Dropping front elements equaling 2048 bytes works.
  // (Likely a GC magic constant.)

  enum magic = 2048;
  enum elementsPerPage = magic / S.sizeof;

  if (arr.length == elementsPerPage) {
arr = arr[elementsPerPage..$];
  }
}
```


First I tried it on an older version (2.0.87).  And with 
version(good) I got interesting results:

```d
 (length == 1, capacity: 2031 -> 6127)
 i = 4079, 8175

 (length == 2, capacity: 1015 -> 3063)
 i = 2039, 4087, 6135, 8183

 (length == 3, null)
 i > 10.000 = ?

 (length == 4, capacity: 507 -> 1531)
 i = 1019, 2043, 3067, 4091, 5115, 6139, 7163, 8187, 9211

 (length == 5, 6, 7..., null)
 i > 10.000 = ?
```

For some reason the capacity changed frequently when data.length 
== 4 .  And I didn't see a result in 3, 5, 6, 7 since I tested 
less than 10k loops. I got more interesting results with the 
extra version:


```d
version(extra) arr.length++; /*

2087
291 elements, capacity: 582 -> 1167, ptr: 7F1306346010
1169 elements, capacity: 2338 -> 4093, ptr: 7F1306346010
2339 elements, capacity: 4678 -> 8189, ptr: 7F1306346010
4387 elements, capacity: 8774 -> 14626, ptr: 7F1306346010
7313 elements, capacity: 14626 -> 23403, ptr: 7F1306346010

Process finished.
*/
```

version(neither) is concurrent with i:

```d
  (length == 1, capacity: i)
  i = 4079, 8175..16367

  (length == 2)
  i = 2039, 4087, 8183..14327

  (length == 3, capacity: i)
  i = 1359, 2725, 5455, 9551..16378

  (length == 4, capacity: i)
  i = 1019, 2043, 4091, 7163..12283

  (length == 5, capacity: i)
  i = 815, 1635, 3273, 5731, 9827..16380
```

SDB@79


Re: In-place extension of arrays only for certain alignment?

2022-08-16 Thread Ali Çehreli via Digitalmars-d-learn

Thank you for the quick response.

On 8/16/22 12:31, Steven Schveighoffer wrote:
> On 8/16/22 2:11 PM, Ali Çehreli wrote:
>> Related to my DConf 2022 lightning talk, I am noticing that D
>> runtime's in-place array extension optimization is available only for
>> array data that are at certain memory alignments.
>>
>
> No, it's based on 2 factors:
>
> 1. Is it a page-size-or-greater block?

I assume the length of the new block.

> 2. Is there a free page after it?

Makes sense.

> The reason why your `bad` version fails is because when it must
> reallocate, it still is only allocating 1 element.

That part I still don't understand. The same block of e.g. 16 bytes 
still has room. Why not use that remaining portion?


Here is a simpler test with better-than-expected results. Array c is 
what I want to do. In this test it works and I don't even need to call 
assumeSafeAppend() (this must be what you call "end slice"):


import std.stdio;

void main() {
  ubyte[] a;
  a ~= 0;
  a.length = 0;
  a.assumeSafeAppend(); // Needed
  assert(a.capacity == 15);

  // Essentially, the same as above
  ubyte[] b;
  b ~= 0;
  b = b[0..0];
  b.assumeSafeAppend();// Needed
  assert(b.capacity == 15);

  ubyte[] c;
  c ~= 0;
  c = c[1..$];
  // c.assumeSafeAppend();
  assert(c.capacity == 14);
}

> metadata (e.g. typeinfo for destruction and append capacity).

I think it is 16 bytes total (on 64 bits): void* + size_t and I see this 
when I print .ptr: The change is always 0x10.


> Note that once you reach page size, the optimization can happen, *even
> if you are only appending to an end slice*. Here's something to try:
> when the capacity is less than the "magic" number of elements, `reserve`
> that number of elements. Then the "drop one element each loop" should
> use the optimization.

.reserve sounds promising but it will sometimes allocate memory and move 
elements even if I will not really need e.g. more than just one more 
element. (In my case, I may not know how many will be needed.) In other 
words, I really don't know how much to reserve.


What I seem to need is this function:

void increaseCapacityWithoutAllocating(T)(ref T[] arr) {
  // ...
}

Only the runtime seems to be able to implement that function. Can I call 
something in the runtime similar to how assumeSafeAppend() calls 
_d_arrayshrinkfit() in object.d?


>
> -Steve

Ali



Re: In-place extension of arrays only for certain alignment?

2022-08-16 Thread Steven Schveighoffer via Digitalmars-d-learn

On 8/16/22 2:11 PM, Ali Çehreli wrote:
Related to my DConf 2022 lightning talk, I am noticing that D runtime's 
in-place array extension optimization is available only for array data 
that are at certain memory alignments.




No, it's based on 2 factors:

1. Is it a page-size-or-greater block?
2. Is there a free page after it?

The smaller blocks are stored in page-sized pools and *cannot* be 
combined together. e.g. you can't have 2 16-byte blocks joined to form a 
32 byte block.


The reason why your `bad` version fails is because when it must 
reallocate, it still is only allocating 1 element.


Now a page is 4096 bytes typically. So why does the 2048 amount work? 
Because in order to store a 2048 byte array, you need 2048 bytes AND the 
metadata (e.g. typeinfo for destruction and append capacity). This 
requires a full page.


Note that once you reach page size, the optimization can happen, *even 
if you are only appending to an end slice*. Here's something to try: 
when the capacity is less than the "magic" number of elements, `reserve` 
that number of elements. Then the "drop one element each loop" should 
use the optimization.


-Steve


In-place extension of arrays only for certain alignment?

2022-08-16 Thread Ali Çehreli via Digitalmars-d-learn
Related to my DConf 2022 lightning talk, I am noticing that D runtime's 
in-place array extension optimization is available only for array data 
that are at certain memory alignments.


I used the following program to test it. In addition to repeatedly 
adding an element to an array,


- 'version = neither' does not drop any element (this is "good" as well)

- 'version = bad' case drops the front element (a moving window of 1)

- 'version = good' case drops elements only when element data will hit a 
certain alignment value.


Is this expected?

import std.stdio;
import std.range;

// PLEASE UNCOMMENT ONLY ONE:
// version = bad;// No in-place extension
// version = good;   // In-place extension happens
// version = neither;// In-place extension happens

mixin assertSingleVersionOf!("bad", "good", "neither");

void main() {
  struct S {
ubyte[4] data;  // Try different reasonable sizes.
// For example, 5 will not work even
// for the "good" case.
  }

  S[] arr;

  foreach (i; 0 .. 100_000) {
const oldCap = arr.capacity;
const oldPtr = arr.ptr;

arr ~= S.init;

if (arr.capacity != oldCap) {
  // The array needed to be extended...
  if (arr.ptr == oldPtr) {
// ... but the pointer did not change
writefln!"In-place extension -- element: %,s  capacity: %,s -> 
%,s  ptr: %s"(

  i, oldCap, arr.capacity, arr.ptr);
  }
}

version (neither) {
  // Do not remove any element; just extend
}

version (bad) {
  // Dropping 1 element inhibits in-place extension
  // (Many values other than 1 will inhibit as well.)
  arr = arr[1..$];

  // Even this does not help
  arr.assumeSafeAppend();
}

version (good) {
  // Dropping front elements equaling 2048 bytes works.
  // (Likely a GC magic constant.)

  enum magic = 2048;
  enum elementsPerPage = magic / S.sizeof;

  if (arr.length == elementsPerPage) {
arr = arr[elementsPerPage..$];
  }
}
  }
}

// A useful template that has nothing to do with this problem.
mixin template assertSingleVersionOf(args...) {
  import std.format : format;

  static assert (1 >= {
size_t count = 0;
static foreach (arg; args) {
  static assert (is (typeof(arg) == string));
  mixin (format!q{
version (%s) {
  ++count;
}
  }(arg));
}
return count;
  }(), format!"Please pick only one or none of %(%s, %)"([args]));
}

Ali


Re: How long will DUB update a package from github release?

2022-08-16 Thread Steven Schveighoffer via Digitalmars-d-learn

On 8/15/22 10:12 PM, Domain wrote:
The project [Lumars](https://code.dlang.org/packages/lumars) has 
released a new version 10 days ago in 
[github](https://github.com/BradleyChatha/lumars). But still unavailable 
in DUB.


It should be done automatically, but this has been an intermittent 
problem for some time. The author can manually request a refresh, and it 
should work.


-Steve


Re: Programs in D are huge

2022-08-16 Thread Steven Schveighoffer via Digitalmars-d-learn

On 8/16/22 4:25 AM, Diego wrote:

Hello everyone,

I'm a Java programmer at work but i'm learning D for pleasure. I'm 
reading _The D Programming Language by Ali Çehreli_.


I noticed that DMD creates very huge executable, for example an empty 
program:


```
empty.d:

void main() {

}
```

after a compilation with these flags `dmd -de -w empty.d` i have an 
executable of 869KiB

It seams huge in my opinion for an empty program

What are the best practices to reduce the size?


By default (but changing in some cases), D links the standard library in 
statically. This means that every D executable has a copy of the library 
(at least the used functions).


If you use C for a comparison, you have to include the C library size as 
well when comparing. On my system, libc is 2MB.


You can reduce size by using dynamic linking, but the drawback is that 
then you have to ensure the library exists on the target system in order 
to run.


If you are shipping a package with lots of D binaries it might make 
sense to do this, but for one-offs, probably not.


-Steve


Re: Programs in D are huge

2022-08-16 Thread Salih Dincer via Digitalmars-d-learn

On Tuesday, 16 August 2022 at 08:25:18 UTC, Diego wrote:
after a compilation with these flags `dmd -de -w empty.d` i 
have an executable of 869KiB

It seams huge in my opinion for an empty program

What are the best practices to reduce the size?


If compile speed and verbose error codes are not important, use 
ldc2 with the -O1 parameter. There is also strip for Linux. For 
example, a simple program:


LCD2+strip: 138.5KB.
DMD: 1.9 MB.

SDB@79


Re: Programs in D are huge

2022-08-16 Thread Dennis via Digitalmars-d-learn

On Tuesday, 16 August 2022 at 08:25:18 UTC, Diego wrote:

It seams huge in my opinion for an empty program

What are the best practices to reduce the size?


The problem is that the druntime, the run time library needed to 
support many D features, is large and linked in its entirety by 
default. The linker could strip unused functions, but even in an 
empty program, a lot is done before `main` that pulls in most of 
it:


- initializing floating point settings, signal handlers, stdout 
and stderr
- parsing --DRT command line options for configuring the Garbage 
Collector

- running module constructors / unit tests

There is a goal to make druntime more 'pay as you go' so these 
things only happen when they're needed, but progress is slow. In 
the mean time, if you can live without a lot of D features that 
require the runtime, you can use `-betterC`:


https://dlang.org/spec/betterc.html

With the LDC2 compiler, you can use `--link-defaultlib-shared`, 
so your program dynamically links with the run time library. This 
doesn't help for a single D program, but multiple D programs can 
reuse a single shared library.


Finally, you could look at customized versions of the runtime, 
such as Light Weight D Runtime: https://github.com/hmmdyl/LWDR


Re: Programs in D are huge

2022-08-16 Thread user1234 via Digitalmars-d-learn

On Tuesday, 16 August 2022 at 08:25:18 UTC, Diego wrote:

Hello everyone,

I'm a Java programmer at work but i'm learning D for pleasure. 
I'm reading _The D Programming Language by Ali Çehreli_.


I noticed that DMD creates very huge executable, for example an 
empty program:


```
empty.d:

void main() {

}
```

after a compilation with these flags `dmd -de -w empty.d` i 
have an executable of 869KiB

It seams huge in my opinion for an empty program

What are the best practices to reduce the size?


This is normal, by default you have plenty of typeinfo (static 
data allowing dynamic introspection), code located in implicit 
import object.d, the runtime (e.g code for the GC).


You can really get rid of that, excepted using -betterC, but you 
see all the stuff that make the default main program big will 
actually be useful in a real program.


It's just that D is not as much "pay as you go" as that, for now.



Programs in D are huge

2022-08-16 Thread Diego via Digitalmars-d-learn

Hello everyone,

I'm a Java programmer at work but i'm learning D for pleasure. 
I'm reading _The D Programming Language by Ali Çehreli_.


I noticed that DMD creates very huge executable, for example an 
empty program:


```
empty.d:

void main() {

}
```

after a compilation with these flags `dmd -de -w empty.d` i have 
an executable of 869KiB

It seams huge in my opinion for an empty program

What are the best practices to reduce the size?