Thank you, this was very informative! On Thu, Apr 5, 2018 at 6:41 PM, David Chisnall <gnus...@theravensnest.org> wrote: > On 5 Apr 2018, at 17:01, Ivan Vučica <i...@vucica.net> wrote: >> >> Layman question: does it make sense to optimize for space, too, and have a >> smaller structure for tiny constant strings? > > With the new ABI, we get much better deduplication across compilation units > for selectors and protocols, which should extend to constant strings. > > At run time, on 64-bit platforms, we generate GSTinyString instances, which > are 64 bits and are hidden inside a pointer. I’m tempted to make the > compiler generate those directly. > >> For 32bit ptrs and longs, this would be 20 bytes without the string itself. >> I don't think that's a lot, but I thought I'd ask. > > 20 bytes isn’t too bad, 36 (for 64-bit platforms) is a bit more. On a > CHERI-like platform, it grows to 52 bytes, which starts to feel a bit > excessive. > > The absolute minimum structure is an isa pointer immediately followed by the > character data, with a null terminator. That’s not a great idea, because the > isa pointer needs to be mutable, which would make the constant string also > accidentally mutable. > > The next smallest would be an isa pointer and a null-terminated string > pointer, so 8 / 16 / 32 bytes on the respective architectures. > > The cost of recomputing the hash is sufficiently expensive that it’s probably > worth using at least the 28 bits that we provide already for string hashes. > > I’ve done some measurements in -base. In the compiled binary, we have a > total of 84976 bytes of strings, in 3307 strings, so an average of just under > 26 bytes per string, so 36 bytes of overhead seems quite a lot, and even 20 > is quite noticeable. If we exclude strings of 8 or fewer characters, this > gives us 81637 bytes in 2586 strings, so an average length of just under 32 > bytes, so 36 bytes is still more than 100% overhead and adds up to about 90KB > in the final binary. > > With the current encoding, each constant string is 24 bytes, so that adds up > to about 60KB (excluding the string data itself) on 64-bit platforms. That’s > about 0.5% of the total binary size, so I’m not too worried about making it > bigger. Even making it 80KB is a lot of overhead per string (roughly 100%), > but isn’t that much of the total binary size. > > > David >
_______________________________________________ Gnustep-dev mailing list Gnustep-dev@gnu.org https://lists.gnu.org/mailman/listinfo/gnustep-dev