Re: [swift-dev] Reducing the size of Swift binaries by shortening symbols

Nadav Rotem via swift-dev Sun, 20 Dec 2015 22:08:37 -0800

Hi Steve, 

> On Dec 20, 2015, at 7:35 AM, Stephen Canon <sca...@apple.com> wrote:
> 
> Nadav, can you clarify what we’re really trying to accomplish here?  "Smaller 
> binaries” isn’t too important of a goal in and of itself.
> 
> Are we trying to:
> – reduce storage used on disk
> – reduce load time
> – reduce loaded memory footprint
> – make emitting swift binaries more efficient
> – something else?
> 
> Yes, I know, “all of the above”, but understanding something about what’s 
> most important would help evaluate the proposal.


> 
> It’s also worth keeping in mind that iOS and OS X have been aggressively 
> adopting pervasive system-wide compression both on disk and in memory.  This 
> trend will continue, and it makes it quite a bit less important for 
> individual components to explicitly adopt compression techniques themselves, 
> except in cases where there’s a lot of special structure that those 
> components can leverage to get better compression than a general-purpose 
> lossless compressor can manage (images and sound are the two obvious examples 
> of this, but also cases like huge arrays of floating-point data where the 
> low-order bits don’t matter, etc).  Linux hasn’t been as aggressive about 
> doing this yet, but pervasive system-level compression is The Future.

Swift is a systems programming language. We’d like to be able to build the 
whole operating system in Swift. This mans that one day your phone will have 
hundreds of shared libraries (written in swift) loaded all at the same time. 
Thousands of shared libraries will be saved on disk, and updated every time you 
upgrade the OS or some apps. The string table (linkedit section) is loaded into 
memory (shared cow). In a world where every single process uses multiple swift 
libraries reducing the size of this section is very beneficial. 

Disk and network compressions can help. I believe that we have domain specific 
information that will allow us do a better job in compressing this section. 

Thanks,
-Nadav

> 
> – Steve
> 
>> On Dec 20, 2015, at 5:17 AM, Dmitri Gribenko <griboz...@gmail.com 
>> <mailto:griboz...@gmail.com>> wrote:
>> 
>> + Stephen Canon, because he probably has good ideas in this domain.
>> 
>> On Fri, Dec 18, 2015 at 3:42 PM, Nadav Rotem via swift-dev 
>> <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
>> 
>> What’s next?
>> 
>> The small experiment I described above showed that compressing the names in 
>> the string table has a huge potential for reducing the size of swift 
>> binaries. I’d like for us (swift-developers) to talk about the implications 
>> of this change and start working on the two tasks of tightening our existing 
>> mangling format and on implementing a new compression layer on top. 
>> 
>> Hi Nadav,
>> 
>> This is a great start that shows that there is a potential for improvement 
>> in our mangled names!
>> 
>> To make this effort more visible, I would suggest creating a bug on 
>> https://bugs.swift.org/ <https://bugs.swift.org/> .
>> 
>> I think we survey existing solutions that industry has developed for 
>> compressing short messages.  What comes to mind:
>> 
>> - header compression in HTTP2:
>> https://http2.github.io/http2-spec/compression.html 
>> <https://http2.github.io/http2-spec/compression.html>
>> 
>> - PPM algorithms are one of the best-performing compression algorithms for 
>> text.
>> 
>> - Arithmetic coding is also a natural starting point for experimentation.
>> 
>> Since the input mangled name also comes in a restricted character set, we 
>> could also remove useless bits first, and try an existing compression 
>> algorithm on the resulting binary string.
>> 
>> We should also build a scheme that uses shortest one between the compressed 
>> and non-compressed names.
>> 
>> For running experiments it would be useful to publish a sample corpus of 
>> mangled names that we will be using for comparing the algorithms and 
>> approaches.
>> 
>> I also have a concern about making mangled names completely unreadable.  
>> Today, I can frequently at least get a gist of what the referenced entity is 
>> without a demangler.  What we could do is make the name consist of a 
>> human-readable prefix that encodes just the base name and a compressed 
>> suffix that encodes the rest of the information.
>> 
>> _T<length><class name><length><method name><compressed suffix>
>> 
>> We would be able to use references to the class and the method name from the 
>> compressed part, so that character data isn't completely wasted.
>> 
>> This scheme that injects human-readable parts will also allow the debugger 
>> to quickly match the names without the need to decompress them.
>> 
>> We should also investigate improving existing mangling scheme to produce 
>> shorter results.  For example, one idea that comes to mind is using base-60 
>> instead of base-10 for single-digit numbers that that specify identifier 
>> length, falling back to base-10 for longer numbers to avoid ambiguity.  This 
>> would save one character for every identifier longer than 9 characters and 
>> shorter than 60, which is actually the common case.
>> 
>> Dmitri
>> 
>> -- 
>> main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
>> (j){printf("%d\n",i);}}} /*Dmitri Gribenko <griboz...@gmail.com 
>> <mailto:griboz...@gmail.com>>*/
>

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Re: [swift-dev] Reducing the size of Swift binaries by shortening symbols

Reply via email to