On Tuesday, 24 May 2016 at 20:25:18 UTC, Era Scarecrow wrote:
64 bytes? Hmm I had the impression it was going to be far
larger...
Well it's been a while since I last touched this topic (or code)
so I'll post it, but there's 3 formulas written.
0) compress.c - control code, we are comparin
Am Tue, 24 May 2016 19:16:07 +
schrieb Anon :
> Upping it to sixteen levels:
>
> current: Generate a 3_688_679 byte symbol
> lz4 -9: Generate a 3_688_679 byte symbol, then compress it to
> 15535 bytes
> lzma -9: Generate a 3_688_679 byte symbol, then compress it to
> 840 bytes
> BRNT: Gener
On 5/24/2016 2:34 PM, Stefan Koch wrote:
But would one ever have to be able to reconstruct the mangle ?
Symbolic debugging, and dealing with various linker errors.
templates cannot be exported nor can instanciations.
Or can they ?
Of course they can!
On Tuesday, 24 May 2016 at 21:26:58 UTC, Walter Bright wrote:
On 5/24/2016 2:06 PM, Stefan Koch wrote:
A serious question, do these templates even need a mangle ?
Is anyone ever going to call them from a different module ?
I don't think you could do that because those are voldemort
types.
Tho
On 5/24/2016 2:06 PM, Stefan Koch wrote:
On Tuesday, 24 May 2016 at 20:39:41 UTC, Walter Bright wrote:
On 5/24/2016 1:27 PM, Guillaume Boucher wrote:
There's no reason not to use the compression in the deco name.
Just make sure
the references are relative and you're set.
Not totally, since y
On Tuesday, 24 May 2016 at 20:39:41 UTC, Walter Bright wrote:
On 5/24/2016 1:27 PM, Guillaume Boucher wrote:
There's no reason not to use the compression in the deco name.
Just make sure
the references are relative and you're set.
Not totally, since you'd like to refer to types that are commo
On 5/24/2016 1:27 PM, Guillaume Boucher wrote:
There's no reason not to use the compression in the deco name. Just make sure
the references are relative and you're set.
Not totally, since you'd like to refer to types that are common across deco
names.
On Tuesday, 24 May 2016 at 20:16:32 UTC, Walter Bright wrote:
On 5/24/2016 9:22 AM, Timon Gehr wrote:
Yes, it does. The compiler does not use exponential space to
store the AST.
BTW, all types in dmd have a 'deco' string which is the AST
linearized as a string. This string is also used to bui
On Tuesday, 24 May 2016 at 20:07:02 UTC, Walter Bright wrote:
On 5/24/2016 4:55 AM, Era Scarecrow wrote:
So Walter, big question: How big before the compressor kicks
in? What's the minimum size of the input string the compressor
should expect?
It's currently set at 64 in the PR. But that numb
On Monday, 23 May 2016 at 20:34:19 UTC, Walter Bright wrote:
On 5/23/2016 12:32 PM, Timon Gehr wrote:
Isn't it enough to create
references to previously embedded mangled symbols (i-th symbol
already mangled)
while generating the mangled string?
That's been covered upthread, and VC++/g++ do t
On 5/24/2016 9:22 AM, Timon Gehr wrote:
Yes, it does. The compiler does not use exponential space to store the AST.
BTW, all types in dmd have a 'deco' string which is the AST linearized as a
string. This string is also used to build the mangled names.
All the deco's are put into a hashtable
On 5/24/2016 4:55 AM, Era Scarecrow wrote:
So Walter, big question: How big before the compressor kicks in? What's the
minimum size of the input string the compressor should expect?
It's currently set at 64 in the PR. But that number is just picked out of the
air.
I assume speed is more i
On 5/24/2016 12:16 PM, Anon wrote:
As if winning the compactness war while still being a symbol linkers won't have
problems with wasn't enough, this approach also avoids generating giant symbols
in the first place. Compression and/or hashing cannot do that. If D really wants
to, it can compress s
On Tuesday, 24 May 2016 at 16:22:56 UTC, Timon Gehr wrote:
It's _exponential_ growth. We don't even want to spend the time
and memory required to generate the strings.
The reason we have this discussion is that the worst case isn't
rare enough to make this argument. Why compress in the first
On Tuesday, 24 May 2016 at 18:30:36 UTC, Stefan Koch wrote:
On Tuesday, 24 May 2016 at 16:22:56 UTC, Timon Gehr wrote:
On 24.05.2016 01:02, Walter Bright wrote:
[...]
Speed. The title of this thread is "Need a faster compressor".
[...]
No. Just increase the recursion depth
On Tuesday, 24 May 2016 at 16:22:56 UTC, Timon Gehr wrote:
On 24.05.2016 01:02, Walter Bright wrote:
[...]
Speed. The title of this thread is "Need a faster compressor".
[...]
No. Just increase the recursion depth by a small number of
levels to completely kill the speedups
ing a compression pass?
Speed. The title of this thread is "Need a faster compressor".
The
only real problem with the compressor was the compression speed, and the
LZ4 test shows this is solvable.
...
No. Just increase the recursion depth by a small number of levels to
completely ki
On Monday, 23 May 2016 at 20:36:04 UTC, Walter Bright wrote:
On 5/23/2016 9:00 AM, Stefan Koch wrote:
The 64k limit does not apply to the input string.
The starting line of the compression function posted here tests
for it and aborts if larger.
So Walter, big question: How big before the c
On 5/23/2016 4:49 PM, Stefan Koch wrote:
Just a heads up on the LZ4.
I have spent roughly 3 hours optimizing my decompresser.
And while I had stunning success, a speed-up of about 400%.
I am still about 600x slower then the C variant.
It is still a mystery to me why that is :)
Since the generate
Just a heads up on the LZ4.
I have spent roughly 3 hours optimizing my decompresser.
And while I had stunning success, a speed-up of about 400%.
I am still about 600x slower then the C variant.
It is still a mystery to me why that is :)
Since the generated code both smaller and works almost witho
On 5/23/2016 2:17 PM, Timon Gehr wrote:
Then don't do that. I.e. re-mangle recursively from scratch for each mangled
name and allow sharing parts between unrelated components within that mangled
name.
How is that essentially different from running a compression pass? The only real
problem with
On 23.05.2016 22:34, Walter Bright wrote:
On 5/23/2016 12:32 PM, Timon Gehr wrote:
Instead, compression should be performed while generating the mangled
string.
The trouble with that is the mangled string is formed from component
pieces.
Then don't do that. I.e. re-mangle recursively from sc
On 5/23/2016 9:00 AM, Stefan Koch wrote:
On Monday, 23 May 2016 at 15:33:45 UTC, Walter Bright wrote:
Also, the LZ4 compressor posted here has a 64K string limit, which won't work
for D because there are reported 8Mb identifier strings.
This is only partially true.
The 64k limit does not appl
On 5/23/2016 12:32 PM, Timon Gehr wrote:
Instead, compression should be performed while generating the mangled string.
The trouble with that is the mangled string is formed from component pieces.
Those component pieces may have common substrings with each other, which won't
be detected until
On 22.05.2016 00:07, Walter Bright wrote:
On 5/21/2016 2:37 PM, Timon Gehr wrote:
Why is longest_match Ω(nm) instead of O(n+m) (e.g. KMP)?
I don't understand the terms you use, but as to the "why" it is based on
what I knew about LZ77 compression. I don't pretend to be an expert on
compressio
Am Mon, 23 May 2016 12:04:48 +
schrieb Stefan Koch :
> The method for archiving perfect compression it outlined here:
> https://github.com/Cyan4973/lz4/issues/183
Nice, if it can keep the compression speed up.
--
Marco
On Monday, 23 May 2016 at 16:00:20 UTC, Stefan Koch wrote:
On Monday, 23 May 2016 at 15:33:45 UTC, Walter Bright wrote:
Also, the LZ4 compressor posted here has a 64K string limit,
which won't work for D because there are reported 8Mb
identifier strings.
This is only partially true.
The 64k
On Monday, 23 May 2016 at 15:33:45 UTC, Walter Bright wrote:
Also, the LZ4 compressor posted here has a 64K string limit,
which won't work for D because there are reported 8Mb
identifier strings.
This is only partially true.
The 64k limit does not apply to the input string. It does only
app
On 5/23/2016 5:04 AM, Stefan Koch wrote:
Am Sun, 22 May 2016 23:42:33 -0700
schrieb Walter Bright :
The file format: http://cyan4973.github.io/lz4/lz4_Block_format.html
It doesn't look too difficult. If we implement our own LZ4 compressor based on
that, from scratch, we can boost license it.
On Monday, 23 May 2016 at 14:47:31 UTC, Wyatt wrote:
Maybe consider lz4 instead?
Disregard that: I see it's come up already.
On Saturday, 21 May 2016 at 21:27:37 UTC, Era Scarecrow wrote:
I assume this is related to compressing symbols thread? I
mentioned possibly considering the LZO library.
Maybe consider lz4 instead? Tends to be a bit faster, and it's
BSD instead of GPL. https://cyan4973.github.io/lz4/
-Wyat
Am Sun, 22 May 2016 23:42:33 -0700
schrieb Walter Bright :
The file format:
http://cyan4973.github.io/lz4/lz4_Block_format.html
It doesn't look too difficult. If we implement our own LZ4
compressor based on that, from scratch, we can boost license it.
That it right. It's pretty simple.
On
On Monday, 23 May 2016 at 01:46:40 UTC, Era Scarecrow wrote:
On Sunday, 22 May 2016 at 19:44:08 UTC, Era Scarecrow wrote:
...
Well here's the rundown of some numbers. min_compress uses a
tiny window, big_compress was my original algorithmn but
modified to use 16k total for a window. reduced_
Am Sun, 22 May 2016 23:42:33 -0700
schrieb Walter Bright :
> The file format: http://cyan4973.github.io/lz4/lz4_Block_format.html
>
> It doesn't look too difficult. If we implement our own LZ4 compressor based
> on
> that, from scratch, we can boost license it.
Ok, any volunteers? I'm not pers
The file format: http://cyan4973.github.io/lz4/lz4_Block_format.html
It doesn't look too difficult. If we implement our own LZ4 compressor based on
that, from scratch, we can boost license it.
On 5/22/2016 10:52 PM, Marco Leise wrote:
Can you elaborate on how copying the source overrules your
above license concerns?
It does not. It just avoids having to link in a library that may or may not
exist on every machine we want to create demanglers on.
I would create a directory "lz4" f
Am Sun, 22 May 2016 20:22:06 +0200
schrieb Rainer Schuetze :
> The 3 consecutive ?s are also in the object file, so they are actually
> part of the mangled symbol. VC++ uses only ASCII characters for mangling
> (which D should do, too, IMO).
In general I agree. The extended characters may be sw
Am Sun, 22 May 2016 13:02:37 -0700
schrieb Walter Bright :
> On 5/22/2016 9:06 AM, Marco Leise wrote:
> > Are there any objections
> > against the benchmarking method or the license?
>
> BSD may cause problems.
>
> > Can the implementation be in D?
>
> Not yet.
>
> > If not, should we link
On Sunday, 22 May 2016 at 19:44:08 UTC, Era Scarecrow wrote:
...
Well here's the rundown of some numbers. min_compress uses a tiny
window, big_compress was my original algorithmn but modified to
use 16k total for a window. reduced_id_compress is the original
except reduced to a 220 window an
On 5/22/2016 9:06 AM, Marco Leise wrote:
Are there any objections
against the benchmarking method or the license?
BSD may cause problems.
Can the implementation be in D?
Not yet.
If not, should we link against the
system liblz4.so/a
No.
or copy the C code into the backend?
Yes.
Is
On 5/22/2016 10:44 AM, Rainer Schuetze wrote:
You are right about the symbols using the VC mangling. The test case
"1.s.s.s.s.s" in the bugreport translated to C++ yields
?foo@Result@?1???$s@UResult@?1???$s@UResult@?1???$s@UResult@?1???$s@UResult@?1???$s@H@testexpansion@@YA@H@Z@@testexpansion@@Y
On Sunday, 22 May 2016 at 19:21:13 UTC, Walter Bright wrote:
On 5/22/2016 3:32 AM, Era Scarecrow wrote:
Curiously the extra 95 symbols actually is just enough to keep
compression up and bring speed to about 37% speed gain, with
it looks like no disadvantages.
You're doing the gods' work, Era!
On 5/22/2016 3:32 AM, Era Scarecrow wrote:
[...]
My idea for speeding things up is every time a new character c is matched in
pattern[], a bit is set in mask:
ulong mask;
...
mask |= 1 << (c & 63);
Then, when scanning for longer matches, test:
if (!((1 << (id[i + matchlen -
On 5/22/2016 3:32 AM, Era Scarecrow wrote:
Curiously the extra 95 symbols actually is just enough to keep compression up
and bring speed to about 37% speed gain, with it looks like no disadvantages.
Then again I still need a larger set to test things on, but it's very promising!
You're doing t
On 5/22/2016 2:57 AM, Era Scarecrow wrote:
Question Walter (or Andrei): Would it be terrible to have lower characters as
part of the identifier string? I'm referring to characters under 32 (control
characters, especially the \r, \n and \t).
Currently there's a bug in id_compress() where it doe
On 5/22/2016 5:50 AM, deadalnix wrote:
On Sunday, 22 May 2016 at 01:36:39 UTC, Walter Bright wrote:
Just a note: large lookup tables can have cold cache performance problems.
16k lookup table are the gold standard.
I like 64 bit vectors because the lookup and test can be done in one
instru
On 5/22/2016 2:07 AM, Era Scarecrow wrote:
On Sunday, 22 May 2016 at 08:50:47 UTC, Walter Bright wrote:
On 5/21/2016 11:26 PM, Era Scarecrow wrote:
With 1 Million iterations:
new_compress: TickDuration(311404604)
id_compress TickDuration(385806589)
Approx 20% increase in speed (if i'm read
On Sunday, 22 May 2016 at 18:18:46 UTC, Marco Leise wrote:
If you provide the decompressor, we're set. :)
I am already working on a non-ctfeable version that does not use
slices.
And therefore should be faster.
But decompression is only needed for demangling.
On 22.05.2016 20:12, Era Scarecrow wrote:
On Sunday, 22 May 2016 at 18:04:25 UTC, Rainer Schuetze wrote:
On 22.05.2016 19:56, Era Scarecrow wrote:
Unless ? is the proper/valid character those are far more likely to
be failed UTF-8 decoding. Care to double check?
? is an allowed character i
Am Sun, 22 May 2016 17:19:38 +
schrieb Stefan Koch :
> Here are my answers please take them with a grain of ignorance.
> […]
Maybe you are right about the dictionary. I haven't put much
any thought into it. Anyways I was looking for input from the
core devs who are affected by this.
> Howeve
On Sunday, 22 May 2016 at 18:04:25 UTC, Rainer Schuetze wrote:
On 22.05.2016 19:56, Era Scarecrow wrote:
Unless ? is the proper/valid character those are far more
likely to be failed UTF-8 decoding. Care to double check?
? is an allowed character in VC++ and for example permits
unambiguous d
On 22.05.2016 19:56, Era Scarecrow wrote:
On Sunday, 22 May 2016 at 17:44:22 UTC, Rainer Schuetze wrote:
You are right about the symbols using the VC mangling. The test case
"1.s.s.s.s.s" in the bugreport translated to C++ yields
?foo@Result@?1???$s@UResult@?1???$s@UResult@?1???$s
i.e. 936
On Sunday, 22 May 2016 at 17:44:22 UTC, Rainer Schuetze wrote:
You are right about the symbols using the VC mangling. The test
case "1.s.s.s.s.s" in the bugreport translated to C++ yields
?foo@Result@?1???$s@UResult@?1???$s@UResult@?1???$s
i.e. 936 characters. I think this is due to the very
On 22.05.2016 00:58, Walter Bright wrote:
On 5/21/2016 3:41 PM, Guillaume Boucher wrote:
Sorry if I didn't memorize everything in this forum from the last 20
years, can
you give a link to some reasoning?
DMC++ matches the Microsoft name mangling scheme, which includes such
compression. It pr
On Sunday, 22 May 2016 at 16:06:07 UTC, Marco Leise wrote:
There are a few open questions, for me at least. Are there
other proposals with good properties? Are there any objections
against the benchmarking method or the license? Can the
implementation be in D? If not, should we link against the
On Sunday, 22 May 2016 at 16:06:07 UTC, Marco Leise wrote:
There are a few open questions, for me at least. Are there
other proposals with good properties? Are there any objections
against the benchmarking method or the license? Can the
implementation be in D? If not, should we link against
Am Sun, 22 May 2016 14:06:52 +
schrieb Stefan Koch :
> On Sunday, 22 May 2016 at 14:05:23 UTC, Marco Leise wrote:
> > ⬅ Download proof of concept
> >
> > This is a proof of concept micro-benchmark compressing the
> > 37_640 bytes symbol from the bug report linked above with both
> > id_compr
⬅ Download proof of concept
This is a proof of concept micro-benchmark compressing the
37_640 bytes symbol from the bug report linked above with both
id_compress from the dmd backend and lz4 (without dictionary).
The lz4 result is additionally transformed to 7-bits similar
to base-64.
Original si
On Sunday, 22 May 2016 at 14:05:23 UTC, Marco Leise wrote:
⬅ Download proof of concept
This is a proof of concept micro-benchmark compressing the
37_640 bytes symbol from the bug report linked above with both
id_compress from the dmd backend and lz4 (without dictionary).
The lz4 result is add
On Sunday, 22 May 2016 at 12:12:09 UTC, Martin Nowak wrote:
Yes, LZO, LZ4, or snappy would be good choices. They are
real-time compression algorithms, i.e. it's faster to compress
and write to main memory than w/o compression.
Yes the LZ compressors are very good for this usecase
Please do
On Sunday, 22 May 2016 at 01:36:39 UTC, Walter Bright wrote:
Just a note: large lookup tables can have cold cache
performance problems.
16k lookup table are the gold standard.
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
The current one is effective, but slow:
https://github.com/dlang/dmd/blob/master/src/backend/compress.c
Anyone want to make a stab at making it faster? Changing the
format is fair game, as well as down and dirty assembler if
t
On 5/22/16 6:40 AM, qznc wrote:
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
The current one is effective, but slow:
https://github.com/dlang/dmd/blob/master/src/backend/compress.c
Anyone want to make a stab at making it faster? Changing the format is
fair game, as well as do
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
The current one is effective, but slow:
https://github.com/dlang/dmd/blob/master/src/backend/compress.c
Yes, LZO, LZ4, or snappy would be good choices. They are
real-time compression algorithms, i.e. it's faster to compress
an
Am Sun, 22 May 2016 10:55:40 +
schrieb Era Scarecrow :
> On Sunday, 22 May 2016 at 10:40:46 UTC, qznc wrote:
> > On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
> > It links to SMAZ:
> >
> > https://github.com/antirez/smaz
> >
> > It uses a codebook primed for english text. Sinc
Am Sat, 21 May 2016 14:12:15 -0700
schrieb Walter Bright :
> The current one is effective, but slow:
>
>https://github.com/dlang/dmd/blob/master/src/backend/compress.c
>
> Anyone want to make a stab at making it faster? Changing the format is fair
> game, as well as down and dirty assembler
On Sunday, 22 May 2016 at 10:40:46 UTC, qznc wrote:
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
It links to SMAZ:
https://github.com/antirez/smaz
It uses a codebook primed for english text. Since this is for
name mangling, priming for Phobos might be a good idea. E.g.
put "
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
The current one is effective, but slow:
https://github.com/dlang/dmd/blob/master/src/backend/compress.c
Anyone want to make a stab at making it faster? Changing the
format is fair game, as well as down and dirty assembler if
t
On Sunday, 22 May 2016 at 10:12:11 UTC, Marco Leise wrote:
Question Walter (or Andrei): Would it be terrible to have
lower characters as part of the identifier string? I'm
referring to characters under 32 (control characters,
especially the \r, \n and \t).
That would be my question, too. What
Am Sun, 22 May 2016 09:57:20 +
schrieb Era Scarecrow :
> On Sunday, 22 May 2016 at 09:48:32 UTC, Stefan Koch wrote:
> > Nice results.
>
> This is with one sample to test against. I need a much bigger
> sample, but I'm not sure how to generate/find it to give it a
> full run for it's mon
On Sunday, 22 May 2016 at 09:48:32 UTC, Stefan Koch wrote:
Nice results.
This is with one sample to test against. I need a much bigger
sample, but I'm not sure how to generate/find it to give it a
full run for it's money. Not to mention there's a huge memory
leak while doing the tests since
On Sunday, 22 May 2016 at 09:43:29 UTC, Era Scarecrow wrote:
And shrinking the lookback to a mere 127 increases it faster
but compression starts taking a hit. Can't shrink it anymore
anyways.
modified id_compress2: TickDuration(230528074) 41% faster!!
Nice results.
On Sunday, 22 May 2016 at 09:19:54 UTC, Era Scarecrow wrote:
On 5/21/2016 11:26 PM, Era Scarecrow wrote:
With 1 Million iterations:
id_compress: TickDuration(385806589)
(original/baseline)
modified id_compress: TickDuration(269275629) 31% faster
And shrinking the lookback to a me
On Sunday, 22 May 2016 at 09:07:07 UTC, Era Scarecrow wrote:
On Sunday, 22 May 2016 at 08:50:47 UTC, Walter Bright wrote:
On 5/21/2016 11:26 PM, Era Scarecrow wrote:
With 1 Million iterations:
new_compress: TickDuration(311404604)
id_compress TickDuration(385806589)
Approx 20% increase in
On Sunday, 22 May 2016 at 08:50:47 UTC, Walter Bright wrote:
On 5/21/2016 11:26 PM, Era Scarecrow wrote:
With 1 Million iterations:
new_compress: TickDuration(311404604)
id_compress TickDuration(385806589)
Approx 20% increase in speed (if i'm reading and did this
right).
It is promising.
On 5/22/2016 1:50 AM, Walter Bright wrote:
It is promising. Need more!
I did figure out an improvement to the existing algorithm whereby it could skip
forward by the length of the existing match rather than test every character
(when it's looking for a longer match). But I have no clue how mu
On 5/21/2016 11:26 PM, Era Scarecrow wrote:
On Sunday, 22 May 2016 at 04:35:15 UTC, Era Scarecrow wrote:
Now I just need to stress test it against some really large input and check
for failures. Regardless it has a compression and decompression functions,
along with benchmarking it vs the compr
On Sunday, 22 May 2016 at 04:35:15 UTC, Era Scarecrow wrote:
Now I just need to stress test it against some really large
input and check for failures. Regardless it has a compression
and decompression functions, along with benchmarking it vs the
compression algorithm I'm trying to beat.
With
On 22/5/2016 05:51, Stefan Koch wrote:
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
The current one is effective, but slow:
https://github.com/dlang/dmd/blob/master/src/backend/compress.c
Anyone want to make a stab at making it faster? Changing the format is
fair game, as wel
On Sunday, 22 May 2016 at 01:46:23 UTC, Era Scarecrow wrote:
I know, if it doesn't fit in 64k then you can't rely on the
cache...
Finished. It's blazing fast and even compresses in my test
example as good or better than gzip/zlib (although maybe that's
because it's headerless).
Now I jus
On Sunday, 22 May 2016 at 01:36:39 UTC, Walter Bright wrote:
On 5/21/2016 5:52 PM, Era Scarecrow wrote:
I'm going to try for a lookback and scan which uses quite a
bit of memory.
Just a note: large lookup tables can have cold cache
performance problems.
I know, if it doesn't fit in 64k th
On 5/21/2016 5:52 PM, Era Scarecrow wrote:
I'm going to try for a lookback and scan which uses quite a bit of memory. On
the other hand you can reuse the memory over and over again, so that isn't an
issue. Although it will take up a Meg in memory. Also written entirely in D,
so...
I think I'l
On Saturday, 21 May 2016 at 22:01:40 UTC, Walter Bright wrote:
Give it a try, see if it pans out.
Other possibilities are the lookback dictionary is fixed at
1023 characters. Experimenting and benchmarking can indicate if
it should be more or less. Perhaps 512 will yield just as good
compress
On 5/21/2016 3:41 PM, Guillaume Boucher wrote:
Sorry if I didn't memorize everything in this forum from the last 20 years, can
you give a link to some reasoning?
DMC++ matches the Microsoft name mangling scheme, which includes such
compression. It proved hopelessly inadequate, which is why I i
On Saturday, 21 May 2016 at 22:41:49 UTC, Guillaume Boucher wrote:
It's almost as if using information about the actual data
leads to a better compression scheme.
Exactly. It always is. Since you know where to look for
redundancy instead of blindly comparing bytes.
On Saturday, 21 May 2016 at 22:07:27 UTC, Walter Bright wrote:
As mentioned elsewhere, the C++ mangling scheme has a primitive
and ineffective LZ77 scheme built in, so I wouldn't waste time
on that.
Sorry if I didn't memorize everything in this forum from the last
20 years, can you give a lin
On 5/21/2016 3:13 PM, Stefan Koch wrote:
Could you post a link to code you use for benchmarking ?
In the comments:
https://github.com/dlang/dmd/pull/5793
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
The current one is effective, but slow:
https://github.com/dlang/dmd/blob/master/src/backend/compress.c
Anyone want to make a stab at making it faster? Changing the
format is fair game, as well as down and dirty assembler if
t
On 5/21/2016 2:37 PM, Timon Gehr wrote:
Why is longest_match Ω(nm) instead of O(n+m) (e.g. KMP)?
I don't understand the terms you use, but as to the "why" it is based on what I
knew about LZ77 compression. I don't pretend to be an expert on compression, and
this is what I churned out. As ment
On 5/21/2016 2:27 PM, Era Scarecrow wrote:
I assume this is related to compressing symbols thread?
Yes.
I mentioned possibly considering the LZO library.
Give it a try, see if it pans out.
Other possibilities are the lookback dictionary is fixed at 1023 characters.
Experimenting and benc
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
The current one is effective, but slow:
https://github.com/dlang/dmd/blob/master/src/backend/compress.c
Anyone want to make a stab at making it faster? Changing the
format is fair game, as well as down and dirty assembler if
t
On 21.05.2016 23:37, Timon Gehr wrote:
On 21.05.2016 23:12, Walter Bright wrote:
The current one is effective, but slow:
https://github.com/dlang/dmd/blob/master/src/backend/compress.c
Anyone want to make a stab at making it faster? Changing the format is
fair game, as well as down and dirt
On 21.05.2016 23:12, Walter Bright wrote:
The current one is effective, but slow:
https://github.com/dlang/dmd/blob/master/src/backend/compress.c
Anyone want to make a stab at making it faster? Changing the format is
fair game, as well as down and dirty assembler if that's what it takes.
So
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
So really, how good are you at fast code?
I recall being able to get a marginal boost in speed by assigning
the first 1-2 bytes in preassigned buckets (256 & 65536
respectively), so you have an immediate 1-2 byte matches
(skippin
The current one is effective, but slow:
https://github.com/dlang/dmd/blob/master/src/backend/compress.c
Anyone want to make a stab at making it faster? Changing the format is fair
game, as well as down and dirty assembler if that's what it takes.
So really, how good are you at fast code?
95 matches
Mail list logo