Re: python implementation of a new integer encoding algorithm.

Dave Angel Tue, 17 Feb 2015 06:16:07 -0800

On 02/17/2015 06:22 AM, janhein.vanderb...@gmail.com wrote:

In http://optarbvalintenc.blogspot.nl/ I propose a new way to encode 
arbitrarily valued integers and the python code that can be used as a reference 
for practical implementations of codecs.


The encoding algorithm itself is optimized for transmission and storage 
requirements without any processor load consideration whatsoever.

The next step is the development of the python code that minimizes processor 
requirements without compromising the algorithm.

Is this the right forum to ask for help with this step or to obtain references 
to more suitable platforms?

This is a fine forum for such a discussion. I for one would love toparticipate. However, note that it isn't necessary true that "thesmaller the better" is a good algorithm. In context, there arefrequently a number of tradeoffs, even ignoring processor time (as youimply).

Many years ago I worked on some code that manipulated the intermediatefiles for a compiler. I had no say in the format, I just had to dealwith it.

They had a field type called a "compressed integer." It could varybetween one byte and I think about six. And the theory was that it tookless space than the equivalent format of fixed size integers. The catchfrom my point was that these integers could appear in the middle of astruct, and thus access to the later fields of the struct required adynamic calculation. This put a huge onus on my code to read or writethe data serially. I ended up solving it by writing code that generated40 thousand lines of C++ header and body code, so that the rest of thecode didn't care.

Was it worth it? To reduce the size of some files that only lived a fewseconds on disk? I seriously doubt it. But I learned a lot in the process.

On another project, the goal was to be able to recover data fromarchives in spite of physical damage to some files. So I had to addselective redundancy for that. In the process, I also compress thedata, but confine the compression algorithm to relatively small piecesof data, and label those pieces independently, so that any singledecompression error affects only one unit of data.

So going back to your problem, and assuming that the other issues aremoot, what's your proposal? Are you compressing relative to a straightbinary form of storage? Are you assuming anything about the relativelikelihood of various values of integers? Do you provide anything toallow for the possibility that your prediction for probabilitydistribution isn't met?


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list

Re: python implementation of a new integer encoding algorithm.

Reply via email to