It sounds to me like you are looking for data compression. Of course, there are many options available for this. One of the best known compression schemes is Huffman coding which is an entropy coding algorithm used for lossless data compression. You can read about it here:

Good general description of data compression principles:
http://www.arturocampos.com/cp_ch1.html

Excellent article providing detailed survey of many compression techniques, including variants of Huffman coding and Arithmetic coding. I suggest reading the whole article:
http://www.ics.uci.edu/~dan/pubs/DataCompression.html

Another is called Arithmetic coding which is also an entropy coding algorithm for lossless compression, but uses a somewhat different approach. It is capable of compression that is about as close to the entropy of the data as possible. It has been proven that you cannot compress the data below the entropy of the data, so achieving entropy is as good as it gets. In other words, you will never take a million integers and stuff them into a 64-bit integer and be able to losslessly reconstruct the original data later (as you had hoped), unless it is a special case of the data being all zeros or something like that. Certainly not for typical random or even non-random data.

Although Arithmetic is often able to achieve better results than Huffman coding, results are often very, very close. For this reason Huffman coding is usually used because there are no restrictions on it's use (see below), but Arithmetic coding is patented and requires a license to use. Read the basics of Arithmetic coding here:

http://sachingarg.com/compression/entropy_coding/ acm87_arithmetic_coding.pdf

I should also note that there are many variants to Huffman coding that have been developed, but beware--many of these variants are patented, so be careful to do a patent search to verify whether you need a license or not before you use something.

I hope that helped a little...

Ken Fleisher

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Reply via email to