It sounds to me like you are looking for data compression. Of course,
there are many options available for this. One of the best known
compression schemes is Huffman coding which is an entropy coding
algorithm used for lossless data compression. You can read about it
here:
Good general description of data compression principles:
http://www.arturocampos.com/cp_ch1.html
Excellent article providing detailed survey of many compression
techniques, including variants of Huffman coding and Arithmetic coding.
I suggest reading the whole article:
http://www.ics.uci.edu/~dan/pubs/DataCompression.html
Another is called Arithmetic coding which is also an entropy coding
algorithm for lossless compression, but uses a somewhat different
approach. It is capable of compression that is about as close to the
entropy of the data as possible. It has been proven that you cannot
compress the data below the entropy of the data, so achieving entropy
is as good as it gets. In other words, you will never take a million
integers and stuff them into a 64-bit integer and be able to losslessly
reconstruct the original data later (as you had hoped), unless it is a
special case of the data being all zeros or something like that.
Certainly not for typical random or even non-random data.
Although Arithmetic is often able to achieve better results than
Huffman coding, results are often very, very close. For this reason
Huffman coding is usually used because there are no restrictions on
it's use (see below), but Arithmetic coding is patented and requires a
license to use. Read the basics of Arithmetic coding here:
http://sachingarg.com/compression/entropy_coding/
acm87_arithmetic_coding.pdf
I should also note that there are many variants to Huffman coding that
have been developed, but beware--many of these variants are patented,
so be careful to do a patent search to verify whether you need a
license or not before you use something.
I hope that helped a little...
Ken Fleisher
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>
Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>