color holograms with macroscopic technology

Kragen Sitaker Thu, 13 Jun 2002 00:20:03 -0700

This starts with a review of Hadamard matrix stuff, continues through
some Hadamard-matrix-like stuff about random matrices which is
probably well-known, and finishes up with what I believe to be a novel
method of embedding multiple images, holograph-like, into a single
output image, which can then be viewed through a transparent mask from
various angles to reveal the various input images.  I think this
method makes it possible to construct full-color "holograms" viewable
by incoherent white light, entirely from macroscopic objects.


Lots of assertions here are unproved, partly because I don't have the
time; I hope all of them are true.  And several connections are left
unmade.  I apologize for the nonrigorous presentation and lack of
references.

All of this occurred to me during a one-hour drive from San Francisco
down to Mountain View, and it took much longer to write down and
implement than to come up with in the first place.

Hadamard matrix basics
----------------------

A Hadamard matrix is a matrix over {-1, 1} where each row and each
column is perfectly uncorrelated with every other row or column and
sums to 0, and the rows are the same as the columns.  (Except that the
first row and column do not sum to 0.)  So all of them, considered as
vectors, are orthogonal.  So if you have a vector that is the sum of
several of these columns, each multiplied by a different constant, you
can recover any of the constants by taking the dot product of the
resultant vector with any of the original columns, then dividing by
that column's dot product with itself --- which will be the number of
elements.

Here is a 4x4 Hadamard matrix:

  [[1   1   1   1]   
   [1  -1   1  -1]  
   [1   1  -1  -1]  
   [1  -1  -1   1]]

(I posted a program to kragen-hacks a couple of years ago that
produces Hadamard matrices by taking the bitwise AND of the
coordinates and setting a -1 where the AND result has an odd number of
bits set.)

If we take the constants 2, 3, 4 and multiply them by the last three
columns of the matrix, we get [2 -2 2 -2], [3 3 -3 -3], and 
[4 -4 -4 4].  Summing these vectors gives [9 -3 -5 -1], which sums to
0, as you can see; and its dot product with the second column is 9 -
-3 + -5 - -1, or 8, which is indeed 2 * 4.  Likewise, its dot product
with the third column is 9 + -3 - -5 - -1, or 12, which is indeed 
3 * 4.

If the first column is not disregarded, this property still holds.  So
multiplying a vector by the matrix gives a new vector which, if
multiplied by the matrix, gives the original vector (multiplied by the
number of rows or columns in the matrix.)

So a Hadamard matrix divided by the square root of its number of rows
or columns is its own inverse.

An interesting thing which may or may not be known follows.

If you replace the -1s by 0s in your matrix, you can still recover the
original numbers from the vector you made with the real Hadamard
matrix, but most of your dot products will only be half as big.  The
earlier [9 -3 -5 -1] example gives 9 + -5 = 4, 9 + -3 = 6, and 9 + -1
= 8 --- twice the 2, 3, 4 we put in.  This doesn't work quite so
uniformly if you don't disregard the first row or column; it adds a
uniform bias to each result that must be subtracted.

Random "Hadamard" matrices
--------------------------

A matrix over {-1, 1} in which the elements are chosen randomly with
equal probability will have the same properties in an approximate
form: any two rows or columns will be very nearly uncorrelated, and
each row or column will sum to approximately 0.  So you should be able
to perform the same feat in an approximate manner.  Consider[0]:

[[ 1 -1  1  1  1  1  1  1 -1  1  1 -1 -1 -1 -1]
 [-1  1 -1  1 -1 -1  1 -1 -1  1  1 -1 -1 -1 -1]
 [-1  1 -1 -1  1 -1  1  1  1 -1 -1 -1  1  1 -1]
 [-1  1  1 -1 -1 -1  1  1 -1 -1  1 -1  1  1 -1]
 [ 1  1  1  1  1 -1 -1  1 -1  1 -1 -1  1  1  1]
 [-1  1  1  1 -1 -1  1  1 -1  1  1 -1 -1 -1 -1]
 [ 1  1  1  1 -1  1  1  1  1  1 -1 -1  1 -1  1]
 [ 1  1  1 -1 -1  1 -1 -1  1 -1  1  1 -1  1  1]
 [-1 -1 -1 -1  1 -1  1 -1  1 -1 -1  1  1 -1 -1]
 [ 1  1  1 -1  1 -1 -1 -1 -1 -1  1  1  1 -1  1]
 [ 1 -1 -1  1 -1  1  1 -1 -1 -1 -1 -1 -1  1  1]
 [-1 -1 -1 -1  1 -1  1  1 -1 -1  1  1  1  1  1]
 [-1  1  1  1 -1 -1 -1 -1  1  1  1  1 -1 -1 -1]
 [ 1 -1 -1 -1  1  1 -1  1  1 -1  1  1  1 -1  1]
 [-1  1  1 -1  1 -1  1  1 -1  1 -1 -1  1 -1 -1]]

Suppose we multiply this array by [0 0 0 5 0 0 0 8 0 0 0 0 0 0 0].
The values to be summed are as follows [1]: 
[[ 0  0  0  5  0  0  0  8  0  0  0  0  0  0  0]
 [ 0  0  0  5  0  0  0 -8  0  0  0  0  0  0  0]
 [ 0  0  0 -5  0  0  0  8  0  0  0  0  0  0  0]
 [ 0  0  0 -5  0  0  0  8  0  0  0  0  0  0  0]
 [ 0  0  0  5  0  0  0  8  0  0  0  0  0  0  0]
 [ 0  0  0  5  0  0  0  8  0  0  0  0  0  0  0]
 [ 0  0  0  5  0  0  0  8  0  0  0  0  0  0  0]
 [ 0  0  0 -5  0  0  0 -8  0  0  0  0  0  0  0]
 [ 0  0  0 -5  0  0  0 -8  0  0  0  0  0  0  0]
 [ 0  0  0 -5  0  0  0 -8  0  0  0  0  0  0  0]
 [ 0  0  0  5  0  0  0 -8  0  0  0  0  0  0  0]
 [ 0  0  0 -5  0  0  0  8  0  0  0  0  0  0  0]
 [ 0  0  0  5  0  0  0 -8  0  0  0  0  0  0  0]
 [ 0  0  0 -5  0  0  0  8  0  0  0  0  0  0  0]
 [ 0  0  0 -5  0  0  0  8  0  0  0  0  0  0  0]]

This gives us the following resultant vector [2]:
[ 13  -3   3   3  13  13  13 -13 -13 -13  -3   3  -3   3   3]

Multiplying this by the original random matrix and dividing by fifteen
gives us this resultant vector [3]:
[ 0.46666667  0.86666667  2.6         4.46666667  0.33333333  0.46666667
       3.          7.66666667 -2.6         5.26666667 -0.86666667 -6.06666667
       0.73333333 -0.46666667 -0.86666667]

This is approximately 0 in most places --- fully half of the values
have an absolute value less than one --- but where the original vector
had a 5, it has 4.4667, and where the original vector had an 8, it has
7.6667.

Multiplying by 1 where the original matrix was 1 and 0 where it was
-1, then dividing by 7.5, we get the following vector instead [4]:
[ 1.73333333  2.13333333  3.86666667  5.73333333  1.6         1.73333333
       4.26666667  8.93333333 -1.33333333  6.53333333  0.4        -4.8       
       2.          0.8         0.4       ]

Here we have 5.7 for 5, 8.9 for 8, and mostly 0 elsewhere.  The errors
are somewhat larger here, which is probably unsurprising --- only half
as many elements participated, on average, in each sum.

These results are fairly typical; sometimes results are worse,
sometimes better.  In general, the dot product between two random
length-N vectors over {-1, 1} should have a Gaussian distribution with
a mean of 0 and a standard deviation of, I think, sqrt(N).  The dot
product between the columns to which the 5 and the 8 belong determines
how much they interfere with one another, and whether the interference
is constructive (leading to both "5" and "8" coming out a little
larger) or destructive (leading to both "5" and "8" coming out a
little smaller).  In a real Hadamard matrix, the dot product is
exactly 0.

[0]
import Numeric, RandomArray

def randsquare(size):
    return RandomArray.randint(0, 2, (size, size)) * 2 - 1

x = randsquare(15)

[1]
myvec = Numeric.zeros(15)
myvec[3] = 5
myvec[7] = 8
x * myvec

[2]
res = Numeric.matrixmultiply(x, myvec)

[3]
Numeric.matrixmultiply(res, x) / 15.

[4]
Numeric.matrixmultiply(res, x > 0) / 7.5

The visual one-time pad
-----------------------

I think I posted this to kragen-hacks some time ago.  By representing
a bitmap with hourglasses --- vertical hourglasses for 1 bits,
horizontal hourglasses for 0 bits --- we can get the interesting
property that overlaying two bitmaps printed on sheets of paper (or,
better, transparency film) visually produces their XOR.  Where the
hourglasses are in the same orientation, some light shines through,
but where they are in the opposite orientation, no light can pass.  So
if you print out a bitmap of random noise and a bitmap consisting of a
one-bit image XORed with the random noise, by overlaying them, you can
see the original image at half intensity --- but either of the two
printouts alone conveys no information about the original image.

I didn't realize this at the time, but the hourglass representation is
somewhat unnecessary if only details larger than a single pixel are
important.  1 bits can be represented simply as white and 0 bits as
black.  Where the noise and the xored image are the same --- because
the image was 0 --- half the pixels will be black on both images and
half the pixels will be white on both images, so that area will be
half black and half white. Where they are opposite, because the image
was 1, half the pixels will be black on one image and half will be
black on the other, so that area will be all black.

Effectively, OR is close enough to XOR that it works.

The novel technique
-------------------

This means that the method can be extended to grayscale and full-color
images; although it no longer provides meaningful security, it still
results in an "encoded" image that is visually flat gray.  In the
pixels where the mask is black, the complementary color of the image
pixel is printed; in the pixels where the mask is transparent, the
image pixel's real color is printed.  When the mask is placed on the
image, the image is revealed, albeit at half intensity.

(Actually getting the resultant image to appear flat gray will
probably require compensating for nonlinearities in the output device
--- for example, for monitors, it will require gamma correction.
Failing to do this properly will result in images "bleeding through"
--- when one image is being displayed, artifacts from particularly
bright or dark parts of other images will be evident.)

If several different images are encoded thus with different,
uncorrelated masks, they can be summed --- just as the columns of the
Hadamard matrix could be summed when they had been multiplied by
constants --- to produce a resultant image from which any of the
component images can be decoded.  

The resultant image will probably have a larger color gamut than any
of the input images, and will therefore suffer more from being
compressed into a device gamut, resulting in dimmer output images with
less contrast than the input image.

Since rows and columns of a random mask will be uncorrelated with one
another, the different input images can actually be encoded with the
same mask, but shifted by one or two pixels in different directions.
Then the mask can be slid over the image to produce the different
images in sequence.  (This could actually be useful for increasing
effective video display frame rates.)  If the mask is held some
distance away from the encoded resultant image, parallax rather than
sliding can provide the offsets.  (This could also allow video
displays to produce real-time three-dimensional displays by means of
eye-to-eye parallax.)

This is actually a lot like a SIRDS in its construction, although not
in the mechanics of how it is viewed.

Here's a Python program that will demo this technique on a (raw PGM
grayscale) image and its upside-down version.  It requires Python,
Numerical Python, and ImageMagick to be installed.  It will use either
an (non-cryptographically-strong) random array or a Hadamard matrix;
each approach has its problems.  The Hadamard matrix, as you might
expect, produces sharper images with less bleed-through, but produces
very visible Hadamard-matrix artifacts in the output image.

I haven't tested it with Python 1.5.2 because I don't have Numerical
Python installed for 1.5.2.

It completely fails to deal with gamma correction, and I suspect it
may have other numerical problems as well.

import Numeric, RandomArray, string, os, random

def randary(shape):
    "Return a random matrix over {-1, 1}."
    return RandomArray.randint(0, 2, shape) * 2 - 1

def randsquare(size):
    return randary((size, size))

def sample(size=20, verbose=0):
    "Show how the random encoding works with some sample scalars."
    x = randsquare(size)
    if verbose: print x
    myvec = Numeric.zeros(size)
    myvec[3] = 5
    myvec[4] = 10
    myvec[5] = 11
    myvec[7] = 8
    if verbose: print x * myvec
    res = Numeric.matrixmultiply(x, myvec)
    if verbose: print res
    both = Numeric.matrixmultiply(res, x) / float(size)
    if verbose: print both
    print both[3:8]
    ones = Numeric.matrixmultiply(res, x > 0) / (float(size)/2)
    if verbose: print ones
    print ones[3:8]

class BadPGM(Exception): pass

def rdpgm(filename):
    """Reads raw-format PGM files produced by djpeg and the GIMP.

    I didn't bother to read the PNM spec to learn how to read PGM or
    PNM files in general.
    """
    # djpeg on a grayscale jpeg produces a grayscale PNM.
    # It seems to look like this.
    pgmfile = open(filename, 'rb')
    format = pgmfile.readline()
    if format != 'P5\n': raise BadPGM('format', format)
    size = pgmfile.readline()
    # Apparently the GIMP puts a comment here, and blank lines can be here too
    while size[0] not in string.digits:
        size = pgmfile.readline()
    sizeline = string.split(size)
    if len(sizeline) != 2: raise BadPGM('size line', sizeline)
    width = int(sizeline[0])
    height = int(sizeline[1])
    ncolors = pgmfile.readline()
    data = pgmfile.read()
    pgmfile.close()
    array = Numeric.fromstring(data, 'b')
    array.shape = (height, width)
    return array

def wrpgm(filename, data):
    """Writes a raw-format PGM file."""
    height, width = data.shape
    pgmfile = open(filename, 'wb')
    pgmfile.write("P5\n%d %d\n255\n" % (width, height))
    pgmfile.write(data.astype('b').tostring())
    pgmfile.close()

# this assumes you have ImageMagick installed
def show(data):
    """Displays a Numeric array of grayscale data.

    Writes to "tmp.foo.pgm", which could be a problem if you're
    in a world-writable directory.

    """
    fname = "tmp.foo.pgm"
    wrpgm(fname, data)
    os.system("display %s" % fname)
    os.unlink(fname)

def labshow(text, data):
    "Print a string and display an array of data."
    print text
    show(data)

# This is probably not the most efficient way to calculate this!
def parity(number, cache = {}):
    "Return whether the number of 1 bits in a number is odd, slowly."
    if number == 0: return 0
    if not cache.has_key(number):
        rv = parity(number >> 1)
        if number & 1: rv = not rv
        cache[number] = rv
    return cache[number]

def hadamard(shape):
    "Produce a Hadamard matrix of the requested shape."
    result = Numeric.ones(shape)
    for ii in range(shape[0]):
        for jj in range(shape[1]):
            if parity(ii & jj):
                result[ii][jj] = -1
    return result

def demo(filename, use_hadamard = 0, shuffle=0):
    """Simulate encoding two images and decoding them.

    By default, use a random array; can use a Hadamard matrix if
    desired.
    """
    img = rdpgm(filename)
    labshow("The image", img)
    if use_hadamard:
        mask = hadamard((img.shape[0] + 1, img.shape[1] + 1))[1:,1:]
    else:
        mask = randary(img.shape)
    if shuffle:
        random.shuffle(mask)
    labshow("The mask", mask)
    outimg = (img - 128) * mask + 128
    labshow("The encoded image (needs gamma correction!)", outimg)
    outimg2 = outimg * (mask > 0)
    labshow("The masked encoded image", outimg2)
    upsidedown = img[::-1, ::-1]
    labshow("The image upside down", upsidedown)
    mask2 = Numeric.concatenate([mask[1:], mask[:1]])
    outimg2 = (upsidedown - 128) * mask2 + 128
    labshow("Encoded image upside down", outimg2)
    combimg = (outimg + outimg2) / 2
    labshow("Combined encoded image", combimg)
    labshow("Extracting first image", combimg * (mask > 0))
    labshow("Extracting upsidedown image", combimg * (mask2 > 0))

The contrast problem
--------------------

In its pure form, this technique produces dim images rather quickly.
The sums of N images whose pixel values range from -1 to 1 will be a
combined image whose pixel values range from -N to N, so when this
image's colors are scaled to fit in a device gamut, it will be N times
lower in contrast than the input images.

But the pixel values in this image should approach a normal
distribution as N grows, and its standard deviation should grow as
sqrt(N).  So if the image's colors are scaled not so that the minimum
and maximum pixels fit in the device gamut, but so that, say, 2.5
standard deviations on either side of the mean fit in the device
gamut, then only a few pixels will be affected, and N images will
combine to produce an output image lower in contrast by a factor of
sqrt(N) instead of N.

Those few pixels will not be uniformly distributed, though; they will
tend to occur more often in places where several of the input images
are either very bright or very dark.  The result will be that bright
or dark places in one input image will "bleed through" as dim spots in
bright areas of other input images.  As you turn up the contrast, this
effect will become more pronounced.

I think it should be possible to trade off these artifacts for
effective resolution by the following method.  Generate the output
image and note which pixels are being "clipped" in either the positive
or negative direction.  Find which source image pixels are
contributing to the "clipping" --- if the image is being clipped
positively, they will be darker-than-average source image pixels being
multiplied by a -1 in their mask, or lighter-than-average sourcce
image pixels being multiplied by 1 in their mask.  Change these pixel
values by moving them toward the average, distributing the difference
to their neighboring pixels, Floyd-Steinberg style.  Now repeat the
encoding, using the same masks.  Now there should be fewer "clipped"
pixels in the output.

My intuition tells me this will not work for N < 3.

The saturation problem
----------------------

Although I haven't tried it, I think this process will tend to
desaturate color images.  I'm not sure how to fix that.

One general technique that might reduce the damage done by this
process at the cost of brightness and resolution is to change the
distribution of the values in the "mask".  Instead of multiplying half
of the image pixels by 1 and the other half by -1, you can multiply
one fourth of them by 1 and three fourths of them by -(1/3).  I
haven't tried this, so I don't know if it helps.


-- 
<[EMAIL PROTECTED]>       Kragen Sitaker     <http://www.pobox.com/~kragen/>
I don't do .INI, .BAT, .DLL or .SYS files. I don't assign apps to files. I 
don't configure peripherals or networks before using them. I have a computer 
to do all that. I have a Macintosh, not a hobby. -- Fritz Anderson

color holograms with macroscopic technology

Reply via email to