Re: [R] Error Correcting Codes, Simplex

2006-10-17 Thread Richard Graham
On 10/16/06, Björn Egert [EMAIL PROTECTED] wrote:
 On 10/8/06, Egert, Bjoern [EMAIL PROTECTED] wrote:
  Hello,
 
Is there a way in R to construct an (error correcting) binary code
e.g. for an source alphabet containing integers from 1 to say 255
with the property that each pair of distinct codewords of length m
is at Hamming distance exactly m/2 ?
 
I was suggested to use so called simplex codes, which should be
fairly standard, but I haven't found a direct way via R packages
to do so, that's why I ask whether there might be in indirect way
to solve this problem.
 
Example:
v1  =c(1,2,3,4)
v2  =c(1,2,5,6)
similarity(v1,v2)=0.5, (because 2 out of 4 elements are equal).
Obviously, a binary representation of would yield a different
similarity of:
binary(v1) =001 010 011 100
binary(v1) =001 010 101 110
similarity(binary(v1),binary(v2))= 9/12
 
Remark: The focus here is not on error correction, but rather the
binary encoding retaining similarity of the elements of vectors.
 
  Many thanks,
  Bjoern
 
  Bjoern,
 
  NB:  I'm an R newbie and I only know a bit about error correcting codes.
 
  I haven't seen any responses to your questions and I don't know if you
  still
  have a need, but it is certainly possible to construct forward error
  correction
  codes with all the great math capability in R.
 
  It seems you want to generate code words that still have the original
  bits
  present.  These are systematic codes and there are lots of them available
  to use.  Many codes are specified by the code word length (n), number
  of original data
  bits in each code word (k), and the minimum Hamming distance of the
  code words (d)
  as a [n,k,d] code. Simplex Codes have these parameters: [2^k - 1, k,
  2^(k - 1)].  These
  codes could be generated as a simple matrix multiply in R, but are you
  sure that's what
  you want?  The code words will be quite long.
 
  Regards,
  Richard Graham


 Hello,

   thank you.

   yes, basically, that's what I want.
   Just a binary encoding of an arbitrary integer value (or vector of
 integers)
   with the property that each pair of distinct integer values have an
 equal Hamming-
   distance (m/2), so as to be able to a similarity search

   I got the idea from: Gionis: Efficient and Tunable Similar Set
 Retrieval (Chap 3.2)

 regards
 Bjoern



Bjoern,

I read only the section of the paper you mention and I'll trust that
the stated properties of Simplex Codes are true.  I haven't researched
or verified it.

[from http://magma.maths.usyd.edu.au/]
Magma is a large, well-supported software package designed to solve
computationally hard problems in algebra, number theory, geometry and
combinatorics. It provides a mathematically rigorous environment for
computing with algebraic, number-theoretic, combinatoric and geometric
objects.

I don't understand a fraction of its capability but I still find it to
be very useful.  In fact, they have an online calculator that will
give you the generator matrix you want.  The online Magma calculator
is at:

http://magma.maths.usyd.edu.au/calc/

To calculate the generator matrix I think you are asking for, go to
the above URL and cut/paste the following command:

ExtendCode(SimplexCode(8));

Click Evaluate and the output window will contain a [256, 8, 128]
Linear Code over GF(2).  You'll need to massage this a bit to use
it as a matrix for R.  I'd use Ruby to do this, but anything will do.
If you want to encode more/less than 8 bits, you can modify the above
argument to SimplexCode.

I used ExtendCode so that the codeword length == Dmin * 2

The Gionis claim I'll research or verify sometime is that _every_ pair
of Simplex Code words  of length m have Hamming distance == m/2.  If
you have a reference to a proof, I'd like to read it (like I said, I
only know a bit about ECC).

Good Luck with your work!
Richard Graham

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error Correcting Codes, Simplex

2006-10-17 Thread Thomas Lumley

On Tue, 17 Oct 2006, Richard Graham wrote:


On 10/16/06, Björn Egert [EMAIL PROTECTED] wrote:

On 10/8/06, Egert, Bjoern [EMAIL PROTECTED] wrote:

Hello,

  Is there a way in R to construct an (error correcting) binary code
  e.g. for an source alphabet containing integers from 1 to say 255
  with the property that each pair of distinct codewords of length m
  is at Hamming distance exactly m/2 ?

  I was suggested to use so called simplex codes, which should be
  fairly standard, but I haven't found a direct way via R packages
  to do so, that's why I ask whether there might be in indirect way
  to solve this problem.



The survey package has a function hadamard() to construct Hadamard 
matrices, which are what simplex codes come from.


-thomas__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error Correcting Codes, Simplex

2006-10-15 Thread Richard Graham
On 10/8/06, Egert, Bjoern [EMAIL PROTECTED] wrote:
 Hello,

   Is there a way in R to construct an (error correcting) binary code
   e.g. for an source alphabet containing integers from 1 to say 255
   with the property that each pair of distinct codewords of length m
   is at Hamming distance exactly m/2 ?

   I was suggested to use so called simplex codes, which should be
   fairly standard, but I haven't found a direct way via R packages
   to do so, that's why I ask whether there might be in indirect way
   to solve this problem.

   Example:
   v1  =c(1,2,3,4)
   v2  =c(1,2,5,6)
   similarity(v1,v2)=0.5, (because 2 out of 4 elements are equal).
   Obviously, a binary representation of would yield a different
   similarity of:
   binary(v1) =001 010 011 100
   binary(v1) =001 010 101 110
   similarity(binary(v1),binary(v2))= 9/12

   Remark: The focus here is not on error correction, but rather the
   binary encoding retaining similarity of the elements of vectors.

 Many thanks,
 Bjoern

Bjoern,

NB:  I'm an R newbie and I only know a bit about error correcting codes.

I haven't seen any responses to your questions and I don't know if you still
have a need, but it is certainly possible to construct forward error correction
codes with all the great math capability in R.

It seems you want to generate code words that still have the original bits
present.  These are systematic codes and there are lots of them available
to use.  Many codes are specified by the code word length (n), number
of original data
bits in each code word (k), and the minimum Hamming distance of the
code words (d)
as a [n,k,d] code. Simplex Codes have these parameters: [2^k - 1, k,
2^(k - 1)].  These
codes could be generated as a simple matrix multiply in R, but are you
sure that's what
you want?  The code words will be quite long.

Regards,
Richard Graham

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.