Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-23 Thread Gregory Edigarov
On Wed, 18 May 2011 13:07:32 -0700
Landon Stewart lstew...@superb.net wrote:

 Lets say you had a file that was 1,000,000,000 characters consisting
 of 8,000,000,000bits.  What if instead of transferring that file
 through the interwebs you transmitted a mathematical equation to tell
 a computer on the other end how to *construct* that file.  First
 you'd feed the file into a cruncher of some type to reduce the
 pattern of 8,000,000,000 bits into an equation somehow.  Sure this
 would take time, I realize that.  The equation would then be
 transmitted to the other computer where it would use its
 mad-math-skillz to *figure out the answer* which would theoretically
 be the same pattern of bits.  Thus the same file would emerge on the
 other end.
 
 The real question here is how long would it take for a regular
 computer to do this kind of math?
 
 Just a weird idea I had.  If it's a good idea then please consider
 this intellectual property.  LOL

42 :-)



Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.

2011-05-21 Thread Marshall Eubanks

On May 20, 2011, at 5:16 PM, Sudeep Khuraijam wrote:

 I could not help but admire nanog in its full form ;)  and I cannot resist 
 anymore.  Allow me to suggest the EPR paradox machine.
 
 The cost of regenerating  unpredictable information is inefficient  by orders 
 of magnitude, but wait... isn't it what we are trying to solve?
 
 On May 20, 2011, at 1:32 PM, Paul Timmins wrote:
 
 On 05/20/2011 03:34 PM, Paul Graydon wrote:
 On 05/20/2011 08:53 AM, Brett Frankenberger wrote:
 Even if those problems were solved, you'd need (on average) just as
 many bits to represent which digit of pi to start with as you'd need to
 represent the original message.
 
-- Brett
 Not quite sure I follow that. Start at position xyz, carry on for
 1 bits shouldn't be as long as telling it all 1 bits?

Yes, it will be as long or longer (on average), because you have to represent 
position XYZ in some fashion, 
and send that representation to the decoder, and it can easily be longer than 
the original message. Suppose that it takes 20,000 bits to
represent XYZ. How have you saved bits ? Having XYZ be longer than the original 
message is just as likely as having it be shorter. 

The same problem applies to the original suggestion. You will not (on average) 
save bits. 

If typical messages are not totally random, you can compress by considering the 
nature of that non-randomness, and tailoring your compression accordingly. 
These schemes are using random strings / hashes 


   for their compression, and thus will (on average) not 
save
bits even if a message is highly non-random.

Regards
Marshall

 
 Currently we have a compression algorithm for doing this already in
 widespread use. We create a list of numbers ranging from 1 to 255 and
 then provide an index into that array. We save space by assuming it's a
 single character.
 
 
 
 Sudeep Khuraijam | Netops | liveops | Office 408 8442511 | Mobile 408 666 9987
 skhurai...@liveops.commailto:skhurai...@liveops.com | aim: skhuraijam
 
 
 
 
 
 
 
 
 




Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-20 Thread Eu-Ming Lee
To do this, you only need 2 numbers: the nth digit of pi and the number of 
digits.

Simply convert your message into a single extremely long integer. Somewhere, 
in the digits of pi, you will find a matching series of digits the same as 
your integer!

Decompressing the number is relatively easy after some sort-of recent 
advances in our understanding of pi.

Finding out what those 2 numbers are--- well, we still have a ways to go 
on that.


Despite the ridiculousness of this example, it does illustrate to the 
author that there are extremes of compression. The single mathematical 
formula compression method is possible, even trivial. However, the 
computation time for compression may be unreasonably large.

Here is another ridiculous way to compress data: Convert your data 
into a series 

of coordinates to a mandelbrot fractal set. The final picture is a 
fixed size, regardless of the size of your starting data set. From that 
final picture, you should be able to retrieve your original starting 
data set.

Good luck!





Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.

2011-05-20 Thread Brett Frankenberger
On Fri, May 20, 2011 at 06:46:45PM +, Eu-Ming Lee wrote:
 To do this, you only need 2 numbers: the nth digit of pi and the number of 
 digits.
 
 Simply convert your message into a single extremely long integer. Somewhere, 
 in the digits of pi, you will find a matching series of digits the same as 
 your integer!
 
 Decompressing the number is relatively easy after some sort-of recent 
 advances in our understanding of pi.
 
 Finding out what those 2 numbers are--- well, we still have a ways to go 
 on that.

Even if those problems were solved, you'd need (on average) just as
many bits to represent which digit of pi to start with as you'd need to
represent the original message.

 -- Brett



Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.

2011-05-20 Thread Paul Graydon

On 05/20/2011 08:53 AM, Brett Frankenberger wrote:

On Fri, May 20, 2011 at 06:46:45PM +, Eu-Ming Lee wrote:

To do this, you only need 2 numbers: the nth digit of pi and the number of
digits.

Simply convert your message into a single extremely long integer. Somewhere,
in the digits of pi, you will find a matching series of digits the same as
your integer!

Decompressing the number is relatively easy after some sort-of recent
advances in our understanding of pi.

Finding out what those 2 numbers are--- well, we still have a ways to go
on that.

Even if those problems were solved, you'd need (on average) just as
many bits to represent which digit of pi to start with as you'd need to
represent the original message.

  -- Brett
Not quite sure I follow that. Start at position xyz, carry on for 1 
bits shouldn't be as long as telling it all 1 bits?


Paul



Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.

2011-05-20 Thread Ken Chase
On Fri, May 20, 2011 at 09:34:59AM -1000, Paul Graydon said:
 Not quite sure I follow that. Start at position xyz, carry on for 1  
 bits shouldn't be as long as telling it all 1 bits?

what position # do you think your exact 1 bits will appear at?

(infact, mathies, whats the probability density function for
string of digits length N appearing in pi's digits per M digits?)

find M/N and there's your answer - might well be cheaper to
express the 1 bits themselves, than a 100,000 bit long position #
in pi.

you cant exabyte-attack all possible integers, ya know.

/kc
-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-20 Thread mikea
On Fri, May 20, 2011 at 09:34:59AM -1000, Paul Graydon wrote:
 On 05/20/2011 08:53 AM, Brett Frankenberger wrote:
 On Fri, May 20, 2011 at 06:46:45PM +, Eu-Ming Lee wrote:
 To do this, you only need 2 numbers: the nth digit of pi and the number of
 digits.
 
 Simply convert your message into a single extremely long integer. 
 Somewhere,
 in the digits of pi, you will find a matching series of digits the same as
 your integer!
 
 Decompressing the number is relatively easy after some sort-of recent
 advances in our understanding of pi.
 
 Finding out what those 2 numbers are--- well, we still have a ways to go
 on that.
 Even if those problems were solved, you'd need (on average) just as
 many bits to represent which digit of pi to start with as you'd need to
 represent the original message.
 
   -- Brett
 Not quite sure I follow that. Start at position xyz, carry on for 1 
 bits shouldn't be as long as telling it all 1 bits?

This depends strongly on the size of the number expressing position xyz.
Pi is infinitely long, so there is no guarantee that for some random string
which can be found starting at position xyz in, say, the binary, decimal,
or hexadecimal expansion of pi, xyz can be expressed in fewer than 1
(or indeed any fixed number N) bits.

-- 
Mike Andrews, W5EGO
mi...@mikea.ath.cx
Tired old sysadmin 



Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.

2011-05-20 Thread Valdis . Kletnieks
On Fri, 20 May 2011 09:34:59 -1000, Paul Graydon said:

 Not quite sure I follow that. Start at position xyz, carry on for 1 
 bits shouldn't be as long as telling it all 1 bits?

The problem is that the length of 'xyz' will probably be on the same order of
magnitude as the length of your message - if it's only 2-3 digits long, you'll
probably find a matching location in the first 100 or so digits.  But if you
need a run that's an exact match for an entire 20K email, you're going to have
to go down a long ways to find a match. (protip - compressing the email text
first is a big win here)



pgpnzvqmPYJNH.pgp
Description: PGP signature


Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.

2011-05-20 Thread Paul Timmins

On 05/20/2011 03:34 PM, Paul Graydon wrote:

On 05/20/2011 08:53 AM, Brett Frankenberger wrote:

Even if those problems were solved, you'd need (on average) just as
many bits to represent which digit of pi to start with as you'd need to
represent the original message.

  -- Brett
Not quite sure I follow that. Start at position xyz, carry on for 
1 bits shouldn't be as long as telling it all 1 bits?


Currently we have a compression algorithm for doing this already in 
widespread use. We create a list of numbers ranging from 1 to 255 and 
then provide an index into that array. We save space by assuming it's a 
single character.




Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.

2011-05-20 Thread Sudeep Khuraijam
I could not help but admire nanog in its full form ;)  and I cannot resist 
anymore.  Allow me to suggest the EPR paradox machine.

The cost of regenerating  unpredictable information is inefficient  by orders 
of magnitude, but wait... isn't it what we are trying to solve?

On May 20, 2011, at 1:32 PM, Paul Timmins wrote:

On 05/20/2011 03:34 PM, Paul Graydon wrote:
On 05/20/2011 08:53 AM, Brett Frankenberger wrote:
Even if those problems were solved, you'd need (on average) just as
many bits to represent which digit of pi to start with as you'd need to
represent the original message.

 -- Brett
Not quite sure I follow that. Start at position xyz, carry on for
1 bits shouldn't be as long as telling it all 1 bits?

Currently we have a compression algorithm for doing this already in
widespread use. We create a list of numbers ranging from 1 to 255 and
then provide an index into that array. We save space by assuming it's a
single character.



Sudeep Khuraijam | Netops | liveops | Office 408 8442511 | Mobile 408 666 9987
skhurai...@liveops.commailto:skhurai...@liveops.com | aim: skhuraijam










Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.

2011-05-20 Thread Doug Barton

On 5/20/2011 12:44 PM, Ken Chase wrote:

On Fri, May 20, 2011 at 09:34:59AM -1000, Paul Graydon said:

Not quite sure I follow that. Start at position xyz, carry on for 1
bits shouldn't be as long as telling it all 1 bits?


what position # do you think your exact 1 bits will appear at?

(infact, mathies, whats the probability density function for
string of digits length N appearing in pi's digits per M digits?)

find M/N and there's your answer - might well be cheaper to
express the 1 bits themselves, than a 100,000 bit long position #
in pi.


Blah. I seriously hate extending this silliness but I can't resist 
pointing out something that might be useful to someone to solve a real 
problem someday.


Who in their right mind would represent a string of 10**3000 numbers as 
the full string in what's supposed to be a compression algorithm? And 
yes, I'm pretty sure I just suggested the proper solution, as did the 
reference to a 255 bit array, but just in case ...


Assume that your string starts at precisely digit number 18,000,000. 
Advance to position 2**24, advance 1,222,784 digits further, begin 
recording. Obviously better/more interesting models could be developed 
by someone who actually cared. :)



Doug (you're welcome)

--

Nothin' ever doesn't change, but nothin' changes much.
-- OK Go

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/




Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-19 Thread Leo Bicknell
In a message written on Wed, May 18, 2011 at 09:52:22PM -0500, Brett 
Frankenberger wrote:
 That will work.  Of course, the CPU usage will be overwhelming --
 longer than the age of the universe to do a large file -- but,
 theoretically, with enough CPU power, it will work.

You have a different definition of work than I do.  If it can't finish
before the universe ends I don't think it works. :)

-- 
   Leo Bicknell - bickn...@ufp.org - CCIE 3440
PGP keys at http://www.ufp.org/~bicknell/


pgpU3xzDOd4rF.pgp
Description: PGP signature


RE: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-19 Thread Leigh Porter


 -Original Message-
 From: Leo Bicknell [mailto:bickn...@ufp.org]
 Sent: 19 May 2011 14:10
 To: nanog
 Subject: Re: Had an idea - looking for a math buff to tell me if it's
 possible with today's technology.
 
 In a message written on Wed, May 18, 2011 at 09:52:22PM -0500, Brett
 Frankenberger wrote:
  That will work.  Of course, the CPU usage will be overwhelming --
  longer than the age of the universe to do a large file -- but,
  theoretically, with enough CPU power, it will work.
 
 You have a different definition of work than I do.  If it can't
 finish before the universe ends I don't think it works. :)

You obviously do not read enough SciFi. By then (whenever then is)  sub 
picoseconds optical quantum computers will be able to solve such problems 
before you knew they were problems ;-)

--
Leigh Porter



__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-19 Thread Andrew Mulholland
On Wed, May 18, 2011 at 9:33 PM, Christopher Morrow
morrowc.li...@gmail.com wrote:


 no no no.. it's simply, since the OP posited a math solution, md5.
 ship the size of file + hash, compute file on the other side. All
 files can be moved anywhere regardless of the size of the file in a
 single packet.

only problem is that of hash collision then.



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-19 Thread Landon Stewart
On Thu, May 19, 2011 at 5:05 AM, Vitkovsky, Adam avitkov...@emea.att.comwrote:

 inverse problem
 This is what I believe Landon meant in his original post

 Everybody started talking about compression -but that is I believe sending
 the result of the function -where both nodes know the function

 But how hard if at all possible is to figure out a function(or set of
 functions) and variables that describe the given data

 And than just send those functions and variables to the other node
 And let it to recompute the original file

 Complex function can be represented by simple numbers to shrink down the
 amount of data to be sent over the wire

 If the file is: 1048576

 -than that coule be represneted via:
 1*1
 X=2
 Y=10
 Where both nodes would know that 1 = x^y


Just wanted to say yes, this is entirely what I meant.  Of course the
smaller the file the more pointless it gets but still...  If the file was
1GB instead of just 7 bytes I'm wondering if a regular old workstation could
put it back together in any reasonable amount of time with the equation.


-- 
Landon Stewart lstew...@superb.net
SuperbHosting.Net by Superb Internet Corp.
Toll Free (US/Canada): 888-354-6128 x 4199
Direct: 206-438-5879
Web hosting and more Ahead of the Rest: http://www.superbhosting.net


Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-19 Thread Warren Kumari

On May 19, 2011, at 11:42 AM, Landon Stewart wrote:

 On Thu, May 19, 2011 at 5:05 AM, Vitkovsky, Adam 
 avitkov...@emea.att.comwrote:
 
 inverse problem
 This is what I believe Landon meant in his original post
 
 Everybody started talking about compression -but that is I believe sending
 the result of the function -where both nodes know the function
 
 But how hard if at all possible is to figure out a function(or set of
 functions) and variables that describe the given data
 
 And than just send those functions and variables to the other node
 And let it to recompute the original file
 
 Complex function can be represented by simple numbers to shrink down the
 amount of data to be sent over the wire
 
 If the file is: 1048576
 
 -than that coule be represneted via:
 1*1
 X=2
 Y=10
 Where both nodes would know that 1 = x^y
 
 
 Just wanted to say yes, this is entirely what I meant.  Of course the
 smaller the file the more pointless it gets but still...  If the file was
 1GB instead of just 7 bytes I'm wondering if a regular old workstation could
 put it back together in any reasonable amount of time with the equation.

While many folk have said You've just invented compression, I'm going to be a 
little more specific -- Wavelet compression.

W
 
 
 -- 
 Landon Stewart lstew...@superb.net
 SuperbHosting.Net by Superb Internet Corp.
 Toll Free (US/Canada): 888-354-6128 x 4199
 Direct: 206-438-5879
 Web hosting and more Ahead of the Rest: http://www.superbhosting.net
 




Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-19 Thread Christopher Morrow
On Thu, May 19, 2011 at 4:34 PM, Warren Kumari war...@kumari.net wrote:
 Just wanted to say yes, this is entirely what I meant.  Of course the
 smaller the file the more pointless it gets but still...  If the file was
 1GB instead of just 7 bytes I'm wondering if a regular old workstation could
 put it back together in any reasonable amount of time with the equation.

 While many folk have said You've just invented compression, I'm going to be 
 a little more specific -- Wavelet compression.

'quantum entanglement'!!



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-19 Thread Adrian Chadd
On Thu, May 19, 2011, Warren Kumari wrote:

  Just wanted to say yes, this is entirely what I meant.  Of course the
  smaller the file the more pointless it gets but still...  If the file was
  1GB instead of just 7 bytes I'm wondering if a regular old workstation could
  put it back together in any reasonable amount of time with the equation.
 
 While many folk have said You've just invented compression, I'm going to be 
 a little more specific -- Wavelet compression.

Well, yes. There's other types of function driven compression rather than
dictionary driven compression (which is just function driven compression :-),
eg iterated function systems.

The problem is finding a method that works for a variety of data. From what I
understand, (lossless) wavelet compression isn't fantastic for arbitrary types
of data.

I'd suggest the original poster pull up some literature introducing them
to information theory and compression techniques in general. Heck, even the
wikipedia article on lossless compression is a good starting point.

I think once the original poster understands some of the basics of information
theory and coding as it relates to representing say 1GB from 7 bytes as given
above, they may be better equipped to ask more specific (and useful!) questions.

HTH,



Adrian




Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread John Adams
We call that Compression.

-j


On Wed, May 18, 2011 at 1:07 PM, Landon Stewart lstew...@superb.net wrote:

 Lets say you had a file that was 1,000,000,000 characters consisting of
 8,000,000,000bits.  What if instead of transferring that file through the
 interwebs you transmitted a mathematical equation to tell a computer on the
 other end how to *construct* that file.  First you'd feed the file into a
 cruncher of some type to reduce the pattern of 8,000,000,000 bits into an
 equation somehow.  Sure this would take time, I realize that.  The equation
 would then be transmitted to the other computer where it would use its
 mad-math-skillz to *figure out the answer* which would theoretically be the
 same pattern of bits.  Thus the same file would emerge on the other end.

 The real question here is how long would it take for a regular computer to
 do this kind of math?

 Just a weird idea I had.  If it's a good idea then please consider this
 intellectual property.  LOL


 --
 Landon Stewart lstew...@superb.net
 SuperbHosting.Net by Superb Internet Corp.
 Toll Free (US/Canada): 888-354-6128 x 4199
 Direct: 206-438-5879
 Web hosting and more Ahead of the Rest: http://www.superbhosting.net



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Jack Carrozzo
That's basically what compression is. Except rarely (read: never) does your
Real Data (tm) fit just one equation, hence the various compression
algorithms that look for patterns etc etc.

-J

On Wed, May 18, 2011 at 4:07 PM, Landon Stewart lstew...@superb.net wrote:

 Lets say you had a file that was 1,000,000,000 characters consisting of
 8,000,000,000bits.  What if instead of transferring that file through the
 interwebs you transmitted a mathematical equation to tell a computer on the
 other end how to *construct* that file.  First you'd feed the file into a
 cruncher of some type to reduce the pattern of 8,000,000,000 bits into an
 equation somehow.  Sure this would take time, I realize that.  The equation
 would then be transmitted to the other computer where it would use its
 mad-math-skillz to *figure out the answer* which would theoretically be the
 same pattern of bits.  Thus the same file would emerge on the other end.

 The real question here is how long would it take for a regular computer to
 do this kind of math?

 Just a weird idea I had.  If it's a good idea then please consider this
 intellectual property.  LOL


 --
 Landon Stewart lstew...@superb.net
 SuperbHosting.Net by Superb Internet Corp.
 Toll Free (US/Canada): 888-354-6128 x 4199
 Direct: 206-438-5879
 Web hosting and more Ahead of the Rest: http://www.superbhosting.net



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Michael Holstein

 Just a weird idea I had.  If it's a good idea then please consider this
 intellectual property.
   

It's easy .. the zeros are fatter than the ones.

http://dilbert.com/strips/comic/2004-12-09/

~Mike.



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Dorn Hetzel
On Wed, May 18, 2011 at 4:07 PM, Landon Stewart lstew...@superb.net wrote:

 Lets say you had a file that was 1,000,000,000 characters consisting of
 8,000,000,000bits.  What if instead of transferring that file through the
 interwebs you transmitted a mathematical equation to tell a computer on the
 other end how to *construct* that file.  First you'd feed the file into a
 cruncher of some type to reduce the pattern of 8,000,000,000 bits into an
 equation somehow.  Sure this would take time, I realize that.  The equation
 would then be transmitted to the other computer where it would use its
 mad-math-skillz to *figure out the answer* which would theoretically be the
 same pattern of bits.  Thus the same file would emerge on the other end.

 The real question here is how long would it take for a regular computer to
 do this kind of math?

 The real question is whether this is possible.  And the short answer is No,
at least not in general.

Now if your file has patterns that make it compressible, you can make it
smaller, but not
all files can be compressed this way, at least not in a way that makes them
smaller.

To understand why, consider the case of a file of one byte, or 8 bits.
 There are 256 possible
files of this size, , 0001, 0010, ..., 1101, 1110,
111.

Since each code we send must generate a unique file (or what's the point, we
need 256 different codes
to represent each possible file), but the shortest general way to write 256
different codes is
still 8 bits long.  Now, we can use coding schemes and say that the one-bit
value 1 represents 
because that file happens a lot.  Then we could use 01 to represent
something else, but
we can't use 1 at the beginning again because we couldn't tell that from the
file named by 1.

Bottom line, for some codes to be shorter than the file they represent,
others must be longer...

So if files have a lot of repetition, you can get a win, but for random
data, not so much :(


RE: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Stefan Fouant
 -Original Message-
 From: Landon Stewart [mailto:lstew...@superb.net]
 Sent: Wednesday, May 18, 2011 4:08 PM
 To: nanog
 Subject: Had an idea - looking for a math buff to tell me if it's
 possible with today's technology.
 
 Lets say you had a file that was 1,000,000,000 characters consisting of
 8,000,000,000bits.  What if instead of transferring that file through
 the
 interwebs you transmitted a mathematical equation to tell a computer on
 the
 other end how to *construct* that file.  First you'd feed the file into
 a
 cruncher of some type to reduce the pattern of 8,000,000,000 bits into
 an
 equation somehow.  Sure this would take time, I realize that.  The
 equation
 would then be transmitted to the other computer where it would use its
 mad-math-skillz to *figure out the answer* which would theoretically be
 the
 same pattern of bits.  Thus the same file would emerge on the other
 end.

Not exactly the same thing, but application acceleration of this sort has
been around for some time - 

http://www.riverbed.com/us/
http://www.juniper.net/us/en/products-services/application-acceleration/wxc-
series/
http://www.cisco.com/en/US/products/ps5680/Products_Sub_Category_Home.html

Stefan Fouant





Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Robert Bonomi


Wildly off-topic for the NANOG mailing-list, as it has -zero- relevance to
'network operations'

 Date: Wed, 18 May 2011 13:07:32 -0700
 Subject: Had an idea - looking for a math buff to tell me if it's possible
   with today's technology.
 From: Landon Stewart lstew...@superb.net
 To: nanog nanog@nanog.org

 Lets say you had a file that was 1,000,000,000 characters consisting of
 8,000,000,000bits.  What if instead of transferring that file through the
 interwebs you transmitted a mathematical equation to tell a computer on the
 other end how to *construct* that file.  First you'd feed the file into a
 cruncher of some type to reduce the pattern of 8,000,000,000 bits into an
 equation somehow.  Sure this would take time, I realize that.  The equation
 would then be transmitted to the other computer where it would use its
 mad-math-skillz to *figure out the answer* which would theoretically be the
 same pattern of bits.  Thus the same file would emerge on the other end.

 The real question here is how long would it take for a regular computer to
 do this kind of math?

I have, on my computer, an encoder/decoder that does _exactly_ that.

Both the encoder and decoder are _amazingly_ fast -- as fast as a file copy,
in fact.

the average size of the tranmsitted files, across all possible input files
is exactly 100% of the size of the input files.  (one *cannot* do better
than that, across all possible inputs -- see the 'counting' problem, in
data-compression theory)

 Just a weird idea I had.  If it's a good idea then please consider this
 intellectual property.  LOL

'Weird' is one word for it.  You might want to read up on the subject
of 'data compression', to get an idea of how things work.

See also polynominial curve-fitting, for the real-world limits of your
theory.
for the real-world limits of your
theory.






Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Steven Bellovin

On May 18, 2011, at 4:07 32PM, Landon Stewart wrote:

 Lets say you had a file that was 1,000,000,000 characters consisting of
 8,000,000,000bits.  What if instead of transferring that file through the
 interwebs you transmitted a mathematical equation to tell a computer on the
 other end how to *construct* that file.  First you'd feed the file into a
 cruncher of some type to reduce the pattern of 8,000,000,000 bits into an
 equation somehow.  Sure this would take time, I realize that.  The equation
 would then be transmitted to the other computer where it would use its
 mad-math-skillz to *figure out the answer* which would theoretically be the
 same pattern of bits.  Thus the same file would emerge on the other end.
 
 The real question here is how long would it take for a regular computer to
 do this kind of math?
 
 Just a weird idea I had.  If it's a good idea then please consider this
 intellectual property.  LOL

http://en.wikipedia.org/wiki/Kolmogorov_complexity

--Steve Bellovin, https://www.cs.columbia.edu/~smb








Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Christopher Morrow
On Wed, May 18, 2011 at 4:18 PM, Michael Holstein
michael.holst...@csuohio.edu wrote:

 Just a weird idea I had.  If it's a good idea then please consider this
 intellectual property.


 It's easy .. the zeros are fatter than the ones.

no no no.. it's simply, since the OP posited a math solution, md5.
ship the size of file + hash, compute file on the other side. All
files can be moved anywhere regardless of the size of the file in a
single packet.


The solution is left as an exercise for the reader.



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Aria Stewart

On Wednesday, May 18, 2011 at 2:18 PM, Dorn Hetzel wrote: 
 On Wed, May 18, 2011 at 4:07 PM, Landon Stewart lstew...@superb.net wrote:
 
  Lets say you had a file that was 1,000,000,000 characters consisting of
  8,000,000,000bits. What if instead of transferring that file through the
  interwebs you transmitted a mathematical equation to tell a computer on the
  other end how to *construct* that file. First you'd feed the file into a
  cruncher of some type to reduce the pattern of 8,000,000,000 bits into an
  equation somehow. Sure this would take time, I realize that. The equation
  would then be transmitted to the other computer where it would use its
  mad-math-skillz to *figure out the answer* which would theoretically be the
  same pattern of bits. Thus the same file would emerge on the other end.
  
  The real question here is how long would it take for a regular computer to
  do this kind of math?
  
  The real question is whether this is possible. And the short answer is No,
 at least not in general.
Exactly: What you run up against is that you can reduce extraneous information, 
and compress redundant information, but if you actually have dense information, 
you're not gonna get any better.

So easy to compress a billion bytes of JSON or XML significantly; not so much a 
billion bytes of already tightly coded movie.


Aria Stewart




Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread John Lee
The concept is called fractals where you can compress the image and send the
values and recreate the image. There was a body of work on the subject, I
would say in the mid to late eighties where two Georgia Tech professors
started a company doing it.

John (ISDN) Lee

On Wed, May 18, 2011 at 4:07 PM, Landon Stewart lstew...@superb.net wrote:

 Lets say you had a file that was 1,000,000,000 characters consisting of
 8,000,000,000bits.  What if instead of transferring that file through the
 interwebs you transmitted a mathematical equation to tell a computer on the
 other end how to *construct* that file.  First you'd feed the file into a
 cruncher of some type to reduce the pattern of 8,000,000,000 bits into an
 equation somehow.  Sure this would take time, I realize that.  The equation
 would then be transmitted to the other computer where it would use its
 mad-math-skillz to *figure out the answer* which would theoretically be the
 same pattern of bits.  Thus the same file would emerge on the other end.

 The real question here is how long would it take for a regular computer to
 do this kind of math?

 Just a weird idea I had.  If it's a good idea then please consider this
 intellectual property.  LOL


 --
 Landon Stewart lstew...@superb.net
 SuperbHosting.Net by Superb Internet Corp.
 Toll Free (US/Canada): 888-354-6128 x 4199
 Direct: 206-438-5879
 Web hosting and more Ahead of the Rest: http://www.superbhosting.net



RE: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Joe Loiacono
Stefan Fouant sfou...@shortestpathfirst.net wrote on 05/18/2011 
04:19:26 PM:

  Lets say you had a file that was 1,000,000,000 characters consisting 
of
 
 http://www.riverbed.com/us/
 
http://www.juniper.net/us/en/products-services/application-acceleration/wxc-

 series/
 
http://www.cisco.com/en/US/products/ps5680/Products_Sub_Category_Home.html

You also need to include Silver Peak.

http://www.silver-peak.com/

Saw a very interesting presentation on their techniques.

Joe


Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Leo Bicknell
In a message written on Wed, May 18, 2011 at 04:33:34PM -0400, Christopher 
Morrow wrote:
 no no no.. it's simply, since the OP posited a math solution, md5.
 ship the size of file + hash, compute file on the other side. All
 files can be moved anywhere regardless of the size of the file in a
 single packet.
 
 
 The solution is left as an exercise for the reader.

Bah, you should include the solution, it's so trivial.

Generate all possible files and then do an index lookup on the MD5.
It's a little CPU heavy, but darn simple to code.

You can even stop when you get a match, which turns out to be a HUGE
optimization. :)

-- 
   Leo Bicknell - bickn...@ufp.org - CCIE 3440
PGP keys at http://www.ufp.org/~bicknell/


pgpqsN6hKjrXD.pgp
Description: PGP signature


Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Philip Dorr
On Wed, May 18, 2011 at 3:33 PM, Christopher Morrow
morrowc.li...@gmail.com wrote:
 On Wed, May 18, 2011 at 4:18 PM, Michael Holstein
 michael.holst...@csuohio.edu wrote:

 Just a weird idea I had.  If it's a good idea then please consider this
 intellectual property.


 It's easy .. the zeros are fatter than the ones.

 no no no.. it's simply, since the OP posited a math solution, md5.
 ship the size of file + hash, compute file on the other side. All
 files can be moved anywhere regardless of the size of the file in a
 single packet.


 The solution is left as an exercise for the reader.


You would need a lot of computing power to generate a file of any
decent size.  If you want to be evil then you could send just a md5
hash and a sha512 hash (or some other hash that would not have a
collision at the same time except when correct)



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Chris Owen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On May 18, 2011, at 4:03 PM, Leo Bicknell wrote:

 Bah, you should include the solution, it's so trivial.
 
 Generate all possible files and then do an index lookup on the MD5.
 It's a little CPU heavy, but darn simple to code.

Isn't this essentially what Dropbox has been doing in many cases?

Chris

- --
- -
Chris Owen - Garden City (620) 275-1900 -  Lottery (noun):
President  - Wichita (316) 858-3000 -A stupidity tax
Hubris Communications Inc  www.hubris.net
- -


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (Darwin)
Comment: Public Key: http://home.hubris.net/owenc/pgpkey.txt
Comment: Public Key ID: 0xB513D9DD

iEYEARECAAYFAk3UOKIACgkQElUlCLUT2d3YoQCfee38nKuXD5O4C2w5VXUWszF1
EjcAmwfyytDgwmQDpJsQZSpl03ddGbVv
=3sX9
-END PGP SIGNATURE-



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Christopher Morrow
On Wed, May 18, 2011 at 4:47 PM, Joe Loiacono jloia...@csc.com wrote:

 You also need to include Silver Peak.


only if you like random failures.



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Heath Jones
I wonder if this is possible:

- Take a hash of the original file. Keep a counter.
- Generate data in some sequential method on sender side (for example simply
starting at 0 and iterating until you generate the same as the original
data)
- Each time you iterate, take the hash of the generated data. If it matches
the hash of the original file, increment counter.
- Send the hash and the counter value to recipient.
- Recipient performs same sequential generation method, stopping when
counter reached.

Any thoughts?

Heath


On 18 May 2011 21:07, Landon Stewart lstew...@superb.net wrote:

 Lets say you had a file that was 1,000,000,000 characters consisting of
 8,000,000,000bits.  What if instead of transferring that file through the
 interwebs you transmitted a mathematical equation to tell a computer on the
 other end how to *construct* that file.  First you'd feed the file into a
 cruncher of some type to reduce the pattern of 8,000,000,000 bits into an
 equation somehow.  Sure this would take time, I realize that.  The equation
 would then be transmitted to the other computer where it would use its
 mad-math-skillz to *figure out the answer* which would theoretically be the
 same pattern of bits.  Thus the same file would emerge on the other end.

 The real question here is how long would it take for a regular computer to
 do this kind of math?

 Just a weird idea I had.  If it's a good idea then please consider this
 intellectual property.  LOL


 --
 Landon Stewart lstew...@superb.net
 SuperbHosting.Net by Superb Internet Corp.
 Toll Free (US/Canada): 888-354-6128 x 4199
 Direct: 206-438-5879
 Web hosting and more Ahead of the Rest: http://www.superbhosting.net



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Valdis . Kletnieks
On Thu, 19 May 2011 00:26:26 BST, Heath Jones said:
 I wonder if this is possible:
 
 - Take a hash of the original file. Keep a counter.
 - Generate data in some sequential method on sender side (for example simply
 starting at 0 and iterating until you generate the same as the original
 data)
 - Each time you iterate, take the hash of the generated data. If it matches
 the hash of the original file, increment counter.
 - Send the hash and the counter value to recipient.
 - Recipient performs same sequential generation method, stopping when
 counter reached.

MD5 is a 128 bit hash.

2^128 is 340,282,366,920,938,463,463,374,607,431,768,211,456 - you're welcome
to iterate that many times to find a duplicate. You may get lucky and get a hit
in the first trillion or so attempts - but you may get unlucky and not get a
hit until the *last* few trillion attempts. On average you'll have to iterate
about half that huge number before you get a hit.

And it's lossy - if you hash all the possible 4K blocks with MD5, you'll find
that each of those 2^128 hashes has been hit about 256 times - and no
indication in the hash of *which* of the 256 colliding 4K blocks you have on
this iteration.  (The only reason that companies can do block-level
de-duplication by saving a hash as an index to one copy shared by all blocks
with the same hash value is because you have a *very small* fraction of the
possibilities covered, so if you saved a 4K block of data from somebody's
system32 folder under a given MD5 hash, it's *far* more likely that another
block with that same hash is from another copy of another identical system32
folder, than it is an actual accidental collision.)

Protip:  A good hash function is by definition one-way - given the data, it's
easy to generate the hash - but reversing it to find the pre-image (the data
that *generated* the hash) is massively difficult.



pgpcx4G19LjCd.pgp
Description: PGP signature


Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Heath Jones
My point here is it IS possible to transfer just a hash and counter value
and effectively generate identical data at the remote end.
The limit that will be hit is the difficulty of generating and comparing
hash values with current processing power.

I'm proposing iterating through generated data up until the actual data.
It's not even a storage issue, as once you have incremented the data you
don't need to store old data or hash values - just the counter. No massive
hash tables.

It's a CPU issue.

On 19 May 2011 00:42, valdis.kletni...@vt.edu wrote:

 On Thu, 19 May 2011 00:26:26 BST, Heath Jones said:
  I wonder if this is possible:
 
  - Take a hash of the original file. Keep a counter.
  - Generate data in some sequential method on sender side (for example
 simply
  starting at 0 and iterating until you generate the same as the original
  data)
  - Each time you iterate, take the hash of the generated data. If it
 matches
  the hash of the original file, increment counter.
  - Send the hash and the counter value to recipient.
  - Recipient performs same sequential generation method, stopping when
  counter reached.

 MD5 is a 128 bit hash.

 2^128 is 340,282,366,920,938,463,463,374,607,431,768,211,456 - you're
 welcome
 to iterate that many times to find a duplicate. You may get lucky and get a
 hit
 in the first trillion or so attempts - but you may get unlucky and not get
 a
 hit until the *last* few trillion attempts. On average you'll have to
 iterate
 about half that huge number before you get a hit.

 And it's lossy - if you hash all the possible 4K blocks with MD5, you'll
 find
 that each of those 2^128 hashes has been hit about 256 times - and no
 indication in the hash of *which* of the 256 colliding 4K blocks you have
 on
 this iteration.  (The only reason that companies can do block-level
 de-duplication by saving a hash as an index to one copy shared by all
 blocks
 with the same hash value is because you have a *very small* fraction of the
 possibilities covered, so if you saved a 4K block of data from somebody's
 system32 folder under a given MD5 hash, it's *far* more likely that another
 block with that same hash is from another copy of another identical
 system32
 folder, than it is an actual accidental collision.)

 Protip:  A good hash function is by definition one-way - given the data,
 it's
 easy to generate the hash - but reversing it to find the pre-image (the
 data
 that *generated* the hash) is massively difficult.




Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Chrisjfenton
try itu v.42bis 


Iridescent iPhone 
+1 972 757 8894



On May 18, 2011, at 15:07, Landon Stewart lstew...@superb.net wrote:

 Lets say you had a file that was 1,000,000,000 characters consisting of
 8,000,000,000bits.  What if instead of transferring that file through the
 interwebs you transmitted a mathematical equation to tell a computer on the
 other end how to *construct* that file.  First you'd feed the file into a
 cruncher of some type to reduce the pattern of 8,000,000,000 bits into an
 equation somehow.  Sure this would take time, I realize that.  The equation
 would then be transmitted to the other computer where it would use its
 mad-math-skillz to *figure out the answer* which would theoretically be the
 same pattern of bits.  Thus the same file would emerge on the other end.
 
 The real question here is how long would it take for a regular computer to
 do this kind of math?
 
 Just a weird idea I had.  If it's a good idea then please consider this
 intellectual property.  LOL
 
 
 -- 
 Landon Stewart lstew...@superb.net
 SuperbHosting.Net by Superb Internet Corp.
 Toll Free (US/Canada): 888-354-6128 x 4199
 Direct: 206-438-5879
 Web hosting and more Ahead of the Rest: http://www.superbhosting.net



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Christopher Morrow
On Wed, May 18, 2011 at 8:03 PM, Heath Jones hj1...@gmail.com wrote:
 My point here is it IS possible to transfer just a hash and counter value
 and effectively generate identical data at the remote end.
 The limit that will be hit is the difficulty of generating and comparing
 hash values with current processing power.

 I'm proposing iterating through generated data up until the actual data.
 It's not even a storage issue, as once you have incremented the data you
 don't need to store old data or hash values - just the counter. No massive
 hash tables.

 It's a CPU issue.


i'd note it took you many more packets than my example of roughly the
same thing.

if you really want to save bandwidth, my 1 packet answer is the best answer.



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Barry Shein

Compression is one result.

But this is sometimes referred to as the inverse problem: Given a
set of data tell me a function which fits it (or fits it to some
tolerance.) It's important in statistics and all kinds of data
analyses.

Another area is fourier transforms which basically sums sine waves of
different amp/freq until you reach the desired fit. This is also the
basis of a lot of noise filtering algorithms, throw out the
frequencies you don't want, such as 60HZ or 50HZ, or all those smaller
than you consider interesting, high-freq noise, or low freq noise,
whatever.

Another buzz term is data entropy, randomness. If the data were
perfectly random then there exists no such function which can be
represented in less bits than the original data, which is why you
can't compress a compressed file indefinitely and also why it's
recommended you compress files before encrypting them, it's hard to
begin cracking a file which is pretty close to random.

And this is what you do when you give something like a MARC or ISBN or
Dewey Decimal index to a librarian and s/he brings you the book you
want. Effectively you've represented the entire book as that small
number. Imagine if you had to recite the entire text of a book to
find it unambiguously! See: Transfinite Number Systems.

-- 
-Barry Shein

The World  | b...@theworld.com   | http://www.TheWorld.com
Purveyors to the Trade | Voice: 800-THE-WRLD| Dial-Up: US, PR, Canada
Software Tool  Die| Public Access Internet | SINCE 1989 *oo*



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Brett Frankenberger
On Thu, May 19, 2011 at 12:26:26AM +0100, Heath Jones wrote:
 I wonder if this is possible:
 
 - Take a hash of the original file. Keep a counter.
 - Generate data in some sequential method on sender side (for example simply
 starting at 0 and iterating until you generate the same as the original
 data)
 - Each time you iterate, take the hash of the generated data. If it matches
 the hash of the original file, increment counter.
 - Send the hash and the counter value to recipient.
 - Recipient performs same sequential generation method, stopping when
 counter reached.
 
 Any thoughts?

That will work.  Of course, the CPU usage will be overwhelming --
longer than the age of the universe to do a large file -- but,
theoretically, with enough CPU power, it will work.

For a 8,000,000,000 bit file and a 128 bit hash, you will need a
counter of at least 7,999,999,872 bits to cover the number of possible
collisions.

So you will need at leat 7,999,999,872 + 128 = 8,000,000,000 bits to
send your 8,000,000,000 bit file.  If your goal is to reduce the number
of bits you send, this wouldn't be a good choice.

 -- Brett



Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.

2011-05-18 Thread Heath Jones
Ha! I was wondering this the whole time - if the size of the counter would
make it a zero sum game. That sux! :)

On 19 May 2011 03:52, Brett Frankenberger rbf+na...@panix.com wrote:

 On Thu, May 19, 2011 at 12:26:26AM +0100, Heath Jones wrote:
  I wonder if this is possible:
 
  - Take a hash of the original file. Keep a counter.
  - Generate data in some sequential method on sender side (for example
 simply
  starting at 0 and iterating until you generate the same as the original
  data)
  - Each time you iterate, take the hash of the generated data. If it
 matches
  the hash of the original file, increment counter.
  - Send the hash and the counter value to recipient.
  - Recipient performs same sequential generation method, stopping when
  counter reached.
 
  Any thoughts?

 That will work.  Of course, the CPU usage will be overwhelming --
 longer than the age of the universe to do a large file -- but,
 theoretically, with enough CPU power, it will work.

 For a 8,000,000,000 bit file and a 128 bit hash, you will need a
 counter of at least 7,999,999,872 bits to cover the number of possible
 collisions.

 So you will need at leat 7,999,999,872 + 128 = 8,000,000,000 bits to
 send your 8,000,000,000 bit file.  If your goal is to reduce the number
 of bits you send, this wouldn't be a good choice.

 -- Brett