Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Wed, 18 May 2011 13:07:32 -0700 Landon Stewart lstew...@superb.net wrote: Lets say you had a file that was 1,000,000,000 characters consisting of 8,000,000,000bits. What if instead of transferring that file through the interwebs you transmitted a mathematical equation to tell a computer on the other end how to *construct* that file. First you'd feed the file into a cruncher of some type to reduce the pattern of 8,000,000,000 bits into an equation somehow. Sure this would take time, I realize that. The equation would then be transmitted to the other computer where it would use its mad-math-skillz to *figure out the answer* which would theoretically be the same pattern of bits. Thus the same file would emerge on the other end. The real question here is how long would it take for a regular computer to do this kind of math? Just a weird idea I had. If it's a good idea then please consider this intellectual property. LOL 42 :-)
Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.
On May 20, 2011, at 5:16 PM, Sudeep Khuraijam wrote: I could not help but admire nanog in its full form ;) and I cannot resist anymore. Allow me to suggest the EPR paradox machine. The cost of regenerating unpredictable information is inefficient by orders of magnitude, but wait... isn't it what we are trying to solve? On May 20, 2011, at 1:32 PM, Paul Timmins wrote: On 05/20/2011 03:34 PM, Paul Graydon wrote: On 05/20/2011 08:53 AM, Brett Frankenberger wrote: Even if those problems were solved, you'd need (on average) just as many bits to represent which digit of pi to start with as you'd need to represent the original message. -- Brett Not quite sure I follow that. Start at position xyz, carry on for 1 bits shouldn't be as long as telling it all 1 bits? Yes, it will be as long or longer (on average), because you have to represent position XYZ in some fashion, and send that representation to the decoder, and it can easily be longer than the original message. Suppose that it takes 20,000 bits to represent XYZ. How have you saved bits ? Having XYZ be longer than the original message is just as likely as having it be shorter. The same problem applies to the original suggestion. You will not (on average) save bits. If typical messages are not totally random, you can compress by considering the nature of that non-randomness, and tailoring your compression accordingly. These schemes are using random strings / hashes for their compression, and thus will (on average) not save bits even if a message is highly non-random. Regards Marshall Currently we have a compression algorithm for doing this already in widespread use. We create a list of numbers ranging from 1 to 255 and then provide an index into that array. We save space by assuming it's a single character. Sudeep Khuraijam | Netops | liveops | Office 408 8442511 | Mobile 408 666 9987 skhurai...@liveops.commailto:skhurai...@liveops.com | aim: skhuraijam
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
To do this, you only need 2 numbers: the nth digit of pi and the number of digits. Simply convert your message into a single extremely long integer. Somewhere, in the digits of pi, you will find a matching series of digits the same as your integer! Decompressing the number is relatively easy after some sort-of recent advances in our understanding of pi. Finding out what those 2 numbers are--- well, we still have a ways to go on that. Despite the ridiculousness of this example, it does illustrate to the author that there are extremes of compression. The single mathematical formula compression method is possible, even trivial. However, the computation time for compression may be unreasonably large. Here is another ridiculous way to compress data: Convert your data into a series of coordinates to a mandelbrot fractal set. The final picture is a fixed size, regardless of the size of your starting data set. From that final picture, you should be able to retrieve your original starting data set. Good luck!
Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.
On Fri, May 20, 2011 at 06:46:45PM +, Eu-Ming Lee wrote: To do this, you only need 2 numbers: the nth digit of pi and the number of digits. Simply convert your message into a single extremely long integer. Somewhere, in the digits of pi, you will find a matching series of digits the same as your integer! Decompressing the number is relatively easy after some sort-of recent advances in our understanding of pi. Finding out what those 2 numbers are--- well, we still have a ways to go on that. Even if those problems were solved, you'd need (on average) just as many bits to represent which digit of pi to start with as you'd need to represent the original message. -- Brett
Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.
On 05/20/2011 08:53 AM, Brett Frankenberger wrote: On Fri, May 20, 2011 at 06:46:45PM +, Eu-Ming Lee wrote: To do this, you only need 2 numbers: the nth digit of pi and the number of digits. Simply convert your message into a single extremely long integer. Somewhere, in the digits of pi, you will find a matching series of digits the same as your integer! Decompressing the number is relatively easy after some sort-of recent advances in our understanding of pi. Finding out what those 2 numbers are--- well, we still have a ways to go on that. Even if those problems were solved, you'd need (on average) just as many bits to represent which digit of pi to start with as you'd need to represent the original message. -- Brett Not quite sure I follow that. Start at position xyz, carry on for 1 bits shouldn't be as long as telling it all 1 bits? Paul
Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.
On Fri, May 20, 2011 at 09:34:59AM -1000, Paul Graydon said: Not quite sure I follow that. Start at position xyz, carry on for 1 bits shouldn't be as long as telling it all 1 bits? what position # do you think your exact 1 bits will appear at? (infact, mathies, whats the probability density function for string of digits length N appearing in pi's digits per M digits?) find M/N and there's your answer - might well be cheaper to express the 1 bits themselves, than a 100,000 bit long position # in pi. you cant exabyte-attack all possible integers, ya know. /kc -- Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto Canada Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W.
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Fri, May 20, 2011 at 09:34:59AM -1000, Paul Graydon wrote: On 05/20/2011 08:53 AM, Brett Frankenberger wrote: On Fri, May 20, 2011 at 06:46:45PM +, Eu-Ming Lee wrote: To do this, you only need 2 numbers: the nth digit of pi and the number of digits. Simply convert your message into a single extremely long integer. Somewhere, in the digits of pi, you will find a matching series of digits the same as your integer! Decompressing the number is relatively easy after some sort-of recent advances in our understanding of pi. Finding out what those 2 numbers are--- well, we still have a ways to go on that. Even if those problems were solved, you'd need (on average) just as many bits to represent which digit of pi to start with as you'd need to represent the original message. -- Brett Not quite sure I follow that. Start at position xyz, carry on for 1 bits shouldn't be as long as telling it all 1 bits? This depends strongly on the size of the number expressing position xyz. Pi is infinitely long, so there is no guarantee that for some random string which can be found starting at position xyz in, say, the binary, decimal, or hexadecimal expansion of pi, xyz can be expressed in fewer than 1 (or indeed any fixed number N) bits. -- Mike Andrews, W5EGO mi...@mikea.ath.cx Tired old sysadmin
Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.
On Fri, 20 May 2011 09:34:59 -1000, Paul Graydon said: Not quite sure I follow that. Start at position xyz, carry on for 1 bits shouldn't be as long as telling it all 1 bits? The problem is that the length of 'xyz' will probably be on the same order of magnitude as the length of your message - if it's only 2-3 digits long, you'll probably find a matching location in the first 100 or so digits. But if you need a run that's an exact match for an entire 20K email, you're going to have to go down a long ways to find a match. (protip - compressing the email text first is a big win here) pgpnzvqmPYJNH.pgp Description: PGP signature
Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.
On 05/20/2011 03:34 PM, Paul Graydon wrote: On 05/20/2011 08:53 AM, Brett Frankenberger wrote: Even if those problems were solved, you'd need (on average) just as many bits to represent which digit of pi to start with as you'd need to represent the original message. -- Brett Not quite sure I follow that. Start at position xyz, carry on for 1 bits shouldn't be as long as telling it all 1 bits? Currently we have a compression algorithm for doing this already in widespread use. We create a list of numbers ranging from 1 to 255 and then provide an index into that array. We save space by assuming it's a single character.
Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.
I could not help but admire nanog in its full form ;) and I cannot resist anymore. Allow me to suggest the EPR paradox machine. The cost of regenerating unpredictable information is inefficient by orders of magnitude, but wait... isn't it what we are trying to solve? On May 20, 2011, at 1:32 PM, Paul Timmins wrote: On 05/20/2011 03:34 PM, Paul Graydon wrote: On 05/20/2011 08:53 AM, Brett Frankenberger wrote: Even if those problems were solved, you'd need (on average) just as many bits to represent which digit of pi to start with as you'd need to represent the original message. -- Brett Not quite sure I follow that. Start at position xyz, carry on for 1 bits shouldn't be as long as telling it all 1 bits? Currently we have a compression algorithm for doing this already in widespread use. We create a list of numbers ranging from 1 to 255 and then provide an index into that array. We save space by assuming it's a single character. Sudeep Khuraijam | Netops | liveops | Office 408 8442511 | Mobile 408 666 9987 skhurai...@liveops.commailto:skhurai...@liveops.com | aim: skhuraijam
Re: Had an idea - looking for a math buff to tell me if it's possible?with today's technology.
On 5/20/2011 12:44 PM, Ken Chase wrote: On Fri, May 20, 2011 at 09:34:59AM -1000, Paul Graydon said: Not quite sure I follow that. Start at position xyz, carry on for 1 bits shouldn't be as long as telling it all 1 bits? what position # do you think your exact 1 bits will appear at? (infact, mathies, whats the probability density function for string of digits length N appearing in pi's digits per M digits?) find M/N and there's your answer - might well be cheaper to express the 1 bits themselves, than a 100,000 bit long position # in pi. Blah. I seriously hate extending this silliness but I can't resist pointing out something that might be useful to someone to solve a real problem someday. Who in their right mind would represent a string of 10**3000 numbers as the full string in what's supposed to be a compression algorithm? And yes, I'm pretty sure I just suggested the proper solution, as did the reference to a 255 bit array, but just in case ... Assume that your string starts at precisely digit number 18,000,000. Advance to position 2**24, advance 1,222,784 digits further, begin recording. Obviously better/more interesting models could be developed by someone who actually cared. :) Doug (you're welcome) -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
In a message written on Wed, May 18, 2011 at 09:52:22PM -0500, Brett Frankenberger wrote: That will work. Of course, the CPU usage will be overwhelming -- longer than the age of the universe to do a large file -- but, theoretically, with enough CPU power, it will work. You have a different definition of work than I do. If it can't finish before the universe ends I don't think it works. :) -- Leo Bicknell - bickn...@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ pgpU3xzDOd4rF.pgp Description: PGP signature
RE: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
-Original Message- From: Leo Bicknell [mailto:bickn...@ufp.org] Sent: 19 May 2011 14:10 To: nanog Subject: Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology. In a message written on Wed, May 18, 2011 at 09:52:22PM -0500, Brett Frankenberger wrote: That will work. Of course, the CPU usage will be overwhelming -- longer than the age of the universe to do a large file -- but, theoretically, with enough CPU power, it will work. You have a different definition of work than I do. If it can't finish before the universe ends I don't think it works. :) You obviously do not read enough SciFi. By then (whenever then is) sub picoseconds optical quantum computers will be able to solve such problems before you knew they were problems ;-) -- Leigh Porter __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Wed, May 18, 2011 at 9:33 PM, Christopher Morrow morrowc.li...@gmail.com wrote: no no no.. it's simply, since the OP posited a math solution, md5. ship the size of file + hash, compute file on the other side. All files can be moved anywhere regardless of the size of the file in a single packet. only problem is that of hash collision then.
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Thu, May 19, 2011 at 5:05 AM, Vitkovsky, Adam avitkov...@emea.att.comwrote: inverse problem This is what I believe Landon meant in his original post Everybody started talking about compression -but that is I believe sending the result of the function -where both nodes know the function But how hard if at all possible is to figure out a function(or set of functions) and variables that describe the given data And than just send those functions and variables to the other node And let it to recompute the original file Complex function can be represented by simple numbers to shrink down the amount of data to be sent over the wire If the file is: 1048576 -than that coule be represneted via: 1*1 X=2 Y=10 Where both nodes would know that 1 = x^y Just wanted to say yes, this is entirely what I meant. Of course the smaller the file the more pointless it gets but still... If the file was 1GB instead of just 7 bytes I'm wondering if a regular old workstation could put it back together in any reasonable amount of time with the equation. -- Landon Stewart lstew...@superb.net SuperbHosting.Net by Superb Internet Corp. Toll Free (US/Canada): 888-354-6128 x 4199 Direct: 206-438-5879 Web hosting and more Ahead of the Rest: http://www.superbhosting.net
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On May 19, 2011, at 11:42 AM, Landon Stewart wrote: On Thu, May 19, 2011 at 5:05 AM, Vitkovsky, Adam avitkov...@emea.att.comwrote: inverse problem This is what I believe Landon meant in his original post Everybody started talking about compression -but that is I believe sending the result of the function -where both nodes know the function But how hard if at all possible is to figure out a function(or set of functions) and variables that describe the given data And than just send those functions and variables to the other node And let it to recompute the original file Complex function can be represented by simple numbers to shrink down the amount of data to be sent over the wire If the file is: 1048576 -than that coule be represneted via: 1*1 X=2 Y=10 Where both nodes would know that 1 = x^y Just wanted to say yes, this is entirely what I meant. Of course the smaller the file the more pointless it gets but still... If the file was 1GB instead of just 7 bytes I'm wondering if a regular old workstation could put it back together in any reasonable amount of time with the equation. While many folk have said You've just invented compression, I'm going to be a little more specific -- Wavelet compression. W -- Landon Stewart lstew...@superb.net SuperbHosting.Net by Superb Internet Corp. Toll Free (US/Canada): 888-354-6128 x 4199 Direct: 206-438-5879 Web hosting and more Ahead of the Rest: http://www.superbhosting.net
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Thu, May 19, 2011 at 4:34 PM, Warren Kumari war...@kumari.net wrote: Just wanted to say yes, this is entirely what I meant. Of course the smaller the file the more pointless it gets but still... If the file was 1GB instead of just 7 bytes I'm wondering if a regular old workstation could put it back together in any reasonable amount of time with the equation. While many folk have said You've just invented compression, I'm going to be a little more specific -- Wavelet compression. 'quantum entanglement'!!
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Thu, May 19, 2011, Warren Kumari wrote: Just wanted to say yes, this is entirely what I meant. Of course the smaller the file the more pointless it gets but still... If the file was 1GB instead of just 7 bytes I'm wondering if a regular old workstation could put it back together in any reasonable amount of time with the equation. While many folk have said You've just invented compression, I'm going to be a little more specific -- Wavelet compression. Well, yes. There's other types of function driven compression rather than dictionary driven compression (which is just function driven compression :-), eg iterated function systems. The problem is finding a method that works for a variety of data. From what I understand, (lossless) wavelet compression isn't fantastic for arbitrary types of data. I'd suggest the original poster pull up some literature introducing them to information theory and compression techniques in general. Heck, even the wikipedia article on lossless compression is a good starting point. I think once the original poster understands some of the basics of information theory and coding as it relates to representing say 1GB from 7 bytes as given above, they may be better equipped to ask more specific (and useful!) questions. HTH, Adrian
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
We call that Compression. -j On Wed, May 18, 2011 at 1:07 PM, Landon Stewart lstew...@superb.net wrote: Lets say you had a file that was 1,000,000,000 characters consisting of 8,000,000,000bits. What if instead of transferring that file through the interwebs you transmitted a mathematical equation to tell a computer on the other end how to *construct* that file. First you'd feed the file into a cruncher of some type to reduce the pattern of 8,000,000,000 bits into an equation somehow. Sure this would take time, I realize that. The equation would then be transmitted to the other computer where it would use its mad-math-skillz to *figure out the answer* which would theoretically be the same pattern of bits. Thus the same file would emerge on the other end. The real question here is how long would it take for a regular computer to do this kind of math? Just a weird idea I had. If it's a good idea then please consider this intellectual property. LOL -- Landon Stewart lstew...@superb.net SuperbHosting.Net by Superb Internet Corp. Toll Free (US/Canada): 888-354-6128 x 4199 Direct: 206-438-5879 Web hosting and more Ahead of the Rest: http://www.superbhosting.net
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
That's basically what compression is. Except rarely (read: never) does your Real Data (tm) fit just one equation, hence the various compression algorithms that look for patterns etc etc. -J On Wed, May 18, 2011 at 4:07 PM, Landon Stewart lstew...@superb.net wrote: Lets say you had a file that was 1,000,000,000 characters consisting of 8,000,000,000bits. What if instead of transferring that file through the interwebs you transmitted a mathematical equation to tell a computer on the other end how to *construct* that file. First you'd feed the file into a cruncher of some type to reduce the pattern of 8,000,000,000 bits into an equation somehow. Sure this would take time, I realize that. The equation would then be transmitted to the other computer where it would use its mad-math-skillz to *figure out the answer* which would theoretically be the same pattern of bits. Thus the same file would emerge on the other end. The real question here is how long would it take for a regular computer to do this kind of math? Just a weird idea I had. If it's a good idea then please consider this intellectual property. LOL -- Landon Stewart lstew...@superb.net SuperbHosting.Net by Superb Internet Corp. Toll Free (US/Canada): 888-354-6128 x 4199 Direct: 206-438-5879 Web hosting and more Ahead of the Rest: http://www.superbhosting.net
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
Just a weird idea I had. If it's a good idea then please consider this intellectual property. It's easy .. the zeros are fatter than the ones. http://dilbert.com/strips/comic/2004-12-09/ ~Mike.
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Wed, May 18, 2011 at 4:07 PM, Landon Stewart lstew...@superb.net wrote: Lets say you had a file that was 1,000,000,000 characters consisting of 8,000,000,000bits. What if instead of transferring that file through the interwebs you transmitted a mathematical equation to tell a computer on the other end how to *construct* that file. First you'd feed the file into a cruncher of some type to reduce the pattern of 8,000,000,000 bits into an equation somehow. Sure this would take time, I realize that. The equation would then be transmitted to the other computer where it would use its mad-math-skillz to *figure out the answer* which would theoretically be the same pattern of bits. Thus the same file would emerge on the other end. The real question here is how long would it take for a regular computer to do this kind of math? The real question is whether this is possible. And the short answer is No, at least not in general. Now if your file has patterns that make it compressible, you can make it smaller, but not all files can be compressed this way, at least not in a way that makes them smaller. To understand why, consider the case of a file of one byte, or 8 bits. There are 256 possible files of this size, , 0001, 0010, ..., 1101, 1110, 111. Since each code we send must generate a unique file (or what's the point, we need 256 different codes to represent each possible file), but the shortest general way to write 256 different codes is still 8 bits long. Now, we can use coding schemes and say that the one-bit value 1 represents because that file happens a lot. Then we could use 01 to represent something else, but we can't use 1 at the beginning again because we couldn't tell that from the file named by 1. Bottom line, for some codes to be shorter than the file they represent, others must be longer... So if files have a lot of repetition, you can get a win, but for random data, not so much :(
RE: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
-Original Message- From: Landon Stewart [mailto:lstew...@superb.net] Sent: Wednesday, May 18, 2011 4:08 PM To: nanog Subject: Had an idea - looking for a math buff to tell me if it's possible with today's technology. Lets say you had a file that was 1,000,000,000 characters consisting of 8,000,000,000bits. What if instead of transferring that file through the interwebs you transmitted a mathematical equation to tell a computer on the other end how to *construct* that file. First you'd feed the file into a cruncher of some type to reduce the pattern of 8,000,000,000 bits into an equation somehow. Sure this would take time, I realize that. The equation would then be transmitted to the other computer where it would use its mad-math-skillz to *figure out the answer* which would theoretically be the same pattern of bits. Thus the same file would emerge on the other end. Not exactly the same thing, but application acceleration of this sort has been around for some time - http://www.riverbed.com/us/ http://www.juniper.net/us/en/products-services/application-acceleration/wxc- series/ http://www.cisco.com/en/US/products/ps5680/Products_Sub_Category_Home.html Stefan Fouant
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
Wildly off-topic for the NANOG mailing-list, as it has -zero- relevance to 'network operations' Date: Wed, 18 May 2011 13:07:32 -0700 Subject: Had an idea - looking for a math buff to tell me if it's possible with today's technology. From: Landon Stewart lstew...@superb.net To: nanog nanog@nanog.org Lets say you had a file that was 1,000,000,000 characters consisting of 8,000,000,000bits. What if instead of transferring that file through the interwebs you transmitted a mathematical equation to tell a computer on the other end how to *construct* that file. First you'd feed the file into a cruncher of some type to reduce the pattern of 8,000,000,000 bits into an equation somehow. Sure this would take time, I realize that. The equation would then be transmitted to the other computer where it would use its mad-math-skillz to *figure out the answer* which would theoretically be the same pattern of bits. Thus the same file would emerge on the other end. The real question here is how long would it take for a regular computer to do this kind of math? I have, on my computer, an encoder/decoder that does _exactly_ that. Both the encoder and decoder are _amazingly_ fast -- as fast as a file copy, in fact. the average size of the tranmsitted files, across all possible input files is exactly 100% of the size of the input files. (one *cannot* do better than that, across all possible inputs -- see the 'counting' problem, in data-compression theory) Just a weird idea I had. If it's a good idea then please consider this intellectual property. LOL 'Weird' is one word for it. You might want to read up on the subject of 'data compression', to get an idea of how things work. See also polynominial curve-fitting, for the real-world limits of your theory. for the real-world limits of your theory.
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On May 18, 2011, at 4:07 32PM, Landon Stewart wrote: Lets say you had a file that was 1,000,000,000 characters consisting of 8,000,000,000bits. What if instead of transferring that file through the interwebs you transmitted a mathematical equation to tell a computer on the other end how to *construct* that file. First you'd feed the file into a cruncher of some type to reduce the pattern of 8,000,000,000 bits into an equation somehow. Sure this would take time, I realize that. The equation would then be transmitted to the other computer where it would use its mad-math-skillz to *figure out the answer* which would theoretically be the same pattern of bits. Thus the same file would emerge on the other end. The real question here is how long would it take for a regular computer to do this kind of math? Just a weird idea I had. If it's a good idea then please consider this intellectual property. LOL http://en.wikipedia.org/wiki/Kolmogorov_complexity --Steve Bellovin, https://www.cs.columbia.edu/~smb
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Wed, May 18, 2011 at 4:18 PM, Michael Holstein michael.holst...@csuohio.edu wrote: Just a weird idea I had. If it's a good idea then please consider this intellectual property. It's easy .. the zeros are fatter than the ones. no no no.. it's simply, since the OP posited a math solution, md5. ship the size of file + hash, compute file on the other side. All files can be moved anywhere regardless of the size of the file in a single packet. The solution is left as an exercise for the reader.
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Wednesday, May 18, 2011 at 2:18 PM, Dorn Hetzel wrote: On Wed, May 18, 2011 at 4:07 PM, Landon Stewart lstew...@superb.net wrote: Lets say you had a file that was 1,000,000,000 characters consisting of 8,000,000,000bits. What if instead of transferring that file through the interwebs you transmitted a mathematical equation to tell a computer on the other end how to *construct* that file. First you'd feed the file into a cruncher of some type to reduce the pattern of 8,000,000,000 bits into an equation somehow. Sure this would take time, I realize that. The equation would then be transmitted to the other computer where it would use its mad-math-skillz to *figure out the answer* which would theoretically be the same pattern of bits. Thus the same file would emerge on the other end. The real question here is how long would it take for a regular computer to do this kind of math? The real question is whether this is possible. And the short answer is No, at least not in general. Exactly: What you run up against is that you can reduce extraneous information, and compress redundant information, but if you actually have dense information, you're not gonna get any better. So easy to compress a billion bytes of JSON or XML significantly; not so much a billion bytes of already tightly coded movie. Aria Stewart
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
The concept is called fractals where you can compress the image and send the values and recreate the image. There was a body of work on the subject, I would say in the mid to late eighties where two Georgia Tech professors started a company doing it. John (ISDN) Lee On Wed, May 18, 2011 at 4:07 PM, Landon Stewart lstew...@superb.net wrote: Lets say you had a file that was 1,000,000,000 characters consisting of 8,000,000,000bits. What if instead of transferring that file through the interwebs you transmitted a mathematical equation to tell a computer on the other end how to *construct* that file. First you'd feed the file into a cruncher of some type to reduce the pattern of 8,000,000,000 bits into an equation somehow. Sure this would take time, I realize that. The equation would then be transmitted to the other computer where it would use its mad-math-skillz to *figure out the answer* which would theoretically be the same pattern of bits. Thus the same file would emerge on the other end. The real question here is how long would it take for a regular computer to do this kind of math? Just a weird idea I had. If it's a good idea then please consider this intellectual property. LOL -- Landon Stewart lstew...@superb.net SuperbHosting.Net by Superb Internet Corp. Toll Free (US/Canada): 888-354-6128 x 4199 Direct: 206-438-5879 Web hosting and more Ahead of the Rest: http://www.superbhosting.net
RE: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
Stefan Fouant sfou...@shortestpathfirst.net wrote on 05/18/2011 04:19:26 PM: Lets say you had a file that was 1,000,000,000 characters consisting of http://www.riverbed.com/us/ http://www.juniper.net/us/en/products-services/application-acceleration/wxc- series/ http://www.cisco.com/en/US/products/ps5680/Products_Sub_Category_Home.html You also need to include Silver Peak. http://www.silver-peak.com/ Saw a very interesting presentation on their techniques. Joe
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
In a message written on Wed, May 18, 2011 at 04:33:34PM -0400, Christopher Morrow wrote: no no no.. it's simply, since the OP posited a math solution, md5. ship the size of file + hash, compute file on the other side. All files can be moved anywhere regardless of the size of the file in a single packet. The solution is left as an exercise for the reader. Bah, you should include the solution, it's so trivial. Generate all possible files and then do an index lookup on the MD5. It's a little CPU heavy, but darn simple to code. You can even stop when you get a match, which turns out to be a HUGE optimization. :) -- Leo Bicknell - bickn...@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ pgpqsN6hKjrXD.pgp Description: PGP signature
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Wed, May 18, 2011 at 3:33 PM, Christopher Morrow morrowc.li...@gmail.com wrote: On Wed, May 18, 2011 at 4:18 PM, Michael Holstein michael.holst...@csuohio.edu wrote: Just a weird idea I had. If it's a good idea then please consider this intellectual property. It's easy .. the zeros are fatter than the ones. no no no.. it's simply, since the OP posited a math solution, md5. ship the size of file + hash, compute file on the other side. All files can be moved anywhere regardless of the size of the file in a single packet. The solution is left as an exercise for the reader. You would need a lot of computing power to generate a file of any decent size. If you want to be evil then you could send just a md5 hash and a sha512 hash (or some other hash that would not have a collision at the same time except when correct)
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On May 18, 2011, at 4:03 PM, Leo Bicknell wrote: Bah, you should include the solution, it's so trivial. Generate all possible files and then do an index lookup on the MD5. It's a little CPU heavy, but darn simple to code. Isn't this essentially what Dropbox has been doing in many cases? Chris - -- - - Chris Owen - Garden City (620) 275-1900 - Lottery (noun): President - Wichita (316) 858-3000 -A stupidity tax Hubris Communications Inc www.hubris.net - - -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (Darwin) Comment: Public Key: http://home.hubris.net/owenc/pgpkey.txt Comment: Public Key ID: 0xB513D9DD iEYEARECAAYFAk3UOKIACgkQElUlCLUT2d3YoQCfee38nKuXD5O4C2w5VXUWszF1 EjcAmwfyytDgwmQDpJsQZSpl03ddGbVv =3sX9 -END PGP SIGNATURE-
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Wed, May 18, 2011 at 4:47 PM, Joe Loiacono jloia...@csc.com wrote: You also need to include Silver Peak. only if you like random failures.
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
I wonder if this is possible: - Take a hash of the original file. Keep a counter. - Generate data in some sequential method on sender side (for example simply starting at 0 and iterating until you generate the same as the original data) - Each time you iterate, take the hash of the generated data. If it matches the hash of the original file, increment counter. - Send the hash and the counter value to recipient. - Recipient performs same sequential generation method, stopping when counter reached. Any thoughts? Heath On 18 May 2011 21:07, Landon Stewart lstew...@superb.net wrote: Lets say you had a file that was 1,000,000,000 characters consisting of 8,000,000,000bits. What if instead of transferring that file through the interwebs you transmitted a mathematical equation to tell a computer on the other end how to *construct* that file. First you'd feed the file into a cruncher of some type to reduce the pattern of 8,000,000,000 bits into an equation somehow. Sure this would take time, I realize that. The equation would then be transmitted to the other computer where it would use its mad-math-skillz to *figure out the answer* which would theoretically be the same pattern of bits. Thus the same file would emerge on the other end. The real question here is how long would it take for a regular computer to do this kind of math? Just a weird idea I had. If it's a good idea then please consider this intellectual property. LOL -- Landon Stewart lstew...@superb.net SuperbHosting.Net by Superb Internet Corp. Toll Free (US/Canada): 888-354-6128 x 4199 Direct: 206-438-5879 Web hosting and more Ahead of the Rest: http://www.superbhosting.net
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Thu, 19 May 2011 00:26:26 BST, Heath Jones said: I wonder if this is possible: - Take a hash of the original file. Keep a counter. - Generate data in some sequential method on sender side (for example simply starting at 0 and iterating until you generate the same as the original data) - Each time you iterate, take the hash of the generated data. If it matches the hash of the original file, increment counter. - Send the hash and the counter value to recipient. - Recipient performs same sequential generation method, stopping when counter reached. MD5 is a 128 bit hash. 2^128 is 340,282,366,920,938,463,463,374,607,431,768,211,456 - you're welcome to iterate that many times to find a duplicate. You may get lucky and get a hit in the first trillion or so attempts - but you may get unlucky and not get a hit until the *last* few trillion attempts. On average you'll have to iterate about half that huge number before you get a hit. And it's lossy - if you hash all the possible 4K blocks with MD5, you'll find that each of those 2^128 hashes has been hit about 256 times - and no indication in the hash of *which* of the 256 colliding 4K blocks you have on this iteration. (The only reason that companies can do block-level de-duplication by saving a hash as an index to one copy shared by all blocks with the same hash value is because you have a *very small* fraction of the possibilities covered, so if you saved a 4K block of data from somebody's system32 folder under a given MD5 hash, it's *far* more likely that another block with that same hash is from another copy of another identical system32 folder, than it is an actual accidental collision.) Protip: A good hash function is by definition one-way - given the data, it's easy to generate the hash - but reversing it to find the pre-image (the data that *generated* the hash) is massively difficult. pgpcx4G19LjCd.pgp Description: PGP signature
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
My point here is it IS possible to transfer just a hash and counter value and effectively generate identical data at the remote end. The limit that will be hit is the difficulty of generating and comparing hash values with current processing power. I'm proposing iterating through generated data up until the actual data. It's not even a storage issue, as once you have incremented the data you don't need to store old data or hash values - just the counter. No massive hash tables. It's a CPU issue. On 19 May 2011 00:42, valdis.kletni...@vt.edu wrote: On Thu, 19 May 2011 00:26:26 BST, Heath Jones said: I wonder if this is possible: - Take a hash of the original file. Keep a counter. - Generate data in some sequential method on sender side (for example simply starting at 0 and iterating until you generate the same as the original data) - Each time you iterate, take the hash of the generated data. If it matches the hash of the original file, increment counter. - Send the hash and the counter value to recipient. - Recipient performs same sequential generation method, stopping when counter reached. MD5 is a 128 bit hash. 2^128 is 340,282,366,920,938,463,463,374,607,431,768,211,456 - you're welcome to iterate that many times to find a duplicate. You may get lucky and get a hit in the first trillion or so attempts - but you may get unlucky and not get a hit until the *last* few trillion attempts. On average you'll have to iterate about half that huge number before you get a hit. And it's lossy - if you hash all the possible 4K blocks with MD5, you'll find that each of those 2^128 hashes has been hit about 256 times - and no indication in the hash of *which* of the 256 colliding 4K blocks you have on this iteration. (The only reason that companies can do block-level de-duplication by saving a hash as an index to one copy shared by all blocks with the same hash value is because you have a *very small* fraction of the possibilities covered, so if you saved a 4K block of data from somebody's system32 folder under a given MD5 hash, it's *far* more likely that another block with that same hash is from another copy of another identical system32 folder, than it is an actual accidental collision.) Protip: A good hash function is by definition one-way - given the data, it's easy to generate the hash - but reversing it to find the pre-image (the data that *generated* the hash) is massively difficult.
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
try itu v.42bis Iridescent iPhone +1 972 757 8894 On May 18, 2011, at 15:07, Landon Stewart lstew...@superb.net wrote: Lets say you had a file that was 1,000,000,000 characters consisting of 8,000,000,000bits. What if instead of transferring that file through the interwebs you transmitted a mathematical equation to tell a computer on the other end how to *construct* that file. First you'd feed the file into a cruncher of some type to reduce the pattern of 8,000,000,000 bits into an equation somehow. Sure this would take time, I realize that. The equation would then be transmitted to the other computer where it would use its mad-math-skillz to *figure out the answer* which would theoretically be the same pattern of bits. Thus the same file would emerge on the other end. The real question here is how long would it take for a regular computer to do this kind of math? Just a weird idea I had. If it's a good idea then please consider this intellectual property. LOL -- Landon Stewart lstew...@superb.net SuperbHosting.Net by Superb Internet Corp. Toll Free (US/Canada): 888-354-6128 x 4199 Direct: 206-438-5879 Web hosting and more Ahead of the Rest: http://www.superbhosting.net
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Wed, May 18, 2011 at 8:03 PM, Heath Jones hj1...@gmail.com wrote: My point here is it IS possible to transfer just a hash and counter value and effectively generate identical data at the remote end. The limit that will be hit is the difficulty of generating and comparing hash values with current processing power. I'm proposing iterating through generated data up until the actual data. It's not even a storage issue, as once you have incremented the data you don't need to store old data or hash values - just the counter. No massive hash tables. It's a CPU issue. i'd note it took you many more packets than my example of roughly the same thing. if you really want to save bandwidth, my 1 packet answer is the best answer.
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
Compression is one result. But this is sometimes referred to as the inverse problem: Given a set of data tell me a function which fits it (or fits it to some tolerance.) It's important in statistics and all kinds of data analyses. Another area is fourier transforms which basically sums sine waves of different amp/freq until you reach the desired fit. This is also the basis of a lot of noise filtering algorithms, throw out the frequencies you don't want, such as 60HZ or 50HZ, or all those smaller than you consider interesting, high-freq noise, or low freq noise, whatever. Another buzz term is data entropy, randomness. If the data were perfectly random then there exists no such function which can be represented in less bits than the original data, which is why you can't compress a compressed file indefinitely and also why it's recommended you compress files before encrypting them, it's hard to begin cracking a file which is pretty close to random. And this is what you do when you give something like a MARC or ISBN or Dewey Decimal index to a librarian and s/he brings you the book you want. Effectively you've represented the entire book as that small number. Imagine if you had to recite the entire text of a book to find it unambiguously! See: Transfinite Number Systems. -- -Barry Shein The World | b...@theworld.com | http://www.TheWorld.com Purveyors to the Trade | Voice: 800-THE-WRLD| Dial-Up: US, PR, Canada Software Tool Die| Public Access Internet | SINCE 1989 *oo*
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
On Thu, May 19, 2011 at 12:26:26AM +0100, Heath Jones wrote: I wonder if this is possible: - Take a hash of the original file. Keep a counter. - Generate data in some sequential method on sender side (for example simply starting at 0 and iterating until you generate the same as the original data) - Each time you iterate, take the hash of the generated data. If it matches the hash of the original file, increment counter. - Send the hash and the counter value to recipient. - Recipient performs same sequential generation method, stopping when counter reached. Any thoughts? That will work. Of course, the CPU usage will be overwhelming -- longer than the age of the universe to do a large file -- but, theoretically, with enough CPU power, it will work. For a 8,000,000,000 bit file and a 128 bit hash, you will need a counter of at least 7,999,999,872 bits to cover the number of possible collisions. So you will need at leat 7,999,999,872 + 128 = 8,000,000,000 bits to send your 8,000,000,000 bit file. If your goal is to reduce the number of bits you send, this wouldn't be a good choice. -- Brett
Re: Had an idea - looking for a math buff to tell me if it's possible with today's technology.
Ha! I was wondering this the whole time - if the size of the counter would make it a zero sum game. That sux! :) On 19 May 2011 03:52, Brett Frankenberger rbf+na...@panix.com wrote: On Thu, May 19, 2011 at 12:26:26AM +0100, Heath Jones wrote: I wonder if this is possible: - Take a hash of the original file. Keep a counter. - Generate data in some sequential method on sender side (for example simply starting at 0 and iterating until you generate the same as the original data) - Each time you iterate, take the hash of the generated data. If it matches the hash of the original file, increment counter. - Send the hash and the counter value to recipient. - Recipient performs same sequential generation method, stopping when counter reached. Any thoughts? That will work. Of course, the CPU usage will be overwhelming -- longer than the age of the universe to do a large file -- but, theoretically, with enough CPU power, it will work. For a 8,000,000,000 bit file and a 128 bit hash, you will need a counter of at least 7,999,999,872 bits to cover the number of possible collisions. So you will need at leat 7,999,999,872 + 128 = 8,000,000,000 bits to send your 8,000,000,000 bit file. If your goal is to reduce the number of bits you send, this wouldn't be a good choice. -- Brett