Re: Module for base 85 encoding

2008-11-25 Thread Scott Gifford
Darian Anthony Patrick [EMAIL PROTECTED] writes:

 Chris Dolan wrote:
 Yes. RFC 1924 specifies a way to convert an IPv6 address to ASCII, by
 treating it as 128 bit integer, writing the number in base 85, then
 expressing
 each base 85 digit as an ASCII character

 btoa and PDFs break up a stream of bytes into ASCII by treating it as 32
 bit
 integers (4 bytes become 5 ASCII characters), and (in the case of PDFs)
 then
 representing last odd 1 to 3 bytes as 2 to 4 ASCII characters. They also
 use
 a different subset of printable ASCII from RFC 1924

You may note that RFC 1924 was issued on April 1 1996, so any
reference to it should be done with tongue firmly in cheek.

Highlights:

7. Implementation Issues

   Many current processors do not find 128 bit integer arithmetic, as
   required for this technique, a trivial operation.  This is not
   considered a serious drawback in the representation, but a flaw of
   the processor designs.

   It may be expected that future processors will address this defect,
   quite possibly before any significant IPv6 deployment has been
   accomplished.

8. Security Considerations

   By encoding addresses in this form, it is less likely that a casual
   observer will be able to immediately detect the binary form of the
   address, and thus will find it harder to make immediate use of the
   address.  As IPv6 addresses are not intended to be learned by
   humans, one reason for which being that they are expected to alter
   in comparatively short timespan, by human perception, the somewhat
   challenging nature of the addresses is seen as a feature.

   Further, the appearance of the address, as if it may be random
   gibberish in a compressed file, makes it much harder to detect by a
   packet sniffer programmed to look for bypassing addresses.

The second paragraph of (7) seems particularly prescient, given that
this RFC was written 12 years ago.  :-)

Scott.


Re: Module for base 85 encoding

2008-11-24 Thread Ricardo SIGNES
* Nicholas Clark [EMAIL PROTECTED] [2008-11-24T11:30:28]
 So I'm not sure what to call it.

String::Base85 seems reasonable.

-- 
rjbs


Re: Module for base 85 encoding

2008-11-24 Thread Darian Anthony Patrick
Nicholas Clark wrote:
 I've written a module that implements the base 85 encoding used by the old
 btoa program, and by PDFs as their Ascii85 encoding*
 
 I'm not sure what to call it. It's functionally equivalent interface to
 MIME::Base64, but this isn't a MIME standard, so that's not the correct top
 level to live under. It is, arguably, an encoding module, but it isn't the
 interface of Encode, which is what modules under the Encode top level provide.
 
 So I'm not sure what to call it.
 
 Nicholas Clark
 
 * http://en.wikipedia.org/wiki/Ascii85

Maybe Math::Base85, following the placement of Math::Base36?

-- 
Darian Anthony Patrick[EMAIL PROTECTED]
=
88FC 044D 5144 BD3A DAF8 FD9F 8C9E DF14 9AD3 4117
=
* Signed and encrypted communications preferred.


Re: Module for base 85 encoding

2008-11-24 Thread Nicholas Clark
On Mon, Nov 24, 2008 at 11:49:04AM -0500, Darian Anthony Patrick wrote:
 Darian Anthony Patrick wrote:
  Nicholas Clark wrote:
  I've written a module that implements the base 85 encoding used by the old
  btoa program, and by PDFs as their Ascii85 encoding*
 
  I'm not sure what to call it. It's functionally equivalent interface to
  MIME::Base64, but this isn't a MIME standard, so that's not the correct top
  level to live under. It is, arguably, an encoding module, but it isn't the
  interface of Encode, which is what modules under the Encode top level 
  provide.
 
  So I'm not sure what to call it.
 
  Nicholas Clark
 
  * http://en.wikipedia.org/wiki/Ascii85
  
  Maybe Math::Base85, following the placement of Math::Base36?
  
 
 Hmm.  Looks like there already is a Math::Base85.  Is your
 implementation different from the existing Math::Base85?

Yes. RFC 1924 specifies a way to convert an IPv6 address to ASCII, by
treating it as 128 bit integer, writing the number in base 85, then expressing
each base 85 digit as an ASCII character

btoa and PDFs break up a stream of bytes into ASCII by treating it as 32 bit
integers (4 bytes become 5 ASCII characters), and (in the case of PDFs) then
representing last odd 1 to 3 bytes as 2 to 4 ASCII characters. They also use
a different subset of printable ASCII from RFC 1924

Nicholas Clark


Re: Module for base 85 encoding

2008-11-24 Thread Chris Dolan
 On Mon, Nov 24, 2008 at 10:42:07AM -0600, Chris Dolan wrote:

 I don't have a good name recommendation, but I do know there is a
 PDF-specific implementation within this CPAN module:
   http://search.cpan.org/src/MHOSKEN/Text-PDF-0.29a/lib/Text/PDF/Filter.pm
 I use that filter within my own CAM::PDF module.

 I infer that you only use it for input, because as best I can tell, the
 output
 filter does not work: http://rt.cpan.org/Public/Bug/Display.html?id=41085

 Nicholas Clark

Hah, thanks for pointing that out, I hadn't noticed. You're right, I think
I've only used it for reading. It's not a commonly used feature in PDF
anyway since gzip is an obviously better filter.
Chris



Re: Module for base 85 encoding

2008-11-24 Thread Chris Dolan
 Yes. RFC 1924 specifies a way to convert an IPv6 address to ASCII, by
 treating it as 128 bit integer, writing the number in base 85, then
 expressing
 each base 85 digit as an ASCII character

 btoa and PDFs break up a stream of bytes into ASCII by treating it as 32
 bit
 integers (4 bytes become 5 ASCII characters), and (in the case of PDFs)
 then
 representing last odd 1 to 3 bytes as 2 to 4 ASCII characters. They also
 use
 a different subset of printable ASCII from RFC 1924

 Nicholas Clark

Given that, I second RJBS' suggestion of String::Base85 (or perhaps
String::Ascii85 for better searchability)
Chris



Re: Module for base 85 encoding

2008-11-24 Thread Darian Anthony Patrick
Chris Dolan wrote:
 Yes. RFC 1924 specifies a way to convert an IPv6 address to ASCII, by
 treating it as 128 bit integer, writing the number in base 85, then
 expressing
 each base 85 digit as an ASCII character

 btoa and PDFs break up a stream of bytes into ASCII by treating it as 32
 bit
 integers (4 bytes become 5 ASCII characters), and (in the case of PDFs)
 then
 representing last odd 1 to 3 bytes as 2 to 4 ASCII characters. They also
 use
 a different subset of printable ASCII from RFC 1924

 Nicholas Clark
 
 Given that, I second RJBS' suggestion of String::Base85 (or perhaps
 String::Ascii85 for better searchability)
 Chris
 

I concur with the latter (String::Ascii85) to draw attention to the fact
that it does not implement the RFC 1924 version.  Maybe also mention
that difference, and Math::Base85, in the docs.

-- 
Darian Anthony Patrick[EMAIL PROTECTED]
=
88FC 044D 5144 BD3A DAF8 FD9F 8C9E DF14 9AD3 4117
=
* Signed and encrypted communications preferred.


Re: Module for base 85 encoding

2008-11-24 Thread Nicholas Clark
On Mon, Nov 24, 2008 at 10:42:07AM -0600, Chris Dolan wrote:

 I don't have a good name recommendation, but I do know there is a
 PDF-specific implementation within this CPAN module:
   http://search.cpan.org/src/MHOSKEN/Text-PDF-0.29a/lib/Text/PDF/Filter.pm
 I use that filter within my own CAM::PDF module.

I infer that you only use it for input, because as best I can tell, the output
filter does not work: http://rt.cpan.org/Public/Bug/Display.html?id=41085

Nicholas Clark


Re: Module for base 85 encoding

2008-11-24 Thread David Nicol
On Mon, Nov 24, 2008 at 12:56 PM, Bill Ward [EMAIL PROTECTED] wrote:
 How about contacting the owner of Math::Base85 and see if you can somehow
 join forces?  It seems to me this would make sense to have a single module
 that can output in either format.

1: RFC 1924 is an April fool.  Those who are not aware of that need to be.
See http://en.wikipedia.org/wiki/Ascii85#RFC_1924_version

2: joining forces or subclassing Math::Base85 or otherwise using it
is silly as it would create a finished good containing more glue than
wood.  http://search.cpan.org/src/TMONROE/Math-Base85-0.2/Base85.pm
reveals two trivial functions, both quite readable and therefore not
particularly inspired in terms of efficiency hacks; for instance the
letter=number encoding could be done as an array of values indexed
by chr($d) or pack/unpack with the N specifier for translating
between binary strings and 32-bit big-endian, presumably both features
present in Mr. Clark's to-be-unveiled String::Base85 module.  Unless
he followed MIME::Base64's example closely and wrote them in C.



-- 
signature closed for repair


Re: Module for base 85 encoding

2008-11-24 Thread David Nicol
On Mon, Nov 24, 2008 at 1:26 PM, David Nicol [EMAIL PROTECTED] wrote:
 present in Mr. Clark's to-be-unveiled String::Base85 module.  Unless

I meant to write String::Ascii85, I stand with Darian Anthony Patrick
in that preference.