Re: [Haskell-cafe] Why so many strings in Network.URI, System.Posix and similar libraries?

2012-03-14 Thread Jason Dusek
2012/3/12 Jeremy Shaw jer...@n-heptane.com: On Sun, Mar 11, 2012 at 1:33 PM, Jason Dusek jason.du...@gmail.com wrote: Well, to quote one example from RFC 3986:  2.1.  Percent-Encoding   A percent-encoding mechanism is used to represent a data octet in a   component when that octet's

Re: [Haskell-cafe] Why so many strings in Network.URI, System.Posix and similar libraries?

2012-03-14 Thread Graham Klyne
Hi, I only just noticed this discussion. Essentially, I think you have arrived at the right conclusion regarding URIs. For more background, the IRI document makes interesting reading in this context: http://tools.ietf.org/html/rfc3987; esp. sections 2, 2.1. The IRI is defined in terms of

Re: [Haskell-cafe] Why so many strings in Network.URI, System.Posix and similar libraries?

2012-03-12 Thread Joey Hess
Jason Dusek wrote: :info System.Posix.Env.getEnvironment System.Posix.Env.getEnvironment :: IO [(String, String)] -- Defined in System.Posix.Env But there is no law that environment variables must be made of characters: The recent ghc release provides

Re: [Haskell-cafe] Why so many strings in Network.URI, System.Posix and similar libraries?

2012-03-11 Thread Jason Dusek
2012/3/11 Jeremy Shaw jer...@n-heptane.com: Also, URIs are not defined in terms of octets.. but in terms of characters. If you write a URI down on a piece of paper -- what octets are you using? None.. it's some scribbles on a paper. It is the characters that are important, not the bit

Re: [Haskell-cafe] Why so many strings in Network.URI, System.Posix and similar libraries?

2012-03-11 Thread Brandon Allbery
On Sun, Mar 11, 2012 at 14:33, Jason Dusek jason.du...@gmail.com wrote: The syntax of URIs is a mechanism for describing data octets, not Unicode code points. It is at variance to describe URIs in terms of Unicode code points. You might want to take a glance at RFC 3492, though. -- brandon

Re: [Haskell-cafe] Why so many strings in Network.URI, System.Posix and similar libraries?

2012-03-11 Thread Jason Dusek
2012/3/11 Brandon Allbery allber...@gmail.com: On Sun, Mar 11, 2012 at 14:33, Jason Dusek jason.du...@gmail.com wrote: The syntax of URIs is a mechanism for describing data octets, not Unicode code points. It is at variance to describe URIs in terms of Unicode code points. You might want

Re: [Haskell-cafe] Why so many strings in Network.URI, System.Posix and similar libraries?

2012-03-11 Thread Jason Dusek
2012/3/11 Thedward Blevins thedw...@barsoom.net: On Sun, Mar 11, 2012 at 13:33, Jason Dusek jason.du...@gmail.com wrote: The syntax of URIs is a mechanism for describing data octets, not Unicode code points. It is at variance to describe URIs in terms of Unicode code points. This claim is

Re: [Haskell-cafe] Why so many strings in Network.URI, System.Posix and similar libraries?

2012-03-11 Thread Brandon Allbery
On Sun, Mar 11, 2012 at 23:05, Jason Dusek jason.du...@gmail.com wrote: Although the intent of the spec is to represent characters, I contend it does not succeed in doing so. Is it wise to assume more semantics than are actually there? It is not; one of the reasons that many experts

Re: [Haskell-cafe] Why so many strings in Network.URI, System.Posix and similar libraries?

2012-03-11 Thread Jeremy Shaw
Argh. Email fail. Hopefully this time I have managed to reply-all to the list *and* keep the unicode properly intact. Sorry about any duplicates you may have received. On Sun, Mar 11, 2012 at 1:33 PM, Jason Dusek jason.du...@gmail.com wrote: 2012/3/11 Jeremy Shaw jer...@n-heptane.com: Also,

Re: [Haskell-cafe] Why so many strings in Network.URI, System.Posix and similar libraries?

2012-03-11 Thread Jason Dusek
2012/3/12 Jeremy Shaw jer...@n-heptane.com: The syntax of URIs is a mechanism for describing data octets, not Unicode code points. It is at variance to describe URIs in terms of Unicode code points. Not sure what you mean by this. As the RFC says, a URI is defined entirely by the identity

[Haskell-cafe] Why so many strings in Network.URI, System.Posix and similar libraries?

2012-03-10 Thread Jason Dusek
The content of URIs is defined in terms of octets in the RFC, and all Posix interfaces are byte streams and C strings, not character strings. Yet in Haskell, we find these objects exposed with String interfaces: :info Network.URI.URI data URI = URI {uriScheme :: String, uriAuthority

Re: [Haskell-cafe] Why so many strings in Network.URI, System.Posix and similar libraries?

2012-03-10 Thread Jeremy Shaw
It is mostly because those libraries are far older than Text and ByteString, so String was the only choice at the time. Modernizing them is good.. but would also break a lot of code. And in many core libraries, the functions are required to have String types in order to be Haskell 98 compliant.