[Haskell-cafe] ANNOUNCE: dbus 0.10
This is completing the merger of the dbus-core and dbus-client packages. The new package does everything they can do, but better. Hackage: http://hackage.haskell.org/package/dbus Homepage: https://john-millikin.com/software/haskell-dbus/ API reference: https://john-millikin.com/software/haskell-dbus/reference/haskell-dbus/0.10/ Examples: Included in the tarball, and also at https://john-millikin.com/branches/haskell-dbus/0.10/head:/examples/ Notable changes: * The module hierarchy was simplified. Most users will only ever need the DBus and DBus.Client modules. * Exports which were not used or useful have been removed. * Much better documentation -- most exports now have a description, and several have examples. * The source code isn't literate any more. Too many people expressed frustration at trying to contribute, so I just converted it all to standard flat .hs files. * Better support for custom socket transports and authentication mechanisms. * Added support for listening for socket connections. This allows users to implement both sides of a peer-peer session. * Various improvements to performance, including moving from binary to cereal. * Various minor improvements to correctness. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Towards a single, unified API for incremental data processing
There are currently several APIs for processing strict monoidal values as if they were pieces of a larger, lazy value. Some of the most popular are based on Oleg's left-fold enumerators, including the iteratee, enumerator, iterIO. Other choices include comonads, conduits, and pipes. Despite having various internal implementations and semantics, these libraries generally export a similar-looking API. This is a terrible duplication of effort, and it causes dependant packages to be strongly tied to the underlying implementation. I propose that a new package, tzinorot, be written to provide a single API based on Data.List. It should be pretty easy to use, requiring only a few common extensions to the type system. For example, the enumerator package's 'mapM' function could be generalized for use in tzinorot through a few simple modifications to the type signature: -- -- enumerator mapM mapM :: Monad m = (ao - m ai) - Enumeratee ao ai m b -- tzinorot mapM mapM :: (Monad m, Tzinorot t, ListLike l1 a1, ListLike l2 a2) = (l1 a1 - m (l2 a2)) - t Void s (TzinorotItems (l1 a1)) (TzinorotItems (l2 a2)) m r -- To make it easier to install and use the tzinorot package, it will depend on all of its supported implementations (iteratee, enumerator, conduits, pipes, etc), and use Michael Snoyman's cabala tool to manage dependency versions. See the cabala announcement for details on use: http://www.yesodweb.com/blog/2012/04/replacing-cabal ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANNOUNCE: options, an easy-to-use command-line option parser
Hackage: http://hackage.haskell.org/package/options Home page: https://john-millikin.com/software/haskell-options/ API reference: https://john-millikin.com/software/haskell-options/reference/haskell-options/latest/ The options package lets library and application developers easily work with command-line options. The following example is a full program that can accept two options, --message and --quiet: -- {-# LANGUAGE TemplateHaskell #-} import Options defineOptions MainOptions $ do stringOption optMessage message Hello world! A message to show the user. boolOption optQuiet quiet False Whether to be quiet. main :: IO () main = runCommand $ \opts args - do if optQuiet opts then return () else putStrLn (optMessage opts) -- -- $ ./hello Hello world! $ ./hello --message='ciao mondo' ciao mondo $ ./hello --quiet $ -- In addition, this library will automatically create documentation options such as --help and --help-all: -- $ ./hello --help Help Options: -h, --help Show option summary. --help-all Show all help options. Application Options: --message A message to show the user. --quiet Whether to be quiet. -- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: system-filepath 0.4.5 and system-fileio 0.3.4
On Mon, Feb 6, 2012 at 10:05, Joey Hess j...@kitenet.net wrote: John Millikin wrote: That was my understanding also, then QuickCheck found a counter-example. It turns out that there are cases where a valid path cannot be roundtripped in the GHC 7.2 encoding. The issue is that [238,189,178] decodes to 0xEF72, which is within the 0xEF00-0xEFFF range that GHC uses to represent un-decodable bytes. How did you deal with this in system-filepath? I used 0xEF00 as an escape character, to mean the following char should be interpreted as a literal byte. A user pointed out that there is a problem with this solution also -- a path containing actual U+EF00 will be considered invalid encoding. I'm going to change things over to use the Python 3 solution -- they use part of the UTF16 surrogate pair range, so it's impossible for a valid path to contain their stand-in characters. Another user says that GHC 7.4 also changed its escape range to match Python 3, so it seems to be a pseudo-standard now. That's really good. I'm going to add a 'posix_ghc704' rule to system-filepath, which should mean that only users running GHC 7.2 will have to worry about escape chars. Unfortunately, the text package refuses to store codepoints in that range (it replaces them with a placeholder), so I have to switch things over to use [Char]. (Yak sighted! Prepare lather!) While no code points in the Supplementary Special-purpose Plane are currently assigned (http://www.unicode.org/roadmaps/ssp/), it is worrying that it's used, especially if filenames in a non-unicode encoding could be interpreted as containing characters really within this plane. I wonder why maxBound :: Char was not increased, and the addtional space after `\1114111' used for the un-decodable bytes? There's probably a lot of code out there that assumes (maxBound :: Char) is also the maximum Unicode code point. It would be difficult to update, particularly when dealing with bindings to foreign libraries (like the text-icu package). Both Python 3 and GHC 7.4 are using codepoints in the UTF16 surrogate pair range for this, and that seems like a pretty clean solution. For FFI, anything that deals with a FilePath should use this withFilePath, which GHC contains but doesn't export(?), rather than the old withCString or withCAString: import GHC.IO.Encoding (getFileSystemEncoding) import GHC.Foreign as GHC withFilePath :: FilePath - (CString - IO a) - IO a withFilePath fp f = getFileSystemEncoding = \enc - GHC.withCString enc fp f If code uses either withFilePort or withCString, then the filenames withFilePath? written will depend on the user's locale. This is wrong. Filenames are either non-encoded text strings (Windows), UTF8 (OSX), or arbitrary bytes (non-OSX POSIX). They must not change depending on the locale. This is exactly how GHC 7.4 handles them. For example: openDirStream :: FilePath - IO DirStream openDirStream name = withFilePath name $ \s - do dirp - throwErrnoPathIfNullRetry openDirStream name $ c_opendir s return (DirStream dirp) removeLink :: FilePath - IO () removeLink name = withFilePath name $ \s - throwErrnoPathIfMinus1_ removeLink name (c_unlink s) I do not see any locale-dependant behavior in the filename bytes read/written. Perhaps I'm misunderstanding, but the definition of 'withFilePath' you provided is definitely locale-dependent. Unless getFileSystemEncoding is constant? Code that reads or writes a FilePath to a Handle (including even to stdout!) must take care to set the right encoding too: fileEncoding :: Handle - IO () fileEncoding h = hSetEncoding h = getFileSystemEncoding This is also wrong. A file path cannot be written to a handle with any hope of correct behavior. If it's to be displayed to the user, a path should be converted to text first, then displayed. Sure it can. See find(1). Its output can be read as FilePaths once the Handle is set up as above. If you prefer your program not crash with an encoding error when an arbitrary FilePath is putStr, but instead perhaps output bytes that are not valid in the current encoding, that's also a valid choice. You might be writing a program, like find, that again needs to output any possible FilePath including badly encoded ones. A program like find(1) has two use cases: 1. Display paths to the user, as text. 2. Provide paths to another program, in the operating system's file path format. These two goals are in conflict. It is not possible to implement a find(1) that performs both correctly in all locales. The best solution is to choose #2, and always write in the OS format, and hope the user's shell+terminal are capable of rendering it to a reasonable-looking path. Filesystem.Path.CurrentOS.toText is a nice option if you want validly encoded output though. Thanks for that! Ah, that's not what toText is for. toText provides a human-readable representation of the path. It's
[Haskell-cafe] ANNOUNCE: system-filepath 0.4.5 and system-fileio 0.3.4
Both packages now have much-improved support for non-UTF8 paths on POSIX systems. There are no significant changes to Windows support in this release. system-filepath 0.4.5: Hackage: http://hackage.haskell.org/package/system-filepath-0.4.5 API reference: https://john-millikin.com/software/haskell-filesystem/reference/system-filepath/0.4.5/ system-fileio 0.3.4: Hackage: http://hackage.haskell.org/package/system-fileio-0.3.4 API reference: https://john-millikin.com/software/haskell-filesystem/reference/system-fileio/0.3.4/Filesystem/ - In GHC 7.2 and later, file path handling in the platform libraries was changed to treat all paths as text (encoded according to locale). This does not work well on POSIX systems, because POSIX paths are byte sequences. There is no guarantee that any particular path will be valid in the user's locale encoding. system-filepath and system-fileio were modified to partially support this new behavior, but because the underlying libraries were unable to represent certain paths, they were still broken when built with GHC 7.2+. The changes in this release mean that they are now fully compatible (to the best of my knowledge) with GHC 7.2 and 7.4. Important changes: * system-filepath has been converted from GHC's escaping rules to its own, more compatible rules. This lets it support file paths that cannot be represented in GHC 7.2's escape format. * The POSIX layer of system-fileio has been completely rewritten to use the FFI, rather than System.Directory. This allows it to work with arbitrary POSIX paths, including those that GHC itself cannot handle. The Windows layer still uses System.Directory, since it seems to work properly. * The POSIX implementation of createTree will no longer recurse into directory symlinks that it does not have permission to remove. This is a change in behavior from the directory package's implementation. See http://www.haskell.org/pipermail/haskell-cafe/2012-January/098911.html for details and the reasoning behind the change. Since Windows does not support symlinks, I have not modified the Windows implementation (which uses removeDirectoryRecursive). ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: system-filepath 0.4.5 and system-fileio 0.3.4
On Sun, Feb 5, 2012 at 18:49, Joey Hess j...@kitenet.net wrote: John Millikin wrote: In GHC 7.2 and later, file path handling in the platform libraries was changed to treat all paths as text (encoded according to locale). This does not work well on POSIX systems, because POSIX paths are byte sequences. There is no guarantee that any particular path will be valid in the user's locale encoding. I've been dealing with this change too, but my current understanding is that GHC's handling of encoding for FilePath is documented to allow arbitrary undecodable bytes to be round-tripped through it. As long as FilePaths are read using this file system encoding, any FilePath should be usable even if it does not match the user's encoding. That was my understanding also, then QuickCheck found a counter-example. It turns out that there are cases where a valid path cannot be roundtripped in the GHC 7.2 encoding. -- $ ~/ghc-7.0.4/bin/ghci Prelude writeFile .txt test Prelude readFile .txt test Prelude $ ~/ghc-7.2.1/bin/ghci Prelude import System.Directory Prelude System.Directory getDirectoryContents . [\61347.txt,\61347.txt,..,.] Prelude System.Directory readFile \61347.txt *** Exception: .txt: openFile: does not exist (No such file or directory) Prelude System.Directory -- The issue is that [238,189,178] decodes to 0xEF72, which is within the 0xEF00-0xEFFF range that GHC uses to represent un-decodable bytes. For FFI, anything that deals with a FilePath should use this withFilePath, which GHC contains but doesn't export(?), rather than the old withCString or withCAString: import GHC.IO.Encoding (getFileSystemEncoding) import GHC.Foreign as GHC withFilePath :: FilePath - (CString - IO a) - IO a withFilePath fp f = getFileSystemEncoding = \enc - GHC.withCString enc fp f If code uses either withFilePort or withCString, then the filenames written will depend on the user's locale. This is wrong. Filenames are either non-encoded text strings (Windows), UTF8 (OSX), or arbitrary bytes (non-OSX POSIX). They must not change depending on the locale. Code that reads or writes a FilePath to a Handle (including even to stdout!) must take care to set the right encoding too: fileEncoding :: Handle - IO () fileEncoding h = hSetEncoding h = getFileSystemEncoding This is also wrong. A file path cannot be written to a handle with any hope of correct behavior. If it's to be displayed to the user, a path should be converted to text first, then displayed. * system-filepath has been converted from GHC's escaping rules to its own, more compatible rules. This lets it support file paths that cannot be represented in GHC 7.2's escape format. I'm dobutful about adding yet another encoding to the mix. Things are complicated enough already! And in my tests, GHC 7.4's FilePath encoding does allow arbitrary bytes in FilePaths. Unlike the GHC encoding, this encoding is entirely internal, and should not change the API's behavior. BTW, GHC now also has RawFilePath. Parts of System.Directory could be usefully written to support that data type too. For example, the parent directory can be determined. Other things are more difficult to do with RawFilepath. This is new in 7.4, and won't be backported, right? I tried compiling the new unix package in 7.2 to get proper file path support, but it failed with an error about some new language extension. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: system-filepath 0.4.5 and system-fileio 0.3.4
On Sun, Feb 5, 2012 at 19:17, John Millikin jmilli...@gmail.com wrote: -- $ ~/ghc-7.0.4/bin/ghci Prelude writeFile .txt test Prelude readFile .txt test Prelude Sorry, that got a bit mangled in the email. Corrected version: -- $ ~/ghc-7.0.4/bin/ghci Prelude writeFile \xA3.txt test Prelude readFile \xA3.txt test Prelude writeFile \xEE\xBE\xA3.txt test 2 Prelude readFile \xEE\xBE\xA3.txt test 2 -- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Something's weird about System.Directory.removeDirectoryRecursive
http://hackage.haskell.org/packages/archive/directory/1.1.0.1/doc/html/System-Directory.html#v:removeDirectoryRecursive The documentation says that removeDirectoryRecursive follows symlinks. However, the implementation doesn't (in most cases, see below). $ mkdir test-removeDirectoryRecursive $ cd test-removeDirectoryRecursive $ mkdir a b $ touch a/a.txt b/b.txt $ ln -s $PWD/b a/ $ ln -s $PWD/b/b.txt a/ $ ls -l a total 8.2k -rw-rw-r-- 1 john john 0 2012-01-27 23:33 a.txt lrwxrwxrwx 1 john john 65 2012-01-27 23:33 b - /home/john/test-removeDirectoryRecursive/b lrwxrwxrwx 1 john john 71 2012-01-27 23:34 b.txt - /home/john/test-removeDirectoryRecursive/b/b.txt # OK, a/ has a normal file and two symlinks in it. Let's recursively remove a/ and see what happens. $ ghci Prelude import System.Directory Prelude System.Directory removeDirectoryRecursive a Prelude System.Directory Leaving GHCi. $ ls -l a b ls: cannot access a: No such file or directory b: total 0 -rw-rw-r-- 1 john john 0 2012-01-27 23:33 b.txt # a/ was removed -- good! # # b/ and its contents are untouched, good, but goes against the docs Now, there is one case where this function *will* follow symlinks. However, I believe it is a bug because it produces odd behavior: $ sudo mkdir a [sudo] password for john: $ sudo ln -s $PWD/b a/ $ ls -l a b a: total 4.1k lrwxrwxrwx 1 root root 65 2012-01-27 23:38 b - /home/john/test-removeDirectoryRecursive/b b: total 0 -rw-rw-r-- 1 john john 0 2012-01-27 23:33 b.txt # Now a/ has a symlink, which cannot be deleted, because its containing directory is read-only to the current user. $ rm a/b rm: remove symbolic link `a/b'? y rm: cannot remove `a/b': Permission denied # What happens if removeDirectoryRecursive is called now? $ ghci Prelude import System.Directory Prelude System.Directory removeDirectoryRecursive a *** Exception: a/b: removeDirectory: permission denied (Permission denied) Prelude System.Directory Leaving GHCi. $ ls -l a b a: total 4.1k lrwxrwxrwx 1 root root 65 2012-01-27 23:38 b - /home/john/test-removeDirectoryRecursive/b b: total 0 # a/ is untouched, but b/ has been emptied! So what is the expected behavior of this function? What should it do in the presence of symlinks? IMO, the function should be documented as *not* following symlinks, and the directory check should be changed so that it returns False for symlink-to-directory. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] mapping an Enumerator
The presence of (Step b m r) is an artifact of Haskell's type system. It can be removed through use of language extensions and 'forall' to give a more aesthetically pleasing signature, but there should be no behavioral difference. On Wed, Dec 21, 2011 at 03:26, Michael Snoyman mich...@snoyman.com wrote: On Wed, Dec 21, 2011 at 12:59 PM, Kannan Goundan kan...@cakoose.com wrote: Michael Snoyman wrote: On Wed, Dec 21, 2011 at 12:35 PM, Kannan Goundan kan...@cakoose.com wrote: I'm using the Data.Enumerator library. I'm trying to write a map function that converts an Enumerator of one type to another. Something like: mapEnum :: Monad m = (a - b) - Enumerator a m r - Enumerator b m r Any hints? (My exact use case is that I have a ByteString enumerator and I need to pass it to something that requires a Blaze.Builder enumerator.) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe You can use the Data.Enumerator.List.map function to create an Enumeratee, and then the Data.Enumerator.$= operators to join them together. Something like: mapEnum f enum = enum $= EL.map f I tried something like that but the resulting type isn't quite what I'm looking for. mapEnum :: Monad m = (a - b) - Enumerator a m (Step b m r) - Enumerator a m r (BTW, Michael, my exact use case is that I have ByteString enumerators, but your HTTP-Enumerator library wants Blaze.Builder enumerators :-) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe Huh, I'm stumped ;). John: is this possible in enumerator? In general though: do you need precisely that type signature? Most of the time, Enumerators have polymorphic return types. It might be a problem from http-enumerator requiring (), but I *think* we can work around that. Michael ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: Anansi 0.4.2 (literate programming pre-processor)
On Tue, Dec 13, 2011 at 03:39, Magnus Therning mag...@therning.org wrote: 1. What to call files? I understand (C)WEB suggests using .w, and that noweb uses .nw, what should I call anansi files? I usually use .anansi, but it doesn't matter. You can use whatever extensions you like, or even none at all. 2. Combining anansi and pandoc works quite well for HTML, but it fails miserably when trying to use the generated LaTeX: markdown2pdf: ! LaTeX Error: Command \guillemotleft unavailable in encoding OT1. Is there any good way to get around that? The LaTeX loom is designed to output basic markup that can be turned into a PDF with minimum fuss. It probably won't work as-is for more advanced cases, such as when a user wants to use custom templates, or has to inter-operate with pseudo-LaTeX parsers like Pandoc. You could try copying the LaTeX loom into your own code, modifying it to generate the custom output format you want, and then running it as a #!runhaskell script. 3. Is there any editor support for anansi, syntax highlihgting etc? Not that I know of. Note that Anansi's syntax itself is very minimal, so what you need is an editor that can support formatting a file using multiple syntaxes. I don't know enough about editor modification to figure out which editors support such a feature, or how to enable it. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANNOUNCE: Anansi 0.4.2 (literate programming pre-processor)
Anansi is a preprocessor for literate programs, in the model of NoWeb or nuweb. Literate programming allows both computer code and documentation to be generated from a single unified source. Home page: https://john-millikin.com/software/anansi/ Hackage: http://hackage.haskell.org/package/anansi-0.4.2 - This release has a couple cool new features, suggested by Dirk Laurie. Markdown loom === Markdown, a lightweight markup language similar to ReStructuredText, is used often on web forums. Use [[ :loom anansi.markdown ]] in your documents to enable. User-defined line pragmas Users can now add, modify, and disable line pragmas in tangled output based on file extension. This makes it much easier to combine code and data in a single source file. By default, line pragmas are enabled for the C, C++, C#, Haskell, Perl, and Go languages. Customizing the pragma format is easy. Use the ${line}, ${path}, and ${quoted-path} substitutions in an :option. This example code will insert comments into tangled Python code: --- :# Insert comments into any tangled output file with a name ending in .py :option anansi.line-pragma-py=# Generated from ${path} line ${line} --- To disable line pragmas for a particular file type, just set its pragma format to an empty string: --- :option anansi.line-pragma-pl= --- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Will changing associativity of enumerator's ($=) affect anyone? (also: enumerator mailing list)
enumerator 0.4.15, which includes this change, is now published. Hackage: http://hackage.haskell.org/package/enumerator-0.4.15 Home page: https://john-millikin.com/software/enumerator/ Important changes since 0.4.14: * Fix an error in UTF-16 decoding, which could cause truncated output if the first four bytes are a surrogate pair. * Change associativity of ($=) from infixr 0 to infixl 1. * When decoding invalid text, throw an error rather than silently truncate the output. Full change log at https://john-millikin.com/software/enumerator/#changelog ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Will changing associativity of enumerator's ($=) affect anyone? (also: enumerator mailing list)
A user recently suggested changing the associativity of ($=) from [[ infixr 0 ]] to [[ infixl 1 ]]. This allows the following expressions to be equivalent: run $ enumerator $$ enumeratee =$ enumeratee =$ iteratee run $ enumerator $= enumeratee $$ enumeratee =$ iteratee run $ enumerator $= enumeratee $= enumeratee $$ iteratee Although this is technically a backward-incompatible change, I feel it's small enough that it could go in a minor release *if nobody depends on the current behavior*. So, if anybody using 'enumerator' will have code broken by this change, please speak up. If I don't hear anything in a week or so, I'll assume it's all clear and will cut the new release. - Second, I was asked whether there's a mailing list for enumerator stuff. To my knowledge there isn't, so I started one on librelist. To subscribe and/or post, send an email to haskell.enumera...@librelist.com . Archives are available at http://librelist.com/browser/haskell.enumerator/ . I plan to make release announcements there (for releases not important enough for haskell-cafe), and it might be useful for general discussion. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Package documentation complaints -- and a suggestion
The package summary is Type-safe ADT-database mapping library., which gives some idea about what it does. In my experience, any package that starts its source files with {-# LANGUAGE GADTs, TypeFamilies, ExistentialQuantification, StandaloneDeriving, TypeSynonymInstances, MultiParamTypeClasses, FunctionalDependencies, FlexibleInstances, FlexibleContexts, OverlappingInstances, ScopedTypeVariables, GeneralizedNewtypeDeriving, UndecidableInstances, EmptyDataDecls #-} is probably an experiment in what is possible, rather than a production-friendly library. Many people upload experimental packages to Hackage so that they can be used by other interested people, even though the packages are not ready/intended for mass consumption. A lack of documentation in such cases is understandable. I wonder if it would be worth giving package uploaders control over whether their packages are shown on the package list? Packages can be manually hidden by emailing an admin, but that's a lot of trouble. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANNOUNCE: knob, a library for memory-backed handles
http://hackage.haskell.org/package/knob This is a small package which allows you to create memory-backed handles. I found it as a pattern in a few of my test suites, so I spent a day or so polishing it up before posting it to the internet. Feel free to play around with it and tell me about any issues that pop up, or things you wish it could do. The interface is very simple; here's a taste: -- import Data.ByteString (pack) import Data.Knob import System.IO main = do -- for this example, we'll start with an empty knob knob - newKnob (pack []) -- write to it withFileHandle knob test.txt WriteMode $ \h - do hPutStrLn h Hello world! bytes - Data.Knob.getContents knob putStrLn (Wrote bytes: ++ show bytes) -- read from it withFileHandle knob test.txt ReadMode $ \h - do line - hGetLine h putStrLn (Got line: ++ show line) -- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Ambiguous module name `System.Directory'
Note that once you upgrade it (to =0.4), you'll still need to remove the older version to fix the error. I wish cabal-install defaulted to hiding every package it installs. The current behavior of exposing every installed module is unreasonable and confusing. Packages should be namespaces, not just installation aliases. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Lifting an enumerator
The type signature liftEnum :: (Monad m, MonadTrans t) = Enumerator a m b - Enumerator a (t m) b expands to: liftEnum :: (Monad m, MonadTrans t) = (Step a m b - Iteratee a m b) - Step a (t m) b - Iteratee a (t m) b So you could implement it iff you can define: lower :: (Monad m, MonadTrans t) = t m a - m a Which is not possible given the standard MonadTrans, but maybe possible with a custom restricted typeclass such as your MonadTransControl. On Wed, Aug 24, 2011 at 07:02, Michael Snoyman mich...@snoyman.com wrote: Hi all, Max asked earlier[1] how to create a new instance of a class in Persistent using a monad transformer. Without getting into the specific details of persistent, I wanted to pose a question based on a much more general question: how can we lift the inner monad of an enumerator? We can easily do so for an Iteratee[2], but there is nothing to allow it for an Enumerator. At first glance, this problem looks very similar to the shortcomings of MonadIO when dealing with callbacks. In that case, you cannot use liftIO on a function that takes an `IO a` as a parameter. A solution to this issue is monad-control[3], which can be used to allow exception catching, memory allocation, etc. So I'm wondering: can we come up with a similar solution to this issue with enumerators? I have a working solution for the specific case of the ErrorT monad[4], but it would be great to be able to generalize it. Bonus points if we could express this in terms of the typeclasses already provided by monad-control. Michael [1] http://groups.google.com/group/yesodweb/browse_thread/thread/be2a77217a7f3343 [2] http://hackage.haskell.org/packages/archive/enumerator/0.4.14/doc/html/Data-Enumerator.html#v:liftTrans [3] http://hackage.haskell.org/package/monad-control [4] https://gist.github.com/1168128 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: Chell: A quiet test runner (low-output alternative to test-framework)
I have, but it's not quite what I'm looking for: - I don't want to silence HUnit's output, I just don't want anything to show on the console when a test *passes*. Showing output on a failure is good. - I'm not interested in BDD. Not to say it's not useful, but it doesn't match my style of testing (which uses mostly pass/fail assertions and properties). On Thu, Aug 11, 2011 at 07:18, Greg Weber g...@gregweber.info wrote: Hi John, I am wondering if you have seen the hspec package? [1] It seems to solve all the problems you are with chell, including that it silences Hunit output. We are using it for all the Yesod tests now. Thanks, Greg Weber [1]: http://hackage.haskell.org/packages/archive/hspec/0.6.1/doc/html/Test-Hspec.html ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: Chell: A quiet test runner (low-output alternative to test-framework)
I tried, actually, but couldn't figure out how to separate running the test from printing its output. All the attempted patches turned into huge refactoring marathons. When given the choice between sending a huge replace all your code with my code patch, and just releasing a separate package, I prefer to do the second. There's usually a reason a library behaves as it does, and this way both behaviors are available to users (even if I find one frustrating). On Wed, Aug 10, 2011 at 23:51, Max Bolingbroke batterseapo...@hotmail.com wrote: On 11 August 2011 05:17, John Millikin jmilli...@gmail.com wrote: This is just a quick package I whipped up out of frustration with test-framework scrolling an error message out of sight, for the millionth time. Patches to make test-framework less noisy (either by default or with a flag) will be gratefully accepted, if anyone wants to give it a go :-) Max ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: Chell: A quiet test runner (low-output alternative to test-framework)
On Thu, Aug 11, 2011 at 07:52, Greg Weber g...@gregweber.info wrote: It silences HUnit's output, but will tell you what happens when there is a failure- which I think is what you want. There are a few available output formatters if you don't like the default output, or you can write your own output formatter. I'm a bit confused. From what I can tell, HUnit does not output *anything* just from running a test -- the result has to be printed manually. What are you silencing? BDD is really a red herring. Instead of using function names to name tests you can use strings, which are inherently more descriptive. In chell you already have `assertions numbers`, in hspec it would be `it numbers`. The preferred style it to remove `test test_Numbers and the test_Numbers definition` which are redundant in this case, and instead place that inline where you define the suite, although that is optional. So I really can't tell any difference betwee BDD and pass/fail assertions. You still just use assertions in hspec. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: Chell: A quiet test runner (low-output alternative to test-framework)
On Thu, Aug 11, 2011 at 08:17, Greg Weber g...@gregweber.info wrote: I am confused also, as to both what output you don't like that motivated chell and what exactly hspec silences :) Suffice to say I am able to get a small relevant error message on failure with hspec. I am adding the hspec maintainer to this e-mail- he can answer any of your questions. The output I didn't like wasn't coming from HUnit, it was coming from the test aggregator I used (test-framework). It prints one line per test case run, whether it passed or failed. That means every time I ran my test suite, it would print *thousands* of lines to the terminal. Any failure immediately scrolled up and out of sight, so I'd have to either Ctrl-C and hunt it down, or wait for the final report when all the tests had finished running. Chell does the same thing as test-framework (aggregates tests into suites, runs them, reports results), but does so quietly. It only reports failed and aborted tests. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: Chell: A quiet test runner (low-output alternative to test-framework)
Possible -- I ran into dependency conflicts between t-f/t-f-q/quickcheck when trying to migrate to test-framework 0.4, so I clamped all my test subprojects to 0.3. On Thu, Aug 11, 2011 at 09:09, Nathan Howell nathan.d.how...@gmail.com wrote: Is this different than the --hide-successes flag for test-framework? Looks like it was added a few months back: https://github.com/batterseapower/test-framework/commit/afd7eeced9a4777293af1e17eadab4bf485fd98f ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANNOUNCE: Chell: A quiet test runner (low-output alternative to test-framework)
Homepage: https://john-millikin.com/software/chell/ Hackage: http://hackage.haskell.org/package/chell This is just a quick package I whipped up out of frustration with test-framework scrolling an error message out of sight, for the millionth time. Chell has the same general purpose (aggregate your assertions + properties + whatever into a single executable), but only prints when a test fails or aborts. It also has a small built-in test library, similar to HUnit, so you don't need to depend on 2-3 separate libraries if you're just doing simple tests. Cool features thereof: * it reports the line number of failed assertions * you can use $expect instead of $assert, so even if it fails, the test keeps going (all the failures are printed at the end) * you can add notes to a test, which are saved in logs and reports. you can put in any sort of metadata you want (nice for figuring out why a test is failing) * assertions for text diffs, so if you're testing two big chunks of text for equality you don't have to copy+paste to see what's different. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANNOUNCE: dbus-core 0.9
D-Bus implementation for Haskell (dbus-core_0.9) This is the first significant release to dbus-core in a while, and it contains some significant improvements to both the API and implementation. Users are advised to upgrade, as the 0.8 branch will no longer be receiving improvements or corrections. * Release announcement: https://john-millikin.com/releases/dbus-core/0.9/ * API documentation: http://hackage.haskell.org/package/dbus-core * Source code (literate PDF): https://john-millikin.com/downloads/dbus-core_0.9.pdf * Source code (tarball): https://john-millikin.com/downloads/dbus-core_0.9.tar.gz Changes in release 0.9 == The biggest change in this release is that dbus-client has been merged into dbus-core. The original purpose of dbus-client (as a very high-level, often-updated support library) never materialized, and the split packaging caused issues with dependencies and versioning. The parts of dbus-client that were commonly used were moved into the DBus.Client and DBus.Client.Simple modules. Users are highly encouraged to use these modules in their applications. Other changes and improvements include: * The Array, Dictionary, and Structure types are no longer needed for most users. You can cast containers directly to corresponding standard types, such as Vector and Map. * Smart constructors for magic strings have been renamed so they're easier to read and type; for example, mkBusName is now just busName. * Support for custom transports and authentication mechanisms. * Removed some seldom-used modules, such as MatchRule and NameReservation. Their functionality has been merged into the new client modules. * Performance improvements for marshaling and unmarshaling messages. How to use the new simple client API The new API makes calling and exporting methods very easy; take a look at these examples! Getting a list of connected clients --- {-# LANGUAGE OverloadedStrings #-} module Main (main) where import Data.List (sort) import DBus.Client.Simple main :: IO () main = do client - connectSession -- Request a list of connected clients from the bus bus - proxy client org.freedesktop.DBus /org/freedesktop/DBus reply - call bus org.freedesktop.DBus ListNames [] -- org.freedesktop.DBus.ListNames returns a single value, which is -- a list of names. let Just names = fromVariant (reply !! 0) -- Print each name on a line, sorted so reserved names are below -- temporary names. mapM_ putStrLn (sort names) Exporting methods - {-# LANGUAGE OverloadedStrings #-} module Main (main) where import Control.Concurrent (threadDelay) import Control.Monad (forever) import DBus.Client.Simple onEcho :: String - IO String onEcho str = do putStrLn str return str onDivide :: Double - Double - IO Double onDivide x y = if y == 0 then throwError com.example.DivisionByZero Divided by zero [] else return (x / y) main :: IO () main = do client - connectSession -- Request a unique name on the bus. If the name is already -- in use, continue without it. requestName client com.example.ExportExample [] -- Export an example object at /divide export client /divide [ method com.example.Echo Echo onEcho , method com.example.Math Divide onDivide ] -- Wait forever for method calls forever (threadDelay 5) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Request for comments: dbus-core 0.9
I think I'm pretty close to releasing the next version of my D-Bus library, dbus-core. This is a big release that changes a lot of the API, so I'd like to see if anybody 1) has any problems with the new APIs or 2) has any suggested improvements. haddock: https://john-millikin.com/temp/dbus-core_0.9/ tarball: https://john-millikin.com/temp/dbus-core_0.9.tar.gz pdf: https://john-millikin.com/temp/dbus-core_0.9.pdf examples: https://john-millikin.com/temp/dbus-core_0.9/examples/ Notable improvements in this release are: • Merges dbus-core and dbus-client. The modules DBus.Client and DBus.Client.Simple provide a very nice, high-level API to common D-Bus operations. Note: I've reverted to the dbus-client_0.3 API for this, since not many users liked the monad-based API introduced in 0.4. • The class Variable was split into IsAtom, IsValue, and IsVariant. Together these let the rest of the type conversion API go away -- you no longer have to worry about Array, Structure, or Dictionary (except in a few, rare cases). • Support for custom transports (eg, a pipe or in-memory buffer). • Improved support for custom authentication mechanisms. Also, while the docs are obviously not finished, I'm interested to know if anyone finds any parts of the library confusing -- those'll get extra doc love. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Imports in complex Setup.hs -- should we encourage/require PackageImports?
Several libraries (notably Pandoc and Gtk2hs) have very complex Setup.hs scripts, which import several external libraries. In my experience, these imports are very fragile, because Cabal does not enforce package visibility in Setup.hs. For example, a Setup.hs that imports Control.Monad.Trans will break if monads-tf is installed, and one that imports System.FilePath will break if system-filepath is installed. My typical solution when this happens is to manually tweak the GHC package database before installing, but this is annoying and does not help other users. Based on a ticket in Cabal's Trac http://hackage.haskell.org/trac/hackage/ticket/326 , custom Setup.hs scripts are discouraged by the Cabal developers. I assume this means there will not be much development effort put towards an integrated solution (such as using -hide-all-packages and build-depends: when compiling Setup.hs). A possible solution is to ask developers with complex Setup.hs requirements to use the PackageImports language extension when importing external libraries. However, this places a burden on such developers, and I don't know if it's portable to non-GHC compilers. It would also need an analysis of currently published Hackage packages to see which have such scripts. Any ideas/comments? Has anyone by chance found a good solution to this? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] No fish, please
Is there any way to indicate to Hackage that it should not try to generate Haddock documentation? I'm concerned for two use cases for packages using a different docs system: 1) A user might see the commentless auto-generated haddock and believe the package is undocumented. 2) A user might find the haddock through Google, and not realize there's real documentation available elsewhere. Purposfully causing Hackage's haddock step to fail will mark the package as a build failure, which gives a bad impression to potential users. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Question about the Monad instance for Iteratee (from the enumerator package)
On Tuesday, April 26, 2011 7:19:25 AM UTC-7, John Lato wrote: I'd be interested to see the results of a shootout between iteratee and enumerator. I would expect them to be basically equivalent most of the time, with maybe two or three operations with a small (but consistent) difference one way or the other. I did some basic benchmarks a few months ago; if I remember correctly, it depends almost entirely on how well GHC optimizes CPS on a particular platform. The relative performace was very similar to Lennart Kolmodin's benchmarks of binary at http://lennartkolmodin.blogspot.com/2011/02/binary-by-numbers.html . In particular, CPS/iteratee is faster on 32-bit, while state passing/enumerator is faster on 64-bit. This difference exists for almost all operations, and was on the order of 5-15% depending on the shape of the input. I couldn't figure out a good way to benchmark the libraries themselves when there's so much noise from the compiler. I'm waiting for GHC7 to stabilise a bit before doing another benchmark -- the LLVM backend and new CPS handling should make both implementations faster, once they're a bit more mature. I really hope CPS becomes as fast on 64-bit as it is on 32-bit, because it's a much cleaner implementation. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Data.Enumerator.Text.utf8 not constant memory?
*sigh* Another fine entry for john-millikin-is-an-idiot.txt Thank you for the patch Felipe, and for the bug report Skirmantas. I have uploaded 0.4.10 to Hackage. My sincere apologies for the inconvenience. On Mon, Apr 25, 2011 at 19:03, Felipe Almeida Lessa felipe.le...@gmail.com wrote: [CC'ing John Millikin, enumerator's maintainer] On Mon, Apr 25, 2011 at 7:10 PM, Skirmantas Kligys skirmantas.kli...@gmail.com wrote: I expected to be able to do what SAX does in Java, i.e. to avoid loading the whole 2 gigabytes into memory. For warm-up, I wrote an iteratee to count lines in the file, and it does load the whole file into memory! After profiling, I see that the problem was Data.Enumerator.Text.utf8, it allocates up to 60 megabytes when run on a 40 megabyte test file. It seems to me that this is a bug in enumerator's strict fold not being strict at all =). The current version 0.4.9.1 of Data.Enumerator.List.fold is -- | Consume the entire input stream with a strict left fold, one element -- at a time. -- -- Since: 0.4.8 fold :: Monad m = (b - a - b) - b - Iteratee a m b fold step = continue . loop where f = L.foldl' step loop acc stream = case stream of Chunks [] - continue (loop acc) Chunks xs - continue (loop (f acc xs)) EOF - yield acc EOF Note that the list fold is strict (f = Data.List.foldl' step), *however* the acc parameter of loop isn't strict at all! It just creates a big, fat thunk with references to all of you input =(. But the fix is extremely easy, just change the 'Chunks xs' line to Chunks xs - continue (loop $! f acc xs) Using only your iterLinesWc test with a 105 MiB file (a movie I had lying around), with enumerator's definition it takes 220 MiB of memory and 1.3~1.5 seconds according to +RTS -s. By doing only this very change above, it takes 2 MiB of memory (100x improvement :P) and 0.8~0.9 seconds. John Millikin, could you please apply the attached patch? =) Cheers, -- Felipe. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Why not Darcs?
My chief complaint is that it's built on patch theory, which is ill-defined and doesn't seem particularly useful. The Bazaar/Git/Mercurial DAG model is much easier to understand and work with. Possibly as a consequence of its shaky foundation, Darcs is much slower than the competition -- this becomes noticeable for even very small repositories, when doing a lot of branching and merging. I think it's been kept alive in the Haskell community out of pure eat our dogfood instinct; IMO if having a VCS written in Haskell is important, it would be better to just write a new implementation of an existing tool. Of course, nobody cares that much about what language their VCS is written in, generally. Beyond that, the feeling I get of the three major DVCS alternatives is: git: Used by Linux kernel hackers, and Rails plugin developers who think they're more important than Linux kernel hackers hg/bzr: Used by people who don't like git's UI, and flipped heads/tails when picking a DVCS (hg and bzr are basically equivalent) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Question about the Monad instance for Iteratee (from the enumerator package)
John Lato's iteratee package is based on IterateeMCPS.hs[1]. I used IterateeM.hs for enumerator, because when I benchmarked them the non-CPS version was something like 10% faster on most operations. The new IterateeM.hs solves some problems with the old encoding, but I haven't switched enumerator to it yet because it would break backwards compatibility. [1] http://okmij.org/ftp/Haskell/Iteratee/IterateeMCPS.hs ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Why not Darcs?
On Thursday, April 21, 2011 4:16:07 PM UTC-7, John Meacham wrote: Um, the patch theory is what makes darcs just work. There is no need to understand it any more than you have to know VLSI design to understand how your computer works. The end result is that darcs repositories don't get corrupted and the order you integrate patches doesn't affect things meaning cherrypicking is painless. This is how it's *supposed* to work. My chief complaints with PT are: - Metadata about branches and merges gets lost. This makes later examination of the merge history impossible, or at least unfeasibly difficult. - Every commit needs --ask-deps , because the automatic dependency detector can only detect automatic changes (and not things like adding a new function in a different module) - The order patches are integrated still matters (it's impossible for it to not matter), but there's no longer any direct support for ordering them, so large merges become very manual. - If you ever merge in the wrong order, future merges will begin consuming more and more CPU time until the repository dies. Undoing this requires using darcs-fastconvert and performing manual surgery on the export files. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Question about the Monad instance for Iteratee (from the enumerator package)
It's forbidden for an iteratee to yield extra input that it hasn't consumed; however, this is unfortunately not enforced by the type system. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] HXT and xhtml page encoded in cp1251
Since the document claims it is HTML, you should be parsing it with an HTML parser. Try hxt-tagsoup -- specifically, the parseHtmlTagSoup arrow. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] using IO monad in Iteratee
Use enumHandle. enumFile deals with the common case of read from the filesystem in IO. It can't deal with general MonadIO monads because there'd be no guarantee that the handle would actually be closed (eg, an ErrorT IO might never run the cleanup). If you need a special monad, do something like: withBinaryFile $ \h - runMyMonad (run_ (enumHandle h $$ myIter)) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Encoding-aware System.Directory functions
On Wednesday, March 30, 2011 12:18:48 PM UTC-7, Bas van Dijk wrote: It would also be great to have a package which combines the proper encoding/decoding of filepaths of the system-filepath package with the type-safety of the pathtype package: http://hackage.haskell.org/package/pathtype Does that package actually work well? I don't see how it can; it's not possible to determine whether a path like /foo/bar or C:\foo\bar refers to a file or directory, so any user input has to be [[ Path ar fd ]]. And since the filesystem's out of our control, even functions like [[ checkType :: Path ar fd - IO (Either (FilePath ar) (DirPath ar)) can't provide any meaningful result. And that's before getting into UNIX symlinks, which can be files and directories at the same time. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Encoding-aware System.Directory functions
On Wednesday, March 30, 2011 9:07:45 AM UTC-7, Michael Snoyman wrote: Thanks to you (and everyone else) for the informative responses. For now, I've simply hard-coded in UTF-8 encoding for all non-Windows systems. I'm not sure how this will play with OSes besides Windows and Linux (especially Mac), but it's a good stop-gap measure. Linux, OSX, and (probably?) FreeBSD use UTF8. It's *possible* for a Linux file path to contain arbitrary bytes, but every application I've ever seen just gives up and writes [[invalid character]] symbols when confronted with such. OSX's chief weirdness is that its GUI programs swap ':' and '/' when displaying filenames. So the file hello:world.txt will show up as hello/world.txt in Finder. It also performs Unicode normalization on your filenames, which is mostly harmless but can have unexpected results on unicode-naïve applications like rsync.** I don't know how its normalization interacts with invalid file paths, or whether it even allows such paths to be written. Window's weirdness is its multi-root filesystem, and also that it distinguishes between absolute and non-relative paths. The Windows path /foo.txt is *not* absolute and *not* relative. I've never been able to figure out how Windows does Unicode; it seems to have a half-dozen APIs for it, all subtly different, and not a single damn one displays anything but ???.txt when I download anything east-Asian. I *do* think it would be incredibly useful to provide alternatives to all the standard operations on FilePath which used opaque datatypes and properly handles filename encoding. I noticed John Millikin's system-filepath package[1]. Do people have experience with it? It seems that adding a few functions like getDirectoryContents, plus adding a version of toString which performs some character decoding, would get us pretty far. system-filepath was my frustration with the somewhat bizarre behavior of some functions in filepath; I designed it to match the Python os.path API pretty closely. I don't think it has any client code outside of my ~/bin , so changing its API radically shouldn't cause any drama. I'd prefer filesystem manipulation functions be put in a separate library (perhaps system-directory?), to match the current filepath/directory split. If it's to contain encoding-aware functions, I think they should be Text-only. The existing String-based are just to interact with legacy functions in System.IO, and should be either renamed to toChar8/fromChar8 or removed entirely. My vote to the second -- if someone needs Char8 strings, they can convert from the ByteString version explicitly. -- -- | Try to decode a FilePath to Text, using the current locale encoding. If -- the filepath is invalid in the current locale, it is decoded as ASCII and -- any non-ASCII bytes are replaced with a placeholder. -- -- The returned text is useful only for display to the user. It might not be -- possible to convert back to the same or any 'FilePath'. toText :: FilePath - Text -- | Try to encode Text to a FilePath, using the current locale encoding. If -- the text cannot be represented in the current locale, returns 'Nothing'. fromText :: Text - Maybe FilePath -- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Encoding-aware System.Directory functions
On Wed, Mar 30, 2011 at 21:07, Ivan Lazar Miljenovic ivan.miljeno...@gmail.com wrote: On 31 March 2011 14:51, John Millikin jmilli...@gmail.com wrote: Linux, OSX, and (probably?) FreeBSD use UTF8. For Linux, doesn't it depend upon the locale rather than forcing UTF-8? In theory, yes. There are environment to specify the locale encoding, and some applications attempt to obey them. In practice, no. Both Qt and GTK+ use UTF8 internally, and react poorly when run on a non-UTF8 system. Every major distribution sets the locale encoding to UTF8. Setting a non-UTF8 encoding requires digging through various undocumented configuration files, and even then many applications will simply ignore it and use UTF8 anyway. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
0.4.9 has been uploaded to cabal, with the new operators. Changes are in the replied-to post (and also quoted below), plus the new operators proposed by Kazu Yamamoto. Here's the corresponding docs (they have examples!) -- -- | @enum =$ iter = 'joinI' (enum $$ iter)@ -- -- #x201c;Wraps#x201d; an iteratee /inner/ in an enumeratee /wrapper/. -- The resulting iteratee will consume /wrapper/#x2019;s input type and -- yield /inner/#x2019;s output type. -- -- Note: if the inner iteratee yields leftover input when it finishes, -- that extra will be discarded. -- -- As an example, consider an iteratee that converts a stream of UTF8-encoded -- bytes into a single 'TL.Text': -- -- consumeUTF8 :: Monad m = Iteratee ByteString m Text -- -- It could be written with either 'joinI' or '(=$)': -- -- import Data.Enumerator.Text as ET -- -- consumeUTF8 = joinI (decode utf8 $$ ET.consume) -- consumeUTF8 = decode utf8 =$ ET.consume -- -- Since: 0.4.9 -- | @enum $= enee = 'joinE' enum enee@ -- -- #x201c;Wraps#x201d; an enumerator /inner/ in an enumeratee /wrapper/. -- The resulting enumerator will generate /wrapper/#x2019;s output type. -- -- As an example, consider an enumerator that yields line character counts -- for a text file (e.g. for source code readability checking): -- -- enumFileCounts :: FilePath - Enumerator Int IO b -- -- It could be written with either 'joinE' or '($=)': -- -- import Data.Text as T -- import Data.Enumerator.List as EL -- import Data.Enumerator.Text as ET -- -- enumFileCounts path = joinE (enumFile path) (EL.map T.length) -- enumFileCounts path = enumFile path $= EL.map T.length -- -- Since: 0.4.9 -- Minor release note -- 0.4.9 and 0.4.9.1 are the exact same code; I just forgot a @ in one of the new docs and had to re-upload so Hackage would haddock properly. There is no difference in behavior. On Monday, March 28, 2011 10:50:45 PM UTC-7, John Millikin wrote: Since the release, a couple people have sent in feature requests, so I'm going to put out 0.4.9 in a day or so. New features will be: - tryIO: runs an IO computation, and converts any exceptions into ``throwError`` calls (requested by Kazu Yamamoto) - checkContinue: encapsulates a common pattern (loop (Continue k) = ...) when defining enumerators - mapAccum and mapAccum: sort of like map and mapM, except the step function is stateful (requested by Long Huynh Huu) Anyone else out there sitting on a request? Please send them in -- I am always happy to receive them, even if they must be declined. --- Also, I would like to do a quick poll regarding operators. 1. It has been requested that I add operator aliases for joinI and joinE. 2. There have been complaints that the library defines too many operators (currently, 5). Do any existing enumerator users, or anyone for that matter, have an opinion either way? The proposed operators are: -- infixr 0 =$ infixr 0 $= (=$) :: Monad m = Enumeratee ao ai m b - Iteratee ai m b - Iteratee ao m b enum =$ iter = joinI (enum $$ iter) ($=) :: Monad m = Enumerator ao m (Step ai m b) - Enumeratee ao ai m b - Enumerator ai m b ($=) = joinE -- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On Sunday, March 27, 2011 9:45:23 PM UTC-7, Ertugrul Soeylemez wrote: For setting a global timeout on an entire session, it's better to wrap the ``run_`` call with ``System.Timeout.timeout`` -- this is more efficient than testing the time on every chunk, and does not require a specialised enumerator. It may be more efficient, but I don't really like it. I like robust applications, and to me killing a thread is always a mistake, even if the thread is kill-safe. ``timeout`` doesn't kill the thread, it just returns ``Nothing`` if the computation took longer than expected. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
Since the release, a couple people have sent in feature requests, so I'm going to put out 0.4.9 in a day or so. New features will be: - tryIO: runs an IO computation, and converts any exceptions into ``throwError`` calls (requested by Kazu Yamamoto) - checkContinue: encapsulates a common pattern (loop (Continue k) = ...) when defining enumerators - mapAccum and mapAccum: sort of like map and mapM, except the step function is stateful (requested by Long Huynh Huu) Anyone else out there sitting on a request? Please send them in -- I am always happy to receive them, even if they must be declined. --- Also, I would like to do a quick poll regarding operators. 1. It has been requested that I add operator aliases for joinI and joinE. 2. There have been complaints that the library defines too many operators (currently, 5). Do any existing enumerator users, or anyone for that matter, have an opinion either way? The proposed operators are: -- infixr 0 =$ infixr 0 $= (=$) :: Monad m = Enumeratee ao ai m b - Iteratee ai m b - Iteratee ao m b enum =$ iter = joinI (enum $$ iter) ($=) :: Monad m = Enumerator ao m (Step ai m b) - Enumeratee ao ai m b - Enumerator ai m b ($=) = joinE -- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On Sunday, March 27, 2011 8:38:38 AM UTC-7, John A. De Goes wrote: Enumeratees solve some use cases but not others. Let's say you want to incrementally compress a 2 GB file. If you use an enumeratee to do this, your transformer iteratee has to do IO. I'd prefer an abstraction to incrementally and purely produce the output from a stream of input. There's no reason the transformer has to do IO. Right now a lot of the interesting enumerator-based packages are actually bindings to C libraries, so they are forced to use IO, but there's nothing inherent in the enumeratee design to require it. For example, the text codec enumeratees encode and decode in Data.Enumerator.Text are pure. I'm working on ideas for writing pure enumeratees to bound libraries, but they will likely only work if the underlying library fully exposes its state, like zlib. Libraries with private or very complex internal states, such as libxml or expat, will probably never be implementable in pure enumeratees. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
If the receiver can only accept very small chunks, you can put a rechunking stage in between the compression and iteratee: --- verySmallChunks :: Monad m = Enumeratee ByteString ByteString m b verySmallSchunks = sequence (take 10) --- Resending is slightly more complex -- if the other end can say resend that last chunk, then it should be easy enough, but resend the last 2 bytes of that chunk you sent 5 minutes ago would be much harder. What is your use case? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On Mar 26, 10:46 am, Michael Snoyman mich...@snoyman.com wrote: As far as the left-over data in a yield issue: does that require a breaking API change, or a change to the definition of = which would change semantics?? It requires a pretty serious API change, as the definition of 'Iteratee' itself is at fault. Unfortunately, Oleg's new definitions also have problems (they can yield extra on a continue step), so I'm at a bit of a loss as to what to do. Either way, underlying primitives allow users to create iteratees with invalid/undefined behavior. Not very Haskell-y. All of the new high-level functions added in recent versions are part of an attempted workaround. I'd like to move the Iteratee definitions themselves to a ``Data.Enumerator.Internal`` module, and add some words discouraging their direct use. There would still be some API breaks (the == , $$, and == operators would go away) but at least clients wouldn't be subjected to a complete rewrite. Since the API is being broken anyway, I'm also going to take the opportunity to change the Stream type so it can represent EOF + some data. That should allow lots of interesting behaviors, such as arbitrary lookahead. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On 2011-03-26, Gregory Collins g...@gregorycollins.net wrote: Since the API is being broken anyway, I'm also going to take the opportunity to change the Stream type so it can represent EOF + some data. That should allow lots of interesting behaviors, such as arbitrary lookahead. The thing which I find is missing the most from enumerator as it stands is not this -- it's the fact that Iteratees sometimes need to allocate resources which need explicit manual deallocation (i.e. sockets, file descriptors, mmaps, etc), but because Enumerators are running the show, there is no local way to ensure that the cleanup/bracket routines get run on error. This hurts composability, because you are forced to either allocate these resources outside the body of the enumerator (where you can bracket run_) or play finalizer-on-mvar tricks with the garbage collector. This kind of sucks. I agree that it sucks, but it's a tradeoff of the left-fold enumerator design. Potential solutions are welcome. The iteratee package has an error constructor on the Stream type for this purpose; I think you could do that -- with the downside that you need to pattern-match against another constructor in mainline code, hurting performance -- or is there some other reasonable way to deal with it? I don't think this would help. Remember that the iteratee has *no* control whatsoever over its lifetime. There is no guarantee that a higher-level enumerator or enumeratee will actually feed it data until it has enough; the computation can be interrupted at any level. Looking at the iteratee package's Stream constructor, I think it doesn't do what you think it does. While it might help with resource management in a specific case, it won't help if (for example) an enumeratee above your iteratee decides to yield. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
Hello Ertugrul Söylemez, Good idea -- I've added an ``enumSocketTimed`` and ``iterSocketTimed`` to the network-enumerator package at http://hackage.haskell.org/package/network-enumerator . ``enumSocketTimed`` is equivalent to your ``enumHandleTimeout``, but instead of Handle uses the more efficient Socket type. For setting a global timeout on an entire session, it's better to wrap the ``run_`` call with ``System.Timeout.timeout`` -- this is more efficient than testing the time on every chunk, and does not require a specialised enumerator. The signatures/docs are: -- | Enumerate binary data from a 'Socket', using 'recv'. The socket must -- be connected. -- -- The buffer size should be a small power of 2, such as 4096. -- -- If any call to 'recv' takes longer than the timeout, 'enumSocketTimed' -- will throw an error. To add a timeout for the entire session, wrap the -- call to 'E.run' in 'timeout'. -- -- Since: 0.1.2 enumSocketTimed :: MonadIO m = Integer -- ^ Buffer size - Integer -- ^ Timeout, in microseconds - S.Socket - E.Enumerator B.ByteString m b -- | Write data to a 'S.Socket', using 'sendMany'. The socket must be connected. -- -- If any call to 'sendMany' takes longer than the timeout, 'iterSocketTimed' -- will throw an error. To add a timeout for the entire session, wrap the -- call to 'E.run' in 'timeout'. -- -- Since: 0.1.2 iterSocketTimed :: MonadIO m = Integer -- ^ Timeout, in microseconds - S.Socket - E.Iteratee B.ByteString m () ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANNOUNCE: enumerator 0.4.8
- Enumerators are an efficient, predictable, and safe alternative to lazy I/O. Discovered by Oleg Kiselyov, they allow large datasets to be processed in near–constant space by pure code. Although somewhat more complex to write, using enumerators instead of lazy I/O produces more correct programs. http://hackage.haskell.org/package/enumerator http://john-millikin.com/software/enumerator/ - Hello -cafe, It's been a while since the last point release of enumerator. This one is sufficiently large that I think folks might want to know about it, and since I try not to spam too many announcements, I'll give a quick rundown on major changes in other 0.4.x versions as well. First, most of what I call list analogues -- enumerator-based versions of 'head', 'take', 'map', etc -- have been separated into three modules (Data.Enumerator.List, .Binary, and .Text) depending on what sorts of data they operate on. This separation has been an ongoing process throughout 0.4.x releases, and I think it's now complete. The old names in Data.Enumerator will continue to exist in 0.4.x versions, but will be removed in 0.5. Second, Gregory Collins and Ertugrul Soeylemez found a space leak in Iteratee's (=), which could cause eventual space exhaustion in some circumstances. If you use enumerators to process very large or infinite streams, you probably want to upgrade to version 0.4.7 or higher. Third, the source code PDF has seen some substantial improvement -- if you're interested in how the library is implemented, or have insomnia, read it at http://john-millikin.com/software/enumerator/enumerator_0.4.8.pdf Finally, there is a known issue in the current encoding of iteratees -- if an iteratee yields extra data but never consumed anything, that iteratee will violate the monad law of associativity. Oleg has updated his implementations to fix this problem, but since it would break a *lot* of dependent libraries, I'm holding off until the vague future of version 0.5. Since iteratees that yield extra data they didn't consume are invalid anyway, I hope this problem will not cause too much inconvenience. New features - * Range-limited binary file enumeration (requested + initial patch by Bardur Arantsson). * splitWhen , based on the split package http://hackage.haskell.org/package/split * 0.4.6: Typeable instances for most types (requested by Michael Snoyman) * 0.4.5: joinE , which simplifies enumerator/enumeratee composition (requested by Michael Snoyman) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] best way to deal with #defines when developing ffi's
When using c2hs, you can use the inline header commands to define global variables: -- #c const int HSMYLIBRARY_SOME_VAR = C_MY_VAR; const char *HSMYLIBRARY_OTHER_VAR = C_OTHER_VAR; #endc -- And then just bind them with foreign imports like any other variable: -- foreign import ccall HSMYLIBRARY_SOME_VAR someVar :: CInt foreign import ccall HSMYLIBRARY_OTHER_VAR otherVarPtr :: Ptr CChar otherVar :: String otherVar = unsafePerformIO (peekCString otherVarPtr) --___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Instancing Typeable for monad transformers?
Is there any reasonable way to do this if I want to cast a monadic value? For example: castState :: (Typeable a, Typeable s, Typeable1 m, Typeable b) = a - Maybe (StateT s m b) castState = Data.Typeable.cast None of the common monad transformers declare instances of Typeable, so I don't know if the concept itself even works. The use case here is one of my library users wants to return an Iteratee from code running in hint, which requires any extracted values be typeable. My first attempt at an extension-free instance is something like this: import Data.Enumerator import Data.Typeable instance (Typeable a, Typeable1 m) = Typeable1 (Iteratee a m) where typeOf1 i = rep where typed :: (a - b) - b - a - a typed _ _ a = a ia :: a - Iteratee a m b ia = undefined im :: m c - Iteratee a m b im = undefined rep = mkTyConApp (mkTyCon Data.Enumerator.Iteratee) [tyA, tyM] tyA = typeOf (typed ia i undefined) tyM = typeOf1 (typed im i undefined) which, besides being ugly, I have no idea if it's correct. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GPL License of H-Matrix and prelude numeric
On Tue, Jan 25, 2011 at 18:55, Chris Smith cdsm...@gmail.com wrote: Licensing with the GPL has definite consequences; for example, that the great majority of Haskell libraries, which are BSD3 licensed, may not legitimately declare dependencies on it. What are you talking about? Of course BSD3 libraries/applications can depend on GPL'd code. The only license Cabal allows that conflicts with the GPL is BSD4, which (to my knowledge) is not used by any software on Hackage. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GPL License of H-Matrix and prelude numeric
On Tue, Jan 25, 2011 at 21:11, David Leimbach leim...@gmail.com wrote: I am not an IP lawyer, but this is my understanding of the GPL and it's transitive relationship with bodies of work that aren't GPL'd. BSD3 doesn't really state anything about what it links with, but the GPL injects itself into the tree of stuff it's linked with via the derivative works clause. The consequence is that the entire derivative work becomes GPL'd as well, and those distributing the derivative work must adhere to the terms of the GPL for distribution and provide source. This is correct. If you create a derivative work of BSD3 and GPL code (eg, a binary), it's covered by both the BSD3 and GPL. The GPL requires you to provide source code for that binary. This is somewhat in the tradition of commercial middleware requiring a royalty fee or some per-installation licensing when a work is distributed that uses a particular library with such terms. In other words transitive licensing properties are NOT unique to the GPL. EVERY license propagates down the dependency chain. If a PublicDomain application depends on a BSD3 library, then compiled versions of that application must obey the BSD license's copyright, endorsement, and warranty clauses. It's disingenuous to suggest this only matters to GPL code. At least that's always been my understanding. A BSD3 library in isolation may still remain BSD3, but if it's not useful without being linked to a GPL'd library, then the point is kind of moot, except that someone is free to implement a replacement for the GPL'd part to avoid the transitive properties in the derivative work, in much the same way you could implement a free version of a commercial library (barring patent or other violations) to avoid transitive properties of the commercial license. Licensing is a property of the code, not the package; Cabal's licensing field is only a useful shorthand for most of the code here is covered by Many people write BSD3 code that depends on GPL libraries, because they believe 1) their code is not important enough to protect (eg, one-off libs) or 2) want to encourage commercial forks (eg, during exploratory design). Sorry to sound so pedantic about this, but I often field emails along the lines of: Hi! I really like some library and want to use it for a program I'm writing. But because its GPL I can't (the program is BSD). Can you please relicense your library so BSD code can use it? And then I have to email them back, and explain, and reassure them that the GPL is not actually the green-eyed bogeyman portrayed by some tabloids. At one point, I actually had to offer to sign/mail a physical letter just to convince some guy I wasn't trying to trick him. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GPL License of H-Matrix and prelude numeric
On Tue, Jan 25, 2011 at 21:51, Chris Smith cdsm...@gmail.com wrote: On Tue, 2011-01-25 at 21:41 -0800, John Millikin wrote: Licensing is a property of the code, not the package; Cabal's licensing field is only a useful shorthand for most of the code here is covered by That would be a very dangerous position to take. When the Cabal license field informs someone that something is licensed under the BSD, I think any reasonable person would assume that means ALL of the code is licensed under the BSD, and without added restrictions. We're being extremely unhelpful if developers have to read through every source file in every library they use, and all of its indirect dependencies, to make sure there's not an additional restriction in there somewhere. I don't think a reasonable person would assume that. Based on the almost universal habit of noting the license in source file comment headers, a reasonable programmer would know to check the status of any code he wants to copy into his own works. For example, if I copy some BSD3 code into one of my GPL'd libraries, that code remains BSD3 (and owned by the original author). The Cabal field will still say GPL, because most of the code is GPL, but some of it is not. Alternatively, I could bundle a GPL'd test script with BSD3 code. The code itself (and hence Cabal file) is BSD3, but not everything in the archive is. The package's dependencies are irrelevant, unless the package's code was itself copied from one of its deps. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GPL License of H-Matrix and prelude numeric
On Tue, Jan 25, 2011 at 22:07, Ivan Lazar Miljenovic ivan.miljeno...@gmail.com wrote: Voila: http://www.gnu.org/licenses/gpl-faq.html#IfLibraryIsGPL (Note: in the past they said otherwise.) Important: or a GPL-compatible license BSD3, MIT, PublicDomain, Apache, etc, are all GPL-compatible. The only GPL-incompatible licenses Cabal supports are BSD4 and AllRightsReserved, which are not used by anything on Hackage. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GPL License of H-Matrix and prelude numeric
On Tue, Jan 25, 2011 at 22:14, Ivan Lazar Miljenovic ivan.miljeno...@gmail.com wrote: However, my understanding that this property is then transitive: if Foo is GPL, Bar depends on Foo and Baz depends on Bar, then Baz must also be released under a GPL-compatible license. It's not really a must, just a matter of practicality. If you compile/link together code with incompatible licenses (BSD4 + GPL, GPL2-only + GPL3-only) then the resulting binary can't be legally distributed for any reason (because doing so would violate at least one of the licenses). You can still license the source code however you want, and distribute it; the problem is only for binaries. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GPL License of H-Matrix and prelude numeric
On Tue, Jan 25, 2011 at 22:20, Chris Smith cdsm...@gmail.com wrote: On Tue, 2011-01-25 at 21:48 -0800, John Millikin wrote: Please cite where the FSF claims the GPL applies to unrelated works just because they can compile against GPL'd code. Keep in mind that if your claim is correct, then every significant Linux and BSD distro is committing massive copyright infringement. I'm not sure what you're doing here. You yourself just wrote in another email: If you create a derivative work of BSD3 and GPL code (eg, a binary), it's covered by both the BSD3 and GPL. In practice, since the BSD3 contains no terms that aren't also contained in the GPL, that means it's covered by the GPL. The specific claim I'm refuting is that if some library or application depends on GPL'd code, that library/application must itself be GPL-licensed. This claim is simply not true. The GPL only applies to derived works, such as binaries or copied code. If you're actually confused, I'd be happy to compose a longer response with more details. But since you've just said yourself the same thing I was saying, I feel rather as though running off to collect quotations from the FSF web site would be a wild goose chase and not much worth the time. What part of this do you think needs support? Is it the definition of derived work to include linking? Something else? I think you're getting mixed up between a derived work and dependent library/application. Using the following code as a guide, the library hmatrix is BSD3, the dependent library matrixTools is X11, the dependent application matrixCalc is BSD3, and the derived work (binary) is all three. Just because some of the packages depend on a GPL'd library, doesn't mean *they* are GPL'd. They can be under whatever license the creator wants them to be. import Data.List data License = GPL | BSD3 | X11 | PublicDomain deriving (Show, Eq) data Library = Library String License [Library] -- name, license, deps deriving (Show, Eq) data Application = Application String License [Library] -- name, license, deps deriving (Show, Eq) data Binary = Binary String [License] -- name, licenses deriving (Show, Eq) compile :: Application - Binary compile (Application name license deps) = Binary (bin- ++ name) allLicenses where allLicenses = nub (license : concatMap libLicenses deps) libLicenses (Library _ l libdeps) = l : concatMap libLicenses libdeps main = do let base = Library base BSD3 [] -- hmatrix depends on base hmatrix = Library hmatrix GPL [base] -- matrixTools depends on hmatrix matrixTools = Library matrix-tools X11 [hmatrix] -- matrixCalc depends on base, hmatrix, and matrixTools matrixCalc = Application matrix-calc BSD3 [matrixTools, hmatrix, base] -- the compiled binary is a derived work of all four binary = compile matrixCalc print binary ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GPL License of H-Matrix and prelude numeric
On Tue, Jan 25, 2011 at 22:52, Chris Smith cdsm...@gmail.com wrote: On Tue, 2011-01-25 at 22:34 -0800, John Millikin wrote: The specific claim I'm refuting is that if some library or application depends on GPL'd code, that library/application must itself be GPL-licensed. This claim is simply not true. The GPL only applies to derived works, such as binaries or copied code. Well, binaries (among other things) are pretty much exactly what's at issue here. I don't think anyone disputes that you can copy and paste sections of BSD3 licensed source code into a new project, but that wasn't the point brought up. If you actually install the thing from Hackage, you build a binary, which links in code from the GPLed library, and distributing the result is covered by the terms of the GPL. It's not possible for a .cabal file to specify which license the final binaries will use -- it depends on what libraries are locally installed, what flags the build uses, and what the executables themselves link. The best Cabal could do is, after finishing cabal build on an executable, printing out which licenses apply: $ cabal build some-pkg --flags=enable-some-exc ... Executable 'some-exc' has licenses: BSD3 GPL3 PublicDomain Executable 'some-other-exc' has licenses: BSD3 GPL3 ** Note: 'some-other-exc' links against external system libraries. Additional libraries may apply. Executable 'another-exc' has licenses: MIT PublicDomain ... $ Since it's impossible for Cabal/Hackage to work as you describe, it's only sensible to interpret the license field as applying to the source code archive. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Adding a builder to the bytestring package?
Patch done and sent to the bytestring maintainers. For the interested, here's the benchmark chart for binary, cereal, and blaze-builder/bytestring: http://i.imgur.com/xw3TL.png On Wed, Jan 19, 2011 at 15:30, Johan Tibell johan.tib...@gmail.com wrote: On Thu, Jan 20, 2011 at 12:16 AM, John Millikin jmilli...@gmail.com wrote: blaze-builder already implements the binary builder interface, minus the putWord* functions. I think those would be trivial to reimplement on top of Write. Since it sounds like everyone agrees with / has already thought of moving Builder into bytestring, I'll start poking at a patch. Who is the current patch-reviewer for binary and bytestring? I'd suggest addressing the patch to Don Stewart, Duncan Coutts, and Lennart Kolmodin. Johan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Adding a builder to the bytestring package?
No units -- I generated the chart with Progression, which by default normalises the data so the first library (here, binary) in each benchmark is 1.0. It can also generate absolute-time charts: Runtime in seconds, grouped by benchmark: http://i.imgur.com/f0EOa.png Runtime in seconds, grouped by library: http://i.imgur.com/PXW97.png Benchmark source files attached, if you'd like to poke at them. On Sun, Jan 23, 2011 at 17:21, Conrad Parker con...@metadecks.org wrote: On 24 January 2011 07:29, John Millikin jmilli...@gmail.com wrote: Patch done and sent to the bytestring maintainers. For the interested, here's the benchmark chart for binary, cereal, and blaze-builder/bytestring: http://i.imgur.com/xw3TL.png Can has units? Conrad. module Main where import Control.DeepSeq import Data.Monoid import Criterion.Types import Progression.Config import Progression.Main import qualified Data.ByteString as B import qualified Data.ByteString.Lazy as BL import qualified Data.Binary.Builder as Binary instance NFData B.ByteString instance NFData BL.ByteString where rnf a = rnf (BL.toChunks a) bytes_100 :: B.ByteString bytes_100 = B.replicate 100 0x61 build_strict :: Int - B.ByteString build_strict n = BL.toStrict (Binary.toLazyByteString builder) where chunks = replicate n (Binary.fromByteString bytes_100) builder = foldr Binary.append Binary.empty chunks build_lazy :: Int - BL.ByteString build_lazy n = Binary.toLazyByteString builder where chunks = replicate n (Binary.fromByteString bytes_100) builder = foldr Binary.append Binary.empty chunks build_strict_mconcat :: Int - B.ByteString build_strict_mconcat n = BL.toStrict (Binary.toLazyByteString builder) where chunks = replicate n (Binary.fromByteString bytes_100) builder = mconcat chunks build_lazy_mconcat :: Int - BL.ByteString build_lazy_mconcat n = Binary.toLazyByteString builder where chunks = replicate n (Binary.fromByteString bytes_100) builder = mconcat chunks benchmarks :: [Benchmark] benchmarks = [ bench strict (nf build_strict 1000) , bench strict +mconcat (nf build_strict_mconcat 1000) , bench lazy (nf build_lazy 1000) , bench lazy +mconcat (nf build_lazy_mconcat 1000) ] main :: IO () main = defaultMainWith config (bgroup all benchmarks) config = Config { cfgMode = Nothing , cfgRun = RunSettings [] (Just 01_binary) , cfgGraph = mempty } module Main where import Control.DeepSeq import Data.Monoid import Criterion.Types import Progression.Config import Progression.Main import qualified Data.ByteString as B import qualified Data.ByteString.Lazy as BL import qualified Blaze.ByteString.Builder as Blaze import qualified Blaze.ByteString.Builder.Internal.Types as Blaze instance NFData B.ByteString instance NFData BL.ByteString where rnf a = rnf (BL.toChunks a) bytes_100 :: B.ByteString bytes_100 = B.replicate 100 0x61 builderEmpty :: Blaze.Builder builderEmpty = Blaze.Builder id {-# INLINE builderEmpty #-} builderAppend :: Blaze.Builder - Blaze.Builder - Blaze.Builder builderAppend (Blaze.Builder b1) (Blaze.Builder b2) = Blaze.Builder (b1 . b2) {-# INLINE builderAppend #-} builderConcat :: [Blaze.Builder] - Blaze.Builder builderConcat = foldr builderAppend builderEmpty {-# INLINE builderConcat #-} build_strict :: Int - B.ByteString build_strict n = Blaze.toByteString builder where chunks = replicate n (Blaze.fromByteString bytes_100) builder = builderConcat chunks build_lazy :: Int - BL.ByteString build_lazy n = Blaze.toLazyByteString builder where chunks = replicate n (Blaze.fromByteString bytes_100) builder = builderConcat chunks build_strict_mconcat :: Int - B.ByteString build_strict_mconcat n = Blaze.toByteString builder where chunks = replicate n (Blaze.fromByteString bytes_100) builder = mconcat chunks build_lazy_mconcat :: Int - BL.ByteString build_lazy_mconcat n = Blaze.toLazyByteString builder where chunks = replicate n (Blaze.fromByteString bytes_100) builder = mconcat chunks benchmarks :: [Benchmark] benchmarks = [ bench strict (nf build_strict 1000) , bench strict +mconcat (nf build_strict_mconcat 1000) , bench lazy (nf build_lazy 1000) , bench lazy +mconcat (nf build_lazy_mconcat 1000) ] main :: IO () main = defaultMainWith config (bgroup all benchmarks) config = Config { cfgMode = Nothing , cfgRun = RunSettings [] (Just 03_blaze) , cfgGraph = mempty } module Main where import Control.DeepSeq import Data.Monoid import Criterion.Types import Progression.Config import Progression.Main import qualified Data.ByteString as B import qualified Data.ByteString.Lazy as BL import qualified Data.Serialize.Builder as Cereal instance NFData B.ByteString instance NFData BL.ByteString where rnf a = rnf (BL.toChunks a) bytes_100 :: B.ByteString bytes_100 = B.replicate 100 0x61 build_strict :: Int - B.ByteString build_strict n = Cereal.toByteString builder where chunks = replicate n (Cereal.fromByteString bytes_100) builder = foldr Cereal.append Cereal.empty chunks build_lazy
[Haskell-cafe] Adding a builder to the bytestring package?
Most people who work with binary data have had to construct bytestrings at some point. The most common solution is to use a Builder, a monoid representing how to construct a bytestring. There are currently three packages (that I know of) which include builder implementations: binary, cereal, and blaze-builder. However, all of these libraries have additional dependencies beyond just bytestring. All three depend on array and containers, and blaze-builder additionally depends on text (and thus deepseq). Since the current implementation of GHC uses static linking, every additional dependency adds to the final size of a binary. Obviously the Builder concept is very useful, as it has been implemented at least three times. How about adding it to the bytestring package itself? We could have a module Data.ByteString.Builder, with functions (at minimum): toByteString :: Builder - Data.ByteString.ByteString toLazyByteString :: Builder - Data.ByteString.Lazy.ByteString fromByteString :: Data.ByteString.ByteString - Builder fromLazyByteString :: Data.ByteString.Lazy.ByteString - Builder empty :: Builder append :: Builder - Builder - Builder Plus whatever implementation details might be useful to expose. Existing libraries could then add their extra features (word - builder for binary and cereal, UTF/HTTP for blaze-builder) on top of the existing types. Is this something the community is interested in? Is there any work currently aimed at this goal? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Adding a builder to the bytestring package?
On Wed, Jan 19, 2011 at 12:04, Johan Tibell johan.tib...@gmail.com wrote: On Wed, Jan 19, 2011 at 8:37 PM, Michael Snoyman mich...@snoyman.com wrote: Isn't Simon Meier working on migrating his code from blaze-builder into binary? So I heard (although not directly from Simon). I think it would be nice to port the blaze-builder implementation to binary, but to keep binaries current interface (for now). From the perspective of a library user, if there's to be code shuffling, I'd much rather have it be one-time (blaze-builder - bytestring) than having multiple merges/ports going on. Especially since building bytestrings is a much more generic operation than binary serialisation. Regarding the interface, I think that as long as the same *basic* operations are available, it's fine to have extra operations as well. Several of blaze-builder's special-case functions (toByteStringIO , fromWriteList) allow more efficient operation than the generic interface. I agree with John that it would make more sense to go in bytestring. Assuming that happens, would the builder from text end up being based on it? ByteString and Text don't share an underlying data structure at the moment (one uses pinned ForeignPtrs and one unpinned ByteArray#s) so they can use the same builder efficiently. Some day perhaps. Can any of the blaze-builder optimizations be translated to the Text builder? When I benchmark it against binary and cereal, blaze-builder is approximately 2-3 times faster for most use cases. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Adding a builder to the bytestring package?
On Wed, Jan 19, 2011 at 14:06, Johan Tibell johan.tib...@gmail.com wrote: On Wed, Jan 19, 2011 at 10:30 PM, Michael Snoyman mich...@snoyman.com wrote: What's the advantage to moving in into binary as opposed to bytestring? To test that the implementation can indeed be ported to that interface. We could of course skip that step if we want to. blaze-builder already implements the binary builder interface, minus the putWord* functions. I think those would be trivial to reimplement on top of Write. Since it sounds like everyone agrees with / has already thought of moving Builder into bytestring, I'll start poking at a patch. Who is the current patch-reviewer for binary and bytestring? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Monad transformer: apply StateT to List monad
Lifting 'f' into StateT -- you get a list of (result, state) pairs. Since the state is never modified, the second half of each pair is identical: -- import Control.Monad.State f :: Int - [Int] f n = [0..n] -- lifting 'f' into State, I use 'Char' for the state so you -- can see which param it is liftedF :: Int - StateT Char [] Int liftedF n = lift (f n) -- prints [(0,'a'),(1,'a'),(2,'a'),(3,'a'),(4,'a')] -- -- 4 is n , 'a' is the state main = print (runStateT (liftedF 4) 'a') -- Lifting 'tick' into ListT -- you get a single pair, the first half is a list with one value, which is whatever 'tick' returned: -- import Control.Monad.List type GeneratorState = State Int tick :: GeneratorState Int tick = do n - get put (n + 1) return n liftedTick :: ListT GeneratorState Int liftedTick = lift tick -- prints ([4],5) -- -- 4 is the initial state, 5 is the final state main = print (runState (runListT liftedTick) 4) -- Generally, monad transformers aren't used to add new functionality to existing monadic computations. Instead, they're used with a generic Monad m = (or similar) constraint, and modify how that generic result is returned. For example, a modified version of 'tick' can have any monad (including lists) applied to it: -- tick :: Monad m = StateT Int m Int tick = do n - get put (n + 1) return n -- prints [(0,1),(1,2),(2,3)] main = print ([0,1,2] = runStateT tickTo) -- On Thu, Jan 13, 2011 at 16:38, michael rice nowg...@yahoo.com wrote: Hi Daniel, What I need to see is a function, say g, that lifts the function f (in the List monad) into the StateT monad, applies it to the monad's value, say 1, and returns a result [0,1]. Or, alternatively, code that lifts a function in the State monad, say tick import Control.Monad.State type GeneratorState = State Int tick :: GeneratorState Int tick = do n - get put (n+1) return n into the ListT monad and applies it to a list, say lst = [0,1,2] producing [(0,1),(1,2),(2,3)]. Both would be very helpful. Or maybe I'm missing the concept of monad transformers altogether and putting them together improperly, like trying to use a spreadsheet to write a letter? Michael ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] haskell-2010 binary IO
Haskell supports binary IO via openBinaryFile, hGetBuf, and hPutBuf . Advanced types like ByteString or Binary are not part of Haskell 2010, I assume because they're too complex to be part of the language standard. On Thu, Dec 9, 2010 at 23:14, Permjacov Evgeniy permea...@gmail.com wrote: Does haskell 2010 include binary IO? If no, what was the reason? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A home-brew iteration-alike library: some extension quiestions
On Thu, Dec 9, 2010 at 12:43, Michael Snoyman mich...@snoyman.com wrote: For the record, enumerator (and I believe iteratee as well) uses transformers, not mtl. transformers itself is Haskell98; all FunDep code is separated out to monads-fd. Michael iteratee also uses 'transformers', but requires several extensions; see http://hackage.haskell.org/packages/archive/iteratee/0.6.0.1/doc/html/src/Data-Iteratee-Base.html It seems silly to avoid extensions, though; every non-trivial package on Hackage depends on them, either directly or via a dependency. For example, though 'enumerator' requires no extensions itself, it depends on both 'text' and 'bytestring', which require a ton of them. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Strange error when using Attoparsec and Enumerator
I swear attoparsec-enumerator is going to give me grey hair; the error you're receiving is because iterParser itself is divergent. Fixed in 0.2.0.3, with my sincere apologies. On Sun, Dec 5, 2010 at 09:14, Crutcher Dunnavant crutc...@gmail.com wrote: I have spent a good chunk of the past week tracing code, trying to solve this problem. I'm seeing an error when using Enumerator and Attoparsec that I can't explain. This is a reduced form of the problem. In general, I've observed that debugging broken iterators and enumerators is very hard. We probably want some tooling around that; I'm looking at an identity enumeratee with debug.trace shoved in, or something like that, not sure yet what would help. {- Haskell 2010, ghc 6.12.3 array-0.3.0.2 attoparsec-0.8.2.0 attoparsec-enumerator-0.2.0.2 bytestring-0.9.1.7 containers-0.4.0.0 deepseq-1.1.0.2 enumerator-0.4.2 text-0.10.0.0 transformers-0.2.2.0 -} import Control.Applicative ((|)) import qualified Data.Attoparsec.Char8 as AP import qualified Data.Attoparsec.Combinator as APC import qualified Data.Attoparsec.Enumerator as APE import qualified Data.ByteString.Char8 as B import qualified Data.Enumerator as E import Data.Enumerator (($$)) import System.IO as IO parseLine :: AP.Parser B.ByteString parseLine = do AP.char '+' return . B.pack = APC.manyTill AP.anyChar endOfLineOrInput endOfLineOrInput :: AP.Parser () endOfLineOrInput = AP.endOfInput | AP.endOfLine pp :: Show a = AP.Parser a - String - IO () pp p s = do result - E.run $ E.enumList 1 [ B.pack s ] $$ E.sequence (APE.iterParser p) $$ E.printChunks False case result of (Right _) - return () (Left e) - IO.hPutStrLn stderr $ show e main = pp parseLine +OK {- Observed output: [OK] *** Exception: enumEOF: divergent iteratee Problems with this: 1) I didn't write an iteratee, enumerator, or enumeratee in this code. Something's wrong. 2) If the parser is divergent, I _should_ be getting the error message: iterParser: divergent parser -} -- Crutcher Dunnavant crutc...@gmail.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] How to put a string into Data.Binary.Put
Use one of the Char8 modules, depending on whether you want a strict or lazy bytestring: --- import qualified Data.ByteString.Lazy.Char8 as BS message :: BS.ByteString message = BS.pack SOME STRING --- See the docs at: http://hackage.haskell.org/packages/archive/bytestring/0.9.1.7/doc/html/Data-ByteString-Char8.html http://hackage.haskell.org/packages/archive/bytestring/0.9.1.7/doc/html/Data-ByteString-Lazy-Char8.html mapping over putWord8 is much slower than putting a single bytestring; if you want to put a string, pack it first: --- putString :: String - Put putString str = putLazyByteString (BS.pack str) -- alternative: probably faster import qualified Data.ByteString.Char8 as B putString :: String - Put putString str = putByteString (B.pack str) --- On Sat, Nov 6, 2010 at 05:30, C K Kashyap ckkash...@gmail.com wrote: Hi, I was trying to put a String in a ByteString import qualified Data.ByteString.Lazy as BS message :: BS.ByteString message = runPut $ do let string=SOME STRING map (putWord8.fromIntegral.ord) string -- this ofcourse generates [Put] How can I convert the list of Put's such that it could be used in the Put monad? For now I used the workaround of first converting the string to ByteString like this - stringToByteString :: String - BS.ByteString stringToByteString str = BS.pack (map (fromIntegral.ord) str) and then using putLazyByteString inside the Put monad. -- Regards, Kashyap ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Edit Hackage
On Thu, Oct 28, 2010 at 12:34, Andrew Coppin andrewcop...@btinternet.com wrote: Today I uploaded a package to Hackage, and rediscovered something that you already know: I'm an idiot. More specifically, I copied the Cabal description from another package and then updated all the fields. Except that I forgot to update one. And now I have a package which I've erroneously placed in completely the wrong category. Don't worry; of all the various ways to screw up a Hackage upload, setting the wrong category is just about the least important. Wait until you've got a few dozen packages in the air, and your hair will turn grey ;) Is there any danger that at some point in the future, it might be possible to edit a package /after/ it has been released on Hackage? There are several reasons why you might wish to do this (beyond realising you did something wrong five minutes after hitting the upload button): Sadly, the current Hackage maintainers follow the immutability is good school of design. A few aspects of packages can be modified, but most (those contained in the .cabal file) cannot. The maintainer might change. The homepage might move. Both of these are handled by uploading a package with a _._._.Z version number; in general, package version numbers are: X.X.Y.Z X.X - the package's major version. Bump this when there's a backwards-incompatible change (eg, dependent packages might break) Y - minor version. Bump this when the package changes, but in a backwards-compatible way. Z - patch version. Bump this when you just change something that doesn't affect the code itself (comments, documentation, cabal properties) A better package might come along, making the existing one obsolete. The only way to mark packages as obsolete, as far as I know, is to email a Hackage administrator. Or you might want to stick a message on the package saying hey, this version has a serious bug, please use the next version up instead! This would be useful; maybe a feature for Hackage 2? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re[2]: [Haskell-cafe] Big Arrays
On Mon, Oct 4, 2010 at 01:51, Bulat Ziganshin bulat.zigans...@gmail.com wrote: Hello John, Monday, October 4, 2010, 7:57:13 AM, you wrote: Sure it does; a 32-bit system can address much more than 2**30 elements. Artificially limiting how much memory can be allocated by depending on a poorly-specced type like 'Int' is a poor design decision in Haskell and GHC. are you understand that the poor design decision makes array access several times faster and doesn't limit anything except for very rare huge Bool arrays? I don't see how using 'Int' instead of 'Word' makes array access several times faster. Could you elaborate on that? The important limited use case is an array of Word8 -- by using 'Int', byte buffers are artificially limited to half of their potential maximum size. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Big Arrays
On Sun, Oct 3, 2010 at 19:09, Bryan O'Sullivan b...@serpentine.com wrote: On Sun, Oct 3, 2010 at 11:54 AM, Henry Laxen nadine.and.he...@pobox.com wrote: I am trying to create a (relatively) big array, but it seems I cannot make anything larger than 2^30 or so. Here is the code: Use a 64-bit machine, where Int is 64 bits wide. Trying to create a larger array on a 32-bit machine doesn't make any sense. Sure it does; a 32-bit system can address much more than 2**30 elements. Artificially limiting how much memory can be allocated by depending on a poorly-specced type like 'Int' is a poor design decision in Haskell and GHC. OP: for this particular use case (unboxed Word64), an easily solution is to have some structure like (data BigArray a = BigArray (UArray Word32 a) ((UArray Word32 a) ((UArray Word32 a) ((UArray Word32 a)), with each array containing 2^30 elements. You'll need to write custom indexing and modification functions, to process the index and pass it to the appropriate array. Something like: idxBig :: BigArray a - Word32 - (UArray Word32 a, Word32) idxBig (BigArray a0 a1 a2 a3) i | i 2^30 = (a0, i) | i 2^31 = (a1, i - 2^30) | i 2^30 + 2^31 = (a2, i - 2^31) | i 2^32 = (a3, i - 2^31 - 2^30) Then wrap the existing array functions: idx :: BigArray a - Word32 - a idx arr i = let (arr', i') = idxBig arr i in arr' ! i' ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Why can't libraries/frameworks like wxHaskell/gtk2hs/... be used with newer versions of ghc/wxWidgets/GTK+/... ?
On Mon, Sep 27, 2010 at 10:55, cas...@istar.ca wrote: Why can't libraries/frameworks like wxHaskell/gtk2hs/... be used with newer versions of ghc/wxWidgets/GTK+/... ? Haskell libraries statically link many parts of the Haskell runtime; you can't combine two libraries compiled with different versions of GHC. Any bindings (like wxHaskell or gtk2hs) should work fine with new versions of their original libraries, assuming upstream has maintained backwards compatibility. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANN: ieee version 0.7
On Mon, Sep 20, 2010 at 22:11, Conrad Parker con...@metadecks.org wrote: I've been using unsafeCoerce: getFloat64be :: Get Double getFloat64be = do n - getWord64be return (unsafeCoerce n :: Double) putFloat64be :: Double - Put putFloat64be n = putWord64be (unsafeCoerce n :: Word64) but only tested it with quickcheck -- it passes about 10^7 checks, comparing roundtrips in combinatrion with the previous data-binary-ieee754 versions. However could that sometimes behave incorrectly? QuickCheck only generates a subset of possible floating point values; when I tested unsafeCoerce, it sometimes gave incorrect results when dealing with edge cases like signaling NaNs. Should the d-b-iee754-0.4.2 versions with castPtr etc. be even faster? It should be slightly slower, but not nearly as slow as the bitfield-based parsing. On Tue, Sep 21, 2010 at 07:10, Daniel Fischer daniel.is.fisc...@web.de wrote: And I'd expect it to be a heck of a lot faster than the previous implementation. Have you done any benchmarks? Only very rough ones -- a few basic Criterion checks, but nothing extensive. Numbers for put/get of 64-bit big-endian: getWord getFloat putWord putFloat Bitfields (0.4.1)59 ns8385 ns 1840 ns 11448 ns poke/peek (0.4.2)59 ns 305 ns 1840 ns 744 ns unsafeCoerce 59 ns 61 ns 1840 ns 642 ns Note: I don't know why the cast-based versions can put a Double faster than a Word64; Float is (as expected) slower than Word32. Some special-case GHC optimization? One problem I see with both, unsafeCoerce and poke/peek is endianness. Will the bit-pattern of a double be interpreted as the same uint64_t on little-endian and on big-endian machines? In other words, is the byte order for doubles endianness-dependent too? If yes, that's fine, if no, it would break between machines of different endianness. Endianness only matters when marshaling bytes into a single value -- Data.Binary.Get/Put handles that. Once the data is encoded as a Word, endianness is no longer relevant. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANN: ieee version 0.7
On Tue, Sep 21, 2010 at 12:08, Daniel Fischer daniel.is.fisc...@web.de wrote: Endianness only matters when marshaling bytes into a single value -- Data.Binary.Get/Put handles that. Once the data is encoded as a Word, endianness is no longer relevant. I mean, take e.g. 2^62 :: Word64. If you poke that to memory, on a big- endian machine, you'd get the byte sequence 40 00 00 00 00 00 00 00 while on a little-endian, you'd get 00 00 00 00 00 00 00 40 , right? If both bit-patterns are interpreted the same as doubles, sign-bit = 0, exponent-bits = 0x400 = 1024, mantissa = 0 , thus yielding 1.0*2^(1024 - 1023) = 2.0, fine. But if on a little-endian machine, the floating point handling is not little-endian and the number is interpreted as sign-bit = 0, exponent-bits = 0, mantissa = 0x40, hence (1 + 2^(-46))*2^(-1023), havoc. I simply didn't know whether that could happen. According to http://en.wikipedia.org/wiki/Endianness#Floating-point_and_endianness it could. On the other hand, no standard for transferring floating point values has been made. This means that floating point data written on one machine may not be readable on another, so if it breaks on weird machines, it's at least a general problem (and not Haskell's). Oh, I misunderstood the question -- you're asking about architectures on which floating-point and fixed-point numbers use a different endianness? I don't think it's worth worrying about, unless you want to use Haskell for number crunching on a PDP-11. If you do need to implement IEEE754 parsing for unusual endians (like 3-4-1-2), parse the word yourself and then use 'wordToFloat' and friends to convert it. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANN: ieee version 0.7
On Mon, Sep 20, 2010 at 03:22, Daniel Fischer daniel.is.fisc...@web.de wrote: unsafeCoerce is not supposed to work for casts between Integral and Floating types. If you try to unsafeCoerce# between unboxed types, say Double# and Word64#, you're likely to get a compile failure (ghc panic). If you unsafeCoerce between the boxed types, it will probably work, but there are no guarantees. There's a feature request for unboxed coercion (i.e. reinterpretation of the bit-pattern): http://hackage.haskell.org/trac/ghc/ticket/4092 Interesting -- in that bug report, Simon Mar says that converting the value using pointers will work correctly. I've changed d-b-ieee754 over to use this method (v 0.4.2); the tests are still passing, so I'll call it success. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANN: ieee version 0.7
On Sun, Sep 19, 2010 at 20:16, Conrad Parker con...@metadecks.org wrote: Anyway, good work. Does this have any overlap with data-binary-ieee754? There was some recent discussion here about the encoding speed in that package. I should probably make it more clear that data-binary-ieee754 is for special use cases; for most people, using something like this will be much faster since it doesn't have to poke around the individual bits: putFloat32be :: Float - Put putFloat32be = putWord32be . unsafeCoerce I needed real IEEE754 binary support for round-trip parsing, where (for example) maintaining the particular bit pattern of a NaN is important. For 99% of people, the unsafe method will work fine. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: text 0.9.0.0 and text-icu 0.4.0.0
On Sun, Sep 19, 2010 at 21:17, Bryan O'Sullivan b...@serpentine.com wrote: I've issued new releases of the text and text-icu packages, the fast, comprehensive Unicode text manipulation libraries. http://hackage.haskell.org/package/text http://hackage.haskell.org/package/text-icu What's new in text-0.9 ? All I see in darcs is a newtype'd param in the Foreign module. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANN: ieee version 0.7
On Sun, Sep 19, 2010 at 21:43, Patrick Perry patpe...@gmail.com wrote: I needed real IEEE754 binary support for round-trip parsing, where (for example) maintaining the particular bit pattern of a NaN is important. For 99% of people, the unsafe method will work fine. How does a C-style cast not preserve the bit pattern of a NaN? Again, sorry if this is a stupid question. It's not a stupid question, and I don't know the answer. But if you plug a C-style cast into the data-binary-ieee754 unit tests, some of them (the fiddly ones, like roundtripping -NaN) will fail. Presumably, this is due to some optimization deep in the bowels of GHC, but I don't understand even a fraction of what goes on in there. For what it's worth, d-b-ieee754 was the very first Haskell library I ever wrote -- and it shows. If anybody knows how to make unsafeCoerce (or equivalent) roundtrip-safe, I would love to rip out all the ugly and make it sane. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Fwd: Semantics of iteratees, enumerators, enumeratees?
On Mon, Sep 6, 2010 at 22:49, Ben midfi...@gmail.com wrote: Sorry to be late coming into this conversation. Something that has bothered me (which I have mentioned to John Lato privately) is that it is very easy to write non-compositional code due to the chunking. For example, there is a standard function map :: (a - b) - Enumeratee a b c whose meaning I hope is clear : use the function to transform the type of a stream and pass it to an iteratee. However last I checked the versions provided in both the iteratee and enumerator packages fail to satisfy the equation map f (it1 it2) == (map f it1) (map f it 2) because of chunking, essentially. You can check this with f == id and it1 and it2 are head: let r = runIdentity . runIteratee runIdentity $ run $ enumList 10 [1..100] $ r $ joinI $ map id $ r (head head) -- Right (Just 2) runIdentity $ run $ enumList 10 [1..100] $ r $ joinI $ (map id $ r head) (map id $ r head) -- Right (Just 11) It is possible to fix this behavior, but it complicates the obvious definitions a lot. Chunking doesn't have anything to do with this, and an iteratee encoding without input chunking would exhibit the same problem. You're running into an (annoying? dangerous?) subtlety in enumeratees. In the particular case of map/head, it's possible to construct an iteratee with the expected behavior by altering the definition of 'map'. However, if the composition is more complicated (like map/parse-json), this alteration becomes impossible. Remember than an enumeratee's return value contains two levels of extra input. The outer layer is from the enumeratee (map), while the inner is from the iteratee (head). The iteratee is allowed to consume an arbitrary amount of input before yielding, and depending on its purpose it might yield extra input from a previous stream. Perhaps the problem is that 'map' is the wrong name? It might make users expect that it composes horizontally rather than vertically. Normally this incorrect interpretation would be caught by the type checker, but using () allows the code to compile. Anyway, the correct way to encode @(map f it1) (map f it 2)@, using above style is: (map id (r head) = returnI) (map id (r head) = returnI) so the full expression becomes: runIdentity $ run $ enumList 10 [1..100] $ r $ (map id (r head) = returnI) (map id (r head) = returnI) which ought to return the correct value (untested; I have no Haskell compiler on this computer). ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: Haddock version 2.8.0
On Fri, Sep 3, 2010 at 23:02, David Menendez d...@zednenem.com wrote: Yes, using foreign namespaces is one of the things recommended against when serving XHTML as text/html. This says nothing about documents following the recommendations in Appendix C. I'm not debating that it's *possible* to serve HTML with an XHTML mimetype and still see something rendered to the screen. Hundreds of thousands of sites do so every day. But to call this XHTML is absurd. I agree, if by absurd you mean consistent with the letter and spirit of the XHTML recommendation. Content served as text/html is not XHTML, any more than content served as text/plain or image/jpg is. *IF* XHTML could be served using text/html, then my example pages would render identically in browsers with XHTML support. Appendix C is a guideline on how to make the same byte content render *something* when treated as HTML or XHTML; it's intended as a low-fidelity fallback for user agents without support for XHTML (IE). It is *not* a means by which HTML may be labelled XHTML for the sake of buzzword compliance. You seem to be under the common misconception that XHTML is merely an alternative encoding of HTML. This is incorrect. XHTML has a different DOM, different CSS support, and different syntax. HTML and XHTML are like Java and C# -- beneath a superficial resemblance, distinct. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: Haddock version 2.8.0
On Sat, Sep 4, 2010 at 14:46, Jeremy Shaw jer...@n-heptane.com wrote: So the choices are: 1. only focus on getting the xhtml 1.0 served as application/xml working correctly, and ie users get nothing.. 2. create xhtml 1.0 that would work correctly if served as application/xml, but serve it as text/html, and ignore that fact that some stuff might not be rendering correctly when treated as text/html. 3. create xhtml documents which render correctly whether served as application/xml or text/html, but then only serve them as text/html anyway 4. forget about how the xhtml documents render as application/xml, and only focus on how they render as text/html. 5. Do as my patch does; default to HTML 4 (supported by all browsers), and allow users to generate correct XHTML if they want/need to. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Confused about ByteString, UTF8, Data.Text and sockets, still.
On Fri, Sep 3, 2010 at 05:04, JP Moresmau jpmores...@gmail.com wrote: I have replaced JSon by AttoJson (there was also JSONb, which seems quite similar), which allows me to work solely with ByteStrings, bypassing the calls to utf8-string completely. Performance has improved noticeably. I'm worried that I've lost full UTF8 compatibility, though, haven't I? No double byte characters will work in that setup? It should be easy enough to test; generate a file with non-ASCII characters in it and see if it's parsed correctly. I assume it will be, though you won't be able to perform String operations on the resulting decoded data unless you manually decode it. Slightly more worrisome is that AttoJson doesn't look like it works with non-UTF8 JSON -- you might have compatibility problems unless you implement manual decoding. I've written a binding to YAJL (a C-based JSON parser) which might be faster for you, if the input is very large -- though it still suffers from the assume UTF8 problem. http://hackage.haskell.org/package/yajl Is Data.Text an alternative? Can I use that everywhere, including for dealing with sockets (the API only mentions Handle). Use 'Network.Socket.socketToHandle' to convert sockets to handles: http://hackage.haskell.org/packages/archive/network/2.2.1.7/doc/html/Network-Socket.html#v%3AsocketToHandle ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: Haddock version 2.8.0
On Fri, Sep 3, 2010 at 20:39, Albert Y. C. Lai tre...@vex.net wrote: In theory, what does file extension matter? Media type is the dictator. The normative Section 5.1 permits the choice of application/xhtml+xml or text/html. While the latter entails extra requirements in the informative Appendix C, as far as I can see (after all IDs are repaired) they are all met. In a cunning combination of theory and practice in our reality, the file extension .html implies the media type text/html unless the server specifies otherwise. But since text/html is allowed in theory, so is .html allowed in practice. Indeed, Internet Explorer plays along just fine with text/html; it stops only when you claim application/xhtml+xml. For example http://www.vex.net/~trebla/xhtml10.html works. This is a correct use of xhtml 1.0, and I fully endorse it. It's not correct. Here's the exact same XHTML document (verify by viewing the source), served with different mimetypes: http://ianen.org/temp/inline-svg.html http://ianen.org/temp/inline-svg.xhtml Notice that the version served as HTML does not render properly. This is because the browser is treating it as HTML with an unknown doctype, not as XHTML. I'm not debating that it's *possible* to serve HTML with an XHTML mimetype and still see something rendered to the screen. Hundreds of thousands of sites do so every day. But to call this XHTML is absurd. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: Haddock version 2.8.0
On Thu, Sep 2, 2010 at 21:14, Mark Lentczner ma...@glyphic.com wrote: I choose to switch Haddock's output from HTML to XHTML mostly because I have found the consistency of rendering cross-browser to be greater and easier to achieve with XHTML. I'm not alone in this opinion: Many well respected web design authorities have promoted, and publish their own sites, in XHTML.[1] Even Microsoft's own developer web site uses it[2]! You're not generating anything browsers will parse as XHTML, so I find it unlikely that attaching the correct doctype will cause problems. I am extremely skeptical that using an HTML4 doctype will render incorrectly when an unrecognized doctype works as expected across all browsers. [1] See, for example: http://www.alistapart.com/ http://www.csszengarden.com/ http://www.quirksmode.org/ http://happycog.com/ http://www.w3.org/ all of which are published as XHTML [2] See: http://msdn.microsoft.com/en-us/default.aspx Browsers treat any data sent using the text/html MIME-type as HTML. Those pages are being served as HTML, so browsers treat them as HTML with an unknown doctype. In particular, CSS and JS behavior on these sites will be that of HTML, *not* XHTML. Firefox will show you how the page is being rendered (HTML or XHTML) in the page info dialog. I don't know of any similar feature in Chrome. [5] I can't find any evidence for your assertion that Internet Explorer doesn't support XHTML, or the way Haddock names the files (and hence URLs). IE versions 8 and below (I've not tested IE9) will not render XHTML -- they pop up a save as dialog box. You're welcome to verify this by opening an XHTML page (such as http://ianen.org/haskell/dbus/ ) in IE. You may be confused, because the pages you mentioned earlier *are* rendering in IE. However, they are not being rendered as XHTML -- again, browsers are rendering them as HTML with an unrecognized doctype. Haddock is generating files with an .html extension, which causes webservers to serve it using text/html, the incorrect MIME-type. The correct extension for XHTML content is .xhtml. For some reason, it is common to use XHTML doctypes in HTML documents -- I assume because people think the X makes it more modern. However, this is incorrect behavior. It is better to serve a page correctly as HTML4 than incorrectly as tag soup. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support
Is there a summary of the API changes available? I see a new module, but Precis is choking on Data.Text and Data.Text.Lazy, so I'm not sure what existing signatures have been modified. Don't forget, you can always improve the text library yourself. I love to receive patches, requests for improvement, and bug reports. Are there any areas in particular you'd like help with, for either library? I'm happy to assist any effort which will help reduce use of String. [aside] does anybody know how to get a list of what packages somebody's uploaded to Hackage? I think I've updated all mine for the new text version dependency, but I'm worried I forgot some. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Quick Question for QuickCheck2
Define a custom element generator, which has characters with your desired values: myRange :: Gen Char myRange = elements (['A'..'Z'] ++ ['a' .. 'z'] ++ ~...@#$%^*()) You can use forAll to run tests with a specific generator: forAll myRange $ \c - chr (ord c) == c On Mon, Aug 30, 2010 at 08:12, Sebastian Höhn sebastian.ho...@googlemail.com wrote: Hello, perhaps I am just blind or is it a difficult issue: I would like to generate Char values in a given Range for QuickCheck2. There is this simple example from the haskell book: instance Arbitrary Char where arbitrary = elements (['A'..'Z'] ++ ['a' .. 'z'] ++ ~...@#$%^*()) This does not work in QuickCheck2 since the instance is already defined. How do I achieve this behaviour in QC2? Thanks for helping. Sebastian ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Quick Question for QuickCheck2
Update your cabal package list, and then install QuickCheck. Optionally, you can use a version specifier: cabal update cabal install 'QuickCheck = 2' This should make QuickCheck 2 the default in GHCI. If it doesn't, you may need to specify the version: ghci -package QuickCheck-2.2 For Cabal-packaged libraries/applications, simply update your version requirements. On Mon, Aug 30, 2010 at 09:06, Lyndon Maydwell maydw...@gmail.com wrote: I'm just trying these examples, and I can't figure out how to import quickcheck2 rather than quickcheck1. I've looked around but I can't seem to find any information on this. How do I do it? Thanks! ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Hackage on Linux
On Thu, Aug 26, 2010 at 20:51, Richard O'Keefe o...@cs.otago.ac.nz wrote: Maybe Linux is different. One thing is NOT different, and that is Linux upgrades *DO* reliably break programs that use dynamic linking. Dynamic libraries get - left out - changed incompatibly - moved some place else - changed compatibly but the version number altered so the dynamic linker doesn't believe it, or the foolsXXkind people who built the program wired in a demand for a particular version Indeed, every Linux upgrade I've had I've found myself screaming in frustration because programs *weren't* statically linked. Upgrading Linux should never, ever cause applications to stop working unless they were designed incorrectly in the first place. Low-level system libraries like glibc are the only code which needs to access Linux directly. However, most of the problems you mentioned (removed/modified dynamic libraries) are not part of Linux at all. If your distribution has poor quality control, you should consider switching to a better one -- I've heard good news about both Debian and RHEL in this area. Desktop-oriented distributions, such as Ubuntu or Fedora, are not suitable for long-term ( 6 years or so) installations. Haskell, of course, takes ABI pickiness to an absolute maximum. One of my most wished-for features is a way to provide C-style stable ABIs for Haskell shared libraries, so I could (for example) upgrade a support library and have every installed application pick it up. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Fwd: Semantics of iteratees, enumerators, enumeratees?
On Wed, Aug 25, 2010 at 01:33, John Lato jwl...@gmail.com wrote: Is this really true? Consider iteratees that don't have a sensible default value (e.g. head) and an empty stream. You could argue that they should really return a Maybe, but then they wouldn't be divergent in other formulations either. Although I do find it interesting that EOF is no longer part of the stream at all. That may open up some possibilities. Divergent iteratees, using the current libraries, will simply throw an exception like enumEOF: divergent iteratee. There's no way to get useful values out of them. Disallowing returning Continue when given an EOF prevents this invalid state. Also, I found this confusing because you're using Result as a data constructor for the Step type, but also as a separate type constructor. I expect this could lead to very confusing error messages (What do you mean 'Result b a' doesn't have type 'Result'?) Oh, sorry, those constructors should be something like this (the system on which I wrote that email has no Haskell compiler, so I couldn't verify types before sending): data Step a b = Continue (a - Step a b) (Result a b) | GotResult (Result a b) The goal is to let the iteratee signal three states: * Can accept more input, but terminating the stream now is acceptable * Requires more input, and terminating the stream now is an error * Cannot accept more input I find this unclear as well, because you've unpacked the continue parameter but not the eof. I would prefer to see this as: type Enumerator a b = (a - Step a b) - Result a b - Step a b However, is it useful to do so? That is, would there ever be a case where you would want to use branches from separate iteratees? If not, then why bother unpacking instead of just using type Enumerator a b = Step a b - Step a When an enumerator terminates, it needs to pass control to the next enumerator (the final enumerator is enumEOF). Thus, the second step parameter is actually the next enumerator to run in the chain (aka the calling enumerator). ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Low level problem with Data.Text.IO
For debugging the error, we'll need to know what your locale's encoding is. You can see this by echoing the $LANG environment variable. For example: $ echo $LANG en_US.UTF-8 means my encoding is UTF-8. Haskell doesn't currently have any decoding libraries with good error handling (that I know of), so you might need to use an external library or program. My preference is Python, since it has very descriptive errors. I'll load a file, attempt to decode it with my locale encoding, and then see what errors pop up: $ python content = open(testfile, rb).read() text = content.decode(utf-8) Traceback (most recent call last): File stdin, line 1, in module UnicodeDecodeError: 'utf8' codec can't decode byte 0x9d in position 1: unexpected code byte The exact error will help us generate a test file to reproduce the problem. If you don't see any error, then the bug will be more difficult to track down. Compile your program into a binary (ghc --make ReadFiles.hs) and then run it with gdb, setting a breakpoint in the malloc_error_break procedure: $ ghc --make ReadFiles.hs $ gdb ./ReadFiles (gdb) break malloc_error_break (gdb) run testfile ... program runs ... BREAKPOINT (gdb) bt stack trace here, copy and paste it into an email for us The stack trace might help narrow down where the memory corruption is occuring. - If you don't care much about debugging, and just want to read the file: First step is to figure out what encoding the file's in. Data.Text.IO is intended for decoding files in the system's local encoding (typically UTF-8), not general-purpose this file has letters in it IO. Web browsers are pretty good at auto-detecting encodings. For example, if you load the file into Firefox and then look at the (View - Character Encoding) menu, which option is selected? Next, you'll need to read the file in as bytes and then decode it. Use Data.ByteString.hGetContents to read it in. If it's encoded in one of the common UTF encodings (UTF-8, UTF-16, UTF-32), then you can use the functions in Data.Text.Encoding to convert from the file's bytes to text. If it's an unusual encoding (windows-1250, shift_jis, gbk, etc) then you'll need a decoding library like text-icu. Create the proper decoder, feed in the bytes, receive text. If all else fails, you can use this function to decode the file as iso8859-1, but it'll be too slow to use on any file larger than a few dozen megabytes. Furthermore, it will likely cause any special characters in the file to become corrupted. import Data.ByteString.Char8 as B8 import Data.Text as T iso8859_1 :: ByteString - Text iso8859_1 = T.pack . B8.unpack If any corruption occurs, please reply with *what* characters were corrupted; this might help us reproduce the error. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] How to generate dependend values with QuickCheck
You're generating two random values, where you probably want to just generate one and then calculate the second from it. Try this: instance Arbitrary QCExample where arbitrary = do i1 - arbitrary return (QCExample i1 (i1 * 2)) 2010/8/25 Jürgen Nicklisch-Franken j...@arcor.de: I want to generate values, so that i have some arbitrary object, which has a certain set of signals, and each set of signals has a certain set of sensor value pairs, were the values are arbitrary. However to demonstrate my problem I have an easy example, were I want to generate an instance of data QCExample = QCExample Int Int deriving Show and the second value should always be the double of the first. So I tried this: instance Arbitrary QCExample where arbitrary = let i1 = arbitrary i2 = fmap (* 2) i1 in liftM2 QCExample i1 i2 but showing some of the generated test cases in ghci does not give me what I expected: let gen :: Gen (QCExample) = arbitrary Test.QuickCheck.Gen.sample gen QCExample (-2) 0 QCExample (-4) (-6) QCExample 3 30 ... I know that I can filter, but this would be to inefficient. Jürgen ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: ANNOUNCE: enumerator, an alternative iteratee package
On Tue, Aug 24, 2010 at 05:12, John Lato jwl...@gmail.com wrote: Oleg included the error state to enable short-circuiting of computation, and I guess everyone just left it in. Recently I've been wondering if it should be removed, though, in favor requiring explicit (i.e. explicit in the type sig) exceptions for everything. I'm not sure if that would be more or less complicated. The first few iterations of the enumerator package use explicit error types, but this makes it very difficult to combine enumerator-based libraries. The only way I could get everything to work correctly was to always set the error type to SomeException, which is equivalent to and more verbose than making SomeException the only supported type. I believe relying on the MonadError class would have the same drawback, since (as far as I know) it's not possible to create MonadError instances which can accept multiple error types. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Fwd: Semantics of iteratees, enumerators, enumeratees?
Here's my (uneducated, half-baked) two cents: There's really no need for an Iteratee type at all, aside from the utility of defining Functor/Monad/etc instances for it. The core type is step, which one can define (ignoring errors) as: data Step a b = Continue (a - Step a b) | Yield b [a] Input chunking is simply an implementation detail, but it's important that the yield case be allowed to contain (= 0) inputs. This allows steps to consume multiple values before deciding what to generate. In this representation, enumerators are functions from a Continue to a Step. type Enumerator a b = (a - Step a b) - Step a b I'll leave off discussion of enumeratees, since they're just a specialised type of enumerator. - Things become a bit more complicated when error handling is added. Specifically, steps must have some response to EOF: data Step a b = Continue (a - Step a b) (Result a b) | Result a b data Result a b = Yield b [a] | Error String In this representation, Continue has two branches. One for receiving more data, and another to be returned if there is no more input. This avoids the divergent iteratee problem, since it's not possible for Continue to be returned in response to EOF. Enumerators are similarly modified, except they are allowed to return Continue when their inner data source runs out. Therefore, both the continue and eof parameters are Step. type Enumerator a b = (a - Step a b) - Step a b - Step a b - Finally, support for monads is added. I don't know if denotational semantics typically considers monads, but I feel they're important when discussing enumerators/iteratees. After all, the entire point of the iteratee abstraction is to serve an alternative to lazy IO. data Step a m b = Continue (a - m (Step a m b)) (m (Result a b)) | Result a b data Result a b = Yield b [a] | Error String type Enumerator m a b = (a - m (Step a m b)) - m (Step a m b) - m (Step a m b) This is mostly the same as the second representation, except that it makes obvious at which point each value is calculated from the underlying monad. From here, it's trivial to define the Iteratee type, if desired: type Iteratee a m b = m (Step a m b) data Step a m b = Continue (a - Iteratee a m b) (m (Result a b)) | Result a b data Result a b = Yield b [a] | Error String type Enumerator m a b = (a - Iteratee a m b) - Iteratee a m b - Iteratee a m b - Note: the data types I've arrived at here are significantly different from those defined by Oleg. Given my relative level of competency with logical programming in general and Haskell in particular, I suspect his definitions are superiour in some way I do not understand. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell] Re: [Haskell-cafe] ANNOUNCE: enumerator, an alternative iteratee package
After fielding some more questions regarding error handling, it turns out that my earlier mail was in error (hah) -- error handling is much more complicated than I thought. When I gave each iteratee its own error type, I was expecting that each pipeline would have only one or two sources of errors -- for example. a parser, or a file reader. However, in reality, it's likely that every single element in a pipeline can produce an error. For example, in a JSON/XML/etc reformatter (enumFile, parseEvents, formatEvents, iterFile), errors could be SomeException, ParseError, or FormatError. Futhermore, while it's easy to change an iteratee's error type with just (e1 - e2), changing an enumerator or enumeratee *also* requires (e2 - e1). In other words, to avoid loss of error information, the two types have to be basically the same thing anyway. I would like to avoid hard-coding the error type to SomeException, because it forces libraries to use unsafe/unportable language features (dynamic typing and casting). However, given the apparent practical requirement that all iteratees have the same error type, it seems like there's no other choice. So, my questions: 1. Has anybody here successfully created / used / heard of an iteratee implementation with independent error types? 2. Do alternative Haskell implementations (JHC, UHC, Hugs, etc) support DeriveDataTypeable? If not, is there any more portable way to define exceptions? 3. Has anybody actually written any libraries which use the existing enumerator error handling API? I don't mind rewriting my own uploads, since this whole mess is my own damn fault, but I don't want to inconvenience anybody else. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell] Re: [Haskell-cafe] ANNOUNCE: enumerator, an alternative iteratee package
On Sat, Aug 21, 2010 at 23:14, Paulo Tanimoto ptanim...@gmail.com wrote: One question: enumFile has type enumFile :: FilePath - Enumerator SomeException ByteString IO b and iterParser has type iterParser :: Monad m = Parser a - Iteratee ParseError ByteString m a How do we use both together? Something in these lines won't type-check E.run (E.enumFile file E.$$ (E.iterParser p)) because the error types are different. Forgot to mention that -- use the mapError function from enumerator-0.2.1 thusly: http://ianen.org/haskell/enumerator/api-docs/Data-Enumerator.html#v%3AmapError parser :: Parser Foo toExc :: Show a = a - E.SomeException toExc = E.SomeException . E.ErrorCall . show main :: IO () main = do run (enumFile parsetest.txt $$ mapError toExc $$ iterParser parser) = print You don't have to map to SomeException -- any type will do. For example, in a complex pipeline with real error handling at the other end, you might want a custom error type so you'll know at what stage the error occurred. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Error in enumerator when using interpreter instead of compiler
It's certainly a bug in iterFile -- I think it'll have to be modified to close the file on EOF, not after returning a continuation. Semi-working in the compiled version is probably just a quirk of the garbage collector and/or OS. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Error in enumerator when using interpreter instead of compiler
Well, now I know why the iteratee package never defined something like iterFile -- it's not really possible. The only way to open handles within an iteratee prevents exception-safe release. enumerator-0.3 will remove the iterFile functions. iterHandle will remain, to be used as in your second example. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell] Re: [Haskell-cafe] ANNOUNCE: enumerator, an alternative iteratee package
I think the docs are wrong, or perhaps we're misunderstanding them. Magnus is correct. Attached is a test program which listens on two ports, 42000 (blocking IO) and 42001 (non-blocking). You can use netcat, telnet, etc, to send it data. The behavior is as Magnus describes: bytes from hGetNonBlocking are available immediately, while hGet waits for a full buffer (or EOF) before returning. This behavior obviously makes hGet unsuitable for enumHandle; my apologies for not understanding the problem sooner. import Control.Concurrent (forkIO, threadDelay) import Control.Monad (forever, unless) import Control.Monad.Fix (fix) import qualified Data.ByteString as B import Network import System.IO main :: IO () main = do blockingSock - listenOn (PortNumber 42000) nonblockingSock - listenOn (PortNumber 42001) forkIO $ acceptLoop B.hGet blockingSock Blocking forkIO $ acceptLoop nonblockingGet nonblockingSock Non-blocking forever $ threadDelay 100 nonblockingGet :: Handle - Int - IO B.ByteString nonblockingGet h n = do hasInput - catch (hWaitForInput h (-1)) (\_ - return False) if hasInput then B.hGetNonBlocking h n else return B.empty acceptLoop :: (Handle - Int - IO B.ByteString) - Socket - String - IO () acceptLoop get sock label = fix $ \loop - do (h, _, _) - accept sock putStrLn $ label ++ client connected bytesLoop (get h) putStrLn $ label ++ EOF loop bytesLoop :: (Int - IO B.ByteString) - IO () bytesLoop get = fix $ \loop - do bytes - get 20 unless (B.null bytes) $ do putStrLn $ bytes = ++ show bytes loop ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell] Re: [Haskell-cafe] ANNOUNCE: enumerator, an alternative iteratee package
On Sat, Aug 21, 2010 at 11:35, Gregory Collins g...@gregorycollins.net wrote: John Millikin jmilli...@gmail.com writes: I think the docs are wrong, or perhaps we're misunderstanding them. Magnus is correct. Attached is a test program which listens on two ports, 42000 (blocking IO) and 42001 (non-blocking). You can use netcat, telnet, etc, to send it data. The behavior is as Magnus describes: bytes from hGetNonBlocking are available immediately, while hGet waits for a full buffer (or EOF) before returning. hSetBuffering handle NoBuffering? The implementation as it is is fine IMO. Disabling buffering doesn't change the behavior -- hGet h 20 still doesn't return until the handle has at least 20 bytes of input available. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell] Re: [Haskell-cafe] ANNOUNCE: enumerator, an alternative iteratee package
On Sat, Aug 21, 2010 at 11:58, Judah Jacobson judah.jacob...@gmail.com wrote: You should note that in ghc=6.12, hWaitForInput tries to decode the next character of input based on to the Handle's encoding. As a result, it will block if the next multibyte sequence is incomplete, and it will throw an error if a multibyte sequence gets split between two chunks. I worked around this problem in Haskeline by temporarily setting stdin to BinaryMode; you may want to do something similar. Also, this issue caused a bug in bytestring with ghc-6.12: http://hackage.haskell.org/trac/ghc/ticket/3808 which will be resolved by the new function 'hGetBufSome' (in ghc-6.14) that blocks only when there's no data to read: http://hackage.haskell.org/trac/ghc/ticket/4046 That function might be useful for your package, though not portable to other implementations or older GHC versions. You should not be reading bytestrings from text-mode handles. The more I think about it, the more having a single Handle type for both text and binary data causes problems. There should be some separation so users don't accidentally use a text handle with binary functions, and vice-versa: openFile :: FilePath - IOMode - IO TextHandle openBinaryFile :: FIlePath - IOMode - IO BinaryHandle hGetBuf :: BinaryHandle - Ptr a - Int - IO Int Data.ByteString.hGet :: BinaryHandle - IO ByteString -- etc then the enumerators would simply require the correct handle type: Data.Enumerator.IO.enumHandle :: BinaryHandle - Enumerator SomeException ByteString IO b Data.Enumerator.Text.enumHandle :: TextHandle - Enumerator SomeException Text IO b I suppose the enumerators could verify the handle mode and throw an exception if it's incorrect -- at least that way, it will fail consistently rather than only in rare occasions. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell] Re: [Haskell-cafe] ANNOUNCE: enumerator, an alternative iteratee package
On Sat, Aug 21, 2010 at 12:44, Magnus Therning mag...@therning.org wrote: As an aside, has anyone written the code necessary to convert a parser, such as e.g. attoparsec, into an enumerator-iteratee[1]? This sort of conversion is trivial. For an example, I've uploaded the attoparsec-enumerator package at http://hackage.haskell.org/package/attoparsec-enumerator -- iterParser is about 20 lines, excluding the module header and imports. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell] Re: [Haskell-cafe] ANNOUNCE: enumerator, an alternative iteratee package
On Sat, Aug 21, 2010 at 14:17, Paulo Tanimoto ptanim...@gmail.com wrote: Cool, but is there a reason it won't work with version 0.2 you just released? build-depends: [...] , enumerator = 0.1 0.2 I noticed that when installing it. Hah ... forgot to save the vim buffer. Corrected version uploaded. Sorry about that. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe